Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5

2018-04-12 Thread Boris Brezillon
On Wed, 11 Apr 2018 11:12:24 +0200
Ladislav Michl  wrote:

> On Wed, Apr 11, 2018 at 10:52:01AM +0200, Boris Brezillon wrote:
> > On Wed, 11 Apr 2018 10:27:46 +0200
> > Ladislav Michl  wrote:
> >   
> > > On Wed, Apr 11, 2018 at 10:08:06AM +0200, Boris Brezillon wrote:  
> > > > On Wed, 11 Apr 2018 09:36:56 +0200
> > > > Ladislav Michl  wrote:
> > > > 
> > > > > Hi Boris,
> > > > > 
> > > > > On Wed, Apr 11, 2018 at 09:15:28AM +0200, Boris Brezillon wrote:
> > > [...]  
> > > > > > Not sure this approach is safe on all archs: if the cache is VIVT or
> > > > > > VIPT, you may have several entries pointing to the same phys page, 
> > > > > > and
> > > > > > then, when dma_map_page() does its cache maintenance operations, 
> > > > > > it's
> > > > > > only taking one of these entries into account.  
> > > > > 
> > > > > Hmm, I used the same approach Samsung OneNAND driver does since commit
> > > > > dcf08227e964a53a2cb39130b74842c7dcb6adde.
> > > > > Both TI OMAP3630 and Samsung S5PC110 are using Cortex-A8 which
> > > > > is VIPT. In that case samsung's driver code has the same problem.
> > > > > 
> > > > > > In other parts of the MTD subsystem, we tend to not do DMA on 
> > > > > > buffers
> > > > > > that have been vmalloc-ed.
> > > > > > 
> > > > > > You can do something like
> > > > > > 
> > > > > > if (virt_addr_valid(buf))
> > > > > > /* Use DMA */
> > > > > > else
> > > > > > /*
> > > > > >  * Do not use DMA, or use a bounce buffer
> > > > > >  * allocated with kmalloc
> > > > > >  */  
> > > > > 
> > > > > Okay, I'll use this approach then, but first I'd like to be sure 
> > > > > above is
> > > > > correct. Anyone?
> > > > 
> > > > See this discussion [1]. The problem came up a few times already, so
> > > > might find other threads describing why it's not safe.
> > > > 
> > > > [1]https://lists.linuxfoundation.org/pipermail/iommu/2016-March/016240.html
> > > > 
> > > 
> > > Question was more likely whenever there might exist more that one mapping
> > > of the same page.  
> > 
> > I'm clearly not an expert, so I might be wrong, but I think a physical
> > page can be both in the identity mapping and mapped in the vmalloc
> > space. Now, is there a real risk that both ends are accessed in
> > parallel thus making different cache entries pointing to the same phys
> > page dirty, I'm not sure. Or maybe there are other corner case, but
> > you'll have to ask Russell (or any other ARM expert) to get a clearer
> > answer.  
> 
> Thank you anyway. As DMA was disabled completely for all DT enabled boards
> until 4.16 let's play safe and disable it for HIGHMEM case as you suggested.
> Later, we might eventually use the same {read,write}_bufferram for both
> OMAP and S5PC110.
> 
> > > But okay, I'll disable DMA for highmem. Patch will follow.
> > > 
> > >   ladis  
> 
> Andreas, Nikolaus, could you please test patch bellow? It is against
> linux.git and should apply also against 4.16 once you modify path.
> 

Looks like it fixes the problem. Could you please rebase on top of
v4.17-rc1 when it's out and send a new version? Note that onenand code
has moved to drivers/mtd/nand/onenand, so we'll have to manually
backport the fix to 4.16.

Regards,

Boris

> Thank you,
>   ladis
> 
> --- >8 ---  
> 
> From: Ladislav Michl 
> Subject: [PATCH] mtd: onenand: omap2: Disable DMA for HIGHMEM buffers
> 
> dma_map_single doesn't get the proper DMA address for vmalloced area,
> so disable DMA in this case.
> 
> Signed-off-by: Ladislav Michl 
> Reported-by: "H. Nikolaus Schaller" 
> ---
>  drivers/mtd/nand/onenand/omap2.c | 105 +++
>  1 file changed, 38 insertions(+), 67 deletions(-)
> 
> diff --git a/drivers/mtd/nand/onenand/omap2.c 
> b/drivers/mtd/nand/onenand/omap2.c
> index 9c159f0dd9a6..321137158ff3 100644
> --- a/drivers/mtd/nand/onenand/omap2.c
> +++ b/drivers/mtd/nand/onenand/omap2.c
> @@ -375,56 +375,42 @@ static int omap2_onenand_read_bufferram(struct mtd_info 
> *mtd, int area,
>  {
>   struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd);
>   struct onenand_chip *this = mtd->priv;
> - dma_addr_t dma_src, dma_dst;
> - int bram_offset;
> + struct device *dev = &c->pdev->dev;
>   void *buf = (void *)buffer;
> + dma_addr_t dma_src, dma_dst;
> + int bram_offset, err;
>   size_t xtra;
> - int ret;
>  
>   bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset;
> - if (bram_offset & 3 || (size_t)buf & 3 || count < 384)
> - goto out_copy;
> -
> - /* panic_write() may be in an interrupt context */
> - if (in_interrupt() || oops_in_progress)
> + /*
> +  * If the buffer address is not DMA-able, len is not long enough to make
> +  * DMA transfers profitable or panic_write() may be in an interrupt
> +  * context fallback to PIO 

Re: [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5

2018-04-12 Thread H. Nikolaus Schaller
Hi Tony,

> Am 12.04.2018 um 17:03 schrieb Tony Lindgren :
> 
> * H. Nikolaus Schaller  [180411 07:05]:
>>> Am 10.04.2018 um 22:56 schrieb Ladislav Michl :
>>> Well, that's a shame... However, this code is in production for several
>>> months now,
>> 
>> Well, we could have tested earlier release-candidates...
> 
> I've found it's way easier to test Linux next on regular basis
> and report the regressions compared to chasing regressions after
> the merge window or tagged kernels. By that time you may have
> already multiple regressions to deal with which is always fun.
> 
> If you have products running with mainline kernel, maybe set up
> automated boot testing for core devices to catch regressions
> early?

Well, it is rare that we face upstream regressions of this severe
kind. Every 2-3 years... We much more often have to face missing
features in upstream so that we add our own patch sets on top
of each -rc.

In this case here we could have found this right after the merge window
when v4.16-rc1 came out. But by bad chance (and holidays and other
more important tasks) we did only test on GTA04 boards with standard
NAND. And just now did boot a board with OneNAND installed and were
surprised by this boot problem.

BR,
Nikolaus





Re: [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5

2018-04-12 Thread Tony Lindgren
* H. Nikolaus Schaller  [180411 07:05]:
> > Am 10.04.2018 um 22:56 schrieb Ladislav Michl :
> > Well, that's a shame... However, this code is in production for several
> > months now,
> 
> Well, we could have tested earlier release-candidates...

I've found it's way easier to test Linux next on regular basis
and report the regressions compared to chasing regressions after
the merge window or tagged kernels. By that time you may have
already multiple regressions to deal with which is always fun.

If you have products running with mainline kernel, maybe set up
automated boot testing for core devices to catch regressions
early?

Regards,

Tony


Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5

2018-04-11 Thread H. Nikolaus Schaller

> Am 11.04.2018 um 11:12 schrieb Ladislav Michl :
> 
> On Wed, Apr 11, 2018 at 10:52:01AM +0200, Boris Brezillon wrote:
>> On Wed, 11 Apr 2018 10:27:46 +0200
>> Ladislav Michl  wrote:
>> 
>>> On Wed, Apr 11, 2018 at 10:08:06AM +0200, Boris Brezillon wrote:
 On Wed, 11 Apr 2018 09:36:56 +0200
 Ladislav Michl  wrote:
 
> Hi Boris,
> 
> On Wed, Apr 11, 2018 at 09:15:28AM +0200, Boris Brezillon wrote:  
>>> [...]
>> Not sure this approach is safe on all archs: if the cache is VIVT or
>> VIPT, you may have several entries pointing to the same phys page, and
>> then, when dma_map_page() does its cache maintenance operations, it's
>> only taking one of these entries into account.
> 
> Hmm, I used the same approach Samsung OneNAND driver does since commit
> dcf08227e964a53a2cb39130b74842c7dcb6adde.
> Both TI OMAP3630 and Samsung S5PC110 are using Cortex-A8 which
> is VIPT. In that case samsung's driver code has the same problem.
> 
>> In other parts of the MTD subsystem, we tend to not do DMA on buffers
>> that have been vmalloc-ed.
>> 
>> You can do something like
>> 
>>  if (virt_addr_valid(buf))
>>  /* Use DMA */
>>  else
>>  /*
>>   * Do not use DMA, or use a bounce buffer
>>   * allocated with kmalloc
>>   */
> 
> Okay, I'll use this approach then, but first I'd like to be sure above is
> correct. Anyone?  
 
 See this discussion [1]. The problem came up a few times already, so
 might find other threads describing why it's not safe.
 
 [1]https://lists.linuxfoundation.org/pipermail/iommu/2016-March/016240.html
   
>>> 
>>> Question was more likely whenever there might exist more that one mapping
>>> of the same page.
>> 
>> I'm clearly not an expert, so I might be wrong, but I think a physical
>> page can be both in the identity mapping and mapped in the vmalloc
>> space. Now, is there a real risk that both ends are accessed in
>> parallel thus making different cache entries pointing to the same phys
>> page dirty, I'm not sure. Or maybe there are other corner case, but
>> you'll have to ask Russell (or any other ARM expert) to get a clearer
>> answer.
> 
> Thank you anyway. As DMA was disabled completely for all DT enabled boards
> until 4.16 let's play safe and disable it for HIGHMEM case as you suggested.
> Later, we might eventually use the same {read,write}_bufferram for both
> OMAP and S5PC110.
> 
>>> But okay, I'll disable DMA for highmem. Patch will follow.
>>> 
>>> ladis
> 
> Andreas, Nikolaus, could you please test patch bellow? It is against
> linux.git and should apply also against 4.16 once you modify path.

Works with v4.16!

I have done "tar cf /dev/null /mnt/nand" and have only
got some fixable bit flips. Writing a test file to OneNAND
also worked and doing a sync did move the weak block to a new PEB.

So I think read & write works again.

> 
> Thank you,
>   ladis

BR and thanks,
Nikolaus

> 
> --- >8 ---
> 
> From: Ladislav Michl 
> Subject: [PATCH] mtd: onenand: omap2: Disable DMA for HIGHMEM buffers
> 
> dma_map_single doesn't get the proper DMA address for vmalloced area,
> so disable DMA in this case.
> 
> Signed-off-by: Ladislav Michl 
> Reported-by: "H. Nikolaus Schaller" 
> ---
> drivers/mtd/nand/onenand/omap2.c | 105 +++
> 1 file changed, 38 insertions(+), 67 deletions(-)
> 
> diff --git a/drivers/mtd/nand/onenand/omap2.c 
> b/drivers/mtd/nand/onenand/omap2.c
> index 9c159f0dd9a6..321137158ff3 100644
> --- a/drivers/mtd/nand/onenand/omap2.c
> +++ b/drivers/mtd/nand/onenand/omap2.c
> @@ -375,56 +375,42 @@ static int omap2_onenand_read_bufferram(struct mtd_info 
> *mtd, int area,
> {
>   struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd);
>   struct onenand_chip *this = mtd->priv;
> - dma_addr_t dma_src, dma_dst;
> - int bram_offset;
> + struct device *dev = &c->pdev->dev;
>   void *buf = (void *)buffer;
> + dma_addr_t dma_src, dma_dst;
> + int bram_offset, err;
>   size_t xtra;
> - int ret;
> 
>   bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset;
> - if (bram_offset & 3 || (size_t)buf & 3 || count < 384)
> - goto out_copy;
> -
> - /* panic_write() may be in an interrupt context */
> - if (in_interrupt() || oops_in_progress)
> + /*
> +  * If the buffer address is not DMA-able, len is not long enough to make
> +  * DMA transfers profitable or panic_write() may be in an interrupt
> +  * context fallback to PIO mode.
> +  */
> + if (!virt_addr_valid(buf) || bram_offset & 3 || (size_t)buf & 3 ||
> + count < 384 || in_interrupt() || oops_in_progress )
>   goto out_copy;
> 
> - if (buf >= high_memory) {
> - struct page *p1;
> -
> - 

Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5

2018-04-11 Thread Ladislav Michl
On Wed, Apr 11, 2018 at 10:52:01AM +0200, Boris Brezillon wrote:
> On Wed, 11 Apr 2018 10:27:46 +0200
> Ladislav Michl  wrote:
> 
> > On Wed, Apr 11, 2018 at 10:08:06AM +0200, Boris Brezillon wrote:
> > > On Wed, 11 Apr 2018 09:36:56 +0200
> > > Ladislav Michl  wrote:
> > >   
> > > > Hi Boris,
> > > > 
> > > > On Wed, Apr 11, 2018 at 09:15:28AM +0200, Boris Brezillon wrote:  
> > [...]
> > > > > Not sure this approach is safe on all archs: if the cache is VIVT or
> > > > > VIPT, you may have several entries pointing to the same phys page, and
> > > > > then, when dma_map_page() does its cache maintenance operations, it's
> > > > > only taking one of these entries into account.
> > > > 
> > > > Hmm, I used the same approach Samsung OneNAND driver does since commit
> > > > dcf08227e964a53a2cb39130b74842c7dcb6adde.
> > > > Both TI OMAP3630 and Samsung S5PC110 are using Cortex-A8 which
> > > > is VIPT. In that case samsung's driver code has the same problem.
> > > >   
> > > > > In other parts of the MTD subsystem, we tend to not do DMA on buffers
> > > > > that have been vmalloc-ed.
> > > > > 
> > > > > You can do something like
> > > > > 
> > > > >   if (virt_addr_valid(buf))
> > > > >   /* Use DMA */
> > > > >   else
> > > > >   /*
> > > > >* Do not use DMA, or use a bounce buffer
> > > > >* allocated with kmalloc
> > > > >*/
> > > > 
> > > > Okay, I'll use this approach then, but first I'd like to be sure above 
> > > > is
> > > > correct. Anyone?  
> > > 
> > > See this discussion [1]. The problem came up a few times already, so
> > > might find other threads describing why it's not safe.
> > > 
> > > [1]https://lists.linuxfoundation.org/pipermail/iommu/2016-March/016240.html
> > >   
> > 
> > Question was more likely whenever there might exist more that one mapping
> > of the same page.
> 
> I'm clearly not an expert, so I might be wrong, but I think a physical
> page can be both in the identity mapping and mapped in the vmalloc
> space. Now, is there a real risk that both ends are accessed in
> parallel thus making different cache entries pointing to the same phys
> page dirty, I'm not sure. Or maybe there are other corner case, but
> you'll have to ask Russell (or any other ARM expert) to get a clearer
> answer.

Thank you anyway. As DMA was disabled completely for all DT enabled boards
until 4.16 let's play safe and disable it for HIGHMEM case as you suggested.
Later, we might eventually use the same {read,write}_bufferram for both
OMAP and S5PC110.

> > But okay, I'll disable DMA for highmem. Patch will follow.
> > 
> > ladis

Andreas, Nikolaus, could you please test patch bellow? It is against
linux.git and should apply also against 4.16 once you modify path.

Thank you,
ladis

--- >8 ---

From: Ladislav Michl 
Subject: [PATCH] mtd: onenand: omap2: Disable DMA for HIGHMEM buffers

dma_map_single doesn't get the proper DMA address for vmalloced area,
so disable DMA in this case.

Signed-off-by: Ladislav Michl 
Reported-by: "H. Nikolaus Schaller" 
---
 drivers/mtd/nand/onenand/omap2.c | 105 +++
 1 file changed, 38 insertions(+), 67 deletions(-)

diff --git a/drivers/mtd/nand/onenand/omap2.c b/drivers/mtd/nand/onenand/omap2.c
index 9c159f0dd9a6..321137158ff3 100644
--- a/drivers/mtd/nand/onenand/omap2.c
+++ b/drivers/mtd/nand/onenand/omap2.c
@@ -375,56 +375,42 @@ static int omap2_onenand_read_bufferram(struct mtd_info 
*mtd, int area,
 {
struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd);
struct onenand_chip *this = mtd->priv;
-   dma_addr_t dma_src, dma_dst;
-   int bram_offset;
+   struct device *dev = &c->pdev->dev;
void *buf = (void *)buffer;
+   dma_addr_t dma_src, dma_dst;
+   int bram_offset, err;
size_t xtra;
-   int ret;
 
bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset;
-   if (bram_offset & 3 || (size_t)buf & 3 || count < 384)
-   goto out_copy;
-
-   /* panic_write() may be in an interrupt context */
-   if (in_interrupt() || oops_in_progress)
+   /*
+* If the buffer address is not DMA-able, len is not long enough to make
+* DMA transfers profitable or panic_write() may be in an interrupt
+* context fallback to PIO mode.
+*/
+   if (!virt_addr_valid(buf) || bram_offset & 3 || (size_t)buf & 3 ||
+   count < 384 || in_interrupt() || oops_in_progress )
goto out_copy;
 
-   if (buf >= high_memory) {
-   struct page *p1;
-
-   if (((size_t)buf & PAGE_MASK) !=
-   ((size_t)(buf + count - 1) & PAGE_MASK))
-   goto out_copy;
-   p1 = vmalloc_to_page(buf);
-   if (!p1)
-   goto out_copy;
-   buf = page_addre

Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5

2018-04-11 Thread Boris Brezillon
On Wed, 11 Apr 2018 10:27:46 +0200
Ladislav Michl  wrote:

> On Wed, Apr 11, 2018 at 10:08:06AM +0200, Boris Brezillon wrote:
> > On Wed, 11 Apr 2018 09:36:56 +0200
> > Ladislav Michl  wrote:
> >   
> > > Hi Boris,
> > > 
> > > On Wed, Apr 11, 2018 at 09:15:28AM +0200, Boris Brezillon wrote:  
> [...]
> > > > Not sure this approach is safe on all archs: if the cache is VIVT or
> > > > VIPT, you may have several entries pointing to the same phys page, and
> > > > then, when dma_map_page() does its cache maintenance operations, it's
> > > > only taking one of these entries into account.
> > > 
> > > Hmm, I used the same approach Samsung OneNAND driver does since commit
> > > dcf08227e964a53a2cb39130b74842c7dcb6adde.
> > > Both TI OMAP3630 and Samsung S5PC110 are using Cortex-A8 which
> > > is VIPT. In that case samsung's driver code has the same problem.
> > >   
> > > > In other parts of the MTD subsystem, we tend to not do DMA on buffers
> > > > that have been vmalloc-ed.
> > > > 
> > > > You can do something like
> > > > 
> > > > if (virt_addr_valid(buf))
> > > > /* Use DMA */
> > > > else
> > > > /*
> > > >  * Do not use DMA, or use a bounce buffer
> > > >  * allocated with kmalloc
> > > >  */
> > > 
> > > Okay, I'll use this approach then, but first I'd like to be sure above is
> > > correct. Anyone?  
> > 
> > See this discussion [1]. The problem came up a few times already, so
> > might find other threads describing why it's not safe.
> > 
> > [1]https://lists.linuxfoundation.org/pipermail/iommu/2016-March/016240.html 
> >  
> 
> Question was more likely whenever there might exist more that one mapping
> of the same page.

I'm clearly not an expert, so I might be wrong, but I think a physical
page can be both in the identity mapping and mapped in the vmalloc
space. Now, is there a real risk that both ends are accessed in
parallel thus making different cache entries pointing to the same phys
page dirty, I'm not sure. Or maybe there are other corner case, but
you'll have to ask Russell (or any other ARM expert) to get a clearer
answer.

> But okay, I'll disable DMA for highmem. Patch will follow.
> 
>   ladis



Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5

2018-04-11 Thread Ladislav Michl
On Wed, Apr 11, 2018 at 10:08:06AM +0200, Boris Brezillon wrote:
> On Wed, 11 Apr 2018 09:36:56 +0200
> Ladislav Michl  wrote:
> 
> > Hi Boris,
> > 
> > On Wed, Apr 11, 2018 at 09:15:28AM +0200, Boris Brezillon wrote:
[...]
> > > Not sure this approach is safe on all archs: if the cache is VIVT or
> > > VIPT, you may have several entries pointing to the same phys page, and
> > > then, when dma_map_page() does its cache maintenance operations, it's
> > > only taking one of these entries into account.  
> > 
> > Hmm, I used the same approach Samsung OneNAND driver does since commit
> > dcf08227e964a53a2cb39130b74842c7dcb6adde.
> > Both TI OMAP3630 and Samsung S5PC110 are using Cortex-A8 which
> > is VIPT. In that case samsung's driver code has the same problem.
> > 
> > > In other parts of the MTD subsystem, we tend to not do DMA on buffers
> > > that have been vmalloc-ed.
> > > 
> > > You can do something like
> > > 
> > >   if (virt_addr_valid(buf))
> > >   /* Use DMA */
> > >   else
> > >   /*
> > >* Do not use DMA, or use a bounce buffer
> > >* allocated with kmalloc
> > >*/  
> > 
> > Okay, I'll use this approach then, but first I'd like to be sure above is
> > correct. Anyone?
> 
> See this discussion [1]. The problem came up a few times already, so
> might find other threads describing why it's not safe.
> 
> [1]https://lists.linuxfoundation.org/pipermail/iommu/2016-March/016240.html

Question was more likely whenever there might exist more that one mapping
of the same page. But okay, I'll disable DMA for highmem. Patch will follow.

ladis


Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5

2018-04-11 Thread Boris Brezillon
On Wed, 11 Apr 2018 09:36:56 +0200
Ladislav Michl  wrote:

> Hi Boris,
> 
> On Wed, Apr 11, 2018 at 09:15:28AM +0200, Boris Brezillon wrote:
> > Hi Ladislav,
> > 
> > On Wed, 11 Apr 2018 08:26:07 +0200
> > Ladislav Michl  wrote:
> >   
> > > Hi Andreas,
> > > 
> > > On Wed, Apr 11, 2018 at 06:59:03AM +0200, Andreas Kemnade wrote:  
> > > > Hi Ladis,
> > > > 
> > > > On Tue, 10 Apr 2018 22:56:43 +0200
> > > > Ladislav Michl  wrote:
> > > > 
> > > > > Hi Nikolaus,
> > > > > 
> > > > > On Tue, Apr 10, 2018 at 06:25:17PM +0200, H. Nikolaus Schaller wrote: 
> > > > >
> > > > > > Hi,
> > > > > > we just started testing the v4.16 kernel and found the
> > > > > > device no longer bootable (works with v4.15). It turned
> > > > > > out that there was a harmful modification somewhere between
> > > > > > v4.15.0 and v4.16-rc1.
> > > > > > 
> > > > > > A git bisect points to this patch:  
> > > > > 
> > > > > Well, that's a shame... However, this code is in production for 
> > > > > several
> > > > > months now, so could you, please put 'goto out_copy' if 'buf >= 
> > > > > high_memory'
> > > > > condition is met, ie:
> > > > > --- a/drivers/mtd/nand/onenand/omap2.c
> > > > > +++ b/drivers/mtd/nand/onenand/omap2.c
> > > > > @@ -392,6 +392,7 @@ static int omap2_onenand_read_bufferram(struct 
> > > > > mtd_info *mtd, int area,
> > > > >   if (buf >= high_memory) {
> > > > >   struct page *p1;
> > > > >  
> > > > > + goto out_copy;
> > > > >   if (((size_t)buf & PAGE_MASK) !=
> > > > >   ((size_t)(buf + count - 1) & PAGE_MASK))
> > > > >   goto out_copy;
> > > > 
> > > > I had the same problem here, and that snippet  helps here. ubiattach
> > > > -p /dev/mtdX does not cause kernel oopses here anymore
> > > 
> > > It seems reviving old code always comes at a price :-) Could you try
> > > following patch, so far compile tested only?
> > > (we'll need to do the same for omap2_onenand_write_bufferram, but
> > > it sould be enough for testing purposes now)
> > > 
> > > diff --git a/drivers/mtd/nand/onenand/omap2.c 
> > > b/drivers/mtd/nand/onenand/omap2.c
> > > index 9c159f0dd9a6..04cefd7a6487 100644
> > > --- a/drivers/mtd/nand/onenand/omap2.c
> > > +++ b/drivers/mtd/nand/onenand/omap2.c
> > > @@ -375,11 +375,12 @@ static int omap2_onenand_read_bufferram(struct 
> > > mtd_info *mtd, int area,
> > >  {
> > >   struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd);
> > >   struct onenand_chip *this = mtd->priv;
> > > + struct device *dev = &c->pdev->dev;
> > >   dma_addr_t dma_src, dma_dst;
> > >   int bram_offset;
> > >   void *buf = (void *)buffer;
> > >   size_t xtra;
> > > - int ret;
> > > + int ret, page_dma = 0;
> > >  
> > >   bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset;
> > >   if (bram_offset & 3 || (size_t)buf & 3 || count < 384)
> > > @@ -389,38 +390,43 @@ static int omap2_onenand_read_bufferram(struct 
> > > mtd_info *mtd, int area,
> > >   if (in_interrupt() || oops_in_progress)
> > >   goto out_copy;
> > >  
> > > + xtra = count & 3;
> > > + if (xtra) {
> > > + count -= xtra;
> > > + memcpy(buf + count, this->base + bram_offset + count, xtra);
> > > + }
> > > +
> > > + /* Handle vmalloc address */
> > >   if (buf >= high_memory) {
> > > - struct page *p1;
> > > + struct page *page;
> > >  
> > >   if (((size_t)buf & PAGE_MASK) !=
> > >   ((size_t)(buf + count - 1) & PAGE_MASK))
> > >   goto out_copy;
> > > - p1 = vmalloc_to_page(buf);
> > > - if (!p1)
> > > + page = vmalloc_to_page(buf);  
> > 
> > Not sure this approach is safe on all archs: if the cache is VIVT or
> > VIPT, you may have several entries pointing to the same phys page, and
> > then, when dma_map_page() does its cache maintenance operations, it's
> > only taking one of these entries into account.  
> 
> Hmm, I used the same approach Samsung OneNAND driver does since commit
> dcf08227e964a53a2cb39130b74842c7dcb6adde.
> Both TI OMAP3630 and Samsung S5PC110 are using Cortex-A8 which
> is VIPT. In that case samsung's driver code has the same problem.
> 
> > In other parts of the MTD subsystem, we tend to not do DMA on buffers
> > that have been vmalloc-ed.
> > 
> > You can do something like
> > 
> > if (virt_addr_valid(buf))
> > /* Use DMA */
> > else
> > /*
> >  * Do not use DMA, or use a bounce buffer
> >  * allocated with kmalloc
> >  */  
> 
> Okay, I'll use this approach then, but first I'd like to be sure above is
> correct. Anyone?

See this discussion [1]. The problem came up a few times already, so
might find other threads describing why it's not safe.

[1]https://lists.linuxfoundation.org/pipermail/iommu/2016-March/016240.html


Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5

2018-04-11 Thread Ladislav Michl
Hi Boris,

On Wed, Apr 11, 2018 at 09:15:28AM +0200, Boris Brezillon wrote:
> Hi Ladislav,
> 
> On Wed, 11 Apr 2018 08:26:07 +0200
> Ladislav Michl  wrote:
> 
> > Hi Andreas,
> > 
> > On Wed, Apr 11, 2018 at 06:59:03AM +0200, Andreas Kemnade wrote:
> > > Hi Ladis,
> > > 
> > > On Tue, 10 Apr 2018 22:56:43 +0200
> > > Ladislav Michl  wrote:
> > >   
> > > > Hi Nikolaus,
> > > > 
> > > > On Tue, Apr 10, 2018 at 06:25:17PM +0200, H. Nikolaus Schaller wrote:  
> > > > > Hi,
> > > > > we just started testing the v4.16 kernel and found the
> > > > > device no longer bootable (works with v4.15). It turned
> > > > > out that there was a harmful modification somewhere between
> > > > > v4.15.0 and v4.16-rc1.
> > > > > 
> > > > > A git bisect points to this patch:
> > > > 
> > > > Well, that's a shame... However, this code is in production for several
> > > > months now, so could you, please put 'goto out_copy' if 'buf >= 
> > > > high_memory'
> > > > condition is met, ie:
> > > > --- a/drivers/mtd/nand/onenand/omap2.c
> > > > +++ b/drivers/mtd/nand/onenand/omap2.c
> > > > @@ -392,6 +392,7 @@ static int omap2_onenand_read_bufferram(struct 
> > > > mtd_info *mtd, int area,
> > > > if (buf >= high_memory) {
> > > > struct page *p1;
> > > >  
> > > > +   goto out_copy;
> > > > if (((size_t)buf & PAGE_MASK) !=
> > > > ((size_t)(buf + count - 1) & PAGE_MASK))
> > > > goto out_copy;  
> > > 
> > > I had the same problem here, and that snippet  helps here. ubiattach
> > > -p /dev/mtdX does not cause kernel oopses here anymore  
> > 
> > It seems reviving old code always comes at a price :-) Could you try
> > following patch, so far compile tested only?
> > (we'll need to do the same for omap2_onenand_write_bufferram, but
> > it sould be enough for testing purposes now)
> > 
> > diff --git a/drivers/mtd/nand/onenand/omap2.c 
> > b/drivers/mtd/nand/onenand/omap2.c
> > index 9c159f0dd9a6..04cefd7a6487 100644
> > --- a/drivers/mtd/nand/onenand/omap2.c
> > +++ b/drivers/mtd/nand/onenand/omap2.c
> > @@ -375,11 +375,12 @@ static int omap2_onenand_read_bufferram(struct 
> > mtd_info *mtd, int area,
> >  {
> > struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd);
> > struct onenand_chip *this = mtd->priv;
> > +   struct device *dev = &c->pdev->dev;
> > dma_addr_t dma_src, dma_dst;
> > int bram_offset;
> > void *buf = (void *)buffer;
> > size_t xtra;
> > -   int ret;
> > +   int ret, page_dma = 0;
> >  
> > bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset;
> > if (bram_offset & 3 || (size_t)buf & 3 || count < 384)
> > @@ -389,38 +390,43 @@ static int omap2_onenand_read_bufferram(struct 
> > mtd_info *mtd, int area,
> > if (in_interrupt() || oops_in_progress)
> > goto out_copy;
> >  
> > +   xtra = count & 3;
> > +   if (xtra) {
> > +   count -= xtra;
> > +   memcpy(buf + count, this->base + bram_offset + count, xtra);
> > +   }
> > +
> > +   /* Handle vmalloc address */
> > if (buf >= high_memory) {
> > -   struct page *p1;
> > +   struct page *page;
> >  
> > if (((size_t)buf & PAGE_MASK) !=
> > ((size_t)(buf + count - 1) & PAGE_MASK))
> > goto out_copy;
> > -   p1 = vmalloc_to_page(buf);
> > -   if (!p1)
> > +   page = vmalloc_to_page(buf);
> 
> Not sure this approach is safe on all archs: if the cache is VIVT or
> VIPT, you may have several entries pointing to the same phys page, and
> then, when dma_map_page() does its cache maintenance operations, it's
> only taking one of these entries into account.

Hmm, I used the same approach Samsung OneNAND driver does since commit
dcf08227e964a53a2cb39130b74842c7dcb6adde.
Both TI OMAP3630 and Samsung S5PC110 are using Cortex-A8 which
is VIPT. In that case samsung's driver code has the same problem.

> In other parts of the MTD subsystem, we tend to not do DMA on buffers
> that have been vmalloc-ed.
> 
> You can do something like
> 
>   if (virt_addr_valid(buf))
>   /* Use DMA */
>   else
>   /*
>* Do not use DMA, or use a bounce buffer
>* allocated with kmalloc
>*/

Okay, I'll use this approach then, but first I'd like to be sure above is
correct. Anyone?

> Regards,
> 
> Boris

Thank you,
ladis

> > +   if (!page)
> > goto out_copy;
> > -   buf = page_address(p1) + ((size_t)buf & ~PAGE_MASK);
> > -   }
> > -
> > -   xtra = count & 3;
> > -   if (xtra) {
> > -   count -= xtra;
> > -   memcpy(buf + count, this->base + bram_offset + count, xtra);
> > +   page_dma = 1;
> > +   dma_dst = dma_map_page(dev, page, (size_t) buf & ~PAGE_MASK,
> > +  count, DMA_FROM_

Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5

2018-04-11 Thread Boris Brezillon
Hi Ladislav,

On Wed, 11 Apr 2018 08:26:07 +0200
Ladislav Michl  wrote:

> Hi Andreas,
> 
> On Wed, Apr 11, 2018 at 06:59:03AM +0200, Andreas Kemnade wrote:
> > Hi Ladis,
> > 
> > On Tue, 10 Apr 2018 22:56:43 +0200
> > Ladislav Michl  wrote:
> >   
> > > Hi Nikolaus,
> > > 
> > > On Tue, Apr 10, 2018 at 06:25:17PM +0200, H. Nikolaus Schaller wrote:  
> > > > Hi,
> > > > we just started testing the v4.16 kernel and found the
> > > > device no longer bootable (works with v4.15). It turned
> > > > out that there was a harmful modification somewhere between
> > > > v4.15.0 and v4.16-rc1.
> > > > 
> > > > A git bisect points to this patch:
> > > 
> > > Well, that's a shame... However, this code is in production for several
> > > months now, so could you, please put 'goto out_copy' if 'buf >= 
> > > high_memory'
> > > condition is met, ie:
> > > --- a/drivers/mtd/nand/onenand/omap2.c
> > > +++ b/drivers/mtd/nand/onenand/omap2.c
> > > @@ -392,6 +392,7 @@ static int omap2_onenand_read_bufferram(struct 
> > > mtd_info *mtd, int area,
> > >   if (buf >= high_memory) {
> > >   struct page *p1;
> > >  
> > > + goto out_copy;
> > >   if (((size_t)buf & PAGE_MASK) !=
> > >   ((size_t)(buf + count - 1) & PAGE_MASK))
> > >   goto out_copy;  
> > 
> > I had the same problem here, and that snippet  helps here. ubiattach
> > -p /dev/mtdX does not cause kernel oopses here anymore  
> 
> It seems reviving old code always comes at a price :-) Could you try
> following patch, so far compile tested only?
> (we'll need to do the same for omap2_onenand_write_bufferram, but
> it sould be enough for testing purposes now)
> 
> diff --git a/drivers/mtd/nand/onenand/omap2.c 
> b/drivers/mtd/nand/onenand/omap2.c
> index 9c159f0dd9a6..04cefd7a6487 100644
> --- a/drivers/mtd/nand/onenand/omap2.c
> +++ b/drivers/mtd/nand/onenand/omap2.c
> @@ -375,11 +375,12 @@ static int omap2_onenand_read_bufferram(struct mtd_info 
> *mtd, int area,
>  {
>   struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd);
>   struct onenand_chip *this = mtd->priv;
> + struct device *dev = &c->pdev->dev;
>   dma_addr_t dma_src, dma_dst;
>   int bram_offset;
>   void *buf = (void *)buffer;
>   size_t xtra;
> - int ret;
> + int ret, page_dma = 0;
>  
>   bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset;
>   if (bram_offset & 3 || (size_t)buf & 3 || count < 384)
> @@ -389,38 +390,43 @@ static int omap2_onenand_read_bufferram(struct mtd_info 
> *mtd, int area,
>   if (in_interrupt() || oops_in_progress)
>   goto out_copy;
>  
> + xtra = count & 3;
> + if (xtra) {
> + count -= xtra;
> + memcpy(buf + count, this->base + bram_offset + count, xtra);
> + }
> +
> + /* Handle vmalloc address */
>   if (buf >= high_memory) {
> - struct page *p1;
> + struct page *page;
>  
>   if (((size_t)buf & PAGE_MASK) !=
>   ((size_t)(buf + count - 1) & PAGE_MASK))
>   goto out_copy;
> - p1 = vmalloc_to_page(buf);
> - if (!p1)
> + page = vmalloc_to_page(buf);

Not sure this approach is safe on all archs: if the cache is VIVT or
VIPT, you may have several entries pointing to the same phys page, and
then, when dma_map_page() does its cache maintenance operations, it's
only taking one of these entries into account.

In other parts of the MTD subsystem, we tend to not do DMA on buffers
that have been vmalloc-ed.

You can do something like

if (virt_addr_valid(buf))
/* Use DMA */
else
/*
 * Do not use DMA, or use a bounce buffer
 * allocated with kmalloc
 */

Regards,

Boris

> + if (!page)
>   goto out_copy;
> - buf = page_address(p1) + ((size_t)buf & ~PAGE_MASK);
> - }
> -
> - xtra = count & 3;
> - if (xtra) {
> - count -= xtra;
> - memcpy(buf + count, this->base + bram_offset + count, xtra);
> + page_dma = 1;
> + dma_dst = dma_map_page(dev, page, (size_t) buf & ~PAGE_MASK,
> +count, DMA_FROM_DEVICE);
> + } else {
> + dma_dst = dma_map_single(dev, buf, count, DMA_FROM_DEVICE);
>   }
> -
>   dma_src = c->phys_base + bram_offset;
> - dma_dst = dma_map_single(&c->pdev->dev, buf, count, DMA_FROM_DEVICE);
> - if (dma_mapping_error(&c->pdev->dev, dma_dst)) {
> - dev_err(&c->pdev->dev,
> - "Couldn't DMA map a %d byte buffer\n",
> - count);
> +
> + if (dma_mapping_error(dev, dma_dst)) {
> + dev_err(dev, "Couldn't DMA map a %d byte buffer\n", count);
>   goto out_copy;
>   }
>  
>   ret = omap2_on

Re: [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5

2018-04-11 Thread H. Nikolaus Schaller
Hi Ladislav,

> Am 10.04.2018 um 22:56 schrieb Ladislav Michl :
> 
> Hi Nikolaus,
> 
> On Tue, Apr 10, 2018 at 06:25:17PM +0200, H. Nikolaus Schaller wrote:
>> Hi,
>> we just started testing the v4.16 kernel and found the
>> device no longer bootable (works with v4.15). It turned
>> out that there was a harmful modification somewhere between
>> v4.15.0 and v4.16-rc1.
>> 
>> A git bisect points to this patch:
> 
> Well, that's a shame... However, this code is in production for several
> months now,

Well, we could have tested earlier release-candidates...

> so could you, please put 'goto out_copy' if 'buf >= high_memory'
> condition is met, ie:
> --- a/drivers/mtd/nand/onenand/omap2.c
> +++ b/drivers/mtd/nand/onenand/omap2.c
> @@ -392,6 +392,7 @@ static int omap2_onenand_read_bufferram(struct mtd_info 
> *mtd, int area,
>   if (buf >= high_memory) {
>   struct page *p1;
> 
> + goto out_copy;
>   if (((size_t)buf & PAGE_MASK) !=
>   ((size_t)(buf + count - 1) & PAGE_MASK))
>   goto out_copy;
> and in case it does not help put the same goto at the very beginning of
> omap2_onenand_read_bufferram function and report result?

Yes, works for me. Device boots and I can mount the NAND to inspect the
ubifs.

Only omap2_onenand_write_bufferram has the same problem, but that is to
be expected.

I'll try you new patch now.

> 
> Thank you for cooperation,
>   ladis

Thanks for the quick reply and analysis.

BR,
Nikolaus

> 
>> commit bdaca9345d41fd9420995469d27603ea62054691
>> Author: Ladislav Michl 
>> Date:   Fri Jan 12 14:16:57 2018 +0100
>> 
>>   mtd: onenand: omap2: Decouple DMA enabling from INT pin availability
>> 
>>   INT pin (gpio_irq) is not really needed for DMA but only for notification
>>   when a command that needs wait has completed. DMA memcpy can be still used
>>   even without gpio_irq available, so enable it unconditionally.
>> 
>>   Signed-off-by: Ladislav Michl 
>>   Reviewed-by: Peter Ujfalusi 
>>   Tested-by: Tony Lindgren 
>>   Tested-by: Aaro Koskinen 
>>   Acked-by: Roger Quadros 
>>   Signed-off-by: Boris Brezillon 
>> 
>> Kernel panic log is attached which indeed indicates a DMA problem.
>> 
>> Note that we have added compatible = "ti,omap2-onenand";
>> 
>> Any suggestions, fixes?
>> 
>> BR and thanks,
>> Nikolaus Schaller
>> 
>> 
>> 
>> ## Booting kernel from Legacy Image at 8200 ...
>>   Image Name:   Linux-4.16.0-letux+
>>   Image Type:   ARM Linux Kernel Image (uncompressed)
>>   Data Size:4456744 Bytes = 4.3 MiB
>>   Load Address: 80008000
>>   Entry Point:  80008000
>>   Verifying Checksum ... OK
>> ## Flattened Device Tree blob at 81c0
>>   Booting using the fdt blob at 0x81c0
>>   Loading Kernel Image ... OK
>>   Using Device Tree in place at 81c0, end 81c14a1e
>> 
>> Starting kernel ...
>> 
>> [0.00] Booting Linux on physical CPU 0x0
>> [0.00] Linux version 4.16.0-letux+ (hns@iMac.local) (gcc version 
>> 4.9.2 (GCC)) #2187 SMP PREEMPT Tue Apr 10 16:23:45 CEST 2018
>> [0.00] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), 
>> cr=10c5387d
>> [0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing 
>> instruction cache
>> [0.00] OF: fdt: Machine model: Goldelico GTA04A5/Letux 2804
>> [0.00] debug: ignoring loglevel setting.
>> [0.00] Memory policy: Data cache writeback
>> [0.00] cma: Reserved 16 MiB at 0xbe80
>> [0.00] On node 0 totalpages: 261632
>> [0.00]   Normal zone: 1536 pages used for memmap
>> [0.00]   Normal zone: 0 pages reserved
>> [0.00]   Normal zone: 196608 pages, LIFO batch:31
>> [0.00]   HighMem zone: 65024 pages, LIFO batch:15
>> [0.00] CPU: All CPU(s) started in SVC mode.
>> [0.00] OMAP3630/DM3730 ES1.2 (l2cache iva sgx neon isp 192mhz_clk)
>> [0.00] random: fast init done
>> [0.00] percpu: Embedded 17 pages/cpu @(ptrval) s39744 r8192 d21696 
>> u69632
>> [0.00] pcpu-alloc: s39744 r8192 d21696 u69632 alloc=17*4096
>> [0.00] pcpu-alloc: [0] 0 
>> [0.00] Built 1 zonelists, mobility grouping on.  Total pages: 260096
>> [0.00] Kernel command line: console=ttyO2,115200n8 
>> mtdoops.mtddev=omap2.nand ubi.mtd=4 root=/dev/mmcblk0p1 rw 
>> rootfstype=ext4,ext3 rootwait console=ttyO2,115200n8 vram=12M 
>> omapfb.vram=0:8M,1:4M omapfb.rotate_type=0 omapdss.def_disp=lcd rootwait 
>> twl4030_charger.allow_usb=1 log_buf_len=8M ignore_loglevel earlyprintk
>> [0.00] log_buf_len: 8388608 bytes
>> [0.00] early log buf free: 63924(97%)
>> [0.00] Dentry cache hash table entries: 131072 (order: 7, 524288 
>> bytes)
>> [0.00] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
>> [0.00] Memory: 1002524K/1046528K available (6169K kernel code, 626K 
>> rwdata, 1660K rodata, 1024K init, 218K bss, 27620K reserved, 16384K 
>> cma-reserved, 243712K highmem)
>> [   

Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5

2018-04-10 Thread Ladislav Michl
Hi Andreas,

On Wed, Apr 11, 2018 at 06:59:03AM +0200, Andreas Kemnade wrote:
> Hi Ladis,
> 
> On Tue, 10 Apr 2018 22:56:43 +0200
> Ladislav Michl  wrote:
> 
> > Hi Nikolaus,
> > 
> > On Tue, Apr 10, 2018 at 06:25:17PM +0200, H. Nikolaus Schaller wrote:
> > > Hi,
> > > we just started testing the v4.16 kernel and found the
> > > device no longer bootable (works with v4.15). It turned
> > > out that there was a harmful modification somewhere between
> > > v4.15.0 and v4.16-rc1.
> > > 
> > > A git bisect points to this patch:  
> > 
> > Well, that's a shame... However, this code is in production for several
> > months now, so could you, please put 'goto out_copy' if 'buf >= high_memory'
> > condition is met, ie:
> > --- a/drivers/mtd/nand/onenand/omap2.c
> > +++ b/drivers/mtd/nand/onenand/omap2.c
> > @@ -392,6 +392,7 @@ static int omap2_onenand_read_bufferram(struct mtd_info 
> > *mtd, int area,
> > if (buf >= high_memory) {
> > struct page *p1;
> >  
> > +   goto out_copy;
> > if (((size_t)buf & PAGE_MASK) !=
> > ((size_t)(buf + count - 1) & PAGE_MASK))
> > goto out_copy;
> 
> I had the same problem here, and that snippet  helps here. ubiattach
> -p /dev/mtdX does not cause kernel oopses here anymore

It seems reviving old code always comes at a price :-) Could you try
following patch, so far compile tested only?
(we'll need to do the same for omap2_onenand_write_bufferram, but
it sould be enough for testing purposes now)

diff --git a/drivers/mtd/nand/onenand/omap2.c b/drivers/mtd/nand/onenand/omap2.c
index 9c159f0dd9a6..04cefd7a6487 100644
--- a/drivers/mtd/nand/onenand/omap2.c
+++ b/drivers/mtd/nand/onenand/omap2.c
@@ -375,11 +375,12 @@ static int omap2_onenand_read_bufferram(struct mtd_info 
*mtd, int area,
 {
struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd);
struct onenand_chip *this = mtd->priv;
+   struct device *dev = &c->pdev->dev;
dma_addr_t dma_src, dma_dst;
int bram_offset;
void *buf = (void *)buffer;
size_t xtra;
-   int ret;
+   int ret, page_dma = 0;
 
bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset;
if (bram_offset & 3 || (size_t)buf & 3 || count < 384)
@@ -389,38 +390,43 @@ static int omap2_onenand_read_bufferram(struct mtd_info 
*mtd, int area,
if (in_interrupt() || oops_in_progress)
goto out_copy;
 
+   xtra = count & 3;
+   if (xtra) {
+   count -= xtra;
+   memcpy(buf + count, this->base + bram_offset + count, xtra);
+   }
+
+   /* Handle vmalloc address */
if (buf >= high_memory) {
-   struct page *p1;
+   struct page *page;
 
if (((size_t)buf & PAGE_MASK) !=
((size_t)(buf + count - 1) & PAGE_MASK))
goto out_copy;
-   p1 = vmalloc_to_page(buf);
-   if (!p1)
+   page = vmalloc_to_page(buf);
+   if (!page)
goto out_copy;
-   buf = page_address(p1) + ((size_t)buf & ~PAGE_MASK);
-   }
-
-   xtra = count & 3;
-   if (xtra) {
-   count -= xtra;
-   memcpy(buf + count, this->base + bram_offset + count, xtra);
+   page_dma = 1;
+   dma_dst = dma_map_page(dev, page, (size_t) buf & ~PAGE_MASK,
+  count, DMA_FROM_DEVICE);
+   } else {
+   dma_dst = dma_map_single(dev, buf, count, DMA_FROM_DEVICE);
}
-
dma_src = c->phys_base + bram_offset;
-   dma_dst = dma_map_single(&c->pdev->dev, buf, count, DMA_FROM_DEVICE);
-   if (dma_mapping_error(&c->pdev->dev, dma_dst)) {
-   dev_err(&c->pdev->dev,
-   "Couldn't DMA map a %d byte buffer\n",
-   count);
+
+   if (dma_mapping_error(dev, dma_dst)) {
+   dev_err(dev, "Couldn't DMA map a %d byte buffer\n", count);
goto out_copy;
}
 
ret = omap2_onenand_dma_transfer(c, dma_src, dma_dst, count);
-   dma_unmap_single(&c->pdev->dev, dma_dst, count, DMA_FROM_DEVICE);
+   if (page_dma)
+   dma_unmap_page(dev, dma_dst, count, DMA_FROM_DEVICE);
+   else
+   dma_unmap_single(dev, dma_dst, count, DMA_FROM_DEVICE);
 
if (ret) {
-   dev_err(&c->pdev->dev, "timeout waiting for DMA\n");
+   dev_err(dev, "timeout waiting for DMA\n");
goto out_copy;
}
 


Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5

2018-04-10 Thread Andreas Kemnade
Hi Ladis,

On Tue, 10 Apr 2018 22:56:43 +0200
Ladislav Michl  wrote:

> Hi Nikolaus,
> 
> On Tue, Apr 10, 2018 at 06:25:17PM +0200, H. Nikolaus Schaller wrote:
> > Hi,
> > we just started testing the v4.16 kernel and found the
> > device no longer bootable (works with v4.15). It turned
> > out that there was a harmful modification somewhere between
> > v4.15.0 and v4.16-rc1.
> > 
> > A git bisect points to this patch:  
> 
> Well, that's a shame... However, this code is in production for several
> months now, so could you, please put 'goto out_copy' if 'buf >= high_memory'
> condition is met, ie:
> --- a/drivers/mtd/nand/onenand/omap2.c
> +++ b/drivers/mtd/nand/onenand/omap2.c
> @@ -392,6 +392,7 @@ static int omap2_onenand_read_bufferram(struct mtd_info 
> *mtd, int area,
>   if (buf >= high_memory) {
>   struct page *p1;
>  
> + goto out_copy;
>   if (((size_t)buf & PAGE_MASK) !=
>   ((size_t)(buf + count - 1) & PAGE_MASK))
>   goto out_copy;

I had the same problem here, and that snippet  helps here. ubiattach
-p /dev/mtdX does not cause kernel oopses here anymore


Regards,
Andreas


pgpMtvtkaSvQW.pgp
Description: OpenPGP digital signature


Re: [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5

2018-04-10 Thread Ladislav Michl
Hi Nikolaus,

On Tue, Apr 10, 2018 at 06:25:17PM +0200, H. Nikolaus Schaller wrote:
> Hi,
> we just started testing the v4.16 kernel and found the
> device no longer bootable (works with v4.15). It turned
> out that there was a harmful modification somewhere between
> v4.15.0 and v4.16-rc1.
> 
> A git bisect points to this patch:

Well, that's a shame... However, this code is in production for several
months now, so could you, please put 'goto out_copy' if 'buf >= high_memory'
condition is met, ie:
--- a/drivers/mtd/nand/onenand/omap2.c
+++ b/drivers/mtd/nand/onenand/omap2.c
@@ -392,6 +392,7 @@ static int omap2_onenand_read_bufferram(struct mtd_info 
*mtd, int area,
if (buf >= high_memory) {
struct page *p1;
 
+   goto out_copy;
if (((size_t)buf & PAGE_MASK) !=
((size_t)(buf + count - 1) & PAGE_MASK))
goto out_copy;
and in case it does not help put the same goto at the very beginning of
omap2_onenand_read_bufferram function and report result?

Thank you for cooperation,
ladis

> commit bdaca9345d41fd9420995469d27603ea62054691
> Author: Ladislav Michl 
> Date:   Fri Jan 12 14:16:57 2018 +0100
> 
>mtd: onenand: omap2: Decouple DMA enabling from INT pin availability
> 
>INT pin (gpio_irq) is not really needed for DMA but only for notification
>when a command that needs wait has completed. DMA memcpy can be still used
>even without gpio_irq available, so enable it unconditionally.
> 
>Signed-off-by: Ladislav Michl 
>Reviewed-by: Peter Ujfalusi 
>Tested-by: Tony Lindgren 
>Tested-by: Aaro Koskinen 
>Acked-by: Roger Quadros 
>Signed-off-by: Boris Brezillon 
> 
> Kernel panic log is attached which indeed indicates a DMA problem.
> 
> Note that we have added compatible = "ti,omap2-onenand";
> 
> Any suggestions, fixes?
> 
> BR and thanks,
> Nikolaus Schaller
> 
> 
> 
> ## Booting kernel from Legacy Image at 8200 ...
>Image Name:   Linux-4.16.0-letux+
>Image Type:   ARM Linux Kernel Image (uncompressed)
>Data Size:4456744 Bytes = 4.3 MiB
>Load Address: 80008000
>Entry Point:  80008000
>Verifying Checksum ... OK
> ## Flattened Device Tree blob at 81c0
>Booting using the fdt blob at 0x81c0
>Loading Kernel Image ... OK
>Using Device Tree in place at 81c0, end 81c14a1e
> 
> Starting kernel ...
> 
> [0.00] Booting Linux on physical CPU 0x0
> [0.00] Linux version 4.16.0-letux+ (hns@iMac.local) (gcc version 
> 4.9.2 (GCC)) #2187 SMP PREEMPT Tue Apr 10 16:23:45 CEST 2018
> [0.00] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c5387d
> [0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing 
> instruction cache
> [0.00] OF: fdt: Machine model: Goldelico GTA04A5/Letux 2804
> [0.00] debug: ignoring loglevel setting.
> [0.00] Memory policy: Data cache writeback
> [0.00] cma: Reserved 16 MiB at 0xbe80
> [0.00] On node 0 totalpages: 261632
> [0.00]   Normal zone: 1536 pages used for memmap
> [0.00]   Normal zone: 0 pages reserved
> [0.00]   Normal zone: 196608 pages, LIFO batch:31
> [0.00]   HighMem zone: 65024 pages, LIFO batch:15
> [0.00] CPU: All CPU(s) started in SVC mode.
> [0.00] OMAP3630/DM3730 ES1.2 (l2cache iva sgx neon isp 192mhz_clk)
> [0.00] random: fast init done
> [0.00] percpu: Embedded 17 pages/cpu @(ptrval) s39744 r8192 d21696 
> u69632
> [0.00] pcpu-alloc: s39744 r8192 d21696 u69632 alloc=17*4096
> [0.00] pcpu-alloc: [0] 0 
> [0.00] Built 1 zonelists, mobility grouping on.  Total pages: 260096
> [0.00] Kernel command line: console=ttyO2,115200n8 
> mtdoops.mtddev=omap2.nand ubi.mtd=4 root=/dev/mmcblk0p1 rw 
> rootfstype=ext4,ext3 rootwait console=ttyO2,115200n8 vram=12M 
> omapfb.vram=0:8M,1:4M omapfb.rotate_type=0 omapdss.def_disp=lcd rootwait 
> twl4030_charger.allow_usb=1 log_buf_len=8M ignore_loglevel earlyprintk
> [0.00] log_buf_len: 8388608 bytes
> [0.00] early log buf free: 63924(97%)
> [0.00] Dentry cache hash table entries: 131072 (order: 7, 524288 
> bytes)
> [0.00] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
> [0.00] Memory: 1002524K/1046528K available (6169K kernel code, 626K 
> rwdata, 1660K rodata, 1024K init, 218K bss, 27620K reserved, 16384K 
> cma-reserved, 243712K highmem)
> [0.00] Virtual kernel memory layout:
> [0.00] vector  : 0x - 0x1000   (   4 kB)
> [0.00] fixmap  : 0xffc0 - 0xfff0   (3072 kB)
> [0.00] vmalloc : 0xf080 - 0xff80   ( 240 MB)
> [0.00] lowmem  : 0xc000 - 0xf000   ( 768 MB)
> [0.00] pkmap   : 0xbfe0 - 0xc000   (   2 MB)
> [0.00] modules : 0xbf00 - 0xbfe0   (  14 MB)
> [0.00]   .text : 0x(ptrval) 

[Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5

2018-04-10 Thread H. Nikolaus Schaller
Hi,
we just started testing the v4.16 kernel and found the
device no longer bootable (works with v4.15). It turned
out that there was a harmful modification somewhere between
v4.15.0 and v4.16-rc1.

A git bisect points to this patch:

commit bdaca9345d41fd9420995469d27603ea62054691
Author: Ladislav Michl 
Date:   Fri Jan 12 14:16:57 2018 +0100

   mtd: onenand: omap2: Decouple DMA enabling from INT pin availability

   INT pin (gpio_irq) is not really needed for DMA but only for notification
   when a command that needs wait has completed. DMA memcpy can be still used
   even without gpio_irq available, so enable it unconditionally.

   Signed-off-by: Ladislav Michl 
   Reviewed-by: Peter Ujfalusi 
   Tested-by: Tony Lindgren 
   Tested-by: Aaro Koskinen 
   Acked-by: Roger Quadros 
   Signed-off-by: Boris Brezillon 

Kernel panic log is attached which indeed indicates a DMA problem.

Note that we have added compatible = "ti,omap2-onenand";

Any suggestions, fixes?

BR and thanks,
Nikolaus Schaller



## Booting kernel from Legacy Image at 8200 ...
   Image Name:   Linux-4.16.0-letux+
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:4456744 Bytes = 4.3 MiB
   Load Address: 80008000
   Entry Point:  80008000
   Verifying Checksum ... OK
## Flattened Device Tree blob at 81c0
   Booting using the fdt blob at 0x81c0
   Loading Kernel Image ... OK
   Using Device Tree in place at 81c0, end 81c14a1e

Starting kernel ...

[0.00] Booting Linux on physical CPU 0x0
[0.00] Linux version 4.16.0-letux+ (hns@iMac.local) (gcc version 4.9.2 
(GCC)) #2187 SMP PREEMPT Tue Apr 10 16:23:45 CEST 2018
[0.00] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c5387d
[0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing 
instruction cache
[0.00] OF: fdt: Machine model: Goldelico GTA04A5/Letux 2804
[0.00] debug: ignoring loglevel setting.
[0.00] Memory policy: Data cache writeback
[0.00] cma: Reserved 16 MiB at 0xbe80
[0.00] On node 0 totalpages: 261632
[0.00]   Normal zone: 1536 pages used for memmap
[0.00]   Normal zone: 0 pages reserved
[0.00]   Normal zone: 196608 pages, LIFO batch:31
[0.00]   HighMem zone: 65024 pages, LIFO batch:15
[0.00] CPU: All CPU(s) started in SVC mode.
[0.00] OMAP3630/DM3730 ES1.2 (l2cache iva sgx neon isp 192mhz_clk)
[0.00] random: fast init done
[0.00] percpu: Embedded 17 pages/cpu @(ptrval) s39744 r8192 d21696 
u69632
[0.00] pcpu-alloc: s39744 r8192 d21696 u69632 alloc=17*4096
[0.00] pcpu-alloc: [0] 0 
[0.00] Built 1 zonelists, mobility grouping on.  Total pages: 260096
[0.00] Kernel command line: console=ttyO2,115200n8 
mtdoops.mtddev=omap2.nand ubi.mtd=4 root=/dev/mmcblk0p1 rw rootfstype=ext4,ext3 
rootwait console=ttyO2,115200n8 vram=12M omapfb.vram=0:8M,1:4M 
omapfb.rotate_type=0 omapdss.def_disp=lcd rootwait twl4030_charger.allow_usb=1 
log_buf_len=8M ignore_loglevel earlyprintk
[0.00] log_buf_len: 8388608 bytes
[0.00] early log buf free: 63924(97%)
[0.00] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
[0.00] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
[0.00] Memory: 1002524K/1046528K available (6169K kernel code, 626K 
rwdata, 1660K rodata, 1024K init, 218K bss, 27620K reserved, 16384K 
cma-reserved, 243712K highmem)
[0.00] Virtual kernel memory layout:
[0.00] vector  : 0x - 0x1000   (   4 kB)
[0.00] fixmap  : 0xffc0 - 0xfff0   (3072 kB)
[0.00] vmalloc : 0xf080 - 0xff80   ( 240 MB)
[0.00] lowmem  : 0xc000 - 0xf000   ( 768 MB)
[0.00] pkmap   : 0xbfe0 - 0xc000   (   2 MB)
[0.00] modules : 0xbf00 - 0xbfe0   (  14 MB)
[0.00]   .text : 0x(ptrval) - 0x(ptrval)   (7162 kB)
[0.00]   .init : 0x(ptrval) - 0x(ptrval)   (1024 kB)
[0.00]   .data : 0x(ptrval) - 0x(ptrval)   ( 627 kB)
[0.00].bss : 0x(ptrval) - 0x(ptrval)   ( 219 kB)
[0.00] Preemptible hierarchical RCU implementation.
[0.00]  RCU restricting CPUs from NR_CPUS=2 to nr_cpu_ids=1.
[0.00]  Tasks RCU enabled.
[0.00] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[0.00] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
[0.00] IRQ: Found an INTC at 0x(ptrval) (revision 4.0) with 96 
interrupts
[0.00] Clocking rate (Crystal/Core/MPU): 26.0/400/600 MHz
[0.00] OMAP clockevent source: timer1 at 32768 Hz
[0.00] clocksource: 32k_counter: mask: 0x max_cycles: 
0x, max_idle_ns: 58327039986419 ns
[0.00] sched_clock: 32 bits at 32kHz, resolution 30517ns, wraps every 
6553584741ns
[0.00] OMAP clocksource: 32k_counter at 32768 Hz
[0.001159] Console: colour dummy