Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5
On Wed, 11 Apr 2018 11:12:24 +0200 Ladislav Michl wrote: > On Wed, Apr 11, 2018 at 10:52:01AM +0200, Boris Brezillon wrote: > > On Wed, 11 Apr 2018 10:27:46 +0200 > > Ladislav Michl wrote: > > > > > On Wed, Apr 11, 2018 at 10:08:06AM +0200, Boris Brezillon wrote: > > > > On Wed, 11 Apr 2018 09:36:56 +0200 > > > > Ladislav Michl wrote: > > > > > > > > > Hi Boris, > > > > > > > > > > On Wed, Apr 11, 2018 at 09:15:28AM +0200, Boris Brezillon wrote: > > > [...] > > > > > > Not sure this approach is safe on all archs: if the cache is VIVT or > > > > > > VIPT, you may have several entries pointing to the same phys page, > > > > > > and > > > > > > then, when dma_map_page() does its cache maintenance operations, > > > > > > it's > > > > > > only taking one of these entries into account. > > > > > > > > > > Hmm, I used the same approach Samsung OneNAND driver does since commit > > > > > dcf08227e964a53a2cb39130b74842c7dcb6adde. > > > > > Both TI OMAP3630 and Samsung S5PC110 are using Cortex-A8 which > > > > > is VIPT. In that case samsung's driver code has the same problem. > > > > > > > > > > > In other parts of the MTD subsystem, we tend to not do DMA on > > > > > > buffers > > > > > > that have been vmalloc-ed. > > > > > > > > > > > > You can do something like > > > > > > > > > > > > if (virt_addr_valid(buf)) > > > > > > /* Use DMA */ > > > > > > else > > > > > > /* > > > > > > * Do not use DMA, or use a bounce buffer > > > > > > * allocated with kmalloc > > > > > > */ > > > > > > > > > > Okay, I'll use this approach then, but first I'd like to be sure > > > > > above is > > > > > correct. Anyone? > > > > > > > > See this discussion [1]. The problem came up a few times already, so > > > > might find other threads describing why it's not safe. > > > > > > > > [1]https://lists.linuxfoundation.org/pipermail/iommu/2016-March/016240.html > > > > > > > > > > Question was more likely whenever there might exist more that one mapping > > > of the same page. > > > > I'm clearly not an expert, so I might be wrong, but I think a physical > > page can be both in the identity mapping and mapped in the vmalloc > > space. Now, is there a real risk that both ends are accessed in > > parallel thus making different cache entries pointing to the same phys > > page dirty, I'm not sure. Or maybe there are other corner case, but > > you'll have to ask Russell (or any other ARM expert) to get a clearer > > answer. > > Thank you anyway. As DMA was disabled completely for all DT enabled boards > until 4.16 let's play safe and disable it for HIGHMEM case as you suggested. > Later, we might eventually use the same {read,write}_bufferram for both > OMAP and S5PC110. > > > > But okay, I'll disable DMA for highmem. Patch will follow. > > > > > > ladis > > Andreas, Nikolaus, could you please test patch bellow? It is against > linux.git and should apply also against 4.16 once you modify path. > Looks like it fixes the problem. Could you please rebase on top of v4.17-rc1 when it's out and send a new version? Note that onenand code has moved to drivers/mtd/nand/onenand, so we'll have to manually backport the fix to 4.16. Regards, Boris > Thank you, > ladis > > --- >8 --- > > From: Ladislav Michl > Subject: [PATCH] mtd: onenand: omap2: Disable DMA for HIGHMEM buffers > > dma_map_single doesn't get the proper DMA address for vmalloced area, > so disable DMA in this case. > > Signed-off-by: Ladislav Michl > Reported-by: "H. Nikolaus Schaller" > --- > drivers/mtd/nand/onenand/omap2.c | 105 +++ > 1 file changed, 38 insertions(+), 67 deletions(-) > > diff --git a/drivers/mtd/nand/onenand/omap2.c > b/drivers/mtd/nand/onenand/omap2.c > index 9c159f0dd9a6..321137158ff3 100644 > --- a/drivers/mtd/nand/onenand/omap2.c > +++ b/drivers/mtd/nand/onenand/omap2.c > @@ -375,56 +375,42 @@ static int omap2_onenand_read_bufferram(struct mtd_info > *mtd, int area, > { > struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd); > struct onenand_chip *this = mtd->priv; > - dma_addr_t dma_src, dma_dst; > - int bram_offset; > + struct device *dev = &c->pdev->dev; > void *buf = (void *)buffer; > + dma_addr_t dma_src, dma_dst; > + int bram_offset, err; > size_t xtra; > - int ret; > > bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset; > - if (bram_offset & 3 || (size_t)buf & 3 || count < 384) > - goto out_copy; > - > - /* panic_write() may be in an interrupt context */ > - if (in_interrupt() || oops_in_progress) > + /* > + * If the buffer address is not DMA-able, len is not long enough to make > + * DMA transfers profitable or panic_write() may be in an interrupt > + * context fallback to PIO
Re: [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5
Hi Tony, > Am 12.04.2018 um 17:03 schrieb Tony Lindgren : > > * H. Nikolaus Schaller [180411 07:05]: >>> Am 10.04.2018 um 22:56 schrieb Ladislav Michl : >>> Well, that's a shame... However, this code is in production for several >>> months now, >> >> Well, we could have tested earlier release-candidates... > > I've found it's way easier to test Linux next on regular basis > and report the regressions compared to chasing regressions after > the merge window or tagged kernels. By that time you may have > already multiple regressions to deal with which is always fun. > > If you have products running with mainline kernel, maybe set up > automated boot testing for core devices to catch regressions > early? Well, it is rare that we face upstream regressions of this severe kind. Every 2-3 years... We much more often have to face missing features in upstream so that we add our own patch sets on top of each -rc. In this case here we could have found this right after the merge window when v4.16-rc1 came out. But by bad chance (and holidays and other more important tasks) we did only test on GTA04 boards with standard NAND. And just now did boot a board with OneNAND installed and were surprised by this boot problem. BR, Nikolaus
Re: [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5
* H. Nikolaus Schaller [180411 07:05]: > > Am 10.04.2018 um 22:56 schrieb Ladislav Michl : > > Well, that's a shame... However, this code is in production for several > > months now, > > Well, we could have tested earlier release-candidates... I've found it's way easier to test Linux next on regular basis and report the regressions compared to chasing regressions after the merge window or tagged kernels. By that time you may have already multiple regressions to deal with which is always fun. If you have products running with mainline kernel, maybe set up automated boot testing for core devices to catch regressions early? Regards, Tony
Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5
> Am 11.04.2018 um 11:12 schrieb Ladislav Michl : > > On Wed, Apr 11, 2018 at 10:52:01AM +0200, Boris Brezillon wrote: >> On Wed, 11 Apr 2018 10:27:46 +0200 >> Ladislav Michl wrote: >> >>> On Wed, Apr 11, 2018 at 10:08:06AM +0200, Boris Brezillon wrote: On Wed, 11 Apr 2018 09:36:56 +0200 Ladislav Michl wrote: > Hi Boris, > > On Wed, Apr 11, 2018 at 09:15:28AM +0200, Boris Brezillon wrote: >>> [...] >> Not sure this approach is safe on all archs: if the cache is VIVT or >> VIPT, you may have several entries pointing to the same phys page, and >> then, when dma_map_page() does its cache maintenance operations, it's >> only taking one of these entries into account. > > Hmm, I used the same approach Samsung OneNAND driver does since commit > dcf08227e964a53a2cb39130b74842c7dcb6adde. > Both TI OMAP3630 and Samsung S5PC110 are using Cortex-A8 which > is VIPT. In that case samsung's driver code has the same problem. > >> In other parts of the MTD subsystem, we tend to not do DMA on buffers >> that have been vmalloc-ed. >> >> You can do something like >> >> if (virt_addr_valid(buf)) >> /* Use DMA */ >> else >> /* >> * Do not use DMA, or use a bounce buffer >> * allocated with kmalloc >> */ > > Okay, I'll use this approach then, but first I'd like to be sure above is > correct. Anyone? See this discussion [1]. The problem came up a few times already, so might find other threads describing why it's not safe. [1]https://lists.linuxfoundation.org/pipermail/iommu/2016-March/016240.html >>> >>> Question was more likely whenever there might exist more that one mapping >>> of the same page. >> >> I'm clearly not an expert, so I might be wrong, but I think a physical >> page can be both in the identity mapping and mapped in the vmalloc >> space. Now, is there a real risk that both ends are accessed in >> parallel thus making different cache entries pointing to the same phys >> page dirty, I'm not sure. Or maybe there are other corner case, but >> you'll have to ask Russell (or any other ARM expert) to get a clearer >> answer. > > Thank you anyway. As DMA was disabled completely for all DT enabled boards > until 4.16 let's play safe and disable it for HIGHMEM case as you suggested. > Later, we might eventually use the same {read,write}_bufferram for both > OMAP and S5PC110. > >>> But okay, I'll disable DMA for highmem. Patch will follow. >>> >>> ladis > > Andreas, Nikolaus, could you please test patch bellow? It is against > linux.git and should apply also against 4.16 once you modify path. Works with v4.16! I have done "tar cf /dev/null /mnt/nand" and have only got some fixable bit flips. Writing a test file to OneNAND also worked and doing a sync did move the weak block to a new PEB. So I think read & write works again. > > Thank you, > ladis BR and thanks, Nikolaus > > --- >8 --- > > From: Ladislav Michl > Subject: [PATCH] mtd: onenand: omap2: Disable DMA for HIGHMEM buffers > > dma_map_single doesn't get the proper DMA address for vmalloced area, > so disable DMA in this case. > > Signed-off-by: Ladislav Michl > Reported-by: "H. Nikolaus Schaller" > --- > drivers/mtd/nand/onenand/omap2.c | 105 +++ > 1 file changed, 38 insertions(+), 67 deletions(-) > > diff --git a/drivers/mtd/nand/onenand/omap2.c > b/drivers/mtd/nand/onenand/omap2.c > index 9c159f0dd9a6..321137158ff3 100644 > --- a/drivers/mtd/nand/onenand/omap2.c > +++ b/drivers/mtd/nand/onenand/omap2.c > @@ -375,56 +375,42 @@ static int omap2_onenand_read_bufferram(struct mtd_info > *mtd, int area, > { > struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd); > struct onenand_chip *this = mtd->priv; > - dma_addr_t dma_src, dma_dst; > - int bram_offset; > + struct device *dev = &c->pdev->dev; > void *buf = (void *)buffer; > + dma_addr_t dma_src, dma_dst; > + int bram_offset, err; > size_t xtra; > - int ret; > > bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset; > - if (bram_offset & 3 || (size_t)buf & 3 || count < 384) > - goto out_copy; > - > - /* panic_write() may be in an interrupt context */ > - if (in_interrupt() || oops_in_progress) > + /* > + * If the buffer address is not DMA-able, len is not long enough to make > + * DMA transfers profitable or panic_write() may be in an interrupt > + * context fallback to PIO mode. > + */ > + if (!virt_addr_valid(buf) || bram_offset & 3 || (size_t)buf & 3 || > + count < 384 || in_interrupt() || oops_in_progress ) > goto out_copy; > > - if (buf >= high_memory) { > - struct page *p1; > - > -
Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5
On Wed, Apr 11, 2018 at 10:52:01AM +0200, Boris Brezillon wrote: > On Wed, 11 Apr 2018 10:27:46 +0200 > Ladislav Michl wrote: > > > On Wed, Apr 11, 2018 at 10:08:06AM +0200, Boris Brezillon wrote: > > > On Wed, 11 Apr 2018 09:36:56 +0200 > > > Ladislav Michl wrote: > > > > > > > Hi Boris, > > > > > > > > On Wed, Apr 11, 2018 at 09:15:28AM +0200, Boris Brezillon wrote: > > [...] > > > > > Not sure this approach is safe on all archs: if the cache is VIVT or > > > > > VIPT, you may have several entries pointing to the same phys page, and > > > > > then, when dma_map_page() does its cache maintenance operations, it's > > > > > only taking one of these entries into account. > > > > > > > > Hmm, I used the same approach Samsung OneNAND driver does since commit > > > > dcf08227e964a53a2cb39130b74842c7dcb6adde. > > > > Both TI OMAP3630 and Samsung S5PC110 are using Cortex-A8 which > > > > is VIPT. In that case samsung's driver code has the same problem. > > > > > > > > > In other parts of the MTD subsystem, we tend to not do DMA on buffers > > > > > that have been vmalloc-ed. > > > > > > > > > > You can do something like > > > > > > > > > > if (virt_addr_valid(buf)) > > > > > /* Use DMA */ > > > > > else > > > > > /* > > > > >* Do not use DMA, or use a bounce buffer > > > > >* allocated with kmalloc > > > > >*/ > > > > > > > > Okay, I'll use this approach then, but first I'd like to be sure above > > > > is > > > > correct. Anyone? > > > > > > See this discussion [1]. The problem came up a few times already, so > > > might find other threads describing why it's not safe. > > > > > > [1]https://lists.linuxfoundation.org/pipermail/iommu/2016-March/016240.html > > > > > > > Question was more likely whenever there might exist more that one mapping > > of the same page. > > I'm clearly not an expert, so I might be wrong, but I think a physical > page can be both in the identity mapping and mapped in the vmalloc > space. Now, is there a real risk that both ends are accessed in > parallel thus making different cache entries pointing to the same phys > page dirty, I'm not sure. Or maybe there are other corner case, but > you'll have to ask Russell (or any other ARM expert) to get a clearer > answer. Thank you anyway. As DMA was disabled completely for all DT enabled boards until 4.16 let's play safe and disable it for HIGHMEM case as you suggested. Later, we might eventually use the same {read,write}_bufferram for both OMAP and S5PC110. > > But okay, I'll disable DMA for highmem. Patch will follow. > > > > ladis Andreas, Nikolaus, could you please test patch bellow? It is against linux.git and should apply also against 4.16 once you modify path. Thank you, ladis --- >8 --- From: Ladislav Michl Subject: [PATCH] mtd: onenand: omap2: Disable DMA for HIGHMEM buffers dma_map_single doesn't get the proper DMA address for vmalloced area, so disable DMA in this case. Signed-off-by: Ladislav Michl Reported-by: "H. Nikolaus Schaller" --- drivers/mtd/nand/onenand/omap2.c | 105 +++ 1 file changed, 38 insertions(+), 67 deletions(-) diff --git a/drivers/mtd/nand/onenand/omap2.c b/drivers/mtd/nand/onenand/omap2.c index 9c159f0dd9a6..321137158ff3 100644 --- a/drivers/mtd/nand/onenand/omap2.c +++ b/drivers/mtd/nand/onenand/omap2.c @@ -375,56 +375,42 @@ static int omap2_onenand_read_bufferram(struct mtd_info *mtd, int area, { struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd); struct onenand_chip *this = mtd->priv; - dma_addr_t dma_src, dma_dst; - int bram_offset; + struct device *dev = &c->pdev->dev; void *buf = (void *)buffer; + dma_addr_t dma_src, dma_dst; + int bram_offset, err; size_t xtra; - int ret; bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset; - if (bram_offset & 3 || (size_t)buf & 3 || count < 384) - goto out_copy; - - /* panic_write() may be in an interrupt context */ - if (in_interrupt() || oops_in_progress) + /* +* If the buffer address is not DMA-able, len is not long enough to make +* DMA transfers profitable or panic_write() may be in an interrupt +* context fallback to PIO mode. +*/ + if (!virt_addr_valid(buf) || bram_offset & 3 || (size_t)buf & 3 || + count < 384 || in_interrupt() || oops_in_progress ) goto out_copy; - if (buf >= high_memory) { - struct page *p1; - - if (((size_t)buf & PAGE_MASK) != - ((size_t)(buf + count - 1) & PAGE_MASK)) - goto out_copy; - p1 = vmalloc_to_page(buf); - if (!p1) - goto out_copy; - buf = page_addre
Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5
On Wed, 11 Apr 2018 10:27:46 +0200 Ladislav Michl wrote: > On Wed, Apr 11, 2018 at 10:08:06AM +0200, Boris Brezillon wrote: > > On Wed, 11 Apr 2018 09:36:56 +0200 > > Ladislav Michl wrote: > > > > > Hi Boris, > > > > > > On Wed, Apr 11, 2018 at 09:15:28AM +0200, Boris Brezillon wrote: > [...] > > > > Not sure this approach is safe on all archs: if the cache is VIVT or > > > > VIPT, you may have several entries pointing to the same phys page, and > > > > then, when dma_map_page() does its cache maintenance operations, it's > > > > only taking one of these entries into account. > > > > > > Hmm, I used the same approach Samsung OneNAND driver does since commit > > > dcf08227e964a53a2cb39130b74842c7dcb6adde. > > > Both TI OMAP3630 and Samsung S5PC110 are using Cortex-A8 which > > > is VIPT. In that case samsung's driver code has the same problem. > > > > > > > In other parts of the MTD subsystem, we tend to not do DMA on buffers > > > > that have been vmalloc-ed. > > > > > > > > You can do something like > > > > > > > > if (virt_addr_valid(buf)) > > > > /* Use DMA */ > > > > else > > > > /* > > > > * Do not use DMA, or use a bounce buffer > > > > * allocated with kmalloc > > > > */ > > > > > > Okay, I'll use this approach then, but first I'd like to be sure above is > > > correct. Anyone? > > > > See this discussion [1]. The problem came up a few times already, so > > might find other threads describing why it's not safe. > > > > [1]https://lists.linuxfoundation.org/pipermail/iommu/2016-March/016240.html > > > > Question was more likely whenever there might exist more that one mapping > of the same page. I'm clearly not an expert, so I might be wrong, but I think a physical page can be both in the identity mapping and mapped in the vmalloc space. Now, is there a real risk that both ends are accessed in parallel thus making different cache entries pointing to the same phys page dirty, I'm not sure. Or maybe there are other corner case, but you'll have to ask Russell (or any other ARM expert) to get a clearer answer. > But okay, I'll disable DMA for highmem. Patch will follow. > > ladis
Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5
On Wed, Apr 11, 2018 at 10:08:06AM +0200, Boris Brezillon wrote: > On Wed, 11 Apr 2018 09:36:56 +0200 > Ladislav Michl wrote: > > > Hi Boris, > > > > On Wed, Apr 11, 2018 at 09:15:28AM +0200, Boris Brezillon wrote: [...] > > > Not sure this approach is safe on all archs: if the cache is VIVT or > > > VIPT, you may have several entries pointing to the same phys page, and > > > then, when dma_map_page() does its cache maintenance operations, it's > > > only taking one of these entries into account. > > > > Hmm, I used the same approach Samsung OneNAND driver does since commit > > dcf08227e964a53a2cb39130b74842c7dcb6adde. > > Both TI OMAP3630 and Samsung S5PC110 are using Cortex-A8 which > > is VIPT. In that case samsung's driver code has the same problem. > > > > > In other parts of the MTD subsystem, we tend to not do DMA on buffers > > > that have been vmalloc-ed. > > > > > > You can do something like > > > > > > if (virt_addr_valid(buf)) > > > /* Use DMA */ > > > else > > > /* > > >* Do not use DMA, or use a bounce buffer > > >* allocated with kmalloc > > >*/ > > > > Okay, I'll use this approach then, but first I'd like to be sure above is > > correct. Anyone? > > See this discussion [1]. The problem came up a few times already, so > might find other threads describing why it's not safe. > > [1]https://lists.linuxfoundation.org/pipermail/iommu/2016-March/016240.html Question was more likely whenever there might exist more that one mapping of the same page. But okay, I'll disable DMA for highmem. Patch will follow. ladis
Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5
On Wed, 11 Apr 2018 09:36:56 +0200 Ladislav Michl wrote: > Hi Boris, > > On Wed, Apr 11, 2018 at 09:15:28AM +0200, Boris Brezillon wrote: > > Hi Ladislav, > > > > On Wed, 11 Apr 2018 08:26:07 +0200 > > Ladislav Michl wrote: > > > > > Hi Andreas, > > > > > > On Wed, Apr 11, 2018 at 06:59:03AM +0200, Andreas Kemnade wrote: > > > > Hi Ladis, > > > > > > > > On Tue, 10 Apr 2018 22:56:43 +0200 > > > > Ladislav Michl wrote: > > > > > > > > > Hi Nikolaus, > > > > > > > > > > On Tue, Apr 10, 2018 at 06:25:17PM +0200, H. Nikolaus Schaller wrote: > > > > > > > > > > > Hi, > > > > > > we just started testing the v4.16 kernel and found the > > > > > > device no longer bootable (works with v4.15). It turned > > > > > > out that there was a harmful modification somewhere between > > > > > > v4.15.0 and v4.16-rc1. > > > > > > > > > > > > A git bisect points to this patch: > > > > > > > > > > Well, that's a shame... However, this code is in production for > > > > > several > > > > > months now, so could you, please put 'goto out_copy' if 'buf >= > > > > > high_memory' > > > > > condition is met, ie: > > > > > --- a/drivers/mtd/nand/onenand/omap2.c > > > > > +++ b/drivers/mtd/nand/onenand/omap2.c > > > > > @@ -392,6 +392,7 @@ static int omap2_onenand_read_bufferram(struct > > > > > mtd_info *mtd, int area, > > > > > if (buf >= high_memory) { > > > > > struct page *p1; > > > > > > > > > > + goto out_copy; > > > > > if (((size_t)buf & PAGE_MASK) != > > > > > ((size_t)(buf + count - 1) & PAGE_MASK)) > > > > > goto out_copy; > > > > > > > > I had the same problem here, and that snippet helps here. ubiattach > > > > -p /dev/mtdX does not cause kernel oopses here anymore > > > > > > It seems reviving old code always comes at a price :-) Could you try > > > following patch, so far compile tested only? > > > (we'll need to do the same for omap2_onenand_write_bufferram, but > > > it sould be enough for testing purposes now) > > > > > > diff --git a/drivers/mtd/nand/onenand/omap2.c > > > b/drivers/mtd/nand/onenand/omap2.c > > > index 9c159f0dd9a6..04cefd7a6487 100644 > > > --- a/drivers/mtd/nand/onenand/omap2.c > > > +++ b/drivers/mtd/nand/onenand/omap2.c > > > @@ -375,11 +375,12 @@ static int omap2_onenand_read_bufferram(struct > > > mtd_info *mtd, int area, > > > { > > > struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd); > > > struct onenand_chip *this = mtd->priv; > > > + struct device *dev = &c->pdev->dev; > > > dma_addr_t dma_src, dma_dst; > > > int bram_offset; > > > void *buf = (void *)buffer; > > > size_t xtra; > > > - int ret; > > > + int ret, page_dma = 0; > > > > > > bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset; > > > if (bram_offset & 3 || (size_t)buf & 3 || count < 384) > > > @@ -389,38 +390,43 @@ static int omap2_onenand_read_bufferram(struct > > > mtd_info *mtd, int area, > > > if (in_interrupt() || oops_in_progress) > > > goto out_copy; > > > > > > + xtra = count & 3; > > > + if (xtra) { > > > + count -= xtra; > > > + memcpy(buf + count, this->base + bram_offset + count, xtra); > > > + } > > > + > > > + /* Handle vmalloc address */ > > > if (buf >= high_memory) { > > > - struct page *p1; > > > + struct page *page; > > > > > > if (((size_t)buf & PAGE_MASK) != > > > ((size_t)(buf + count - 1) & PAGE_MASK)) > > > goto out_copy; > > > - p1 = vmalloc_to_page(buf); > > > - if (!p1) > > > + page = vmalloc_to_page(buf); > > > > Not sure this approach is safe on all archs: if the cache is VIVT or > > VIPT, you may have several entries pointing to the same phys page, and > > then, when dma_map_page() does its cache maintenance operations, it's > > only taking one of these entries into account. > > Hmm, I used the same approach Samsung OneNAND driver does since commit > dcf08227e964a53a2cb39130b74842c7dcb6adde. > Both TI OMAP3630 and Samsung S5PC110 are using Cortex-A8 which > is VIPT. In that case samsung's driver code has the same problem. > > > In other parts of the MTD subsystem, we tend to not do DMA on buffers > > that have been vmalloc-ed. > > > > You can do something like > > > > if (virt_addr_valid(buf)) > > /* Use DMA */ > > else > > /* > > * Do not use DMA, or use a bounce buffer > > * allocated with kmalloc > > */ > > Okay, I'll use this approach then, but first I'd like to be sure above is > correct. Anyone? See this discussion [1]. The problem came up a few times already, so might find other threads describing why it's not safe. [1]https://lists.linuxfoundation.org/pipermail/iommu/2016-March/016240.html
Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5
Hi Boris, On Wed, Apr 11, 2018 at 09:15:28AM +0200, Boris Brezillon wrote: > Hi Ladislav, > > On Wed, 11 Apr 2018 08:26:07 +0200 > Ladislav Michl wrote: > > > Hi Andreas, > > > > On Wed, Apr 11, 2018 at 06:59:03AM +0200, Andreas Kemnade wrote: > > > Hi Ladis, > > > > > > On Tue, 10 Apr 2018 22:56:43 +0200 > > > Ladislav Michl wrote: > > > > > > > Hi Nikolaus, > > > > > > > > On Tue, Apr 10, 2018 at 06:25:17PM +0200, H. Nikolaus Schaller wrote: > > > > > Hi, > > > > > we just started testing the v4.16 kernel and found the > > > > > device no longer bootable (works with v4.15). It turned > > > > > out that there was a harmful modification somewhere between > > > > > v4.15.0 and v4.16-rc1. > > > > > > > > > > A git bisect points to this patch: > > > > > > > > Well, that's a shame... However, this code is in production for several > > > > months now, so could you, please put 'goto out_copy' if 'buf >= > > > > high_memory' > > > > condition is met, ie: > > > > --- a/drivers/mtd/nand/onenand/omap2.c > > > > +++ b/drivers/mtd/nand/onenand/omap2.c > > > > @@ -392,6 +392,7 @@ static int omap2_onenand_read_bufferram(struct > > > > mtd_info *mtd, int area, > > > > if (buf >= high_memory) { > > > > struct page *p1; > > > > > > > > + goto out_copy; > > > > if (((size_t)buf & PAGE_MASK) != > > > > ((size_t)(buf + count - 1) & PAGE_MASK)) > > > > goto out_copy; > > > > > > I had the same problem here, and that snippet helps here. ubiattach > > > -p /dev/mtdX does not cause kernel oopses here anymore > > > > It seems reviving old code always comes at a price :-) Could you try > > following patch, so far compile tested only? > > (we'll need to do the same for omap2_onenand_write_bufferram, but > > it sould be enough for testing purposes now) > > > > diff --git a/drivers/mtd/nand/onenand/omap2.c > > b/drivers/mtd/nand/onenand/omap2.c > > index 9c159f0dd9a6..04cefd7a6487 100644 > > --- a/drivers/mtd/nand/onenand/omap2.c > > +++ b/drivers/mtd/nand/onenand/omap2.c > > @@ -375,11 +375,12 @@ static int omap2_onenand_read_bufferram(struct > > mtd_info *mtd, int area, > > { > > struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd); > > struct onenand_chip *this = mtd->priv; > > + struct device *dev = &c->pdev->dev; > > dma_addr_t dma_src, dma_dst; > > int bram_offset; > > void *buf = (void *)buffer; > > size_t xtra; > > - int ret; > > + int ret, page_dma = 0; > > > > bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset; > > if (bram_offset & 3 || (size_t)buf & 3 || count < 384) > > @@ -389,38 +390,43 @@ static int omap2_onenand_read_bufferram(struct > > mtd_info *mtd, int area, > > if (in_interrupt() || oops_in_progress) > > goto out_copy; > > > > + xtra = count & 3; > > + if (xtra) { > > + count -= xtra; > > + memcpy(buf + count, this->base + bram_offset + count, xtra); > > + } > > + > > + /* Handle vmalloc address */ > > if (buf >= high_memory) { > > - struct page *p1; > > + struct page *page; > > > > if (((size_t)buf & PAGE_MASK) != > > ((size_t)(buf + count - 1) & PAGE_MASK)) > > goto out_copy; > > - p1 = vmalloc_to_page(buf); > > - if (!p1) > > + page = vmalloc_to_page(buf); > > Not sure this approach is safe on all archs: if the cache is VIVT or > VIPT, you may have several entries pointing to the same phys page, and > then, when dma_map_page() does its cache maintenance operations, it's > only taking one of these entries into account. Hmm, I used the same approach Samsung OneNAND driver does since commit dcf08227e964a53a2cb39130b74842c7dcb6adde. Both TI OMAP3630 and Samsung S5PC110 are using Cortex-A8 which is VIPT. In that case samsung's driver code has the same problem. > In other parts of the MTD subsystem, we tend to not do DMA on buffers > that have been vmalloc-ed. > > You can do something like > > if (virt_addr_valid(buf)) > /* Use DMA */ > else > /* >* Do not use DMA, or use a bounce buffer >* allocated with kmalloc >*/ Okay, I'll use this approach then, but first I'd like to be sure above is correct. Anyone? > Regards, > > Boris Thank you, ladis > > + if (!page) > > goto out_copy; > > - buf = page_address(p1) + ((size_t)buf & ~PAGE_MASK); > > - } > > - > > - xtra = count & 3; > > - if (xtra) { > > - count -= xtra; > > - memcpy(buf + count, this->base + bram_offset + count, xtra); > > + page_dma = 1; > > + dma_dst = dma_map_page(dev, page, (size_t) buf & ~PAGE_MASK, > > + count, DMA_FROM_
Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5
Hi Ladislav, On Wed, 11 Apr 2018 08:26:07 +0200 Ladislav Michl wrote: > Hi Andreas, > > On Wed, Apr 11, 2018 at 06:59:03AM +0200, Andreas Kemnade wrote: > > Hi Ladis, > > > > On Tue, 10 Apr 2018 22:56:43 +0200 > > Ladislav Michl wrote: > > > > > Hi Nikolaus, > > > > > > On Tue, Apr 10, 2018 at 06:25:17PM +0200, H. Nikolaus Schaller wrote: > > > > Hi, > > > > we just started testing the v4.16 kernel and found the > > > > device no longer bootable (works with v4.15). It turned > > > > out that there was a harmful modification somewhere between > > > > v4.15.0 and v4.16-rc1. > > > > > > > > A git bisect points to this patch: > > > > > > Well, that's a shame... However, this code is in production for several > > > months now, so could you, please put 'goto out_copy' if 'buf >= > > > high_memory' > > > condition is met, ie: > > > --- a/drivers/mtd/nand/onenand/omap2.c > > > +++ b/drivers/mtd/nand/onenand/omap2.c > > > @@ -392,6 +392,7 @@ static int omap2_onenand_read_bufferram(struct > > > mtd_info *mtd, int area, > > > if (buf >= high_memory) { > > > struct page *p1; > > > > > > + goto out_copy; > > > if (((size_t)buf & PAGE_MASK) != > > > ((size_t)(buf + count - 1) & PAGE_MASK)) > > > goto out_copy; > > > > I had the same problem here, and that snippet helps here. ubiattach > > -p /dev/mtdX does not cause kernel oopses here anymore > > It seems reviving old code always comes at a price :-) Could you try > following patch, so far compile tested only? > (we'll need to do the same for omap2_onenand_write_bufferram, but > it sould be enough for testing purposes now) > > diff --git a/drivers/mtd/nand/onenand/omap2.c > b/drivers/mtd/nand/onenand/omap2.c > index 9c159f0dd9a6..04cefd7a6487 100644 > --- a/drivers/mtd/nand/onenand/omap2.c > +++ b/drivers/mtd/nand/onenand/omap2.c > @@ -375,11 +375,12 @@ static int omap2_onenand_read_bufferram(struct mtd_info > *mtd, int area, > { > struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd); > struct onenand_chip *this = mtd->priv; > + struct device *dev = &c->pdev->dev; > dma_addr_t dma_src, dma_dst; > int bram_offset; > void *buf = (void *)buffer; > size_t xtra; > - int ret; > + int ret, page_dma = 0; > > bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset; > if (bram_offset & 3 || (size_t)buf & 3 || count < 384) > @@ -389,38 +390,43 @@ static int omap2_onenand_read_bufferram(struct mtd_info > *mtd, int area, > if (in_interrupt() || oops_in_progress) > goto out_copy; > > + xtra = count & 3; > + if (xtra) { > + count -= xtra; > + memcpy(buf + count, this->base + bram_offset + count, xtra); > + } > + > + /* Handle vmalloc address */ > if (buf >= high_memory) { > - struct page *p1; > + struct page *page; > > if (((size_t)buf & PAGE_MASK) != > ((size_t)(buf + count - 1) & PAGE_MASK)) > goto out_copy; > - p1 = vmalloc_to_page(buf); > - if (!p1) > + page = vmalloc_to_page(buf); Not sure this approach is safe on all archs: if the cache is VIVT or VIPT, you may have several entries pointing to the same phys page, and then, when dma_map_page() does its cache maintenance operations, it's only taking one of these entries into account. In other parts of the MTD subsystem, we tend to not do DMA on buffers that have been vmalloc-ed. You can do something like if (virt_addr_valid(buf)) /* Use DMA */ else /* * Do not use DMA, or use a bounce buffer * allocated with kmalloc */ Regards, Boris > + if (!page) > goto out_copy; > - buf = page_address(p1) + ((size_t)buf & ~PAGE_MASK); > - } > - > - xtra = count & 3; > - if (xtra) { > - count -= xtra; > - memcpy(buf + count, this->base + bram_offset + count, xtra); > + page_dma = 1; > + dma_dst = dma_map_page(dev, page, (size_t) buf & ~PAGE_MASK, > +count, DMA_FROM_DEVICE); > + } else { > + dma_dst = dma_map_single(dev, buf, count, DMA_FROM_DEVICE); > } > - > dma_src = c->phys_base + bram_offset; > - dma_dst = dma_map_single(&c->pdev->dev, buf, count, DMA_FROM_DEVICE); > - if (dma_mapping_error(&c->pdev->dev, dma_dst)) { > - dev_err(&c->pdev->dev, > - "Couldn't DMA map a %d byte buffer\n", > - count); > + > + if (dma_mapping_error(dev, dma_dst)) { > + dev_err(dev, "Couldn't DMA map a %d byte buffer\n", count); > goto out_copy; > } > > ret = omap2_on
Re: [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5
Hi Ladislav, > Am 10.04.2018 um 22:56 schrieb Ladislav Michl : > > Hi Nikolaus, > > On Tue, Apr 10, 2018 at 06:25:17PM +0200, H. Nikolaus Schaller wrote: >> Hi, >> we just started testing the v4.16 kernel and found the >> device no longer bootable (works with v4.15). It turned >> out that there was a harmful modification somewhere between >> v4.15.0 and v4.16-rc1. >> >> A git bisect points to this patch: > > Well, that's a shame... However, this code is in production for several > months now, Well, we could have tested earlier release-candidates... > so could you, please put 'goto out_copy' if 'buf >= high_memory' > condition is met, ie: > --- a/drivers/mtd/nand/onenand/omap2.c > +++ b/drivers/mtd/nand/onenand/omap2.c > @@ -392,6 +392,7 @@ static int omap2_onenand_read_bufferram(struct mtd_info > *mtd, int area, > if (buf >= high_memory) { > struct page *p1; > > + goto out_copy; > if (((size_t)buf & PAGE_MASK) != > ((size_t)(buf + count - 1) & PAGE_MASK)) > goto out_copy; > and in case it does not help put the same goto at the very beginning of > omap2_onenand_read_bufferram function and report result? Yes, works for me. Device boots and I can mount the NAND to inspect the ubifs. Only omap2_onenand_write_bufferram has the same problem, but that is to be expected. I'll try you new patch now. > > Thank you for cooperation, > ladis Thanks for the quick reply and analysis. BR, Nikolaus > >> commit bdaca9345d41fd9420995469d27603ea62054691 >> Author: Ladislav Michl >> Date: Fri Jan 12 14:16:57 2018 +0100 >> >> mtd: onenand: omap2: Decouple DMA enabling from INT pin availability >> >> INT pin (gpio_irq) is not really needed for DMA but only for notification >> when a command that needs wait has completed. DMA memcpy can be still used >> even without gpio_irq available, so enable it unconditionally. >> >> Signed-off-by: Ladislav Michl >> Reviewed-by: Peter Ujfalusi >> Tested-by: Tony Lindgren >> Tested-by: Aaro Koskinen >> Acked-by: Roger Quadros >> Signed-off-by: Boris Brezillon >> >> Kernel panic log is attached which indeed indicates a DMA problem. >> >> Note that we have added compatible = "ti,omap2-onenand"; >> >> Any suggestions, fixes? >> >> BR and thanks, >> Nikolaus Schaller >> >> >> >> ## Booting kernel from Legacy Image at 8200 ... >> Image Name: Linux-4.16.0-letux+ >> Image Type: ARM Linux Kernel Image (uncompressed) >> Data Size:4456744 Bytes = 4.3 MiB >> Load Address: 80008000 >> Entry Point: 80008000 >> Verifying Checksum ... OK >> ## Flattened Device Tree blob at 81c0 >> Booting using the fdt blob at 0x81c0 >> Loading Kernel Image ... OK >> Using Device Tree in place at 81c0, end 81c14a1e >> >> Starting kernel ... >> >> [0.00] Booting Linux on physical CPU 0x0 >> [0.00] Linux version 4.16.0-letux+ (hns@iMac.local) (gcc version >> 4.9.2 (GCC)) #2187 SMP PREEMPT Tue Apr 10 16:23:45 CEST 2018 >> [0.00] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), >> cr=10c5387d >> [0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing >> instruction cache >> [0.00] OF: fdt: Machine model: Goldelico GTA04A5/Letux 2804 >> [0.00] debug: ignoring loglevel setting. >> [0.00] Memory policy: Data cache writeback >> [0.00] cma: Reserved 16 MiB at 0xbe80 >> [0.00] On node 0 totalpages: 261632 >> [0.00] Normal zone: 1536 pages used for memmap >> [0.00] Normal zone: 0 pages reserved >> [0.00] Normal zone: 196608 pages, LIFO batch:31 >> [0.00] HighMem zone: 65024 pages, LIFO batch:15 >> [0.00] CPU: All CPU(s) started in SVC mode. >> [0.00] OMAP3630/DM3730 ES1.2 (l2cache iva sgx neon isp 192mhz_clk) >> [0.00] random: fast init done >> [0.00] percpu: Embedded 17 pages/cpu @(ptrval) s39744 r8192 d21696 >> u69632 >> [0.00] pcpu-alloc: s39744 r8192 d21696 u69632 alloc=17*4096 >> [0.00] pcpu-alloc: [0] 0 >> [0.00] Built 1 zonelists, mobility grouping on. Total pages: 260096 >> [0.00] Kernel command line: console=ttyO2,115200n8 >> mtdoops.mtddev=omap2.nand ubi.mtd=4 root=/dev/mmcblk0p1 rw >> rootfstype=ext4,ext3 rootwait console=ttyO2,115200n8 vram=12M >> omapfb.vram=0:8M,1:4M omapfb.rotate_type=0 omapdss.def_disp=lcd rootwait >> twl4030_charger.allow_usb=1 log_buf_len=8M ignore_loglevel earlyprintk >> [0.00] log_buf_len: 8388608 bytes >> [0.00] early log buf free: 63924(97%) >> [0.00] Dentry cache hash table entries: 131072 (order: 7, 524288 >> bytes) >> [0.00] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) >> [0.00] Memory: 1002524K/1046528K available (6169K kernel code, 626K >> rwdata, 1660K rodata, 1024K init, 218K bss, 27620K reserved, 16384K >> cma-reserved, 243712K highmem) >> [
Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5
Hi Andreas, On Wed, Apr 11, 2018 at 06:59:03AM +0200, Andreas Kemnade wrote: > Hi Ladis, > > On Tue, 10 Apr 2018 22:56:43 +0200 > Ladislav Michl wrote: > > > Hi Nikolaus, > > > > On Tue, Apr 10, 2018 at 06:25:17PM +0200, H. Nikolaus Schaller wrote: > > > Hi, > > > we just started testing the v4.16 kernel and found the > > > device no longer bootable (works with v4.15). It turned > > > out that there was a harmful modification somewhere between > > > v4.15.0 and v4.16-rc1. > > > > > > A git bisect points to this patch: > > > > Well, that's a shame... However, this code is in production for several > > months now, so could you, please put 'goto out_copy' if 'buf >= high_memory' > > condition is met, ie: > > --- a/drivers/mtd/nand/onenand/omap2.c > > +++ b/drivers/mtd/nand/onenand/omap2.c > > @@ -392,6 +392,7 @@ static int omap2_onenand_read_bufferram(struct mtd_info > > *mtd, int area, > > if (buf >= high_memory) { > > struct page *p1; > > > > + goto out_copy; > > if (((size_t)buf & PAGE_MASK) != > > ((size_t)(buf + count - 1) & PAGE_MASK)) > > goto out_copy; > > I had the same problem here, and that snippet helps here. ubiattach > -p /dev/mtdX does not cause kernel oopses here anymore It seems reviving old code always comes at a price :-) Could you try following patch, so far compile tested only? (we'll need to do the same for omap2_onenand_write_bufferram, but it sould be enough for testing purposes now) diff --git a/drivers/mtd/nand/onenand/omap2.c b/drivers/mtd/nand/onenand/omap2.c index 9c159f0dd9a6..04cefd7a6487 100644 --- a/drivers/mtd/nand/onenand/omap2.c +++ b/drivers/mtd/nand/onenand/omap2.c @@ -375,11 +375,12 @@ static int omap2_onenand_read_bufferram(struct mtd_info *mtd, int area, { struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd); struct onenand_chip *this = mtd->priv; + struct device *dev = &c->pdev->dev; dma_addr_t dma_src, dma_dst; int bram_offset; void *buf = (void *)buffer; size_t xtra; - int ret; + int ret, page_dma = 0; bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset; if (bram_offset & 3 || (size_t)buf & 3 || count < 384) @@ -389,38 +390,43 @@ static int omap2_onenand_read_bufferram(struct mtd_info *mtd, int area, if (in_interrupt() || oops_in_progress) goto out_copy; + xtra = count & 3; + if (xtra) { + count -= xtra; + memcpy(buf + count, this->base + bram_offset + count, xtra); + } + + /* Handle vmalloc address */ if (buf >= high_memory) { - struct page *p1; + struct page *page; if (((size_t)buf & PAGE_MASK) != ((size_t)(buf + count - 1) & PAGE_MASK)) goto out_copy; - p1 = vmalloc_to_page(buf); - if (!p1) + page = vmalloc_to_page(buf); + if (!page) goto out_copy; - buf = page_address(p1) + ((size_t)buf & ~PAGE_MASK); - } - - xtra = count & 3; - if (xtra) { - count -= xtra; - memcpy(buf + count, this->base + bram_offset + count, xtra); + page_dma = 1; + dma_dst = dma_map_page(dev, page, (size_t) buf & ~PAGE_MASK, + count, DMA_FROM_DEVICE); + } else { + dma_dst = dma_map_single(dev, buf, count, DMA_FROM_DEVICE); } - dma_src = c->phys_base + bram_offset; - dma_dst = dma_map_single(&c->pdev->dev, buf, count, DMA_FROM_DEVICE); - if (dma_mapping_error(&c->pdev->dev, dma_dst)) { - dev_err(&c->pdev->dev, - "Couldn't DMA map a %d byte buffer\n", - count); + + if (dma_mapping_error(dev, dma_dst)) { + dev_err(dev, "Couldn't DMA map a %d byte buffer\n", count); goto out_copy; } ret = omap2_onenand_dma_transfer(c, dma_src, dma_dst, count); - dma_unmap_single(&c->pdev->dev, dma_dst, count, DMA_FROM_DEVICE); + if (page_dma) + dma_unmap_page(dev, dma_dst, count, DMA_FROM_DEVICE); + else + dma_unmap_single(dev, dma_dst, count, DMA_FROM_DEVICE); if (ret) { - dev_err(&c->pdev->dev, "timeout waiting for DMA\n"); + dev_err(dev, "timeout waiting for DMA\n"); goto out_copy; }
Re: [Letux-kernel] [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5
Hi Ladis, On Tue, 10 Apr 2018 22:56:43 +0200 Ladislav Michl wrote: > Hi Nikolaus, > > On Tue, Apr 10, 2018 at 06:25:17PM +0200, H. Nikolaus Schaller wrote: > > Hi, > > we just started testing the v4.16 kernel and found the > > device no longer bootable (works with v4.15). It turned > > out that there was a harmful modification somewhere between > > v4.15.0 and v4.16-rc1. > > > > A git bisect points to this patch: > > Well, that's a shame... However, this code is in production for several > months now, so could you, please put 'goto out_copy' if 'buf >= high_memory' > condition is met, ie: > --- a/drivers/mtd/nand/onenand/omap2.c > +++ b/drivers/mtd/nand/onenand/omap2.c > @@ -392,6 +392,7 @@ static int omap2_onenand_read_bufferram(struct mtd_info > *mtd, int area, > if (buf >= high_memory) { > struct page *p1; > > + goto out_copy; > if (((size_t)buf & PAGE_MASK) != > ((size_t)(buf + count - 1) & PAGE_MASK)) > goto out_copy; I had the same problem here, and that snippet helps here. ubiattach -p /dev/mtdX does not cause kernel oopses here anymore Regards, Andreas pgpMtvtkaSvQW.pgp Description: OpenPGP digital signature
Re: [Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5
Hi Nikolaus, On Tue, Apr 10, 2018 at 06:25:17PM +0200, H. Nikolaus Schaller wrote: > Hi, > we just started testing the v4.16 kernel and found the > device no longer bootable (works with v4.15). It turned > out that there was a harmful modification somewhere between > v4.15.0 and v4.16-rc1. > > A git bisect points to this patch: Well, that's a shame... However, this code is in production for several months now, so could you, please put 'goto out_copy' if 'buf >= high_memory' condition is met, ie: --- a/drivers/mtd/nand/onenand/omap2.c +++ b/drivers/mtd/nand/onenand/omap2.c @@ -392,6 +392,7 @@ static int omap2_onenand_read_bufferram(struct mtd_info *mtd, int area, if (buf >= high_memory) { struct page *p1; + goto out_copy; if (((size_t)buf & PAGE_MASK) != ((size_t)(buf + count - 1) & PAGE_MASK)) goto out_copy; and in case it does not help put the same goto at the very beginning of omap2_onenand_read_bufferram function and report result? Thank you for cooperation, ladis > commit bdaca9345d41fd9420995469d27603ea62054691 > Author: Ladislav Michl > Date: Fri Jan 12 14:16:57 2018 +0100 > >mtd: onenand: omap2: Decouple DMA enabling from INT pin availability > >INT pin (gpio_irq) is not really needed for DMA but only for notification >when a command that needs wait has completed. DMA memcpy can be still used >even without gpio_irq available, so enable it unconditionally. > >Signed-off-by: Ladislav Michl >Reviewed-by: Peter Ujfalusi >Tested-by: Tony Lindgren >Tested-by: Aaro Koskinen >Acked-by: Roger Quadros >Signed-off-by: Boris Brezillon > > Kernel panic log is attached which indeed indicates a DMA problem. > > Note that we have added compatible = "ti,omap2-onenand"; > > Any suggestions, fixes? > > BR and thanks, > Nikolaus Schaller > > > > ## Booting kernel from Legacy Image at 8200 ... >Image Name: Linux-4.16.0-letux+ >Image Type: ARM Linux Kernel Image (uncompressed) >Data Size:4456744 Bytes = 4.3 MiB >Load Address: 80008000 >Entry Point: 80008000 >Verifying Checksum ... OK > ## Flattened Device Tree blob at 81c0 >Booting using the fdt blob at 0x81c0 >Loading Kernel Image ... OK >Using Device Tree in place at 81c0, end 81c14a1e > > Starting kernel ... > > [0.00] Booting Linux on physical CPU 0x0 > [0.00] Linux version 4.16.0-letux+ (hns@iMac.local) (gcc version > 4.9.2 (GCC)) #2187 SMP PREEMPT Tue Apr 10 16:23:45 CEST 2018 > [0.00] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c5387d > [0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing > instruction cache > [0.00] OF: fdt: Machine model: Goldelico GTA04A5/Letux 2804 > [0.00] debug: ignoring loglevel setting. > [0.00] Memory policy: Data cache writeback > [0.00] cma: Reserved 16 MiB at 0xbe80 > [0.00] On node 0 totalpages: 261632 > [0.00] Normal zone: 1536 pages used for memmap > [0.00] Normal zone: 0 pages reserved > [0.00] Normal zone: 196608 pages, LIFO batch:31 > [0.00] HighMem zone: 65024 pages, LIFO batch:15 > [0.00] CPU: All CPU(s) started in SVC mode. > [0.00] OMAP3630/DM3730 ES1.2 (l2cache iva sgx neon isp 192mhz_clk) > [0.00] random: fast init done > [0.00] percpu: Embedded 17 pages/cpu @(ptrval) s39744 r8192 d21696 > u69632 > [0.00] pcpu-alloc: s39744 r8192 d21696 u69632 alloc=17*4096 > [0.00] pcpu-alloc: [0] 0 > [0.00] Built 1 zonelists, mobility grouping on. Total pages: 260096 > [0.00] Kernel command line: console=ttyO2,115200n8 > mtdoops.mtddev=omap2.nand ubi.mtd=4 root=/dev/mmcblk0p1 rw > rootfstype=ext4,ext3 rootwait console=ttyO2,115200n8 vram=12M > omapfb.vram=0:8M,1:4M omapfb.rotate_type=0 omapdss.def_disp=lcd rootwait > twl4030_charger.allow_usb=1 log_buf_len=8M ignore_loglevel earlyprintk > [0.00] log_buf_len: 8388608 bytes > [0.00] early log buf free: 63924(97%) > [0.00] Dentry cache hash table entries: 131072 (order: 7, 524288 > bytes) > [0.00] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) > [0.00] Memory: 1002524K/1046528K available (6169K kernel code, 626K > rwdata, 1660K rodata, 1024K init, 218K bss, 27620K reserved, 16384K > cma-reserved, 243712K highmem) > [0.00] Virtual kernel memory layout: > [0.00] vector : 0x - 0x1000 ( 4 kB) > [0.00] fixmap : 0xffc0 - 0xfff0 (3072 kB) > [0.00] vmalloc : 0xf080 - 0xff80 ( 240 MB) > [0.00] lowmem : 0xc000 - 0xf000 ( 768 MB) > [0.00] pkmap : 0xbfe0 - 0xc000 ( 2 MB) > [0.00] modules : 0xbf00 - 0xbfe0 ( 14 MB) > [0.00] .text : 0x(ptrval)
[Bug]: mtd: onenand: omap2plus: kernel panic with OneNAND on OMAP3 (DM3730) device GTA04A5
Hi, we just started testing the v4.16 kernel and found the device no longer bootable (works with v4.15). It turned out that there was a harmful modification somewhere between v4.15.0 and v4.16-rc1. A git bisect points to this patch: commit bdaca9345d41fd9420995469d27603ea62054691 Author: Ladislav Michl Date: Fri Jan 12 14:16:57 2018 +0100 mtd: onenand: omap2: Decouple DMA enabling from INT pin availability INT pin (gpio_irq) is not really needed for DMA but only for notification when a command that needs wait has completed. DMA memcpy can be still used even without gpio_irq available, so enable it unconditionally. Signed-off-by: Ladislav Michl Reviewed-by: Peter Ujfalusi Tested-by: Tony Lindgren Tested-by: Aaro Koskinen Acked-by: Roger Quadros Signed-off-by: Boris Brezillon Kernel panic log is attached which indeed indicates a DMA problem. Note that we have added compatible = "ti,omap2-onenand"; Any suggestions, fixes? BR and thanks, Nikolaus Schaller ## Booting kernel from Legacy Image at 8200 ... Image Name: Linux-4.16.0-letux+ Image Type: ARM Linux Kernel Image (uncompressed) Data Size:4456744 Bytes = 4.3 MiB Load Address: 80008000 Entry Point: 80008000 Verifying Checksum ... OK ## Flattened Device Tree blob at 81c0 Booting using the fdt blob at 0x81c0 Loading Kernel Image ... OK Using Device Tree in place at 81c0, end 81c14a1e Starting kernel ... [0.00] Booting Linux on physical CPU 0x0 [0.00] Linux version 4.16.0-letux+ (hns@iMac.local) (gcc version 4.9.2 (GCC)) #2187 SMP PREEMPT Tue Apr 10 16:23:45 CEST 2018 [0.00] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c5387d [0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache [0.00] OF: fdt: Machine model: Goldelico GTA04A5/Letux 2804 [0.00] debug: ignoring loglevel setting. [0.00] Memory policy: Data cache writeback [0.00] cma: Reserved 16 MiB at 0xbe80 [0.00] On node 0 totalpages: 261632 [0.00] Normal zone: 1536 pages used for memmap [0.00] Normal zone: 0 pages reserved [0.00] Normal zone: 196608 pages, LIFO batch:31 [0.00] HighMem zone: 65024 pages, LIFO batch:15 [0.00] CPU: All CPU(s) started in SVC mode. [0.00] OMAP3630/DM3730 ES1.2 (l2cache iva sgx neon isp 192mhz_clk) [0.00] random: fast init done [0.00] percpu: Embedded 17 pages/cpu @(ptrval) s39744 r8192 d21696 u69632 [0.00] pcpu-alloc: s39744 r8192 d21696 u69632 alloc=17*4096 [0.00] pcpu-alloc: [0] 0 [0.00] Built 1 zonelists, mobility grouping on. Total pages: 260096 [0.00] Kernel command line: console=ttyO2,115200n8 mtdoops.mtddev=omap2.nand ubi.mtd=4 root=/dev/mmcblk0p1 rw rootfstype=ext4,ext3 rootwait console=ttyO2,115200n8 vram=12M omapfb.vram=0:8M,1:4M omapfb.rotate_type=0 omapdss.def_disp=lcd rootwait twl4030_charger.allow_usb=1 log_buf_len=8M ignore_loglevel earlyprintk [0.00] log_buf_len: 8388608 bytes [0.00] early log buf free: 63924(97%) [0.00] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) [0.00] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) [0.00] Memory: 1002524K/1046528K available (6169K kernel code, 626K rwdata, 1660K rodata, 1024K init, 218K bss, 27620K reserved, 16384K cma-reserved, 243712K highmem) [0.00] Virtual kernel memory layout: [0.00] vector : 0x - 0x1000 ( 4 kB) [0.00] fixmap : 0xffc0 - 0xfff0 (3072 kB) [0.00] vmalloc : 0xf080 - 0xff80 ( 240 MB) [0.00] lowmem : 0xc000 - 0xf000 ( 768 MB) [0.00] pkmap : 0xbfe0 - 0xc000 ( 2 MB) [0.00] modules : 0xbf00 - 0xbfe0 ( 14 MB) [0.00] .text : 0x(ptrval) - 0x(ptrval) (7162 kB) [0.00] .init : 0x(ptrval) - 0x(ptrval) (1024 kB) [0.00] .data : 0x(ptrval) - 0x(ptrval) ( 627 kB) [0.00].bss : 0x(ptrval) - 0x(ptrval) ( 219 kB) [0.00] Preemptible hierarchical RCU implementation. [0.00] RCU restricting CPUs from NR_CPUS=2 to nr_cpu_ids=1. [0.00] Tasks RCU enabled. [0.00] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1 [0.00] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16 [0.00] IRQ: Found an INTC at 0x(ptrval) (revision 4.0) with 96 interrupts [0.00] Clocking rate (Crystal/Core/MPU): 26.0/400/600 MHz [0.00] OMAP clockevent source: timer1 at 32768 Hz [0.00] clocksource: 32k_counter: mask: 0x max_cycles: 0x, max_idle_ns: 58327039986419 ns [0.00] sched_clock: 32 bits at 32kHz, resolution 30517ns, wraps every 6553584741ns [0.00] OMAP clocksource: 32k_counter at 32768 Hz [0.001159] Console: colour dummy