Re: [PATCH] swiotlb: Fix unexpected swiotlb_alloc_coherent() failures
On Thu, 12 Apr 2018 12:32:54 +0200, Robin Murphy wrote: > > On 12/04/18 09:27, Takashi Iwai wrote: > > On Thu, 12 Apr 2018 10:19:05 +0200, > > Takashi Iwai wrote: > >> > >> On Thu, 12 Apr 2018 10:03:56 +0200, > >> Takashi Iwai wrote: > >>> > >>> On Thu, 12 Apr 2018 08:02:27 +0200, > >>> Christoph Hellwig wrote: > > On Wed, Apr 11, 2018 at 09:28:54AM +0200, Takashi Iwai wrote: > >> But we should try a GFP_DMA32 allocation first, so this is a bit > >> surprising. > > > > Hm, do we really try that? > > Through a quick glance, dma_alloc_coherent_gfp_flags() gives GFP_DMA32 > > only when coherent mask <= DMA_BIT_MASK(32); in the case of iwlwifi, > > it's 36bit, so GFP_DMA isn't set. > > Oh, yes - it is using an odd dma mask, and amdgpu seems to use an > just as odd 40-bit dma mask. > > > We had a fallback allocation with GFP_DMA32 in the past, but this > > seems gone long time ago along with cleanups (commit c647c3bb2d16). > > > > But I haven't followed about this topic for long time, so I might have > > missed obviously... > > I think a fallback would be much better here rather than relying on the > limited swiotlb buffer bool. dma_direct_alloc (which in 4.17 is also > used for x86) already has a GFP_DMA fallback, so extending this for > GFP_DMA32 as well would seem reasonable. > > Any volunteers? > >>> > >>> Below is a quick attempt, totally untested. Actually the retry with > >>> GFP_DMA is superfluous for archs without it, so the first patch > >>> corrects it. > >> > >> Gah, scratch this, it doesn't work. A different check is needed... > > > > The v2 patches are below, replaced with IS_ENABLED(CONFIG_ZONE_DMA). > > That looks pretty reasonable to me - I'd be tempted to try a factoring > out a helper function to avoid the "goto again" logic, but that's > hardly crucial. Right, this change won't make the code flow more complex than before, so I guess it's OK as a first step. I'll submit this second patch as RFC while the first one as a safe fix (in addition to another regression fix I stumbled on). > What I'd really love to see is something like the alloc_pages_mask() > proposal from years ago to come back such that GFP_DMA32 could die > entirely, but I don't know anywhere near enough about the mm layer to > consider taking that on myself :( Yeah, it'd be lovely, but I'm afraid it's hard to fly high... thanks, Takashi
Re: [PATCH] swiotlb: Fix unexpected swiotlb_alloc_coherent() failures
On 12/04/18 09:27, Takashi Iwai wrote: On Thu, 12 Apr 2018 10:19:05 +0200, Takashi Iwai wrote: On Thu, 12 Apr 2018 10:03:56 +0200, Takashi Iwai wrote: On Thu, 12 Apr 2018 08:02:27 +0200, Christoph Hellwig wrote: On Wed, Apr 11, 2018 at 09:28:54AM +0200, Takashi Iwai wrote: But we should try a GFP_DMA32 allocation first, so this is a bit surprising. Hm, do we really try that? Through a quick glance, dma_alloc_coherent_gfp_flags() gives GFP_DMA32 only when coherent mask <= DMA_BIT_MASK(32); in the case of iwlwifi, it's 36bit, so GFP_DMA isn't set. Oh, yes - it is using an odd dma mask, and amdgpu seems to use an just as odd 40-bit dma mask. We had a fallback allocation with GFP_DMA32 in the past, but this seems gone long time ago along with cleanups (commit c647c3bb2d16). But I haven't followed about this topic for long time, so I might have missed obviously... I think a fallback would be much better here rather than relying on the limited swiotlb buffer bool. dma_direct_alloc (which in 4.17 is also used for x86) already has a GFP_DMA fallback, so extending this for GFP_DMA32 as well would seem reasonable. Any volunteers? Below is a quick attempt, totally untested. Actually the retry with GFP_DMA is superfluous for archs without it, so the first patch corrects it. Gah, scratch this, it doesn't work. A different check is needed... The v2 patches are below, replaced with IS_ENABLED(CONFIG_ZONE_DMA). That looks pretty reasonable to me - I'd be tempted to try a factoring out a helper function to avoid the "goto again" logic, but that's hardly crucial. What I'd really love to see is something like the alloc_pages_mask() proposal from years ago to come back such that GFP_DMA32 could die entirely, but I don't know anywhere near enough about the mm layer to consider taking that on myself :( Robin.
Re: [PATCH] swiotlb: Fix unexpected swiotlb_alloc_coherent() failures
On Thu, 12 Apr 2018 10:19:05 +0200, Takashi Iwai wrote: > > On Thu, 12 Apr 2018 10:03:56 +0200, > Takashi Iwai wrote: > > > > On Thu, 12 Apr 2018 08:02:27 +0200, > > Christoph Hellwig wrote: > > > > > > On Wed, Apr 11, 2018 at 09:28:54AM +0200, Takashi Iwai wrote: > > > > > But we should try a GFP_DMA32 allocation first, so this is a bit > > > > > surprising. > > > > > > > > Hm, do we really try that? > > > > Through a quick glance, dma_alloc_coherent_gfp_flags() gives GFP_DMA32 > > > > only when coherent mask <= DMA_BIT_MASK(32); in the case of iwlwifi, > > > > it's 36bit, so GFP_DMA isn't set. > > > > > > Oh, yes - it is using an odd dma mask, and amdgpu seems to use an > > > just as odd 40-bit dma mask. > > > > > > > We had a fallback allocation with GFP_DMA32 in the past, but this > > > > seems gone long time ago along with cleanups (commit c647c3bb2d16). > > > > > > > > But I haven't followed about this topic for long time, so I might have > > > > missed obviously... > > > > > > I think a fallback would be much better here rather than relying on the > > > limited swiotlb buffer bool. dma_direct_alloc (which in 4.17 is also > > > used for x86) already has a GFP_DMA fallback, so extending this for > > > GFP_DMA32 as well would seem reasonable. > > > > > > Any volunteers? > > > > Below is a quick attempt, totally untested. Actually the retry with > > GFP_DMA is superfluous for archs without it, so the first patch > > corrects it. > > Gah, scratch this, it doesn't work. A different check is needed... The v2 patches are below, replaced with IS_ENABLED(CONFIG_ZONE_DMA). Takashi 0001-dma-direct-Don-t-repeat-allocation-for-no-op-GFP_DMA.patch Description: Binary data 0002-dma-direct-Try-reallocation-with-GFP_DMA32-if-possib.patch Description: Binary data
Re: [PATCH] swiotlb: Fix unexpected swiotlb_alloc_coherent() failures
On Thu, 12 Apr 2018 10:03:56 +0200, Takashi Iwai wrote: > > On Thu, 12 Apr 2018 08:02:27 +0200, > Christoph Hellwig wrote: > > > > On Wed, Apr 11, 2018 at 09:28:54AM +0200, Takashi Iwai wrote: > > > > But we should try a GFP_DMA32 allocation first, so this is a bit > > > > surprising. > > > > > > Hm, do we really try that? > > > Through a quick glance, dma_alloc_coherent_gfp_flags() gives GFP_DMA32 > > > only when coherent mask <= DMA_BIT_MASK(32); in the case of iwlwifi, > > > it's 36bit, so GFP_DMA isn't set. > > > > Oh, yes - it is using an odd dma mask, and amdgpu seems to use an > > just as odd 40-bit dma mask. > > > > > We had a fallback allocation with GFP_DMA32 in the past, but this > > > seems gone long time ago along with cleanups (commit c647c3bb2d16). > > > > > > But I haven't followed about this topic for long time, so I might have > > > missed obviously... > > > > I think a fallback would be much better here rather than relying on the > > limited swiotlb buffer bool. dma_direct_alloc (which in 4.17 is also > > used for x86) already has a GFP_DMA fallback, so extending this for > > GFP_DMA32 as well would seem reasonable. > > > > Any volunteers? > > Below is a quick attempt, totally untested. Actually the retry with > GFP_DMA is superfluous for archs without it, so the first patch > corrects it. Gah, scratch this, it doesn't work. A different check is needed... Takashi
Re: [PATCH] swiotlb: Fix unexpected swiotlb_alloc_coherent() failures
On Thu, 12 Apr 2018 08:02:27 +0200, Christoph Hellwig wrote: > > On Wed, Apr 11, 2018 at 09:28:54AM +0200, Takashi Iwai wrote: > > > But we should try a GFP_DMA32 allocation first, so this is a bit > > > surprising. > > > > Hm, do we really try that? > > Through a quick glance, dma_alloc_coherent_gfp_flags() gives GFP_DMA32 > > only when coherent mask <= DMA_BIT_MASK(32); in the case of iwlwifi, > > it's 36bit, so GFP_DMA isn't set. > > Oh, yes - it is using an odd dma mask, and amdgpu seems to use an > just as odd 40-bit dma mask. > > > We had a fallback allocation with GFP_DMA32 in the past, but this > > seems gone long time ago along with cleanups (commit c647c3bb2d16). > > > > But I haven't followed about this topic for long time, so I might have > > missed obviously... > > I think a fallback would be much better here rather than relying on the > limited swiotlb buffer bool. dma_direct_alloc (which in 4.17 is also > used for x86) already has a GFP_DMA fallback, so extending this for > GFP_DMA32 as well would seem reasonable. > > Any volunteers? Below is a quick attempt, totally untested. Actually the retry with GFP_DMA is superfluous for archs without it, so the first patch corrects it. The second patch adds the retry with GFP_DMA32. I'll resubmit properly if these are OK (and better if anyone could test them :) thanks, Takashi 0001-dma-direct-Don-t-repeat-allocation-for-no-op-GFP_DMA.patch Description: Binary data 0002-dma-direct-Try-reallocation-with-GFP_DMA32-if-possib.patch Description: Binary data
Re: [PATCH] swiotlb: Fix unexpected swiotlb_alloc_coherent() failures
On Wed, Apr 11, 2018 at 09:28:54AM +0200, Takashi Iwai wrote: > > But we should try a GFP_DMA32 allocation first, so this is a bit > > surprising. > > Hm, do we really try that? > Through a quick glance, dma_alloc_coherent_gfp_flags() gives GFP_DMA32 > only when coherent mask <= DMA_BIT_MASK(32); in the case of iwlwifi, > it's 36bit, so GFP_DMA isn't set. Oh, yes - it is using an odd dma mask, and amdgpu seems to use an just as odd 40-bit dma mask. > We had a fallback allocation with GFP_DMA32 in the past, but this > seems gone long time ago along with cleanups (commit c647c3bb2d16). > > But I haven't followed about this topic for long time, so I might have > missed obviously... I think a fallback would be much better here rather than relying on the limited swiotlb buffer bool. dma_direct_alloc (which in 4.17 is also used for x86) already has a GFP_DMA fallback, so extending this for GFP_DMA32 as well would seem reasonable. Any volunteers?
Re: [PATCH] swiotlb: Fix unexpected swiotlb_alloc_coherent() failures
On Tue, 10 Apr 2018 20:10:20 +0200, Christoph Hellwig wrote: > > On Tue, Apr 10, 2018 at 06:50:04PM +0100, Robin Murphy wrote: > > In the first one, the machine appears to have enough RAM that most of it is > > beyond the device's 36-bit DMA mask, thus it's fairly likely for the > > initial direct alloc to come back with something unsuitable. > > But we should try a GFP_DMA32 allocation first, so this is a bit > surprising. Hm, do we really try that? Through a quick glance, dma_alloc_coherent_gfp_flags() gives GFP_DMA32 only when coherent mask <= DMA_BIT_MASK(32); in the case of iwlwifi, it's 36bit, so GFP_DMA isn't set. We had a fallback allocation with GFP_DMA32 in the past, but this seems gone long time ago along with cleanups (commit c647c3bb2d16). But I haven't followed about this topic for long time, so I might have missed obviously... thanks, Takashi
Re: [PATCH] swiotlb: Fix unexpected swiotlb_alloc_coherent() failures
On Tue, Apr 10, 2018 at 06:50:04PM +0100, Robin Murphy wrote: > In the first one, the machine appears to have enough RAM that most of it is > beyond the device's 36-bit DMA mask, thus it's fairly likely for the > initial direct alloc to come back with something unsuitable. But we should try a GFP_DMA32 allocation first, so this is a bit surprising.
Re: [PATCH] swiotlb: Fix unexpected swiotlb_alloc_coherent() failures
On 10/04/18 18:07, Takashi Iwai wrote: On Tue, 10 Apr 2018 19:06:15 +0200, Christoph Hellwig wrote: On Tue, Apr 10, 2018 at 07:05:13PM +0200, Takashi Iwai wrote: The code refactoring by commit 0176adb00406 ("swiotlb: refactor coherent buffer allocation") made swiotlb_alloc_buffer() almost always failing due to a thinko: namely, the function evaluates the dma_coherent_ok() call incorrectly and dealing as if it's invalid. This ends up with weird errors like iwlwifi probe failure or amdgpu screen flickering. This patch corrects the logic error. This looks ok, although even hitting this code is a bad sign. Do you know why the previous dma_direct_alloc didn't succeed? That I have no idea, sorry. I just figured out the regression just from the dmesg output of 4.16 kernel in bugzilla reports. Could you take a look at the bugzilla reports there? In the first one, the machine appears to have enough RAM that most of it is beyond the device's 36-bit DMA mask, thus it's fairly likely for the initial direct alloc to come back with something unsuitable. Robin. thanks, Takashi Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1088658 Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1088902 Fixes: 0176adb00406 ("swiotlb: refactor coherent buffer allocation") Cc: # v4.16+ Signed-off-by: Takashi Iwai --- lib/swiotlb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 47aeb04c1997..de7cc540450f 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -719,7 +719,7 @@ swiotlb_alloc_buffer(struct device *dev, size_t size, dma_addr_t *dma_handle, goto out_warn; *dma_handle = __phys_to_dma(dev, phys_addr); - if (dma_coherent_ok(dev, *dma_handle, size)) + if (!dma_coherent_ok(dev, *dma_handle, size)) goto out_unmap; memset(phys_to_virt(phys_addr), 0, size); -- 2.16.3 ---end quoted text--- ___ iommu mailing list io...@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] swiotlb: Fix unexpected swiotlb_alloc_coherent() failures
On Tue, 10 Apr 2018 19:06:15 +0200, Christoph Hellwig wrote: > > On Tue, Apr 10, 2018 at 07:05:13PM +0200, Takashi Iwai wrote: > > The code refactoring by commit 0176adb00406 ("swiotlb: refactor > > coherent buffer allocation") made swiotlb_alloc_buffer() almost always > > failing due to a thinko: namely, the function evaluates the > > dma_coherent_ok() call incorrectly and dealing as if it's invalid. > > This ends up with weird errors like iwlwifi probe failure or amdgpu > > screen flickering. > > > > This patch corrects the logic error. > > This looks ok, although even hitting this code is a bad sign. Do you > know why the previous dma_direct_alloc didn't succeed? That I have no idea, sorry. I just figured out the regression just from the dmesg output of 4.16 kernel in bugzilla reports. Could you take a look at the bugzilla reports there? thanks, Takashi > > Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1088658 > > Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1088902 > > Fixes: 0176adb00406 ("swiotlb: refactor coherent buffer allocation") > > Cc: # v4.16+ > > Signed-off-by: Takashi Iwai > > --- > > lib/swiotlb.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/lib/swiotlb.c b/lib/swiotlb.c > > index 47aeb04c1997..de7cc540450f 100644 > > --- a/lib/swiotlb.c > > +++ b/lib/swiotlb.c > > @@ -719,7 +719,7 @@ swiotlb_alloc_buffer(struct device *dev, size_t size, > > dma_addr_t *dma_handle, > > goto out_warn; > > > > *dma_handle = __phys_to_dma(dev, phys_addr); > > - if (dma_coherent_ok(dev, *dma_handle, size)) > > + if (!dma_coherent_ok(dev, *dma_handle, size)) > > goto out_unmap; > > > > memset(phys_to_virt(phys_addr), 0, size); > > -- > > 2.16.3 > ---end quoted text--- >
Re: [PATCH] swiotlb: Fix unexpected swiotlb_alloc_coherent() failures
On Tue, Apr 10, 2018 at 07:05:13PM +0200, Takashi Iwai wrote: > The code refactoring by commit 0176adb00406 ("swiotlb: refactor > coherent buffer allocation") made swiotlb_alloc_buffer() almost always > failing due to a thinko: namely, the function evaluates the > dma_coherent_ok() call incorrectly and dealing as if it's invalid. > This ends up with weird errors like iwlwifi probe failure or amdgpu > screen flickering. > > This patch corrects the logic error. This looks ok, although even hitting this code is a bad sign. Do you know why the previous dma_direct_alloc didn't succeed? > > Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1088658 > Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1088902 > Fixes: 0176adb00406 ("swiotlb: refactor coherent buffer allocation") > Cc: # v4.16+ > Signed-off-by: Takashi Iwai > --- > lib/swiotlb.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/lib/swiotlb.c b/lib/swiotlb.c > index 47aeb04c1997..de7cc540450f 100644 > --- a/lib/swiotlb.c > +++ b/lib/swiotlb.c > @@ -719,7 +719,7 @@ swiotlb_alloc_buffer(struct device *dev, size_t size, > dma_addr_t *dma_handle, > goto out_warn; > > *dma_handle = __phys_to_dma(dev, phys_addr); > - if (dma_coherent_ok(dev, *dma_handle, size)) > + if (!dma_coherent_ok(dev, *dma_handle, size)) > goto out_unmap; > > memset(phys_to_virt(phys_addr), 0, size); > -- > 2.16.3 ---end quoted text---