subject:"drm\/radeon spamming alloc_contig_range\: \[xxx, yyy\) PFNs busy busy"

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

2016-12-02 Thread Jerome Glisse

On Fri, Dec 02, 2016 at 11:26:02AM +0100, Lucas Stach wrote:
> Am Donnerstag, den 01.12.2016, 15:11 +0100 schrieb Michal Hocko:
> > Let's also CC Marek
> > 
> > On Thu 01-12-16 08:43:40, Vlastimil Babka wrote:
> > > On 12/01/2016 08:21 AM, Michal Hocko wrote:
> > > > Forgot to CC Joonsoo. The email thread starts more or less here
> > > > http://lkml.kernel.org/r/20161130092239.gd18...@dhcp22.suse.cz
> > > > 
> > > > On Thu 01-12-16 08:15:07, Michal Hocko wrote:
> > > > > On Wed 30-11-16 20:19:03, Robin H. Johnson wrote:
> > > > > [...]
> > > > > > alloc_contig_range: [83f2a3, 83f2a4) PFNs busy
> > > > > 
> > > > > Huh, do I get it right that the request was for a _single_ page? Why 
> > > > > do
> > > > > we need CMA for that?
> > > 
> > > Ugh, good point. I assumed that was just the PFNs that it failed to 
> > > migrate
> > > away, but it seems that's indeed the whole requested range. Yeah sounds 
> > > some
> > > part of the dma-cma chain could be smarter and attempt CMA only for e.g.
> > > costly orders.
> > 
> > Is there any reason why the DMA api doesn't try the page allocator first
> > before falling back to the CMA? I simply have a hard time to see why the
> > CMA should be used (and fragment) for small requests size.
> 
> On x86 that is true, but on ARM CMA is the only (low memory) region that
> can change the memory attributes, by being excluded from the lowmem
> section mapping. Changing the memory attributes to
> uncached/writecombined for DMA is crucial on ARM to fulfill the
> requirement that no there aren't any conflicting mappings of the same
> physical page.
> 
> On ARM we can possibly do the optimization of asking the page allocator,
> but only if we can request _only_ highmem pages.
> 

So this memory allocation strategy should only apply to ARM and not x86 we
already had fall out couple year ago when Ubuntu decided to enable CMA on
x86 where it does not make sense as i don't think we have any single device
we care that is not behind an IOMMU and thus does not require contiguous
memory allocation.

The DMA API should only use CMA on architecture where it is necessary not
on all of them.

Cheers,
Jérôme

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

2016-12-02 Thread Lucas Stach

Am Donnerstag, den 01.12.2016, 15:11 +0100 schrieb Michal Hocko:
> Let's also CC Marek
> 
> On Thu 01-12-16 08:43:40, Vlastimil Babka wrote:
> > On 12/01/2016 08:21 AM, Michal Hocko wrote:
> > > Forgot to CC Joonsoo. The email thread starts more or less here
> > > http://lkml.kernel.org/r/20161130092239.gd18...@dhcp22.suse.cz
> > > 
> > > On Thu 01-12-16 08:15:07, Michal Hocko wrote:
> > > > On Wed 30-11-16 20:19:03, Robin H. Johnson wrote:
> > > > [...]
> > > > > alloc_contig_range: [83f2a3, 83f2a4) PFNs busy
> > > > 
> > > > Huh, do I get it right that the request was for a _single_ page? Why do
> > > > we need CMA for that?
> > 
> > Ugh, good point. I assumed that was just the PFNs that it failed to migrate
> > away, but it seems that's indeed the whole requested range. Yeah sounds some
> > part of the dma-cma chain could be smarter and attempt CMA only for e.g.
> > costly orders.
> 
> Is there any reason why the DMA api doesn't try the page allocator first
> before falling back to the CMA? I simply have a hard time to see why the
> CMA should be used (and fragment) for small requests size.

On x86 that is true, but on ARM CMA is the only (low memory) region that
can change the memory attributes, by being excluded from the lowmem
section mapping. Changing the memory attributes to
uncached/writecombined for DMA is crucial on ARM to fulfill the
requirement that no there aren't any conflicting mappings of the same
physical page.

On ARM we can possibly do the optimization of asking the page allocator,
but only if we can request _only_ highmem pages.

Regards,
Lucas

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

2016-12-01 Thread Vlastimil Babka


On 12/01/2016 10:02 PM, Michal Nazarewicz wrote:

On Thu, Dec 01 2016, Michal Hocko wrote:

I am not familiar with this code so I cannot really argue but a quick
look at rmem_cma_setup doesn't suggest any speicific placing or
anything...


early_cma parses ‘cma’ command line argument which can specify where
exactly the default CMA area is to be located.  Furthermore, CMA areas
can be assigned per-device (via the Device Tree IIRC).


OK, but the context of this bug report is a generic cma pool and generic 
dma alloc, which tries cma first and then fallback to 
alloc_pages_node(). If a device really requires specific placing as you 
suggest, then it probably uses a different allocation interface, 
otherwise there would be some flag to disallow the alloc_pages_node() 
fallback?

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

2016-12-01 Thread Michal Nazarewicz

On Thu, Dec 01 2016, Michal Hocko wrote:
> I am not familiar with this code so I cannot really argue but a quick
> look at rmem_cma_setup doesn't suggest any speicific placing or
> anything...

early_cma parses ‘cma’ command line argument which can specify where
exactly the default CMA area is to be located.  Furthermore, CMA areas
can be assigned per-device (via the Device Tree IIRC).

-- 
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

2016-12-01 Thread Michal Hocko

On Thu 01-12-16 17:03:52, Michal Nazarewicz wrote:
> On Thu, Dec 01 2016, Michal Hocko wrote:
> > Let's also CC Marek
> >
> > On Thu 01-12-16 08:43:40, Vlastimil Babka wrote:
> >> On 12/01/2016 08:21 AM, Michal Hocko wrote:
> >> > Forgot to CC Joonsoo. The email thread starts more or less here
> >> > http://lkml.kernel.org/r/20161130092239.gd18...@dhcp22.suse.cz
> >> > 
> >> > On Thu 01-12-16 08:15:07, Michal Hocko wrote:
> >> > > On Wed 30-11-16 20:19:03, Robin H. Johnson wrote:
> >> > > [...]
> >> > > > alloc_contig_range: [83f2a3, 83f2a4) PFNs busy
> >> > > 
> >> > > Huh, do I get it right that the request was for a _single_ page? Why do
> >> > > we need CMA for that?
> >> 
> >> Ugh, good point. I assumed that was just the PFNs that it failed to migrate
> >> away, but it seems that's indeed the whole requested range. Yeah sounds 
> >> some
> >> part of the dma-cma chain could be smarter and attempt CMA only for e.g.
> >> costly orders.
> >
> > Is there any reason why the DMA api doesn't try the page allocator first
> > before falling back to the CMA? I simply have a hard time to see why the
> > CMA should be used (and fragment) for small requests size.
> 
> There actually may be reasons to always go with CMA even if small
> regions are requested.  CMA areas may be defined to map to particular
> physical addresses and given device may require allocations from those
> addresses.  This may be more than just a matter of DMA address space.
> I cannot give you specific examples though and I might be talking
> nonsense.

I am not familiar with this code so I cannot really argue but a quick
look at rmem_cma_setup doesn't suggest any speicific placing or
anything...

-- 
Michal Hocko
SUSE Labs

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

2016-12-01 Thread Michal Nazarewicz

On Thu, Dec 01 2016, Michal Hocko wrote:
> Let's also CC Marek
>
> On Thu 01-12-16 08:43:40, Vlastimil Babka wrote:
>> On 12/01/2016 08:21 AM, Michal Hocko wrote:
>> > Forgot to CC Joonsoo. The email thread starts more or less here
>> > http://lkml.kernel.org/r/20161130092239.gd18...@dhcp22.suse.cz
>> > 
>> > On Thu 01-12-16 08:15:07, Michal Hocko wrote:
>> > > On Wed 30-11-16 20:19:03, Robin H. Johnson wrote:
>> > > [...]
>> > > > alloc_contig_range: [83f2a3, 83f2a4) PFNs busy
>> > > 
>> > > Huh, do I get it right that the request was for a _single_ page? Why do
>> > > we need CMA for that?
>> 
>> Ugh, good point. I assumed that was just the PFNs that it failed to migrate
>> away, but it seems that's indeed the whole requested range. Yeah sounds some
>> part of the dma-cma chain could be smarter and attempt CMA only for e.g.
>> costly orders.
>
> Is there any reason why the DMA api doesn't try the page allocator first
> before falling back to the CMA? I simply have a hard time to see why the
> CMA should be used (and fragment) for small requests size.

There actually may be reasons to always go with CMA even if small
regions are requested.  CMA areas may be defined to map to particular
physical addresses and given device may require allocations from those
addresses.  This may be more than just a matter of DMA address space.
I cannot give you specific examples though and I might be talking
nonsense.

> -- 
> Michal Hocko
> SUSE Labs

-- 
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

2016-12-01 Thread Michal Hocko

Let's also CC Marek

On Thu 01-12-16 08:43:40, Vlastimil Babka wrote:
> On 12/01/2016 08:21 AM, Michal Hocko wrote:
> > Forgot to CC Joonsoo. The email thread starts more or less here
> > http://lkml.kernel.org/r/20161130092239.gd18...@dhcp22.suse.cz
> > 
> > On Thu 01-12-16 08:15:07, Michal Hocko wrote:
> > > On Wed 30-11-16 20:19:03, Robin H. Johnson wrote:
> > > [...]
> > > > alloc_contig_range: [83f2a3, 83f2a4) PFNs busy
> > > 
> > > Huh, do I get it right that the request was for a _single_ page? Why do
> > > we need CMA for that?
> 
> Ugh, good point. I assumed that was just the PFNs that it failed to migrate
> away, but it seems that's indeed the whole requested range. Yeah sounds some
> part of the dma-cma chain could be smarter and attempt CMA only for e.g.
> costly orders.

Is there any reason why the DMA api doesn't try the page allocator first
before falling back to the CMA? I simply have a hard time to see why the
CMA should be used (and fragment) for small requests size.
-- 
Michal Hocko
SUSE Labs

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

2016-11-30 Thread Robin H. Johnson

On Thu, Dec 01, 2016 at 08:38:15AM +0100, Vlastimil Babka wrote:
> >> By default config this should not be used on x86.
> > What do you mean by that statement?
> 
> I mean that the 16 mbytes for generic CMA area is not a default on x86:
> 
> config CMA_SIZE_MBYTES
>  int "Size in Mega Bytes"
>  depends on !CMA_SIZE_SEL_PERCENTAGE
>  default 0 if X86
>  default 16
d7be003a9d275299f5ee36bbdf156654f59e08e9 (v3.18-2122-gd7be003a9d27)
is there the 0MB if-x86 default was added to the tree. Prior to that, it
was 16MiB, and that's where my system picked up the value from.

I have a record of all my kconfigs, because I use oldconfig each time
(going back 8 years to 2.6.27)

# Added in 3.12.0-1-g5f258d0
CONFIG_CMA=y 
# Added in 3.16.0-rc6-00042-g67dd8f3
CONFIG_CMA_ALIGNMENT=8
CONFIG_CMA_AREAS=7
CONFIG_CMA_SIZE_MBYTES=16
CONFIG_CMA_SIZE_SEL_MBYTES=y
CONFIG_DMA_CMA=y

So the next question, is why did I pick up CMA in
3.16.0-rc6-00042-g67dd8f3... I'll poke at that.

> > Yes, I'd say if there's a fallback without much penalty, nowarn makes
> > sense. If the fallback just tries multiple addresses until success, then
> > the warning should only be issued when too many attempts have been made.
> On the other hand, if the warnings are correlated with high kernel CPU usage, 
> it's arguably better to be warned.
Keep the rate-limit on the warning for cases like this?

> >> > The rate of the problem starts slow, and also is relatively low on an 
> >> > idle
> >> > system (my screens blank at night, no xscreensaver running), but it 
> >> > still ramps
> >> > up over time (to the point of generating 2.5GB/hour of "(timestamp)
> >> > alloc_contig_range: [83e4d9, 83e4da) PFNs busy"), with various addresses 
> >> > (~100
> >> > unique ranges for a day).
> >> >
> >> > My X workload is ~50 chrome tabs and ~20 terminals (over 3x 24" monitors 
> >> > w/ 9
> >> > virtual desktops per monitor).
> >> So IIUC, except the messages, everything actually works fine?
> > There's high kernel CPU usage that seems to roughly correlate with the
> > messages, but I can't yet tell if that's due to the syslog itself, or
> > repeated alloc_contig_range requests.
> You could try running perf top.
Will do in the morning.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

2016-11-30 Thread Vlastimil Babka


On 12/01/2016 08:21 AM, Michal Hocko wrote:

Forgot to CC Joonsoo. The email thread starts more or less here
http://lkml.kernel.org/r/20161130092239.gd18...@dhcp22.suse.cz

On Thu 01-12-16 08:15:07, Michal Hocko wrote:

On Wed 30-11-16 20:19:03, Robin H. Johnson wrote:
[...]
> alloc_contig_range: [83f2a3, 83f2a4) PFNs busy

Huh, do I get it right that the request was for a _single_ page? Why do
we need CMA for that?


Ugh, good point. I assumed that was just the PFNs that it failed to migrate 
away, but it seems that's indeed the whole requested range. Yeah sounds some 
part of the dma-cma chain could be smarter and attempt CMA only for e.g. costly 
orders.

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

2016-11-30 Thread Vlastimil Babka

On 12/01/2016 07:21 AM, Robin H. Johnson wrote:

On Wed, Nov 30, 2016 at 10:24:59PM +0100, Vlastimil Babka wrote:

[add more CC's]

On 11/30/2016 09:19 PM, Robin H. Johnson wrote:
> Somewhere in the Radeon/DRM codebase, CMA page allocation has either
> regressed in the timeline of 4.5->4.9, and/or the drm/radeon code is
> doing something different with pages.

Could be that it didn't use dma_generic_alloc_coherent() before, or you didn't
have the generic CMA pool configured.

v4.9-rc7-23-gded6e842cf49:
[0.00] cma: Reserved 16 MiB at 0x00083e40
[0.00] Memory: 32883108K/33519432K available (6752K kernel code, 1244K
rwdata, 4716K rodata, 1772K init, 2720K bss, 619940K reserved, 16384K
cma-reserved)

What's the output of "grep CMA" on your
.config?

# grep CMA .config |grep -v -e SECMARK= -e CONFIG_BCMA -e CONFIG_USB_HCD_BCMA 
-e INPUT_CMA3000 -e CRYPTO_CMAC
CONFIG_CMA=y
# CONFIG_CMA_DEBUG is not set
# CONFIG_CMA_DEBUGFS is not set
CONFIG_CMA_AREAS=7
CONFIG_DMA_CMA=y
CONFIG_CMA_SIZE_MBYTES=16
CONFIG_CMA_SIZE_SEL_MBYTES=y
# CONFIG_CMA_SIZE_SEL_PERCENTAGE is not set
# CONFIG_CMA_SIZE_SEL_MIN is not set
# CONFIG_CMA_SIZE_SEL_MAX is not set
CONFIG_CMA_ALIGNMENT=8

Or any kernel boot options with cma in name?

None.

By default config this should not be used on x86.

What do you mean by that statement?

I mean that the 16 mbytes for generic CMA area is not a default on x86:

config CMA_SIZE_MBYTES
int "Size in Mega Bytes"
depends on !CMA_SIZE_SEL_PERCENTAGE
default 0 if X86
default 16

Which explains why it's rare to see these reports in the context such as yours.
I'd recommend just disabling it, as the primary use case for CMA are devices on 
mobile phones that don't have any other fallback (unlike the dma alloc).

It should be disallowed to enable CONFIG_CMA? Radeon and CMA should be
mutually exclusive?

I don't think this is a specific problem of radeon. But looks like it's a heavy 
user of the dma alloc. There might be others.

> Given that I haven't seen ANY other reports of this, I'm inclined to
> believe the problem is drm/radeon specific (if I don't start X, I can't
> reproduce the problem).

It's rather CMA specific, the allocation attemps just can't be 100% reliable due
to how CMA works. The question is if it should be spewing in the log in the
context of dma-cma, which has a fallback allocation option. It even uses
__GFP_NOWARN, perhaps the CMA path should respect that?

Yes, I'd say if there's a fallback without much penalty, nowarn makes
sense. If the fallback just tries multiple addresses until success, then
the warning should only be issued when too many attempts have been made.

On the other hand, if the warnings are correlated with high kernel CPU usage, 
it's arguably better to be warned.

> The rate of the problem starts slow, and also is relatively low on an idle
> system (my screens blank at night, no xscreensaver running), but it still 
ramps
> up over time (to the point of generating 2.5GB/hour of "(timestamp)
> alloc_contig_range: [83e4d9, 83e4da) PFNs busy"), with various addresses (~100
> unique ranges for a day).
>
> My X workload is ~50 chrome tabs and ~20 terminals (over 3x 24" monitors w/ 9
> virtual desktops per monitor).
So IIUC, except the messages, everything actually works fine?

There's high kernel CPU usage that seems to roughly correlate with the
messages, but I can't yet tell if that's due to the syslog itself, or
repeated alloc_contig_range requests.

You could try running perf top.

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

2016-11-30 Thread Michal Hocko

Forgot to CC Joonsoo. The email thread starts more or less here
http://lkml.kernel.org/r/20161130092239.gd18...@dhcp22.suse.cz

On Thu 01-12-16 08:15:07, Michal Hocko wrote:
> On Wed 30-11-16 20:19:03, Robin H. Johnson wrote:
> [...]
> > alloc_contig_range: [83f2a3, 83f2a4) PFNs busy
> 
> Huh, do I get it right that the request was for a _single_ page? Why do
> we need CMA for that?

-- 
Michal Hocko
SUSE Labs

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

2016-11-30 Thread Michal Hocko

On Wed 30-11-16 20:19:03, Robin H. Johnson wrote:
[...]
> alloc_contig_range: [83f2a3, 83f2a4) PFNs busy

Huh, do I get it right that the request was for a _single_ page? Why do
we need CMA for that?
-- 
Michal Hocko
SUSE Labs

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

2016-11-30 Thread Robin H. Johnson

On Wed, Nov 30, 2016 at 10:24:59PM +0100, Vlastimil Babka wrote:
> [add more CC's]
> 
> On 11/30/2016 09:19 PM, Robin H. Johnson wrote:
> > Somewhere in the Radeon/DRM codebase, CMA page allocation has either
> > regressed in the timeline of 4.5->4.9, and/or the drm/radeon code is
> > doing something different with pages.
> 
> Could be that it didn't use dma_generic_alloc_coherent() before, or you 
> didn't 
> have the generic CMA pool configured.
v4.9-rc7-23-gded6e842cf49:
[0.00] cma: Reserved 16 MiB at 0x00083e40
[0.00] Memory: 32883108K/33519432K available (6752K kernel code, 1244K
rwdata, 4716K rodata, 1772K init, 2720K bss, 619940K reserved, 16384K
cma-reserved)

> What's the output of "grep CMA" on your 
> .config?

# grep CMA .config |grep -v -e SECMARK= -e CONFIG_BCMA -e CONFIG_USB_HCD_BCMA 
-e INPUT_CMA3000 -e CRYPTO_CMAC
CONFIG_CMA=y
# CONFIG_CMA_DEBUG is not set
# CONFIG_CMA_DEBUGFS is not set
CONFIG_CMA_AREAS=7
CONFIG_DMA_CMA=y
CONFIG_CMA_SIZE_MBYTES=16
CONFIG_CMA_SIZE_SEL_MBYTES=y
# CONFIG_CMA_SIZE_SEL_PERCENTAGE is not set
# CONFIG_CMA_SIZE_SEL_MIN is not set
# CONFIG_CMA_SIZE_SEL_MAX is not set
CONFIG_CMA_ALIGNMENT=8

> Or any kernel boot options with cma in name? 
None.


> By default config this should not be used on x86.
What do you mean by that statement? 
It should be disallowed to enable CONFIG_CMA? Radeon and CMA should be
mutually exclusive?

> > Given that I haven't seen ANY other reports of this, I'm inclined to
> > believe the problem is drm/radeon specific (if I don't start X, I can't
> > reproduce the problem).
> 
> It's rather CMA specific, the allocation attemps just can't be 100% reliable 
> due 
> to how CMA works. The question is if it should be spewing in the log in the 
> context of dma-cma, which has a fallback allocation option. It even uses 
> __GFP_NOWARN, perhaps the CMA path should respect that?
Yes, I'd say if there's a fallback without much penalty, nowarn makes
sense. If the fallback just tries multiple addresses until success, then
the warning should only be issued when too many attempts have been made.

> 
> > The rate of the problem starts slow, and also is relatively low on an idle
> > system (my screens blank at night, no xscreensaver running), but it still 
> > ramps
> > up over time (to the point of generating 2.5GB/hour of "(timestamp)
> > alloc_contig_range: [83e4d9, 83e4da) PFNs busy"), with various addresses 
> > (~100
> > unique ranges for a day).
> >
> > My X workload is ~50 chrome tabs and ~20 terminals (over 3x 24" monitors w/ 
> > 9
> > virtual desktops per monitor).
> So IIUC, except the messages, everything actually works fine?
There's high kernel CPU usage that seems to roughly correlate with the
messages, but I can't yet tell if that's due to the syslog itself, or
repeated alloc_contig_range requests.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136


signature.asc
Description: Digital signature

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

2016-11-30 Thread Vlastimil Babka


[add more CC's]

On 11/30/2016 09:19 PM, Robin H. Johnson wrote:

Somewhere in the Radeon/DRM codebase, CMA page allocation has either
regressed in the timeline of 4.5->4.9, and/or the drm/radeon code is
doing something different with pages.


Could be that it didn't use dma_generic_alloc_coherent() before, or you didn't 
have the generic CMA pool configured. What's the output of "grep CMA" on your 
.config? Or any kernel boot options with cma in name? By default config this 
should not be used on x86.



Given that I haven't seen ANY other reports of this, I'm inclined to
believe the problem is drm/radeon specific (if I don't start X, I can't
reproduce the problem).


It's rather CMA specific, the allocation attemps just can't be 100% reliable due 
to how CMA works. The question is if it should be spewing in the log in the 
context of dma-cma, which has a fallback allocation option. It even uses 
__GFP_NOWARN, perhaps the CMA path should respect that?



The rate of the problem starts slow, and also is relatively low on an idle
system (my screens blank at night, no xscreensaver running), but it still ramps
up over time (to the point of generating 2.5GB/hour of "(timestamp)
alloc_contig_range: [83e4d9, 83e4da) PFNs busy"), with various addresses (~100
unique ranges for a day).

My X workload is ~50 chrome tabs and ~20 terminals (over 3x 24" monitors w/ 9
virtual desktops per monitor).


So IIUC, except the messages, everything actually works fine?


I added a stack trace & rate limit to alloc_contig_range's PFNs busy message
(patch in previous email on LKML/-MM lists); and they point to radeon.

alloc_contig_range: [83f2a3, 83f2a4) PFNs busy
CPU: 3 PID: 8518 Comm: X Not tainted 4.9.0-rc7-00024-g6ad4037e18ec #27
Hardware name: System manufacturer System Product Name/P8Z68 DELUXE, BIOS 0501 
05/09/2011
 ad50c3d7f730 b236c873 0083f2a3 0083f2a4
 ad50c3d7f810 b2183b38 999dff4d8040 20fca8c0
 0083f400 0083f000 0083f2a3 0004
Call Trace:
 [] dump_stack+0x85/0xc2
 [] alloc_contig_range+0x368/0x370
 [] cma_alloc+0x127/0x2e0
 [] dma_alloc_from_contiguous+0x38/0x40
 [] dma_generic_alloc_coherent+0x91/0x1d0
 [] x86_swiotlb_alloc_coherent+0x25/0x50
 [] ttm_dma_populate+0x48a/0x9a0 [ttm]
 [] ? __kmalloc+0x1b6/0x250
 [] radeon_ttm_tt_populate+0x22a/0x2d0 [radeon]
 [] ? ttm_dma_tt_init+0x67/0xc0 [ttm]
 [] ttm_tt_bind+0x37/0x70 [ttm]
 [] ttm_bo_handle_move_mem+0x528/0x5a0 [ttm]
 [] ? shmem_alloc_inode+0x1a/0x30
 [] ttm_bo_validate+0x114/0x130 [ttm]
 [] ? _raw_write_unlock+0xe/0x10
 [] ttm_bo_init+0x31d/0x3f0 [ttm]
 [] radeon_bo_create+0x19b/0x260 [radeon]
 [] ? radeon_update_memory_usage.isra.0+0x50/0x50 [radeon]
 [] radeon_gem_object_create+0xad/0x180 [radeon]
 [] radeon_gem_create_ioctl+0x5f/0xf0 [radeon]
 [] drm_ioctl+0x21b/0x4d0 [drm]
 [] ? radeon_gem_pwrite_ioctl+0x30/0x30 [radeon]
 [] radeon_drm_ioctl+0x4c/0x80 [radeon]
 [] do_vfs_ioctl+0x92/0x5c0
 [] SyS_ioctl+0x79/0x90
 [] do_syscall_64+0x73/0x190
 [] entry_SYSCALL64_slow_path+0x25/0x25

The Radeon card in my case is a VisionTek HD 7750 Eyefinity 6, which is
reported as:

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cape 
Verde PRO [Radeon HD 7750/8740 / R7 250E] (prog-if 00 [VGA controller])
Subsystem: VISIONTEK Cape Verde PRO [Radeon HD 7750/8740 / R7 250E]
Flags: bus master, fast devsel, latency 0, IRQ 58
Memory at c000 (64-bit, prefetchable) [size=256M]
Memory at fbe0 (64-bit, non-prefetchable) [size=256K]
I/O ports at e000 [size=256]
Expansion ROM at 000c [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 

Capabilities: [150] Advanced Error Reporting
Kernel driver in use: radeon
Kernel modules: radeon, amdgpu

drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

2016-11-30 Thread Robin H. Johnson

Somewhere in the Radeon/DRM codebase, CMA page allocation has either
regressed in the timeline of 4.5->4.9, and/or the drm/radeon code is
doing something different with pages.

Given that I haven't seen ANY other reports of this, I'm inclined to
believe the problem is drm/radeon specific (if I don't start X, I can't
reproduce the problem).

The rate of the problem starts slow, and also is relatively low on an idle
system (my screens blank at night, no xscreensaver running), but it still ramps
up over time (to the point of generating 2.5GB/hour of "(timestamp)
alloc_contig_range: [83e4d9, 83e4da) PFNs busy"), with various addresses (~100
unique ranges for a day).

My X workload is ~50 chrome tabs and ~20 terminals (over 3x 24" monitors w/ 9
virtual desktops per monitor).

I added a stack trace & rate limit to alloc_contig_range's PFNs busy message
(patch in previous email on LKML/-MM lists); and they point to radeon.

alloc_contig_range: [83f2a3, 83f2a4) PFNs busy
CPU: 3 PID: 8518 Comm: X Not tainted 4.9.0-rc7-00024-g6ad4037e18ec #27
Hardware name: System manufacturer System Product Name/P8Z68 DELUXE, BIOS 0501 
05/09/2011
 ad50c3d7f730 b236c873 0083f2a3 0083f2a4
 ad50c3d7f810 b2183b38 999dff4d8040 20fca8c0
 0083f400 0083f000 0083f2a3 0004
Call Trace:
 [] dump_stack+0x85/0xc2
 [] alloc_contig_range+0x368/0x370
 [] cma_alloc+0x127/0x2e0
 [] dma_alloc_from_contiguous+0x38/0x40
 [] dma_generic_alloc_coherent+0x91/0x1d0
 [] x86_swiotlb_alloc_coherent+0x25/0x50
 [] ttm_dma_populate+0x48a/0x9a0 [ttm]
 [] ? __kmalloc+0x1b6/0x250
 [] radeon_ttm_tt_populate+0x22a/0x2d0 [radeon]
 [] ? ttm_dma_tt_init+0x67/0xc0 [ttm]
 [] ttm_tt_bind+0x37/0x70 [ttm]
 [] ttm_bo_handle_move_mem+0x528/0x5a0 [ttm]
 [] ? shmem_alloc_inode+0x1a/0x30
 [] ttm_bo_validate+0x114/0x130 [ttm]
 [] ? _raw_write_unlock+0xe/0x10
 [] ttm_bo_init+0x31d/0x3f0 [ttm]
 [] radeon_bo_create+0x19b/0x260 [radeon]
 [] ? radeon_update_memory_usage.isra.0+0x50/0x50 [radeon]
 [] radeon_gem_object_create+0xad/0x180 [radeon]
 [] radeon_gem_create_ioctl+0x5f/0xf0 [radeon]
 [] drm_ioctl+0x21b/0x4d0 [drm]
 [] ? radeon_gem_pwrite_ioctl+0x30/0x30 [radeon]
 [] radeon_drm_ioctl+0x4c/0x80 [radeon]
 [] do_vfs_ioctl+0x92/0x5c0
 [] SyS_ioctl+0x79/0x90
 [] do_syscall_64+0x73/0x190
 [] entry_SYSCALL64_slow_path+0x25/0x25

The Radeon card in my case is a VisionTek HD 7750 Eyefinity 6, which is
reported as:

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cape 
Verde PRO [Radeon HD 7750/8740 / R7 250E] (prog-if 00 [VGA controller])
Subsystem: VISIONTEK Cape Verde PRO [Radeon HD 7750/8740 / R7 250E]
Flags: bus master, fast devsel, latency 0, IRQ 58
Memory at c000 (64-bit, prefetchable) [size=256M]
Memory at fbe0 (64-bit, non-prefetchable) [size=256K]
I/O ports at e000 [size=256]
Expansion ROM at 000c [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 

Capabilities: [150] Advanced Error Reporting
Kernel driver in use: radeon
Kernel modules: radeon, amdgpu

-- 
Robin Hugh Johnson
E-Mail : robb...@orbis-terrarum.net
Home Page  : http://www.orbis-terrarum.net/?l=people.robbat2
ICQ#   : 30269588 or 41961639
GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85


signature.asc
Description: Digital signature

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

15 matches

Site Navigation

Mail list logo

Footer information