Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder

2014-04-28 Thread Dave Hansen
On 04/26/2014 06:10 AM, Chris Wilson wrote:
>>> > > Thanks for the pointer to
>>> > > register_oom_notifier(), I can use that to make sure that we do purge
>>> > > everything from the GPU, and do a sanity check at the same time, before
>>> > > we start killing processes.
>> > 
>> > Actually, that one doesn't get called until we're *SURE* we are going to
>> > OOM.  Any action taken in there won't be taken in to account.
> blocking_notifier_call_chain(&oom_notify_list, 0, &freed);
> if (freed > 0)
>   /* Got some memory back in the last second. */
>   return;
> 
> That looks like it should abort the oom and so repeat the allocation
> attempt? Or is that too hopeful?

You're correct.  I was reading the code utterly wrong.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder

2014-04-26 Thread Chris Wilson
On Fri, Apr 25, 2014 at 10:18:57AM -0700, Dave Hansen wrote:
> On 04/25/2014 12:23 AM, Chris Wilson wrote:
> > On Thu, Apr 24, 2014 at 03:35:47PM -0700, Dave Hansen wrote:
> >> On 04/24/2014 08:39 AM, Chris Wilson wrote:
> >>> On Thu, Apr 24, 2014 at 08:21:58AM -0700, Dave Hansen wrote:
>  Is it possible that there's still a get_page() reference that's holding
>  those pages in place from the graphics code?
> >>>
> >>> Not from i915.ko. The last resort of our shrinker is to drop all page
> >>> refs held by the GPU, which is invoked if we are asked to free memory
> >>> and we have no inactive objects left.
> >>
> >> How sure are we that this was performed before the OOM?
> > 
> > Only by virtue of how shrink_slabs() works.
> 
> Could we try to raise the level of assurance there, please? :)
> 
> So this "last resort" is i915_gem_shrink_all()?  It seems like we might
> have some problems getting down to that part of the code if we have
> problems getting the mutex.

In general, but not in this example where the load is tightly controlled.
 
> We have tracepoints for the shrinkers in here (it says slab, but it's
> all the shrinkers, I checked):
> 
> /sys/kernel/debug/tracing/events/vmscan/mm_shrink_slab_*/enable
> and another for OOMs:
> /sys/kernel/debug/tracing/events/oom/enable
> 
> Could you collect a trace during one of these OOM events and see what
> the i915 shrinker is doing?  Just enable those two and then collect a
> copy of:
> 
>   /sys/kernel/debug/tracing/trace
> 
> That'll give us some insight about how well the shrinker is working.  If
> the VM gave up on calling in to it, it might reveal why we didn't get
> all the way down in to i915_gem_shrink_all().

I'll add it to the list for QA to try.
 
> > Thanks for the pointer to
> > register_oom_notifier(), I can use that to make sure that we do purge
> > everything from the GPU, and do a sanity check at the same time, before
> > we start killing processes.
> 
> Actually, that one doesn't get called until we're *SURE* we are going to
> OOM.  Any action taken in there won't be taken in to account.

blocking_notifier_call_chain(&oom_notify_list, 0, &freed);
if (freed > 0)
/* Got some memory back in the last second. */
return;

That looks like it should abort the oom and so repeat the allocation
attempt? Or is that too hopeful?

> >> Also, forgive me for being an idiot wrt the way graphics work, but are
> >> there any good candidates that you can think of that could be holding a
> >> reference?  I've honestly never seen an OOM like this.
> > 
> > Here the only place that we take a page reference is in
> > i915_gem_object_get_pages(). We do this when we first bind the pages
> > into the GPU's translation table, but we only release the pages once the
> > object is destroyed or the system experiences memory pressure. (Once the
> > GPU touches the pages, we no longer consider them to be cache coherent
> > with the CPU and so migrating them between the GPU and CPU requires
> > clflushing, which is expensive.)
> > 
> > Aside from CPU mmaps of the shmemfs filp, all operations on our
> > graphical objects should lead to i915_gem_object_get_pages(). However
> > not all objects are recoverable as some may be pinned due to hardware
> > access.
> 
> In that oom callback, could you dump out the aggregate number of
> obj->pages_pin_count across all the objects?  That would be a very
> interesting piece of information to have.  It would also be very
> insightful for folks who see OOMs in practice with i915 in their systems.

Indeed.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder

2014-04-25 Thread Dave Hansen
Poking around with those tracepoints, I don't see the i915 shrinker
getting run, only i915_gem_inactive_count() being called.  It must be
returning 0 because we're never even _getting_ to the tracepoints
themselves after calling i915_gem_inactive_count().

This is on my laptop, and I haven't been able to coax i915 in to
reclaiming a single page in 10 or 15 minutes.  That seems fishy to me.
Surely *SOMETHING* has become reclaimable in that time.

Here's /sys/kernel/debug/dri/0/i915_gem_objects:

> 919 objects, 354914304 bytes
> 874 [333] objects, 291004416 [93614080] bytes in gtt
>   0 [0] active objects, 0 [0] bytes
>   874 [333] inactive objects, 291004416 [93614080] bytes
> 0 unbound objects, 0 bytes
> 199 purgeable objects, 92844032 bytes
> 30 pinned mappable objects, 18989056 bytes
> 139 fault mappable objects, 17371136 bytes
> 2145386496 [268435456] gtt total
> 
> Xorg: 632 objects, 235450368 bytes (0 active, 180899840 inactive, 21262336 
> unbound)
> gnome-control-c: 11 objects, 110592 bytes (0 active, 0 inactive, 49152 
> unbound)
> chromium-browse: 266 objects, 101367808 bytes (0 active, 101330944 inactive, 
> 0 unbound)
> Xorg: 0 objects, 0 bytes (0 active, 0 inactive, 0 unbound)
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder

2014-04-25 Thread Dave Hansen
On 04/25/2014 12:23 AM, Chris Wilson wrote:
> On Thu, Apr 24, 2014 at 03:35:47PM -0700, Dave Hansen wrote:
>> On 04/24/2014 08:39 AM, Chris Wilson wrote:
>>> On Thu, Apr 24, 2014 at 08:21:58AM -0700, Dave Hansen wrote:
 Is it possible that there's still a get_page() reference that's holding
 those pages in place from the graphics code?
>>>
>>> Not from i915.ko. The last resort of our shrinker is to drop all page
>>> refs held by the GPU, which is invoked if we are asked to free memory
>>> and we have no inactive objects left.
>>
>> How sure are we that this was performed before the OOM?
> 
> Only by virtue of how shrink_slabs() works.

Could we try to raise the level of assurance there, please? :)

So this "last resort" is i915_gem_shrink_all()?  It seems like we might
have some problems getting down to that part of the code if we have
problems getting the mutex.

We have tracepoints for the shrinkers in here (it says slab, but it's
all the shrinkers, I checked):

/sys/kernel/debug/tracing/events/vmscan/mm_shrink_slab_*/enable
and another for OOMs:
/sys/kernel/debug/tracing/events/oom/enable

Could you collect a trace during one of these OOM events and see what
the i915 shrinker is doing?  Just enable those two and then collect a
copy of:

/sys/kernel/debug/tracing/trace

That'll give us some insight about how well the shrinker is working.  If
the VM gave up on calling in to it, it might reveal why we didn't get
all the way down in to i915_gem_shrink_all().

> Thanks for the pointer to
> register_oom_notifier(), I can use that to make sure that we do purge
> everything from the GPU, and do a sanity check at the same time, before
> we start killing processes.

Actually, that one doesn't get called until we're *SURE* we are going to
OOM.  Any action taken in there won't be taken in to account.

>> Also, forgive me for being an idiot wrt the way graphics work, but are
>> there any good candidates that you can think of that could be holding a
>> reference?  I've honestly never seen an OOM like this.
> 
> Here the only place that we take a page reference is in
> i915_gem_object_get_pages(). We do this when we first bind the pages
> into the GPU's translation table, but we only release the pages once the
> object is destroyed or the system experiences memory pressure. (Once the
> GPU touches the pages, we no longer consider them to be cache coherent
> with the CPU and so migrating them between the GPU and CPU requires
> clflushing, which is expensive.)
> 
> Aside from CPU mmaps of the shmemfs filp, all operations on our
> graphical objects should lead to i915_gem_object_get_pages(). However
> not all objects are recoverable as some may be pinned due to hardware
> access.

In that oom callback, could you dump out the aggregate number of
obj->pages_pin_count across all the objects?  That would be a very
interesting piece of information to have.  It would also be very
insightful for folks who see OOMs in practice with i915 in their systems.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder

2014-04-25 Thread Chris Wilson
On Thu, Apr 24, 2014 at 03:35:47PM -0700, Dave Hansen wrote:
> On 04/24/2014 08:39 AM, Chris Wilson wrote:
> > On Thu, Apr 24, 2014 at 08:21:58AM -0700, Dave Hansen wrote:
> >> Is it possible that there's still a get_page() reference that's holding
> >> those pages in place from the graphics code?
> > 
> > Not from i915.ko. The last resort of our shrinker is to drop all page
> > refs held by the GPU, which is invoked if we are asked to free memory
> > and we have no inactive objects left.
> 
> How sure are we that this was performed before the OOM?

Only by virtue of how shrink_slabs() works. Thanks for the pointer to
register_oom_notifier(), I can use that to make sure that we do purge
everything from the GPU, and do a sanity check at the same time, before
we start killing processes.
 
> Also, forgive me for being an idiot wrt the way graphics work, but are
> there any good candidates that you can think of that could be holding a
> reference?  I've honestly never seen an OOM like this.

Here the only place that we take a page reference is in
i915_gem_object_get_pages(). We do this when we first bind the pages
into the GPU's translation table, but we only release the pages once the
object is destroyed or the system experiences memory pressure. (Once the
GPU touches the pages, we no longer consider them to be cache coherent
with the CPU and so migrating them between the GPU and CPU requires
clflushing, which is expensive.)

Aside from CPU mmaps of the shmemfs filp, all operations on our
graphical objects should lead to i915_gem_object_get_pages(). However
not all objects are recoverable as some may be pinned due to hardware
access.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder

2014-04-24 Thread Dave Hansen
On 04/24/2014 08:39 AM, Chris Wilson wrote:
> On Thu, Apr 24, 2014 at 08:21:58AM -0700, Dave Hansen wrote:
>> Is it possible that there's still a get_page() reference that's holding
>> those pages in place from the graphics code?
> 
> Not from i915.ko. The last resort of our shrinker is to drop all page
> refs held by the GPU, which is invoked if we are asked to free memory
> and we have no inactive objects left.

How sure are we that this was performed before the OOM?

Also, forgive me for being an idiot wrt the way graphics work, but are
there any good candidates that you can think of that could be holding a
reference?  I've honestly never seen an OOM like this.

Somewhat rhetorical question for the mm folks on cc: should we be
sticking the pages on which you're holding a reference on our
unreclaimable list?

> If we could get a callback for the oom report, I could dump some details
> about what the GPU is holding onto. That seems like a useful extension to
> add to the shrinkers.

There's a register_oom_notifier().  Is that sufficient for your use, or
is there something additional that would help?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder

2014-04-24 Thread Chris Wilson
On Thu, Apr 24, 2014 at 08:21:58AM -0700, Dave Hansen wrote:
> On 04/23/2014 10:58 PM, Chris Wilson wrote:
> > [ 4756.750938] Node 0 DMA free:14664kB min:32kB low:40kB high:48kB 
> > active_anon:0kB inactive_anon:1024kB active_file:0kB inactive_file:4kB 
> > unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB 
> > managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:412kB 
> > slab_reclaimable:80kB slab_unreclaimable:24kB kernel_stack:0kB 
> > pagetables:48kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB 
> > pages_scanned:76 all_unreclaimable? yes
> > [ 4756.751103] lowmem_reserve[]: 0 3337 3660 3660
> > [ 4756.751133] Node 0 DMA32 free:7208kB min:7044kB low:8804kB high:10564kB 
> > active_anon:36172kB inactive_anon:3351408kB active_file:92kB 
> > inactive_file:72kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
> > present:3518336kB managed:3440548kB mlocked:0kB dirty:0kB writeback:0kB 
> > mapped:12kB shmem:1661420kB slab_reclaimable:17624kB 
> > slab_unreclaimable:14400kB kernel_stack:696kB pagetables:4324kB 
> > unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:327 
> > all_unreclaimable? yes
> > [ 4756.751341] lowmem_reserve[]: 0 0 322 322
> > [ 4756.752889] Node 0 Normal free:328kB min:680kB low:848kB high:1020kB 
> > active_anon:61372kB inactive_anon:250740kB active_file:0kB 
> > inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
> > present:393216kB managed:330360kB mlocked:0kB dirty:0kB writeback:0kB 
> > mapped:0kB shmem:227740kB slab_reclaimable:3032kB slab_unreclaimable:5128kB 
> > kernel_stack:400kB pagetables:624kB unstable:0kB bounce:0kB free_cma:0kB 
> > writeback_tmp:0kB pages_scanned:6 all_unreclaimable? yes
> > [ 4756.757635] lowmem_reserve[]: 0 0 0 0
> > [ 4756.759294] Node 0 DMA: 2*4kB (UM) 2*8kB (UM) 3*16kB (UEM) 4*32kB (UEM) 
> > 2*64kB (UM) 4*128kB (UEM) 2*256kB (EM) 2*512kB (EM) 2*1024kB (UM) 3*2048kB 
> > (EMR) 1*4096kB (M) = 14664kB
> > [ 4756.762776] Node 0 DMA32: 424*4kB (UEM) 171*8kB (UEM) 21*16kB (UEM) 
> > 1*32kB (R) 1*64kB (R) 1*128kB (R) 0*256kB 1*512kB (R) 1*1024kB (R) 1*2048kB 
> > (R) 0*4096kB = 7208kB
> > [ 4756.766284] Node 0 Normal: 26*4kB (UER) 18*8kB (UER) 3*16kB (E) 1*32kB 
> > (R) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 328kB
> > [ 4756.768198] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
> > hugepages_size=2048kB
> > [ 4756.770026] 916139 total pagecache pages
> > [ 4756.771857] 443703 pages in swap cache
> > [ 4756.773695] Swap cache stats: add 15363874, delete 14920171, find 
> > 6533699/7512215
> > [ 4756.775592] Free swap  = 0kB
> > [ 4756.777505] Total swap = 2047996kB
> 
> OK, so here's my theory as to what happens:
> 
> 1. The graphics pages got put on the LRU
> 2. System is low on memory, they get on (and *STAY* on) the inactive
>LRU.
> 3. VM adds graphics pages to the swap cache, and writes them out, and
>we see the writeout from the vmstat, and lots of adds/removes from
>the swap cache.
> 4. But, despite all the swap writeout, we don't get helped by seeing
>much memory get freed.  Why?
> 
> I _suspect_ that the graphics drivers here are holding a reference to
> the page.  During reclaim, we're mostly concerned with the pages being
> mapped.  If we manage to get them unmapped, we'll go ahead and swap
> them, which I _think_ is what we're seeing.  But, when it comes time to
> _actually_ free them, that last reference on the page keeps them from
> being freed.
> 
> Is it possible that there's still a get_page() reference that's holding
> those pages in place from the graphics code?

Not from i915.ko. The last resort of our shrinker is to drop all page
refs held by the GPU, which is invoked if we are asked to free memory
and we have no inactive objects left.

If we could get a callback for the oom report, I could dump some details
about what the GPU is holding onto. That seems like a useful extension to
add to the shrinkers.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder

2014-04-24 Thread Dave Hansen
On 04/23/2014 10:58 PM, Chris Wilson wrote:
> [ 4756.750938] Node 0 DMA free:14664kB min:32kB low:40kB high:48kB 
> active_anon:0kB inactive_anon:1024kB active_file:0kB inactive_file:4kB 
> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB 
> managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:412kB 
> slab_reclaimable:80kB slab_unreclaimable:24kB kernel_stack:0kB 
> pagetables:48kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB 
> pages_scanned:76 all_unreclaimable? yes
> [ 4756.751103] lowmem_reserve[]: 0 3337 3660 3660
> [ 4756.751133] Node 0 DMA32 free:7208kB min:7044kB low:8804kB high:10564kB 
> active_anon:36172kB inactive_anon:3351408kB active_file:92kB 
> inactive_file:72kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
> present:3518336kB managed:3440548kB mlocked:0kB dirty:0kB writeback:0kB 
> mapped:12kB shmem:1661420kB slab_reclaimable:17624kB 
> slab_unreclaimable:14400kB kernel_stack:696kB pagetables:4324kB unstable:0kB 
> bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:327 
> all_unreclaimable? yes
> [ 4756.751341] lowmem_reserve[]: 0 0 322 322
> [ 4756.752889] Node 0 Normal free:328kB min:680kB low:848kB high:1020kB 
> active_anon:61372kB inactive_anon:250740kB active_file:0kB inactive_file:4kB 
> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:393216kB 
> managed:330360kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB 
> shmem:227740kB slab_reclaimable:3032kB slab_unreclaimable:5128kB 
> kernel_stack:400kB pagetables:624kB unstable:0kB bounce:0kB free_cma:0kB 
> writeback_tmp:0kB pages_scanned:6 all_unreclaimable? yes
> [ 4756.757635] lowmem_reserve[]: 0 0 0 0
> [ 4756.759294] Node 0 DMA: 2*4kB (UM) 2*8kB (UM) 3*16kB (UEM) 4*32kB (UEM) 
> 2*64kB (UM) 4*128kB (UEM) 2*256kB (EM) 2*512kB (EM) 2*1024kB (UM) 3*2048kB 
> (EMR) 1*4096kB (M) = 14664kB
> [ 4756.762776] Node 0 DMA32: 424*4kB (UEM) 171*8kB (UEM) 21*16kB (UEM) 1*32kB 
> (R) 1*64kB (R) 1*128kB (R) 0*256kB 1*512kB (R) 1*1024kB (R) 1*2048kB (R) 
> 0*4096kB = 7208kB
> [ 4756.766284] Node 0 Normal: 26*4kB (UER) 18*8kB (UER) 3*16kB (E) 1*32kB (R) 
> 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 328kB
> [ 4756.768198] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
> hugepages_size=2048kB
> [ 4756.770026] 916139 total pagecache pages
> [ 4756.771857] 443703 pages in swap cache
> [ 4756.773695] Swap cache stats: add 15363874, delete 14920171, find 
> 6533699/7512215
> [ 4756.775592] Free swap  = 0kB
> [ 4756.777505] Total swap = 2047996kB

OK, so here's my theory as to what happens:

1. The graphics pages got put on the LRU
2. System is low on memory, they get on (and *STAY* on) the inactive
   LRU.
3. VM adds graphics pages to the swap cache, and writes them out, and
   we see the writeout from the vmstat, and lots of adds/removes from
   the swap cache.
4. But, despite all the swap writeout, we don't get helped by seeing
   much memory get freed.  Why?

I _suspect_ that the graphics drivers here are holding a reference to
the page.  During reclaim, we're mostly concerned with the pages being
mapped.  If we manage to get them unmapped, we'll go ahead and swap
them, which I _think_ is what we're seeing.  But, when it comes time to
_actually_ free them, that last reference on the page keeps them from
being freed.

Is it possible that there's still a get_page() reference that's holding
those pages in place from the graphics code?

>> Also, the vmstat output from the bug:
>>
>>> https://bugs.freedesktop.org/show_bug.cgi?id=72742
>>
>> shows there being an *AWFUL* lot of swap I/O going on here.  From the
>> looks of it, we stuck ~2GB in swap and evicted another 1.5GB of page
>> cache (although I guess that could be double-counting tmpfs getting
>> swapped out too).  Hmmm, was this one of the cases where you actually
>> ran _out_ of swap?
> 
> Yes. This bug is a little odd because they always run out of swap. We
> have another category of bug (which appears to be fixed, touch wood)
> where we trigger oom without even touching swap. The test case is
> designed to only just swap (use at most 1/4 of the available swap space)
> and checks that its working set should fit into available memory + swap.
> However, when QA run the test, their systems run completely out of
> virtual memory. There is a discrepancy on their machines where
> anon_inactive is reported as being 2x shmem, but we only expect
> anon_inactive to be our own shmem allocations. I don't know how to track
> what else is using anon_inactive. Suggestions?

Let's tackle one bug at a time.  They might be the same thing.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder

2014-04-23 Thread Chris Wilson
On Wed, Apr 23, 2014 at 02:14:36PM -0700, Dave Hansen wrote:
> On 04/22/2014 12:30 PM, Daniel Vetter wrote:
> >> > > During testing of i915.ko with working texture sets larger than RAM, we
> >> > > encounter OOM with plenty of memory still trapped within writeback, 
> >> > > e.g:
> >> > > 
> >> > > [   42.386039] active_anon:10134 inactive_anon:1900781 isolated_anon:32
> >> > >  active_file:33 inactive_file:39 isolated_file:0
> >> > >  unevictable:0 dirty:0 writeback:337627 unstable:0
> >> > >  free:11985 slab_reclaimable:9458 slab_unreclaimable:23614
> >> > >  mapped:41 shmem:1560769 pagetables:1276 bounce:0
> >> > > 
> >> > > If we throttle for writeback following shrink_slab, this gives us time
> >> > > to wait upon the writeback generated by the i915.ko shinker:
> >> > > 
> >> > > [ 4756.750808] active_anon:24386 inactive_anon:900793 isolated_anon:0
> >> > >  active_file:23 inactive_file:20 isolated_file:0
> >> > >  unevictable:0 dirty:0 writeback:0 unstable:0
> >> > >  free:5550 slab_reclaimable:5184 slab_unreclaimable:4888
> >> > >  mapped:3 shmem:472393 pagetables:1249 bounce:0
> 
> Could you get some dumps of the entire set of OOM information?  These
> are only tiny snippets.

For reference the last oom report after flushing all the writeback:

[ 4756.749554] crond invoked oom-killer: gfp_mask=0x201da, order=0, 
oom_score_adj=0
[ 4756.749603] crond cpuset=/ mems_allowed=0
[ 4756.749628] CPU: 0 PID: 3574 Comm: crond Tainted: GW
3.14.0_prts_de579f_20140410 #2
[ 4756.749676] Hardware name: Gigabyte Technology Co., Ltd. 
H55M-UD2H/H55M-UD2H, BIOS F4 12/02/2009
[ 4756.749723]   000201da 81717273 
8800d235dc40
[ 4756.749762]  81714541 0400 8800cb6f3b10 
880117ff8000
[ 4756.749800]  81072266 0206 812d6ebe 
880112f25c40
[ 4756.749838] Call Trace:
[ 4756.749856]  [] ? dump_stack+0x41/0x51
[ 4756.749881]  [] ? dump_header.isra.8+0x69/0x191
[ 4756.749911]  [] ? ktime_get_ts+0x49/0xab
[ 4756.749938]  [] ? ___ratelimit+0xae/0xc8
[ 4756.749965]  [] ? oom_kill_process+0x76/0x32c
[ 4756.749992]  [] ? find_lock_task_mm+0x22/0x6e
[ 4756.750018]  [] ? out_of_memory+0x41c/0x44f
[ 4756.750045]  [] ? __alloc_pages_nodemask+0x680/0x78d
[ 4756.750076]  [] ? alloc_pages_current+0xbf/0xdc
[ 4756.750103]  [] ? filemap_fault+0x266/0x38b
[ 4756.750130]  [] ? __do_fault+0xac/0x3bf
[ 4756.750155]  [] ? handle_mm_fault+0x1e7/0x7e2
[ 4756.750181]  [] ? tlb_flush_mmu+0x4b/0x64
[ 4756.750219]  [] ? timerqueue_add+0x79/0x98
[ 4756.750254]  [] ? enqueue_hrtimer+0x15/0x37
[ 4756.750287]  [] ? __do_page_fault+0x42e/0x47b
[ 4756.750319]  [] ? hrtimer_try_to_cancel+0x67/0x70
[ 4756.750353]  [] ? hrtimer_cancel+0xc/0x16
[ 4756.750385]  [] ? do_nanosleep+0xb3/0xf1
[ 4756.750415]  [] ? hrtimer_nanosleep+0x89/0x10b
[ 4756.750447]  [] ? page_fault+0x22/0x30
[ 4756.750476] Mem-Info:
[ 4756.750490] Node 0 DMA per-cpu:
[ 4756.750510] CPU0: hi:0, btch:   1 usd:   0
[ 4756.750533] CPU1: hi:0, btch:   1 usd:   0
[ 4756.750555] CPU2: hi:0, btch:   1 usd:   0
[ 4756.750576] CPU3: hi:0, btch:   1 usd:   0
[ 4756.750598] Node 0 DMA32 per-cpu:
[ 4756.750615] CPU0: hi:  186, btch:  31 usd:   0
[ 4756.750637] CPU1: hi:  186, btch:  31 usd:   0
[ 4756.750660] CPU2: hi:  186, btch:  31 usd:   0
[ 4756.750681] CPU3: hi:  186, btch:  31 usd:   0
[ 4756.750702] Node 0 Normal per-cpu:
[ 4756.750720] CPU0: hi:   90, btch:  15 usd:   0
[ 4756.750742] CPU1: hi:   90, btch:  15 usd:   0
[ 4756.750763] CPU2: hi:   90, btch:  15 usd:   0
[ 4756.750785] CPU3: hi:   90, btch:  15 usd:   0
[ 4756.750808] active_anon:24386 inactive_anon:900793 isolated_anon:0
 active_file:23 inactive_file:20 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 free:5550 slab_reclaimable:5184 slab_unreclaimable:4888
 mapped:3 shmem:472393 pagetables:1249 bounce:0
 free_cma:0
[ 4756.750938] Node 0 DMA free:14664kB min:32kB low:40kB high:48kB 
active_anon:0kB inactive_anon:1024kB active_file:0kB inactive_file:4kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB 
managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:412kB 
slab_reclaimable:80kB slab_unreclaimable:24kB kernel_stack:0kB pagetables:48kB 
unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:76 
all_unreclaimable? yes
[ 4756.751103] lowmem_reserve[]: 0 3337 3660 3660
[ 4756.751133] Node 0 DMA32 free:7208kB min:7044kB low:8804kB high:10564kB 
active_anon:36172kB inactive_anon:3351408kB active_file:92kB inactive_file:72kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3518336kB 
managed:3440548kB mlocked:0kB dirty:0kB writeback:0kB mapped:12kB 
shmem:1661420kB slab_reclaimable:17624kB slab_unreclaimable:14400kB 
kernel_stack:696kB pagetables:4324kB unstable:0kB bounce:0kB free_cma:0kB 
writeback_tmp:0kB pages_scanned:327 all_unreclaimable? yes
[ 4756.751341] lowmem_reserve[]: 0 0 322 32

Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder

2014-04-23 Thread Dave Hansen
On 04/22/2014 12:30 PM, Daniel Vetter wrote:
>> > > During testing of i915.ko with working texture sets larger than RAM, we
>> > > encounter OOM with plenty of memory still trapped within writeback, e.g:
>> > > 
>> > > [   42.386039] active_anon:10134 inactive_anon:1900781 isolated_anon:32
>> > >  active_file:33 inactive_file:39 isolated_file:0
>> > >  unevictable:0 dirty:0 writeback:337627 unstable:0
>> > >  free:11985 slab_reclaimable:9458 slab_unreclaimable:23614
>> > >  mapped:41 shmem:1560769 pagetables:1276 bounce:0
>> > > 
>> > > If we throttle for writeback following shrink_slab, this gives us time
>> > > to wait upon the writeback generated by the i915.ko shinker:
>> > > 
>> > > [ 4756.750808] active_anon:24386 inactive_anon:900793 isolated_anon:0
>> > >  active_file:23 inactive_file:20 isolated_file:0
>> > >  unevictable:0 dirty:0 writeback:0 unstable:0
>> > >  free:5550 slab_reclaimable:5184 slab_unreclaimable:4888
>> > >  mapped:3 shmem:472393 pagetables:1249 bounce:0

Could you get some dumps of the entire set of OOM information?  These
are only tiny snippets.

Also, the vmstat output from the bug:

> https://bugs.freedesktop.org/show_bug.cgi?id=72742

shows there being an *AWFUL* lot of swap I/O going on here.  From the
looks of it, we stuck ~2GB in swap and evicted another 1.5GB of page
cache (although I guess that could be double-counting tmpfs getting
swapped out too).  Hmmm, was this one of the cases where you actually
ran _out_ of swap?

>  2  0  19472  33952296 36103240 19472 0 19472 1474  151  3 27 71  > 0
>  4  0 484964  66468296 31758640 465492 0 465516 2597 1395  0 32 
> 66  2
>  0  2 751940  23692980 30228840 266976   688 266976 3681  636  0 27 
> 66  6
> procs ---memory-- ---swap-- -io -system-- cpu
>  r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa
>  2  1 1244580 295336988 26069840 492896 0 492908 1237  311  1  9 
> 50 41
>  0  2 2047996  28760988 20371440 803160 0 803160 1221 1291  1 15 
> 69 14


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder

2014-04-22 Thread Daniel Vetter
On Fri, Apr 18, 2014 at 12:14:16PM -0700, Andrew Morton wrote:
> On Thu, 10 Apr 2014 08:05:06 +0100 Chris Wilson  
> wrote:
> 
> > During testing of i915.ko with working texture sets larger than RAM, we
> > encounter OOM with plenty of memory still trapped within writeback, e.g:
> > 
> > [   42.386039] active_anon:10134 inactive_anon:1900781 isolated_anon:32
> >  active_file:33 inactive_file:39 isolated_file:0
> >  unevictable:0 dirty:0 writeback:337627 unstable:0
> >  free:11985 slab_reclaimable:9458 slab_unreclaimable:23614
> >  mapped:41 shmem:1560769 pagetables:1276 bounce:0
> > 
> > If we throttle for writeback following shrink_slab, this gives us time
> > to wait upon the writeback generated by the i915.ko shinker:
> > 
> > [ 4756.750808] active_anon:24386 inactive_anon:900793 isolated_anon:0
> >  active_file:23 inactive_file:20 isolated_file:0
> >  unevictable:0 dirty:0 writeback:0 unstable:0
> >  free:5550 slab_reclaimable:5184 slab_unreclaimable:4888
> >  mapped:3 shmem:472393 pagetables:1249 bounce:0
> > 
> > (Sadly though the test is still failing.)
> > 
> > Testcase: igt/gem_tiled_swapping
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=72742
> 
> i915_gem_object_get_pages_gtt() makes my head spin, but
> https://bugs.freedesktop.org/attachment.cgi?id=90818 says
> "gfp_mask=0x201da" which is 
> 
> ___GFP_HARDWALL|___GFP_COLD|___GFP_FS|___GFP_IO|___GFP_WAIT|___GFP_MOVABLE|___GFP_HIGHMEM
> 
> so this allocation should work and it very bad if the page allocator is
> declaring oom while there is so much writeback in flight, assuming the
> writeback is to eligible zones.

For more head spinning look at the lock stealing dance we do in our
shrinker callbacks i915_gem_inactive_scan|count(). It's not pretty at all,
but it helps to avoids the dreaded oom in a few more cases. Some review of
our mess of ducttape from -mm developers with actual clue would be really
appreciated ...
-Daniel
 
> Mel, Johannes: could you take a look please?
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org";> em...@kvack.org 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder

2014-04-18 Thread Andrew Morton
On Thu, 10 Apr 2014 08:05:06 +0100 Chris Wilson  
wrote:

> During testing of i915.ko with working texture sets larger than RAM, we
> encounter OOM with plenty of memory still trapped within writeback, e.g:
> 
> [   42.386039] active_anon:10134 inactive_anon:1900781 isolated_anon:32
>  active_file:33 inactive_file:39 isolated_file:0
>  unevictable:0 dirty:0 writeback:337627 unstable:0
>  free:11985 slab_reclaimable:9458 slab_unreclaimable:23614
>  mapped:41 shmem:1560769 pagetables:1276 bounce:0
> 
> If we throttle for writeback following shrink_slab, this gives us time
> to wait upon the writeback generated by the i915.ko shinker:
> 
> [ 4756.750808] active_anon:24386 inactive_anon:900793 isolated_anon:0
>  active_file:23 inactive_file:20 isolated_file:0
>  unevictable:0 dirty:0 writeback:0 unstable:0
>  free:5550 slab_reclaimable:5184 slab_unreclaimable:4888
>  mapped:3 shmem:472393 pagetables:1249 bounce:0
> 
> (Sadly though the test is still failing.)
> 
> Testcase: igt/gem_tiled_swapping
> References: https://bugs.freedesktop.org/show_bug.cgi?id=72742

i915_gem_object_get_pages_gtt() makes my head spin, but
https://bugs.freedesktop.org/attachment.cgi?id=90818 says
"gfp_mask=0x201da" which is 

___GFP_HARDWALL|___GFP_COLD|___GFP_FS|___GFP_IO|___GFP_WAIT|___GFP_MOVABLE|___GFP_HIGHMEM

so this allocation should work and it very bad if the page allocator is
declaring oom while there is so much writeback in flight, assuming the
writeback is to eligible zones.

Mel, Johannes: could you take a look please?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] mm: Throttle shrinkers harder

2014-04-10 Thread Chris Wilson
During testing of i915.ko with working texture sets larger than RAM, we
encounter OOM with plenty of memory still trapped within writeback, e.g:

[   42.386039] active_anon:10134 inactive_anon:1900781 isolated_anon:32
 active_file:33 inactive_file:39 isolated_file:0
 unevictable:0 dirty:0 writeback:337627 unstable:0
 free:11985 slab_reclaimable:9458 slab_unreclaimable:23614
 mapped:41 shmem:1560769 pagetables:1276 bounce:0

If we throttle for writeback following shrink_slab, this gives us time
to wait upon the writeback generated by the i915.ko shinker:

[ 4756.750808] active_anon:24386 inactive_anon:900793 isolated_anon:0
 active_file:23 inactive_file:20 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 free:5550 slab_reclaimable:5184 slab_unreclaimable:4888
 mapped:3 shmem:472393 pagetables:1249 bounce:0

(Sadly though the test is still failing.)

Testcase: igt/gem_tiled_swapping
References: https://bugs.freedesktop.org/show_bug.cgi?id=72742
Signed-off-by: Chris Wilson 
Cc: Andrew Morton 
Cc: Mel Gorman 
Cc: Michal Hocko 
Cc: Rik van Riel 
Cc: Johannes Weiner 
Cc: Dave Chinner 
Cc: Glauber Costa 
Cc: Hugh Dickins 
Cc: linux...@kvack.org
---
 mm/vmscan.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index a9c74b409681..8c2cb1150d17 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -135,6 +135,10 @@ unsigned long vm_total_pages;  /* The total number of 
pages which the VM controls
 static LIST_HEAD(shrinker_list);
 static DECLARE_RWSEM(shrinker_rwsem);
 
+static bool throttle_direct_reclaim(gfp_t gfp_mask,
+   struct zonelist *zonelist,
+   nodemask_t *nodemask);
+
 #ifdef CONFIG_MEMCG
 static bool global_reclaim(struct scan_control *sc)
 {
@@ -1521,7 +1525,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct 
lruvec *lruvec,
 * of pages under pages flagged for immediate reclaim and stall if any
 * are encountered in the nr_immediate check below.
 */
-   if (nr_writeback && nr_writeback == nr_taken)
+   if (nr_writeback > nr_taken / 2)
zone_set_flag(zone, ZONE_WRITEBACK);
 
/*
@@ -2465,6 +2469,12 @@ static unsigned long do_try_to_free_pages(struct 
zonelist *zonelist,
WB_REASON_TRY_TO_FREE_PAGES);
sc->may_writepage = 1;
}
+
+   if (global_reclaim(sc) &&
+   throttle_direct_reclaim(sc->gfp_mask,
+   zonelist,
+   sc->nodemask))
+   aborted_reclaim = true;
} while (--sc->priority >= 0 && !aborted_reclaim);
 
 out:
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx