Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder
On 04/26/2014 06:10 AM, Chris Wilson wrote: >>> > > Thanks for the pointer to >>> > > register_oom_notifier(), I can use that to make sure that we do purge >>> > > everything from the GPU, and do a sanity check at the same time, before >>> > > we start killing processes. >> > >> > Actually, that one doesn't get called until we're *SURE* we are going to >> > OOM. Any action taken in there won't be taken in to account. > blocking_notifier_call_chain(&oom_notify_list, 0, &freed); > if (freed > 0) > /* Got some memory back in the last second. */ > return; > > That looks like it should abort the oom and so repeat the allocation > attempt? Or is that too hopeful? You're correct. I was reading the code utterly wrong. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder
On Fri, Apr 25, 2014 at 10:18:57AM -0700, Dave Hansen wrote: > On 04/25/2014 12:23 AM, Chris Wilson wrote: > > On Thu, Apr 24, 2014 at 03:35:47PM -0700, Dave Hansen wrote: > >> On 04/24/2014 08:39 AM, Chris Wilson wrote: > >>> On Thu, Apr 24, 2014 at 08:21:58AM -0700, Dave Hansen wrote: > Is it possible that there's still a get_page() reference that's holding > those pages in place from the graphics code? > >>> > >>> Not from i915.ko. The last resort of our shrinker is to drop all page > >>> refs held by the GPU, which is invoked if we are asked to free memory > >>> and we have no inactive objects left. > >> > >> How sure are we that this was performed before the OOM? > > > > Only by virtue of how shrink_slabs() works. > > Could we try to raise the level of assurance there, please? :) > > So this "last resort" is i915_gem_shrink_all()? It seems like we might > have some problems getting down to that part of the code if we have > problems getting the mutex. In general, but not in this example where the load is tightly controlled. > We have tracepoints for the shrinkers in here (it says slab, but it's > all the shrinkers, I checked): > > /sys/kernel/debug/tracing/events/vmscan/mm_shrink_slab_*/enable > and another for OOMs: > /sys/kernel/debug/tracing/events/oom/enable > > Could you collect a trace during one of these OOM events and see what > the i915 shrinker is doing? Just enable those two and then collect a > copy of: > > /sys/kernel/debug/tracing/trace > > That'll give us some insight about how well the shrinker is working. If > the VM gave up on calling in to it, it might reveal why we didn't get > all the way down in to i915_gem_shrink_all(). I'll add it to the list for QA to try. > > Thanks for the pointer to > > register_oom_notifier(), I can use that to make sure that we do purge > > everything from the GPU, and do a sanity check at the same time, before > > we start killing processes. > > Actually, that one doesn't get called until we're *SURE* we are going to > OOM. Any action taken in there won't be taken in to account. blocking_notifier_call_chain(&oom_notify_list, 0, &freed); if (freed > 0) /* Got some memory back in the last second. */ return; That looks like it should abort the oom and so repeat the allocation attempt? Or is that too hopeful? > >> Also, forgive me for being an idiot wrt the way graphics work, but are > >> there any good candidates that you can think of that could be holding a > >> reference? I've honestly never seen an OOM like this. > > > > Here the only place that we take a page reference is in > > i915_gem_object_get_pages(). We do this when we first bind the pages > > into the GPU's translation table, but we only release the pages once the > > object is destroyed or the system experiences memory pressure. (Once the > > GPU touches the pages, we no longer consider them to be cache coherent > > with the CPU and so migrating them between the GPU and CPU requires > > clflushing, which is expensive.) > > > > Aside from CPU mmaps of the shmemfs filp, all operations on our > > graphical objects should lead to i915_gem_object_get_pages(). However > > not all objects are recoverable as some may be pinned due to hardware > > access. > > In that oom callback, could you dump out the aggregate number of > obj->pages_pin_count across all the objects? That would be a very > interesting piece of information to have. It would also be very > insightful for folks who see OOMs in practice with i915 in their systems. Indeed. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder
Poking around with those tracepoints, I don't see the i915 shrinker getting run, only i915_gem_inactive_count() being called. It must be returning 0 because we're never even _getting_ to the tracepoints themselves after calling i915_gem_inactive_count(). This is on my laptop, and I haven't been able to coax i915 in to reclaiming a single page in 10 or 15 minutes. That seems fishy to me. Surely *SOMETHING* has become reclaimable in that time. Here's /sys/kernel/debug/dri/0/i915_gem_objects: > 919 objects, 354914304 bytes > 874 [333] objects, 291004416 [93614080] bytes in gtt > 0 [0] active objects, 0 [0] bytes > 874 [333] inactive objects, 291004416 [93614080] bytes > 0 unbound objects, 0 bytes > 199 purgeable objects, 92844032 bytes > 30 pinned mappable objects, 18989056 bytes > 139 fault mappable objects, 17371136 bytes > 2145386496 [268435456] gtt total > > Xorg: 632 objects, 235450368 bytes (0 active, 180899840 inactive, 21262336 > unbound) > gnome-control-c: 11 objects, 110592 bytes (0 active, 0 inactive, 49152 > unbound) > chromium-browse: 266 objects, 101367808 bytes (0 active, 101330944 inactive, > 0 unbound) > Xorg: 0 objects, 0 bytes (0 active, 0 inactive, 0 unbound) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder
On 04/25/2014 12:23 AM, Chris Wilson wrote: > On Thu, Apr 24, 2014 at 03:35:47PM -0700, Dave Hansen wrote: >> On 04/24/2014 08:39 AM, Chris Wilson wrote: >>> On Thu, Apr 24, 2014 at 08:21:58AM -0700, Dave Hansen wrote: Is it possible that there's still a get_page() reference that's holding those pages in place from the graphics code? >>> >>> Not from i915.ko. The last resort of our shrinker is to drop all page >>> refs held by the GPU, which is invoked if we are asked to free memory >>> and we have no inactive objects left. >> >> How sure are we that this was performed before the OOM? > > Only by virtue of how shrink_slabs() works. Could we try to raise the level of assurance there, please? :) So this "last resort" is i915_gem_shrink_all()? It seems like we might have some problems getting down to that part of the code if we have problems getting the mutex. We have tracepoints for the shrinkers in here (it says slab, but it's all the shrinkers, I checked): /sys/kernel/debug/tracing/events/vmscan/mm_shrink_slab_*/enable and another for OOMs: /sys/kernel/debug/tracing/events/oom/enable Could you collect a trace during one of these OOM events and see what the i915 shrinker is doing? Just enable those two and then collect a copy of: /sys/kernel/debug/tracing/trace That'll give us some insight about how well the shrinker is working. If the VM gave up on calling in to it, it might reveal why we didn't get all the way down in to i915_gem_shrink_all(). > Thanks for the pointer to > register_oom_notifier(), I can use that to make sure that we do purge > everything from the GPU, and do a sanity check at the same time, before > we start killing processes. Actually, that one doesn't get called until we're *SURE* we are going to OOM. Any action taken in there won't be taken in to account. >> Also, forgive me for being an idiot wrt the way graphics work, but are >> there any good candidates that you can think of that could be holding a >> reference? I've honestly never seen an OOM like this. > > Here the only place that we take a page reference is in > i915_gem_object_get_pages(). We do this when we first bind the pages > into the GPU's translation table, but we only release the pages once the > object is destroyed or the system experiences memory pressure. (Once the > GPU touches the pages, we no longer consider them to be cache coherent > with the CPU and so migrating them between the GPU and CPU requires > clflushing, which is expensive.) > > Aside from CPU mmaps of the shmemfs filp, all operations on our > graphical objects should lead to i915_gem_object_get_pages(). However > not all objects are recoverable as some may be pinned due to hardware > access. In that oom callback, could you dump out the aggregate number of obj->pages_pin_count across all the objects? That would be a very interesting piece of information to have. It would also be very insightful for folks who see OOMs in practice with i915 in their systems. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder
On Thu, Apr 24, 2014 at 03:35:47PM -0700, Dave Hansen wrote: > On 04/24/2014 08:39 AM, Chris Wilson wrote: > > On Thu, Apr 24, 2014 at 08:21:58AM -0700, Dave Hansen wrote: > >> Is it possible that there's still a get_page() reference that's holding > >> those pages in place from the graphics code? > > > > Not from i915.ko. The last resort of our shrinker is to drop all page > > refs held by the GPU, which is invoked if we are asked to free memory > > and we have no inactive objects left. > > How sure are we that this was performed before the OOM? Only by virtue of how shrink_slabs() works. Thanks for the pointer to register_oom_notifier(), I can use that to make sure that we do purge everything from the GPU, and do a sanity check at the same time, before we start killing processes. > Also, forgive me for being an idiot wrt the way graphics work, but are > there any good candidates that you can think of that could be holding a > reference? I've honestly never seen an OOM like this. Here the only place that we take a page reference is in i915_gem_object_get_pages(). We do this when we first bind the pages into the GPU's translation table, but we only release the pages once the object is destroyed or the system experiences memory pressure. (Once the GPU touches the pages, we no longer consider them to be cache coherent with the CPU and so migrating them between the GPU and CPU requires clflushing, which is expensive.) Aside from CPU mmaps of the shmemfs filp, all operations on our graphical objects should lead to i915_gem_object_get_pages(). However not all objects are recoverable as some may be pinned due to hardware access. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder
On 04/24/2014 08:39 AM, Chris Wilson wrote: > On Thu, Apr 24, 2014 at 08:21:58AM -0700, Dave Hansen wrote: >> Is it possible that there's still a get_page() reference that's holding >> those pages in place from the graphics code? > > Not from i915.ko. The last resort of our shrinker is to drop all page > refs held by the GPU, which is invoked if we are asked to free memory > and we have no inactive objects left. How sure are we that this was performed before the OOM? Also, forgive me for being an idiot wrt the way graphics work, but are there any good candidates that you can think of that could be holding a reference? I've honestly never seen an OOM like this. Somewhat rhetorical question for the mm folks on cc: should we be sticking the pages on which you're holding a reference on our unreclaimable list? > If we could get a callback for the oom report, I could dump some details > about what the GPU is holding onto. That seems like a useful extension to > add to the shrinkers. There's a register_oom_notifier(). Is that sufficient for your use, or is there something additional that would help? ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder
On Thu, Apr 24, 2014 at 08:21:58AM -0700, Dave Hansen wrote: > On 04/23/2014 10:58 PM, Chris Wilson wrote: > > [ 4756.750938] Node 0 DMA free:14664kB min:32kB low:40kB high:48kB > > active_anon:0kB inactive_anon:1024kB active_file:0kB inactive_file:4kB > > unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB > > managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:412kB > > slab_reclaimable:80kB slab_unreclaimable:24kB kernel_stack:0kB > > pagetables:48kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB > > pages_scanned:76 all_unreclaimable? yes > > [ 4756.751103] lowmem_reserve[]: 0 3337 3660 3660 > > [ 4756.751133] Node 0 DMA32 free:7208kB min:7044kB low:8804kB high:10564kB > > active_anon:36172kB inactive_anon:3351408kB active_file:92kB > > inactive_file:72kB unevictable:0kB isolated(anon):0kB isolated(file):0kB > > present:3518336kB managed:3440548kB mlocked:0kB dirty:0kB writeback:0kB > > mapped:12kB shmem:1661420kB slab_reclaimable:17624kB > > slab_unreclaimable:14400kB kernel_stack:696kB pagetables:4324kB > > unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:327 > > all_unreclaimable? yes > > [ 4756.751341] lowmem_reserve[]: 0 0 322 322 > > [ 4756.752889] Node 0 Normal free:328kB min:680kB low:848kB high:1020kB > > active_anon:61372kB inactive_anon:250740kB active_file:0kB > > inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB > > present:393216kB managed:330360kB mlocked:0kB dirty:0kB writeback:0kB > > mapped:0kB shmem:227740kB slab_reclaimable:3032kB slab_unreclaimable:5128kB > > kernel_stack:400kB pagetables:624kB unstable:0kB bounce:0kB free_cma:0kB > > writeback_tmp:0kB pages_scanned:6 all_unreclaimable? yes > > [ 4756.757635] lowmem_reserve[]: 0 0 0 0 > > [ 4756.759294] Node 0 DMA: 2*4kB (UM) 2*8kB (UM) 3*16kB (UEM) 4*32kB (UEM) > > 2*64kB (UM) 4*128kB (UEM) 2*256kB (EM) 2*512kB (EM) 2*1024kB (UM) 3*2048kB > > (EMR) 1*4096kB (M) = 14664kB > > [ 4756.762776] Node 0 DMA32: 424*4kB (UEM) 171*8kB (UEM) 21*16kB (UEM) > > 1*32kB (R) 1*64kB (R) 1*128kB (R) 0*256kB 1*512kB (R) 1*1024kB (R) 1*2048kB > > (R) 0*4096kB = 7208kB > > [ 4756.766284] Node 0 Normal: 26*4kB (UER) 18*8kB (UER) 3*16kB (E) 1*32kB > > (R) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 328kB > > [ 4756.768198] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 > > hugepages_size=2048kB > > [ 4756.770026] 916139 total pagecache pages > > [ 4756.771857] 443703 pages in swap cache > > [ 4756.773695] Swap cache stats: add 15363874, delete 14920171, find > > 6533699/7512215 > > [ 4756.775592] Free swap = 0kB > > [ 4756.777505] Total swap = 2047996kB > > OK, so here's my theory as to what happens: > > 1. The graphics pages got put on the LRU > 2. System is low on memory, they get on (and *STAY* on) the inactive >LRU. > 3. VM adds graphics pages to the swap cache, and writes them out, and >we see the writeout from the vmstat, and lots of adds/removes from >the swap cache. > 4. But, despite all the swap writeout, we don't get helped by seeing >much memory get freed. Why? > > I _suspect_ that the graphics drivers here are holding a reference to > the page. During reclaim, we're mostly concerned with the pages being > mapped. If we manage to get them unmapped, we'll go ahead and swap > them, which I _think_ is what we're seeing. But, when it comes time to > _actually_ free them, that last reference on the page keeps them from > being freed. > > Is it possible that there's still a get_page() reference that's holding > those pages in place from the graphics code? Not from i915.ko. The last resort of our shrinker is to drop all page refs held by the GPU, which is invoked if we are asked to free memory and we have no inactive objects left. If we could get a callback for the oom report, I could dump some details about what the GPU is holding onto. That seems like a useful extension to add to the shrinkers. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder
On 04/23/2014 10:58 PM, Chris Wilson wrote: > [ 4756.750938] Node 0 DMA free:14664kB min:32kB low:40kB high:48kB > active_anon:0kB inactive_anon:1024kB active_file:0kB inactive_file:4kB > unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB > managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:412kB > slab_reclaimable:80kB slab_unreclaimable:24kB kernel_stack:0kB > pagetables:48kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB > pages_scanned:76 all_unreclaimable? yes > [ 4756.751103] lowmem_reserve[]: 0 3337 3660 3660 > [ 4756.751133] Node 0 DMA32 free:7208kB min:7044kB low:8804kB high:10564kB > active_anon:36172kB inactive_anon:3351408kB active_file:92kB > inactive_file:72kB unevictable:0kB isolated(anon):0kB isolated(file):0kB > present:3518336kB managed:3440548kB mlocked:0kB dirty:0kB writeback:0kB > mapped:12kB shmem:1661420kB slab_reclaimable:17624kB > slab_unreclaimable:14400kB kernel_stack:696kB pagetables:4324kB unstable:0kB > bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:327 > all_unreclaimable? yes > [ 4756.751341] lowmem_reserve[]: 0 0 322 322 > [ 4756.752889] Node 0 Normal free:328kB min:680kB low:848kB high:1020kB > active_anon:61372kB inactive_anon:250740kB active_file:0kB inactive_file:4kB > unevictable:0kB isolated(anon):0kB isolated(file):0kB present:393216kB > managed:330360kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB > shmem:227740kB slab_reclaimable:3032kB slab_unreclaimable:5128kB > kernel_stack:400kB pagetables:624kB unstable:0kB bounce:0kB free_cma:0kB > writeback_tmp:0kB pages_scanned:6 all_unreclaimable? yes > [ 4756.757635] lowmem_reserve[]: 0 0 0 0 > [ 4756.759294] Node 0 DMA: 2*4kB (UM) 2*8kB (UM) 3*16kB (UEM) 4*32kB (UEM) > 2*64kB (UM) 4*128kB (UEM) 2*256kB (EM) 2*512kB (EM) 2*1024kB (UM) 3*2048kB > (EMR) 1*4096kB (M) = 14664kB > [ 4756.762776] Node 0 DMA32: 424*4kB (UEM) 171*8kB (UEM) 21*16kB (UEM) 1*32kB > (R) 1*64kB (R) 1*128kB (R) 0*256kB 1*512kB (R) 1*1024kB (R) 1*2048kB (R) > 0*4096kB = 7208kB > [ 4756.766284] Node 0 Normal: 26*4kB (UER) 18*8kB (UER) 3*16kB (E) 1*32kB (R) > 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 328kB > [ 4756.768198] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 > hugepages_size=2048kB > [ 4756.770026] 916139 total pagecache pages > [ 4756.771857] 443703 pages in swap cache > [ 4756.773695] Swap cache stats: add 15363874, delete 14920171, find > 6533699/7512215 > [ 4756.775592] Free swap = 0kB > [ 4756.777505] Total swap = 2047996kB OK, so here's my theory as to what happens: 1. The graphics pages got put on the LRU 2. System is low on memory, they get on (and *STAY* on) the inactive LRU. 3. VM adds graphics pages to the swap cache, and writes them out, and we see the writeout from the vmstat, and lots of adds/removes from the swap cache. 4. But, despite all the swap writeout, we don't get helped by seeing much memory get freed. Why? I _suspect_ that the graphics drivers here are holding a reference to the page. During reclaim, we're mostly concerned with the pages being mapped. If we manage to get them unmapped, we'll go ahead and swap them, which I _think_ is what we're seeing. But, when it comes time to _actually_ free them, that last reference on the page keeps them from being freed. Is it possible that there's still a get_page() reference that's holding those pages in place from the graphics code? >> Also, the vmstat output from the bug: >> >>> https://bugs.freedesktop.org/show_bug.cgi?id=72742 >> >> shows there being an *AWFUL* lot of swap I/O going on here. From the >> looks of it, we stuck ~2GB in swap and evicted another 1.5GB of page >> cache (although I guess that could be double-counting tmpfs getting >> swapped out too). Hmmm, was this one of the cases where you actually >> ran _out_ of swap? > > Yes. This bug is a little odd because they always run out of swap. We > have another category of bug (which appears to be fixed, touch wood) > where we trigger oom without even touching swap. The test case is > designed to only just swap (use at most 1/4 of the available swap space) > and checks that its working set should fit into available memory + swap. > However, when QA run the test, their systems run completely out of > virtual memory. There is a discrepancy on their machines where > anon_inactive is reported as being 2x shmem, but we only expect > anon_inactive to be our own shmem allocations. I don't know how to track > what else is using anon_inactive. Suggestions? Let's tackle one bug at a time. They might be the same thing. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder
On Wed, Apr 23, 2014 at 02:14:36PM -0700, Dave Hansen wrote: > On 04/22/2014 12:30 PM, Daniel Vetter wrote: > >> > > During testing of i915.ko with working texture sets larger than RAM, we > >> > > encounter OOM with plenty of memory still trapped within writeback, > >> > > e.g: > >> > > > >> > > [ 42.386039] active_anon:10134 inactive_anon:1900781 isolated_anon:32 > >> > > active_file:33 inactive_file:39 isolated_file:0 > >> > > unevictable:0 dirty:0 writeback:337627 unstable:0 > >> > > free:11985 slab_reclaimable:9458 slab_unreclaimable:23614 > >> > > mapped:41 shmem:1560769 pagetables:1276 bounce:0 > >> > > > >> > > If we throttle for writeback following shrink_slab, this gives us time > >> > > to wait upon the writeback generated by the i915.ko shinker: > >> > > > >> > > [ 4756.750808] active_anon:24386 inactive_anon:900793 isolated_anon:0 > >> > > active_file:23 inactive_file:20 isolated_file:0 > >> > > unevictable:0 dirty:0 writeback:0 unstable:0 > >> > > free:5550 slab_reclaimable:5184 slab_unreclaimable:4888 > >> > > mapped:3 shmem:472393 pagetables:1249 bounce:0 > > Could you get some dumps of the entire set of OOM information? These > are only tiny snippets. For reference the last oom report after flushing all the writeback: [ 4756.749554] crond invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0 [ 4756.749603] crond cpuset=/ mems_allowed=0 [ 4756.749628] CPU: 0 PID: 3574 Comm: crond Tainted: GW 3.14.0_prts_de579f_20140410 #2 [ 4756.749676] Hardware name: Gigabyte Technology Co., Ltd. H55M-UD2H/H55M-UD2H, BIOS F4 12/02/2009 [ 4756.749723] 000201da 81717273 8800d235dc40 [ 4756.749762] 81714541 0400 8800cb6f3b10 880117ff8000 [ 4756.749800] 81072266 0206 812d6ebe 880112f25c40 [ 4756.749838] Call Trace: [ 4756.749856] [] ? dump_stack+0x41/0x51 [ 4756.749881] [] ? dump_header.isra.8+0x69/0x191 [ 4756.749911] [] ? ktime_get_ts+0x49/0xab [ 4756.749938] [] ? ___ratelimit+0xae/0xc8 [ 4756.749965] [] ? oom_kill_process+0x76/0x32c [ 4756.749992] [] ? find_lock_task_mm+0x22/0x6e [ 4756.750018] [] ? out_of_memory+0x41c/0x44f [ 4756.750045] [] ? __alloc_pages_nodemask+0x680/0x78d [ 4756.750076] [] ? alloc_pages_current+0xbf/0xdc [ 4756.750103] [] ? filemap_fault+0x266/0x38b [ 4756.750130] [] ? __do_fault+0xac/0x3bf [ 4756.750155] [] ? handle_mm_fault+0x1e7/0x7e2 [ 4756.750181] [] ? tlb_flush_mmu+0x4b/0x64 [ 4756.750219] [] ? timerqueue_add+0x79/0x98 [ 4756.750254] [] ? enqueue_hrtimer+0x15/0x37 [ 4756.750287] [] ? __do_page_fault+0x42e/0x47b [ 4756.750319] [] ? hrtimer_try_to_cancel+0x67/0x70 [ 4756.750353] [] ? hrtimer_cancel+0xc/0x16 [ 4756.750385] [] ? do_nanosleep+0xb3/0xf1 [ 4756.750415] [] ? hrtimer_nanosleep+0x89/0x10b [ 4756.750447] [] ? page_fault+0x22/0x30 [ 4756.750476] Mem-Info: [ 4756.750490] Node 0 DMA per-cpu: [ 4756.750510] CPU0: hi:0, btch: 1 usd: 0 [ 4756.750533] CPU1: hi:0, btch: 1 usd: 0 [ 4756.750555] CPU2: hi:0, btch: 1 usd: 0 [ 4756.750576] CPU3: hi:0, btch: 1 usd: 0 [ 4756.750598] Node 0 DMA32 per-cpu: [ 4756.750615] CPU0: hi: 186, btch: 31 usd: 0 [ 4756.750637] CPU1: hi: 186, btch: 31 usd: 0 [ 4756.750660] CPU2: hi: 186, btch: 31 usd: 0 [ 4756.750681] CPU3: hi: 186, btch: 31 usd: 0 [ 4756.750702] Node 0 Normal per-cpu: [ 4756.750720] CPU0: hi: 90, btch: 15 usd: 0 [ 4756.750742] CPU1: hi: 90, btch: 15 usd: 0 [ 4756.750763] CPU2: hi: 90, btch: 15 usd: 0 [ 4756.750785] CPU3: hi: 90, btch: 15 usd: 0 [ 4756.750808] active_anon:24386 inactive_anon:900793 isolated_anon:0 active_file:23 inactive_file:20 isolated_file:0 unevictable:0 dirty:0 writeback:0 unstable:0 free:5550 slab_reclaimable:5184 slab_unreclaimable:4888 mapped:3 shmem:472393 pagetables:1249 bounce:0 free_cma:0 [ 4756.750938] Node 0 DMA free:14664kB min:32kB low:40kB high:48kB active_anon:0kB inactive_anon:1024kB active_file:0kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:412kB slab_reclaimable:80kB slab_unreclaimable:24kB kernel_stack:0kB pagetables:48kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:76 all_unreclaimable? yes [ 4756.751103] lowmem_reserve[]: 0 3337 3660 3660 [ 4756.751133] Node 0 DMA32 free:7208kB min:7044kB low:8804kB high:10564kB active_anon:36172kB inactive_anon:3351408kB active_file:92kB inactive_file:72kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3518336kB managed:3440548kB mlocked:0kB dirty:0kB writeback:0kB mapped:12kB shmem:1661420kB slab_reclaimable:17624kB slab_unreclaimable:14400kB kernel_stack:696kB pagetables:4324kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:327 all_unreclaimable? yes [ 4756.751341] lowmem_reserve[]: 0 0 322 32
Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder
On 04/22/2014 12:30 PM, Daniel Vetter wrote: >> > > During testing of i915.ko with working texture sets larger than RAM, we >> > > encounter OOM with plenty of memory still trapped within writeback, e.g: >> > > >> > > [ 42.386039] active_anon:10134 inactive_anon:1900781 isolated_anon:32 >> > > active_file:33 inactive_file:39 isolated_file:0 >> > > unevictable:0 dirty:0 writeback:337627 unstable:0 >> > > free:11985 slab_reclaimable:9458 slab_unreclaimable:23614 >> > > mapped:41 shmem:1560769 pagetables:1276 bounce:0 >> > > >> > > If we throttle for writeback following shrink_slab, this gives us time >> > > to wait upon the writeback generated by the i915.ko shinker: >> > > >> > > [ 4756.750808] active_anon:24386 inactive_anon:900793 isolated_anon:0 >> > > active_file:23 inactive_file:20 isolated_file:0 >> > > unevictable:0 dirty:0 writeback:0 unstable:0 >> > > free:5550 slab_reclaimable:5184 slab_unreclaimable:4888 >> > > mapped:3 shmem:472393 pagetables:1249 bounce:0 Could you get some dumps of the entire set of OOM information? These are only tiny snippets. Also, the vmstat output from the bug: > https://bugs.freedesktop.org/show_bug.cgi?id=72742 shows there being an *AWFUL* lot of swap I/O going on here. From the looks of it, we stuck ~2GB in swap and evicted another 1.5GB of page cache (although I guess that could be double-counting tmpfs getting swapped out too). Hmmm, was this one of the cases where you actually ran _out_ of swap? > 2 0 19472 33952296 36103240 19472 0 19472 1474 151 3 27 71 > 0 > 4 0 484964 66468296 31758640 465492 0 465516 2597 1395 0 32 > 66 2 > 0 2 751940 23692980 30228840 266976 688 266976 3681 636 0 27 > 66 6 > procs ---memory-- ---swap-- -io -system-- cpu > r b swpd free buff cache si sobibo in cs us sy id wa > 2 1 1244580 295336988 26069840 492896 0 492908 1237 311 1 9 > 50 41 > 0 2 2047996 28760988 20371440 803160 0 803160 1221 1291 1 15 > 69 14 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder
On Fri, Apr 18, 2014 at 12:14:16PM -0700, Andrew Morton wrote: > On Thu, 10 Apr 2014 08:05:06 +0100 Chris Wilson > wrote: > > > During testing of i915.ko with working texture sets larger than RAM, we > > encounter OOM with plenty of memory still trapped within writeback, e.g: > > > > [ 42.386039] active_anon:10134 inactive_anon:1900781 isolated_anon:32 > > active_file:33 inactive_file:39 isolated_file:0 > > unevictable:0 dirty:0 writeback:337627 unstable:0 > > free:11985 slab_reclaimable:9458 slab_unreclaimable:23614 > > mapped:41 shmem:1560769 pagetables:1276 bounce:0 > > > > If we throttle for writeback following shrink_slab, this gives us time > > to wait upon the writeback generated by the i915.ko shinker: > > > > [ 4756.750808] active_anon:24386 inactive_anon:900793 isolated_anon:0 > > active_file:23 inactive_file:20 isolated_file:0 > > unevictable:0 dirty:0 writeback:0 unstable:0 > > free:5550 slab_reclaimable:5184 slab_unreclaimable:4888 > > mapped:3 shmem:472393 pagetables:1249 bounce:0 > > > > (Sadly though the test is still failing.) > > > > Testcase: igt/gem_tiled_swapping > > References: https://bugs.freedesktop.org/show_bug.cgi?id=72742 > > i915_gem_object_get_pages_gtt() makes my head spin, but > https://bugs.freedesktop.org/attachment.cgi?id=90818 says > "gfp_mask=0x201da" which is > > ___GFP_HARDWALL|___GFP_COLD|___GFP_FS|___GFP_IO|___GFP_WAIT|___GFP_MOVABLE|___GFP_HIGHMEM > > so this allocation should work and it very bad if the page allocator is > declaring oom while there is so much writeback in flight, assuming the > writeback is to eligible zones. For more head spinning look at the lock stealing dance we do in our shrinker callbacks i915_gem_inactive_scan|count(). It's not pretty at all, but it helps to avoids the dreaded oom in a few more cases. Some review of our mess of ducttape from -mm developers with actual clue would be really appreciated ... -Daniel > Mel, Johannes: could you take a look please? > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majord...@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: mailto:"d...@kvack.org";> em...@kvack.org -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder
On Thu, 10 Apr 2014 08:05:06 +0100 Chris Wilson wrote: > During testing of i915.ko with working texture sets larger than RAM, we > encounter OOM with plenty of memory still trapped within writeback, e.g: > > [ 42.386039] active_anon:10134 inactive_anon:1900781 isolated_anon:32 > active_file:33 inactive_file:39 isolated_file:0 > unevictable:0 dirty:0 writeback:337627 unstable:0 > free:11985 slab_reclaimable:9458 slab_unreclaimable:23614 > mapped:41 shmem:1560769 pagetables:1276 bounce:0 > > If we throttle for writeback following shrink_slab, this gives us time > to wait upon the writeback generated by the i915.ko shinker: > > [ 4756.750808] active_anon:24386 inactive_anon:900793 isolated_anon:0 > active_file:23 inactive_file:20 isolated_file:0 > unevictable:0 dirty:0 writeback:0 unstable:0 > free:5550 slab_reclaimable:5184 slab_unreclaimable:4888 > mapped:3 shmem:472393 pagetables:1249 bounce:0 > > (Sadly though the test is still failing.) > > Testcase: igt/gem_tiled_swapping > References: https://bugs.freedesktop.org/show_bug.cgi?id=72742 i915_gem_object_get_pages_gtt() makes my head spin, but https://bugs.freedesktop.org/attachment.cgi?id=90818 says "gfp_mask=0x201da" which is ___GFP_HARDWALL|___GFP_COLD|___GFP_FS|___GFP_IO|___GFP_WAIT|___GFP_MOVABLE|___GFP_HIGHMEM so this allocation should work and it very bad if the page allocator is declaring oom while there is so much writeback in flight, assuming the writeback is to eligible zones. Mel, Johannes: could you take a look please? ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] mm: Throttle shrinkers harder
During testing of i915.ko with working texture sets larger than RAM, we encounter OOM with plenty of memory still trapped within writeback, e.g: [ 42.386039] active_anon:10134 inactive_anon:1900781 isolated_anon:32 active_file:33 inactive_file:39 isolated_file:0 unevictable:0 dirty:0 writeback:337627 unstable:0 free:11985 slab_reclaimable:9458 slab_unreclaimable:23614 mapped:41 shmem:1560769 pagetables:1276 bounce:0 If we throttle for writeback following shrink_slab, this gives us time to wait upon the writeback generated by the i915.ko shinker: [ 4756.750808] active_anon:24386 inactive_anon:900793 isolated_anon:0 active_file:23 inactive_file:20 isolated_file:0 unevictable:0 dirty:0 writeback:0 unstable:0 free:5550 slab_reclaimable:5184 slab_unreclaimable:4888 mapped:3 shmem:472393 pagetables:1249 bounce:0 (Sadly though the test is still failing.) Testcase: igt/gem_tiled_swapping References: https://bugs.freedesktop.org/show_bug.cgi?id=72742 Signed-off-by: Chris Wilson Cc: Andrew Morton Cc: Mel Gorman Cc: Michal Hocko Cc: Rik van Riel Cc: Johannes Weiner Cc: Dave Chinner Cc: Glauber Costa Cc: Hugh Dickins Cc: linux...@kvack.org --- mm/vmscan.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index a9c74b409681..8c2cb1150d17 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -135,6 +135,10 @@ unsigned long vm_total_pages; /* The total number of pages which the VM controls static LIST_HEAD(shrinker_list); static DECLARE_RWSEM(shrinker_rwsem); +static bool throttle_direct_reclaim(gfp_t gfp_mask, + struct zonelist *zonelist, + nodemask_t *nodemask); + #ifdef CONFIG_MEMCG static bool global_reclaim(struct scan_control *sc) { @@ -1521,7 +1525,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, * of pages under pages flagged for immediate reclaim and stall if any * are encountered in the nr_immediate check below. */ - if (nr_writeback && nr_writeback == nr_taken) + if (nr_writeback > nr_taken / 2) zone_set_flag(zone, ZONE_WRITEBACK); /* @@ -2465,6 +2469,12 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, WB_REASON_TRY_TO_FREE_PAGES); sc->may_writepage = 1; } + + if (global_reclaim(sc) && + throttle_direct_reclaim(sc->gfp_mask, + zonelist, + sc->nodemask)) + aborted_reclaim = true; } while (--sc->priority >= 0 && !aborted_reclaim); out: -- 1.9.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx