Re: [PATCH v3] kmemleaak: survive in a low-memory situation
On 3/26/19 12:06 PM, Catalin Marinas wrote: > I wonder whether we'd be better off to replace the metadata allocator > with gen_pool. This way we'd also get rid of early logging/replaying of > the memory allocations since we can populate the gen_pool early with a > static buffer. I suppose this is not going to work well, as DMA_API_DEBUG use a similar approach [1] but I still saw it is struggling in a low-memory situation and disable itself occasionally. [1] https://lkml.org/lkml/2018/12/10/383
Re: [PATCH v3] kmemleaak: survive in a low-memory situation
On Tue 26-03-19 16:20:41, Catalin Marinas wrote: > On Tue, Mar 26, 2019 at 09:05:36AM -0700, Matthew Wilcox wrote: > > On Tue, Mar 26, 2019 at 11:43:38AM -0400, Qian Cai wrote: > > > Unless there is a brave soul to reimplement the kmemleak to embed it's > > > metadata into the tracked memory itself in a foreseeable future, this > > > provides a good balance between enabling kmemleak in a low-memory > > > situation and not introducing too much hackiness into the existing > > > code for now. > > > > I don't understand kmemleak. Kirill pointed me at this a few days ago: > > > > https://gist.github.com/kiryl/3225e235fea390aa2e49bf625bbe83ec > > > > It's caused by the XArray allocating memory using GFP_NOWAIT | __GFP_NOWARN. > > kmemleak then decides it needs to allocate memory to track this memory. > > So it calls kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp)); > > > > #define gfp_kmemleak_mask(gfp) (((gfp) & (GFP_KERNEL | GFP_ATOMIC)) | \ > > __GFP_NORETRY | __GFP_NOMEMALLOC | \ > > __GFP_NOWARN | __GFP_NOFAIL) > > > > then the page allocator gets to see GFP_NOFAIL | GFP_NOWAIT and gets angry. > > > > But I don't understand why kmemleak needs to mess with the GFP flags at > > all. > > Originally, it was just preserving GFP_KERNEL | GFP_ATOMIC. Starting > with commit 6ae4bd1f0bc4 ("kmemleak: Allow kmemleak metadata allocations > to fail"), this mask changed, aimed at making kmemleak allocation > failures less verbose (i.e. just disable it since it's a debug tool). > > Commit d9570ee3bd1d ("kmemleak: allow to coexist with fault injection") > introduced __GFP_NOFAIL but this came with its own problems which have > been previously reported (the warning you mentioned is another one of > these). We didn't get to any clear conclusion on how best to allow > allocations to fail with fault injection but not for the kmemleak > metadata. Your suggestion below would probably do the trick. I have objected to that on several occasions. An implicit __GFP_NOFAIL is simply broken and __GFP_NOWAIT allocations are a shiny example of that. You cannot loop inside the allocator for an unbound amount of time potentially with locks held. I have heard that there are some plans to deal with that but nothing has really materialized AFAIK. d9570ee3bd1d should be reverted I believe. The proper way around is to keep a pool objects and keep spare objects for restrected allocation contexts. -- Michal Hocko SUSE Labs
Re: [PATCH v3] kmemleaak: survive in a low-memory situation
On 3/26/19 12:00 PM, Christopher Lameter wrote: >> + */ >> +gfp = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC : >> + gfp_kmemleak_mask(gfp) | __GFP_DIRECT_RECLAIM; >> +object = kmem_cache_alloc(object_cache, gfp); >> +} >> + >> if (!object) { > > If the alloc must succeed then this check is no longer necessary. Well, GFP_ATOMIC could still fail. It looks like the only thing that will never fail is (__GFP_DIRECT_RECLAIM | __GFP_NOFAIL) as it keeps retrying in __alloc_pages_slowpath().
Re: [PATCH v3] kmemleaak: survive in a low-memory situation
On Tue, Mar 26, 2019 at 09:05:36AM -0700, Matthew Wilcox wrote: > On Tue, Mar 26, 2019 at 11:43:38AM -0400, Qian Cai wrote: > > Unless there is a brave soul to reimplement the kmemleak to embed it's > > metadata into the tracked memory itself in a foreseeable future, this > > provides a good balance between enabling kmemleak in a low-memory > > situation and not introducing too much hackiness into the existing > > code for now. > > I don't understand kmemleak. Kirill pointed me at this a few days ago: > > https://gist.github.com/kiryl/3225e235fea390aa2e49bf625bbe83ec > > It's caused by the XArray allocating memory using GFP_NOWAIT | __GFP_NOWARN. > kmemleak then decides it needs to allocate memory to track this memory. > So it calls kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp)); > > #define gfp_kmemleak_mask(gfp) (((gfp) & (GFP_KERNEL | GFP_ATOMIC)) | \ > __GFP_NORETRY | __GFP_NOMEMALLOC | \ > __GFP_NOWARN | __GFP_NOFAIL) > > then the page allocator gets to see GFP_NOFAIL | GFP_NOWAIT and gets angry. > > But I don't understand why kmemleak needs to mess with the GFP flags at > all. Originally, it was just preserving GFP_KERNEL | GFP_ATOMIC. Starting with commit 6ae4bd1f0bc4 ("kmemleak: Allow kmemleak metadata allocations to fail"), this mask changed, aimed at making kmemleak allocation failures less verbose (i.e. just disable it since it's a debug tool). Commit d9570ee3bd1d ("kmemleak: allow to coexist with fault injection") introduced __GFP_NOFAIL but this came with its own problems which have been previously reported (the warning you mentioned is another one of these). We didn't get to any clear conclusion on how best to allow allocations to fail with fault injection but not for the kmemleak metadata. Your suggestion below would probably do the trick. > Just allocate using the same flags as the caller, and fail the original > allocation if the kmemleak allocation fails. Like this: > > +++ b/mm/slab.h > @@ -435,12 +435,22 @@ static inline void slab_post_alloc_hook(struct > kmem_cache *s, gfp_t flags, > for (i = 0; i < size; i++) { > p[i] = kasan_slab_alloc(s, p[i], flags); > /* As p[i] might get tagged, call kmemleak hook after KASAN. > */ > - kmemleak_alloc_recursive(p[i], s->object_size, 1, > -s->flags, flags); > + if (kmemleak_alloc_recursive(p[i], s->object_size, 1, > +s->flags, flags)) > + goto fail; > } > > if (memcg_kmem_enabled()) > memcg_kmem_put_cache(s); > + return; > + > +fail: > + while (i > 0) { > + kasan_blah(...); > + kmemleak_blah(); > + i--; > + } > + free_blah(p); > + *p = NULL; > } > > #ifndef CONFIG_SLOB > > > and if we had something like this, we wouldn't need kmemleak to have this > self-disabling or must-succeed property. We'd still need the self-disabling in place since there are a few other places where we call kmemleak_alloc() from. -- Catalin
Re: [PATCH v3] kmemleaak: survive in a low-memory situation
On Tue, Mar 26, 2019 at 11:43:38AM -0400, Qian Cai wrote: > Kmemleak could quickly fail to allocate an object structure and then > disable itself in a low-memory situation. For example, running a mmap() > workload triggering swapping and OOM. This is especially problematic for > running things like LTP testsuite where one OOM test case would disable > the whole kmemleak and render the rest of test cases without kmemleak > watching for leaking. > > Kmemleak allocation could fail even though the tracked memory is > succeeded. Hence, it could still try to start a direct reclaim if it is > not executed in an atomic context (spinlock, irq-handler etc), or a > high-priority allocation in an atomic context as a last-ditch effort. > Since kmemleak is a debug feature, it is unlikely to be used in > production that memory resources is scarce where direct reclaim or > high-priority atomic allocations should not be granted lightly. > > Unless there is a brave soul to reimplement the kmemleak to embed it's > metadata into the tracked memory itself in a foreseeable future, this > provides a good balance between enabling kmemleak in a low-memory > situation and not introducing too much hackiness into the existing > code for now. Embedding the metadata would help with the slab allocations (though not with vmalloc) but it comes with its own potential issues. There are some bits of kmemleak that rely on deferred freeing of metadata for RCU traversal, so this wouldn't go well with embedding it. I wonder whether we'd be better off to replace the metadata allocator with gen_pool. This way we'd also get rid of early logging/replaying of the memory allocations since we can populate the gen_pool early with a static buffer. -- Catalin
Re: [PATCH v3] kmemleaak: survive in a low-memory situation
On Tue, Mar 26, 2019 at 11:43:38AM -0400, Qian Cai wrote: > Unless there is a brave soul to reimplement the kmemleak to embed it's > metadata into the tracked memory itself in a foreseeable future, this > provides a good balance between enabling kmemleak in a low-memory > situation and not introducing too much hackiness into the existing > code for now. I don't understand kmemleak. Kirill pointed me at this a few days ago: https://gist.github.com/kiryl/3225e235fea390aa2e49bf625bbe83ec It's caused by the XArray allocating memory using GFP_NOWAIT | __GFP_NOWARN. kmemleak then decides it needs to allocate memory to track this memory. So it calls kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp)); #define gfp_kmemleak_mask(gfp) (((gfp) & (GFP_KERNEL | GFP_ATOMIC)) | \ __GFP_NORETRY | __GFP_NOMEMALLOC | \ __GFP_NOWARN | __GFP_NOFAIL) then the page allocator gets to see GFP_NOFAIL | GFP_NOWAIT and gets angry. But I don't understand why kmemleak needs to mess with the GFP flags at all. Just allocate using the same flags as the caller, and fail the original allocation if the kmemleak allocation fails. Like this: +++ b/mm/slab.h @@ -435,12 +435,22 @@ static inline void slab_post_alloc_hook(struct kmem_cache *s, gfp_t flags, for (i = 0; i < size; i++) { p[i] = kasan_slab_alloc(s, p[i], flags); /* As p[i] might get tagged, call kmemleak hook after KASAN. */ - kmemleak_alloc_recursive(p[i], s->object_size, 1, -s->flags, flags); + if (kmemleak_alloc_recursive(p[i], s->object_size, 1, +s->flags, flags)) + goto fail; } if (memcg_kmem_enabled()) memcg_kmem_put_cache(s); + return; + +fail: + while (i > 0) { + kasan_blah(...); + kmemleak_blah(); + i--; + } + free_blah(p); + *p = NULL; } #ifndef CONFIG_SLOB and if we had something like this, we wouldn't need kmemleak to have this self-disabling or must-succeed property.
Re: [PATCH v3] kmemleaak: survive in a low-memory situation
On Tue, 26 Mar 2019, Qian Cai wrote: > + if (!object) { > + /* > + * The tracked memory was allocated successful, if the kmemleak > + * object failed to allocate for some reasons, it ends up with > + * the whole kmemleak disabled, so let it success at all cost. "let it succeed at all costs" > + */ > + gfp = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC : > +gfp_kmemleak_mask(gfp) | __GFP_DIRECT_RECLAIM; > + object = kmem_cache_alloc(object_cache, gfp); > + } > + > if (!object) { If the alloc must succeed then this check is no longer necessary.
[PATCH v3] kmemleaak: survive in a low-memory situation
Kmemleak could quickly fail to allocate an object structure and then disable itself in a low-memory situation. For example, running a mmap() workload triggering swapping and OOM. This is especially problematic for running things like LTP testsuite where one OOM test case would disable the whole kmemleak and render the rest of test cases without kmemleak watching for leaking. Kmemleak allocation could fail even though the tracked memory is succeeded. Hence, it could still try to start a direct reclaim if it is not executed in an atomic context (spinlock, irq-handler etc), or a high-priority allocation in an atomic context as a last-ditch effort. Since kmemleak is a debug feature, it is unlikely to be used in production that memory resources is scarce where direct reclaim or high-priority atomic allocations should not be granted lightly. Unless there is a brave soul to reimplement the kmemleak to embed it's metadata into the tracked memory itself in a foreseeable future, this provides a good balance between enabling kmemleak in a low-memory situation and not introducing too much hackiness into the existing code for now. Signed-off-by: Qian Cai --- v3: Update the commit log. Simplify the code inspired by graph_trace_open() from ftrace. v2: Remove the needless checking for NULL objects in slab_post_alloc_hook() per Catalin. mm/kmemleak.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/mm/kmemleak.c b/mm/kmemleak.c index a2d894d3de07..239927166894 100644 --- a/mm/kmemleak.c +++ b/mm/kmemleak.c @@ -581,6 +581,17 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size, unsigned long untagged_ptr; object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp)); + if (!object) { + /* +* The tracked memory was allocated successful, if the kmemleak +* object failed to allocate for some reasons, it ends up with +* the whole kmemleak disabled, so let it success at all cost. +*/ + gfp = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC : + gfp_kmemleak_mask(gfp) | __GFP_DIRECT_RECLAIM; + object = kmem_cache_alloc(object_cache, gfp); + } + if (!object) { pr_warn("Cannot allocate a kmemleak_object structure\n"); kmemleak_disable(); -- 2.17.2 (Apple Git-113)