Re: [PATCH] slab: fix a crash by reading /proc/slab_allocators
On Sun, Apr 07, 2019 at 07:35:34PM -1000, Linus Torvalds wrote: > On Sat, Apr 6, 2019 at 12:59 PM Qian Cai wrote: > > > > The commit 510ded33e075 ("slab: implement slab_root_caches list") > > changes the name of the list node within "struct kmem_cache" from > > "list" to "root_caches_node", but leaks_show() still use the "list" > > which causes a crash when reading /proc/slab_allocators. > > The patch does seem to be correct, and I have applied it. > > However, it does strike me that apparently this wasn't caught for two > years. Which makes me wonder whether we should (once again) discuss > just removing SLAB entirely, or at least removing the > /proc/slab_allocators file. Apparently it has never been used in the > last two years. At some point a "this can't have worked if anybody > ever tried to use it" situation means that the code should likely be > excised. The bug doesn't trigger on every read of /proc/slab_allocators (as noted later in this thread by Qian). I tried to repro it with a bunch of different configs and often `cat /proc/slab_allocators` just returns empty. I've got a patchset ready to go sitting in my tree that removes SLAB, I could just send it to start the conversation :) Tobin
Re: [PATCH] slab: fix a crash by reading /proc/slab_allocators
On Sun, 7 Apr 2019, Linus Torvalds wrote: > On Sat, Apr 6, 2019 at 12:59 PM Qian Cai wrote: > > > > The commit 510ded33e075 ("slab: implement slab_root_caches list") > > changes the name of the list node within "struct kmem_cache" from > > "list" to "root_caches_node", but leaks_show() still use the "list" > > which causes a crash when reading /proc/slab_allocators. > > The patch does seem to be correct, and I have applied it. > > However, it does strike me that apparently this wasn't caught for two > years. Which makes me wonder whether we should (once again) discuss > just removing SLAB entirely, or at least removing the > /proc/slab_allocators file. Apparently it has never been used in the > last two years. At some point a "this can't have worked if anybody > ever tried to use it" situation means that the code should likely be > excised. This is only occurring with specially build kernels so that memory leaks can be investigated. The same is done with other tools (kasan and friends) today I guess and also the SLUB debugging tools are much more user friendly. So this means that some esoteric debugging feature of SLAB was broken.
Re: [PATCH] slab: fix a crash by reading /proc/slab_allocators
On Sun, 2019-04-07 at 19:35 -1000, Linus Torvalds wrote: > On Sat, Apr 6, 2019 at 12:59 PM Qian Cai wrote: > > > > The commit 510ded33e075 ("slab: implement slab_root_caches list") > > changes the name of the list node within "struct kmem_cache" from > > "list" to "root_caches_node", but leaks_show() still use the "list" > > which causes a crash when reading /proc/slab_allocators. > > The patch does seem to be correct, and I have applied it. > > However, it does strike me that apparently this wasn't caught for two > years. Which makes me wonder whether we should (once again) discuss > just removing SLAB entirely, or at least removing the > /proc/slab_allocators file. Apparently it has never been used in the > last two years. At some point a "this can't have worked if anybody > ever tried to use it" situation means that the code should likely be > excised. > > Qian, how did you end up noticing and debugging this? There are some nice texts for CONFIG_SLAB Kconfig written in 2007, "The regular slab allocator that is established and known to work well in all environments." "tricked" me into enabling it in a debug kernel for running testing where LTP proc01 test case (read all files in procfs) would usually trigger the crash (Sometimes, "cat /proc/slab_allocators" would just end up printing nothing). Normally, all those debug kernels would use CONFIG_KASAN which would set CONFIG_DEBUG_SLAB=n. However, there is no KASAN for powerpc yet, so it selects CONFIG_DEBUG_SLAB=y there, and then the testing found the issue.
Re: [PATCH] slab: fix a crash by reading /proc/slab_allocators
On Sat, Apr 6, 2019 at 12:59 PM Qian Cai wrote: > > The commit 510ded33e075 ("slab: implement slab_root_caches list") > changes the name of the list node within "struct kmem_cache" from > "list" to "root_caches_node", but leaks_show() still use the "list" > which causes a crash when reading /proc/slab_allocators. The patch does seem to be correct, and I have applied it. However, it does strike me that apparently this wasn't caught for two years. Which makes me wonder whether we should (once again) discuss just removing SLAB entirely, or at least removing the /proc/slab_allocators file. Apparently it has never been used in the last two years. At some point a "this can't have worked if anybody ever tried to use it" situation means that the code should likely be excised. Qian, how did you end up noticing and debugging this? Linus
Re: [PATCH] slab: fix a crash by reading /proc/slab_allocators
On Sat, Apr 06, 2019 at 06:59:01PM -0400, Qian Cai wrote: > The commit 510ded33e075 ("slab: implement slab_root_caches list") > changes the name of the list node within "struct kmem_cache" from > "list" to "root_caches_node", but leaks_show() still use the "list" > which causes a crash when reading /proc/slab_allocators. > > BUG: unable to handle kernel NULL pointer dereference at > 00aa > PGD 0 P4D 0 > Oops: [#1] SMP DEBUG_PAGEALLOC PTI > CPU: 3 PID: 5925 Comm: ldd Not tainted 5.1.0-rc3-mm1+ #6 > RIP: 0010:__lock_acquire.isra.14+0x4b4/0xa50 > Call Trace: > > lock_acquire+0xa3/0x180 > _raw_spin_lock+0x2f/0x40 > do_drain+0x61/0xc0 > flush_smp_call_function_queue+0x3a/0x110 > generic_smp_call_function_single_interrupt+0x13/0x2b > smp_call_function_interrupt+0x66/0x1a0 > call_function_interrupt+0xf/0x20 > > RIP: 0010:__tlb_remove_page_size+0x8c/0xe0 > zap_pte_range+0x39f/0xc80 > unmap_page_range+0x38a/0x550 > unmap_single_vma+0x7d/0xe0 > unmap_vmas+0xae/0xd0 > exit_mmap+0xae/0x190 > mmput+0x7a/0x150 > do_exit+0x2d9/0xd40 > do_group_exit+0x41/0xd0 > __x64_sys_exit_group+0x18/0x20 > do_syscall_64+0x68/0x381 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > Fixes: 510ded33e075 ("slab: implement slab_root_caches list") > Signed-off-by: Qian Cai > --- > mm/slab.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/slab.c b/mm/slab.c > index 46a6e084222b..9142ee992493 100644 > --- a/mm/slab.c > +++ b/mm/slab.c > @@ -4307,7 +4307,8 @@ static void show_symbol(struct seq_file *m, unsigned > long address) > > static int leaks_show(struct seq_file *m, void *p) > { > - struct kmem_cache *cachep = list_entry(p, struct kmem_cache, list); > + struct kmem_cache *cachep = list_entry(p, struct kmem_cache, > +root_caches_node); > struct page *page; > struct kmem_cache_node *n; > const char *name; > -- > 2.17.2 (Apple Git-113) > For what its worth Reviewed-by: Tobin C. Harding thanks, Tobin.
Re: [PATCH] slab: fix a crash by reading /proc/slab_allocators
On 4/7/19 9:59 PM, Tobin C. Harding wrote: > On Sat, Apr 06, 2019 at 06:59:01PM -0400, Qian Cai wrote: >> The commit 510ded33e075 ("slab: implement slab_root_caches list") >> changes the name of the list node within "struct kmem_cache" from >> "list" to "root_caches_node" > > Are you sure? It looks to me like it adds a member to the memcg_cache_array > > diff --git a/include/linux/slab.h b/include/linux/slab.h > index a0cc7a77cda2..af1a5bef80f4 100644 > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -556,6 +556,8 @@ struct memcg_cache_array { > * used to index child cachces during allocation and cleared > * early during shutdown. > * > + * @root_caches_node: List node for slab_root_caches list. > + * > * @children: List of all child caches. While the child caches are also > * reachable through @memcg_caches, a child cache remains on > * this list until it is actually destroyed. > @@ -573,6 +575,7 @@ struct memcg_cache_params { > union { > struct { > struct memcg_cache_array __rcu *memcg_caches; > + struct list_head __root_caches_node; > struct list_head children; > }; > > And then defines 'root_caches_node' to be 'memcg_params.__root_caches_node' > if we have CONFIG_MEMCG otherwise defines 'root_caches_node' to be 'list' > > >> but leaks_show() still use the "list" > > I believe it should since 'list' is used to add to slab_caches list. See the offensive commit 510ded33e075 which changed those. @@ -1136,12 +1146,12 @@ static void print_slabinfo_header(struct seq_file *m) void *slab_start(struct seq_file *m, loff_t *pos) { mutex_lock(_mutex); - return seq_list_start(_caches, *pos); + return seq_list_start(_root_caches, *pos); } void *slab_next(struct seq_file *m, void *p, loff_t *pos) { - return seq_list_next(p, _caches, pos); + return seq_list_next(p, _root_caches, pos); } and then memcg_link_cache() does, if (is_root_cache(s)) { list_add(>root_caches_node, _root_caches); memcg_unlink_cache() does, if (is_root_cache(s)) { list_del(>root_caches_node); It also changed /proc/slabinfo but forgot to change /proc/slab_allocators. @@ -1193,12 +1203,11 @@ static void cache_show(struct kmem_cache *s, struct seq_file *m) static int slab_show(struct seq_file *m, void *p) { - struct kmem_cache *s = list_entry(p, struct kmem_cache, list); + struct kmem_cache *s = list_entry(p, struct kmem_cache, root_caches_node); > >> which causes a crash when reading /proc/slab_allocators. > > I was unable to reproduce this crash, I built with > > # CONFIG_MEMCG is not set > CONFIG_SLAB=y > CONFIG_SLAB_MERGE_DEFAULT=y > CONFIG_SLAB_FREELIST_RANDOM=y > CONFIG_DEBUG_SLAB=y > CONFIG_DEBUG_SLAB_LEAK=y > > I then booted in Qemu and successfully ran > $ cat slab_allocators > > Perhaps you could post your config? Yes, it won't be reproducible without CONFIG_MEMCG=y, because it has, /* If !memcg, all caches are root. */ #define slab_root_caches slab_caches #define root_caches_node list Anyway, https://git.sr.ht/~cai/linux-debug/blob/master/config
Re: [PATCH] slab: fix a crash by reading /proc/slab_allocators
On Sat, Apr 06, 2019 at 06:59:01PM -0400, Qian Cai wrote: > The commit 510ded33e075 ("slab: implement slab_root_caches list") > changes the name of the list node within "struct kmem_cache" from > "list" to "root_caches_node" Are you sure? It looks to me like it adds a member to the memcg_cache_array diff --git a/include/linux/slab.h b/include/linux/slab.h index a0cc7a77cda2..af1a5bef80f4 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -556,6 +556,8 @@ struct memcg_cache_array { * used to index child cachces during allocation and cleared * early during shutdown. * + * @root_caches_node: List node for slab_root_caches list. + * * @children: List of all child caches. While the child caches are also * reachable through @memcg_caches, a child cache remains on * this list until it is actually destroyed. @@ -573,6 +575,7 @@ struct memcg_cache_params { union { struct { struct memcg_cache_array __rcu *memcg_caches; + struct list_head __root_caches_node; struct list_head children; }; And then defines 'root_caches_node' to be 'memcg_params.__root_caches_node' if we have CONFIG_MEMCG otherwise defines 'root_caches_node' to be 'list' > but leaks_show() still use the "list" I believe it should since 'list' is used to add to slab_caches list. > which causes a crash when reading /proc/slab_allocators. I was unable to reproduce this crash, I built with # CONFIG_MEMCG is not set CONFIG_SLAB=y CONFIG_SLAB_MERGE_DEFAULT=y CONFIG_SLAB_FREELIST_RANDOM=y CONFIG_DEBUG_SLAB=y CONFIG_DEBUG_SLAB_LEAK=y I then booted in Qemu and successfully ran $ cat slab_allocators Perhaps you could post your config? Hope this helps, Tobin.
[PATCH] slab: fix a crash by reading /proc/slab_allocators
The commit 510ded33e075 ("slab: implement slab_root_caches list") changes the name of the list node within "struct kmem_cache" from "list" to "root_caches_node", but leaks_show() still use the "list" which causes a crash when reading /proc/slab_allocators. BUG: unable to handle kernel NULL pointer dereference at 00aa PGD 0 P4D 0 Oops: [#1] SMP DEBUG_PAGEALLOC PTI CPU: 3 PID: 5925 Comm: ldd Not tainted 5.1.0-rc3-mm1+ #6 RIP: 0010:__lock_acquire.isra.14+0x4b4/0xa50 Call Trace: lock_acquire+0xa3/0x180 _raw_spin_lock+0x2f/0x40 do_drain+0x61/0xc0 flush_smp_call_function_queue+0x3a/0x110 generic_smp_call_function_single_interrupt+0x13/0x2b smp_call_function_interrupt+0x66/0x1a0 call_function_interrupt+0xf/0x20 RIP: 0010:__tlb_remove_page_size+0x8c/0xe0 zap_pte_range+0x39f/0xc80 unmap_page_range+0x38a/0x550 unmap_single_vma+0x7d/0xe0 unmap_vmas+0xae/0xd0 exit_mmap+0xae/0x190 mmput+0x7a/0x150 do_exit+0x2d9/0xd40 do_group_exit+0x41/0xd0 __x64_sys_exit_group+0x18/0x20 do_syscall_64+0x68/0x381 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 510ded33e075 ("slab: implement slab_root_caches list") Signed-off-by: Qian Cai --- mm/slab.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/slab.c b/mm/slab.c index 46a6e084222b..9142ee992493 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -4307,7 +4307,8 @@ static void show_symbol(struct seq_file *m, unsigned long address) static int leaks_show(struct seq_file *m, void *p) { - struct kmem_cache *cachep = list_entry(p, struct kmem_cache, list); + struct kmem_cache *cachep = list_entry(p, struct kmem_cache, + root_caches_node); struct page *page; struct kmem_cache_node *n; const char *name; -- 2.17.2 (Apple Git-113)