Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
On Thursday 17 February 2005 08:38 pm, Badari Pulavarty wrote: > > On Wednesday 16 February 2005 06:52 pm, Andrew Morton wrote: > > > So it's probably an ndiswrapper bug? > > > > Andrew, > > It looks like it is a kernel bug triggered by NdisWrapper. Without > > NdisWrapper, and with just 8139too plus some light network activity the > > size-64 grew from ~ 1100 to 4500 overnight. Is this normal? I will keep > > it running to see where it goes. [OT] Didn't wanted to keep this hanging - It turned out to be a strange ndiswrapper bug - It seems that the other OS in question allows the following without a leak ;) - ptr =Allocate(...); ptr = Allocate(...); : repeat this zillion times without ever fearing that 'ptr' will leak.. I sent a fix to ndiswrapper-general mailing list on sourceforge if any one is using ndiswrapper and having a similar problem. Parag - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
On Thursday 17 February 2005 08:38 pm, Badari Pulavarty wrote: On Wednesday 16 February 2005 06:52 pm, Andrew Morton wrote: So it's probably an ndiswrapper bug? Andrew, It looks like it is a kernel bug triggered by NdisWrapper. Without NdisWrapper, and with just 8139too plus some light network activity the size-64 grew from ~ 1100 to 4500 overnight. Is this normal? I will keep it running to see where it goes. [OT] Didn't wanted to keep this hanging - It turned out to be a strange ndiswrapper bug - It seems that the other OS in question allows the following without a leak ;) - ptr =Allocate(...); ptr = Allocate(...); : repeat this zillion times without ever fearing that 'ptr' will leak.. I sent a fix to ndiswrapper-general mailing list on sourceforge if any one is using ndiswrapper and having a similar problem. Parag - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
On Thu, 2005-02-17 at 05:00, Parag Warudkar wrote: > On Wednesday 16 February 2005 06:52 pm, Andrew Morton wrote: > > So it's probably an ndiswrapper bug? > Andrew, > It looks like it is a kernel bug triggered by NdisWrapper. Without > NdisWrapper, and with just 8139too plus some light network activity the > size-64 grew from ~ 1100 to 4500 overnight. Is this normal? I will keep it > running to see where it goes. > > A question - is it safe to assume it is a kmalloc based leak? (I am thinking > of tracking it down by using kprobes to insert a probe into __kmalloc and > record the stack to see what is causing so many allocations.) > Last time I debugged something like this, I ended up adding dump_stack() in kmem_cache_alloc() for the specific slab. If you are really interested, you can try to get following jprobe module working. (need to teach about kmem_cache_t structure to get it to compile and export kallsyms_lookup_name() symbol etc). Thanks, Badari #include #include #include #include MODULE_PARM_DESC(kmod, "\n"); int count = 0; void fastcall inst_kmem_cache_alloc(kmem_cache_t *cachep, int flags) { if (cachep->objsize == 64) { if (count++ == 100) { dump_stack(); count = 0; } } jprobe_return(); } static char *fn_names[] = { "kmem_cache_alloc", }; static struct jprobe kmem_probes[] = { { .entry = (kprobe_opcode_t *) inst_kmem_cache_alloc, .kp.addr=(kprobe_opcode_t *) 0, } }; #define MAX_KMEM_ROUTINE (sizeof(kmem_probes)/sizeof(struct kprobe)) /* installs the probes in the appropriate places */ static int init_kmods(void) { int i; for (i = 0; i < MAX_KMEM_ROUTINE; i++) { kmem_probes[i].kp.addr = kallsyms_lookup_name(fn_names[i]); if (kmem_probes[i].kp.addr) { printk("plant jprobe at name %s %p, handler addr %p\n", fn_names[i], kmem_probes[i].kp.addr, kmem_probes[i].entry); register_jprobe(_probes[i]); } } return 0; } static void cleanup_kmods(void) { int i; for (i = 0; i < MAX_KMEM_ROUTINE; i++) { unregister_jprobe(_probes[i]); } } module_init(init_kmods); module_exit(cleanup_kmods); MODULE_LICENSE("GPL");
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
On Thu, 17 Feb 2005, Parag Warudkar wrote: > > A question - is it safe to assume it is a kmalloc based leak? (I am thinking > of tracking it down by using kprobes to insert a probe into __kmalloc and > record the stack to see what is causing so many allocations.) It's definitely kmalloc-based, but you may not catch it in __kmalloc. The "kmalloc()" function is actually an inline function which has some magic compile-time code that statically determines when the size is constant and can be turned into a direct call to "kmem_cache_alloc()" with the proper cache descriptor. So you'd need to either instrument kmem_cache_alloc() (and trigger on the proper slab descriptor) or you would need to modify the kmalloc() definition in to not do the constant size optimization, at which point you can instrument just __kmalloc() and avoid some of the overhead. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
On Wednesday 16 February 2005 10:48 pm, Horst von Brand wrote: > Does x86_64 use up a (freeable) register for the frame pointer or not? > I.e., does -fomit-frame-pointer have any effect on the generated code? {Took Linus out of the loop as he probably isn't interested} The generated code is different for both cases but for some reason gcc has trouble with __builtin_return_address on x86-64. For e.g. specifying gcc -fo-f-p, a method produces following assembly. method_1: .LFB2: subq$8, %rsp .LCFI0: movl$__FUNCTION__.0, %esi movl$.LC0, %edi movl$0, %eax callprintf movl$0, %eax addq$8, %rsp ret And with -fno-o-f-p, the same method yields method_1: .LFB2: pushq %rbp .LCFI0: movq%rsp, %rbp .LCFI1: movl$__FUNCTION__.0, %esi movl$.LC0, %edi movl$0, %eax callprintf movl$0, %eax leave ret - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
On Wednesday 16 February 2005 06:52 pm, Andrew Morton wrote: > So it's probably an ndiswrapper bug? Andrew, It looks like it is a kernel bug triggered by NdisWrapper. Without NdisWrapper, and with just 8139too plus some light network activity the size-64 grew from ~ 1100 to 4500 overnight. Is this normal? I will keep it running to see where it goes. A question - is it safe to assume it is a kmalloc based leak? (I am thinking of tracking it down by using kprobes to insert a probe into __kmalloc and record the stack to see what is causing so many allocations.) Thanks Parag - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
On Wednesday 16 February 2005 06:52 pm, Andrew Morton wrote: So it's probably an ndiswrapper bug? Andrew, It looks like it is a kernel bug triggered by NdisWrapper. Without NdisWrapper, and with just 8139too plus some light network activity the size-64 grew from ~ 1100 to 4500 overnight. Is this normal? I will keep it running to see where it goes. A question - is it safe to assume it is a kmalloc based leak? (I am thinking of tracking it down by using kprobes to insert a probe into __kmalloc and record the stack to see what is causing so many allocations.) Thanks Parag - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
On Wednesday 16 February 2005 10:48 pm, Horst von Brand wrote: Does x86_64 use up a (freeable) register for the frame pointer or not? I.e., does -fomit-frame-pointer have any effect on the generated code? {Took Linus out of the loop as he probably isn't interested} The generated code is different for both cases but for some reason gcc has trouble with __builtin_return_address on x86-64. For e.g. specifying gcc -fo-f-p, a method produces following assembly. method_1: .LFB2: subq$8, %rsp .LCFI0: movl$__FUNCTION__.0, %esi movl$.LC0, %edi movl$0, %eax callprintf movl$0, %eax addq$8, %rsp ret And with -fno-o-f-p, the same method yields method_1: .LFB2: pushq %rbp .LCFI0: movq%rsp, %rbp .LCFI1: movl$__FUNCTION__.0, %esi movl$.LC0, %edi movl$0, %eax callprintf movl$0, %eax leave ret - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
On Thu, 17 Feb 2005, Parag Warudkar wrote: A question - is it safe to assume it is a kmalloc based leak? (I am thinking of tracking it down by using kprobes to insert a probe into __kmalloc and record the stack to see what is causing so many allocations.) It's definitely kmalloc-based, but you may not catch it in __kmalloc. The kmalloc() function is actually an inline function which has some magic compile-time code that statically determines when the size is constant and can be turned into a direct call to kmem_cache_alloc() with the proper cache descriptor. So you'd need to either instrument kmem_cache_alloc() (and trigger on the proper slab descriptor) or you would need to modify the kmalloc() definition in linux/slab.h to not do the constant size optimization, at which point you can instrument just __kmalloc() and avoid some of the overhead. Linus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
On Thu, 2005-02-17 at 05:00, Parag Warudkar wrote: On Wednesday 16 February 2005 06:52 pm, Andrew Morton wrote: So it's probably an ndiswrapper bug? Andrew, It looks like it is a kernel bug triggered by NdisWrapper. Without NdisWrapper, and with just 8139too plus some light network activity the size-64 grew from ~ 1100 to 4500 overnight. Is this normal? I will keep it running to see where it goes. A question - is it safe to assume it is a kmalloc based leak? (I am thinking of tracking it down by using kprobes to insert a probe into __kmalloc and record the stack to see what is causing so many allocations.) Last time I debugged something like this, I ended up adding dump_stack() in kmem_cache_alloc() for the specific slab. If you are really interested, you can try to get following jprobe module working. (need to teach about kmem_cache_t structure to get it to compile and export kallsyms_lookup_name() symbol etc). Thanks, Badari #include linux/module.h #include linux/kprobes.h #include linux/kallsyms.h #include linux/kdev_t.h MODULE_PARM_DESC(kmod, \n); int count = 0; void fastcall inst_kmem_cache_alloc(kmem_cache_t *cachep, int flags) { if (cachep-objsize == 64) { if (count++ == 100) { dump_stack(); count = 0; } } jprobe_return(); } static char *fn_names[] = { kmem_cache_alloc, }; static struct jprobe kmem_probes[] = { { .entry = (kprobe_opcode_t *) inst_kmem_cache_alloc, .kp.addr=(kprobe_opcode_t *) 0, } }; #define MAX_KMEM_ROUTINE (sizeof(kmem_probes)/sizeof(struct kprobe)) /* installs the probes in the appropriate places */ static int init_kmods(void) { int i; for (i = 0; i MAX_KMEM_ROUTINE; i++) { kmem_probes[i].kp.addr = kallsyms_lookup_name(fn_names[i]); if (kmem_probes[i].kp.addr) { printk(plant jprobe at name %s %p, handler addr %p\n, fn_names[i], kmem_probes[i].kp.addr, kmem_probes[i].entry); register_jprobe(kmem_probes[i]); } } return 0; } static void cleanup_kmods(void) { int i; for (i = 0; i MAX_KMEM_ROUTINE; i++) { unregister_jprobe(kmem_probes[i]); } } module_init(init_kmods); module_exit(cleanup_kmods); MODULE_LICENSE(GPL);
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
Andrew Morton <[EMAIL PROTECTED]> said: > Parag Warudkar <[EMAIL PROTECTED]> wrote: [...] > > Is there a reason X86_64 doesnt have CONFIG_FRAME_POINTER anywhere in > > the .config? > No good reason, I suspect. Does x86_64 use up a (freeable) register for the frame pointer or not? I.e., does -fomit-frame-pointer have any effect on the generated code? -- Dr. Horst H. von Brand User #22616 counter.li.org Departamento de Informatica Fono: +56 32 654431 Universidad Tecnica Federico Santa Maria +56 32 654239 Casilla 110-V, Valparaiso, ChileFax: +56 32 797513 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
On Wednesday 16 February 2005 06:51 pm, Andrew Morton wrote: > 81002fe8 is the address of the slab object. 08a8 is > supposed to be the caller's text address. It appears that > __builtin_return_address(0) is returning junk. Perhaps due to > -fomit-frame-pointer. I tried manually removing -fomit-frame-pointer from Makefile and adding -fno-omit-frame-pointer but with same results - junk return addresses. Probably a X86_64 issue. >So it's probably an ndiswrapper bug? I looked at ndiswrapper mailing lists and found this explanation for the same issue of growing size-64 with ndiswrapper - -- "It looks like the problem is kernel-version related, not ndiswrapper. ndiswrapper just uses some API that starts the memory leak but the problem is indeed in the kernel itself. versions from 2.6.10 up to .11-rc3 have this problem afaik. haven"t tested rc4 but maybe this one doesn"t have the problem anymore, we will see" -- I tested -rc4 and it has the problem too. More over, with plain old 8139too driver, the slab still continues to grow albeit slowly. So there is a reason to suspect kernel leak as well. I will try binary searching... Parag - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
Parag Warudkar <[EMAIL PROTECTED]> wrote: > > On Wednesday 16 February 2005 12:12 am, Andrew Morton wrote: > > Plenty of moisture there. > > > > Could you please use this patch? Make sure that you enable > > CONFIG_FRAME_POINTER (might not be needed for __builtin_return_address(0), > > but let's be sure). Also enable CONFIG_DEBUG_SLAB. > > Will try that out. For now I tried -rc4 and couple other things - removing > nvidia module doesnt make any difference but removing ndiswrapper and with no > networking the slab growth stops. With 8139too driver and network the growth > is there but pretty slower than with ndiswrapper. With 8139too + some network > activity slab seems to reduce sometimes. OK. > Seems either an ndiswrapper or a networking related leak. Will report the > results with Manfred's patch tomorrow. So it's probably an ndiswrapper bug? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
Parag Warudkar <[EMAIL PROTECTED]> wrote: > > On Wednesday 16 February 2005 12:12 am, Andrew Morton wrote: > > echo "size-4096 0 0 0" > /proc/slabinfo > > Is there a reason X86_64 doesnt have CONFIG_FRAME_POINTER anywhere in > the .config? No good reason, I suspect. > I tried -rc4 with Manfred's patch and with CONFIG_DEBUG_SLAB and > CONFIG_DEBUG. Thanks. > I get the following output from > echo "size-64 0 0 0" > /proc/slabinfo > > obj 81002fe8/0: 08a8 <0x8a8> > obj 81002fe8/1: 08a8 <0x8a8> > obj 81002fe8/2: 08a8 <0x8a8> > : 3 > : 4 > : : > obj 81002fe8/43: 08a8 <0x8a8> > obj 81002fe8/44: 08a8 <0x8a8> > > How do I know what is at 81002fe8? I tried the normal tricks (gdb > -c /proc/kcore vmlinux, objdump -d etc.) but none of the places list this > address. 81002fe8 is the address of the slab object. 08a8 is supposed to be the caller's text address. It appears that __builtin_return_address(0) is returning junk. Perhaps due to -fomit-frame-pointer. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
On Wednesday 16 February 2005 12:12 am, Andrew Morton wrote: > Plenty of moisture there. > > Could you please use this patch? Make sure that you enable > CONFIG_FRAME_POINTER (might not be needed for __builtin_return_address(0), > but let's be sure). Also enable CONFIG_DEBUG_SLAB. Will try that out. For now I tried -rc4 and couple other things - removing nvidia module doesnt make any difference but removing ndiswrapper and with no networking the slab growth stops. With 8139too driver and network the growth is there but pretty slower than with ndiswrapper. With 8139too + some network activity slab seems to reduce sometimes. Seems either an ndiswrapper or a networking related leak. Will report the results with Manfred's patch tomorrow. Thanks Parag - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
On Wednesday 16 February 2005 12:12 am, Andrew Morton wrote: Plenty of moisture there. Could you please use this patch? Make sure that you enable CONFIG_FRAME_POINTER (might not be needed for __builtin_return_address(0), but let's be sure). Also enable CONFIG_DEBUG_SLAB. Will try that out. For now I tried -rc4 and couple other things - removing nvidia module doesnt make any difference but removing ndiswrapper and with no networking the slab growth stops. With 8139too driver and network the growth is there but pretty slower than with ndiswrapper. With 8139too + some network activity slab seems to reduce sometimes. Seems either an ndiswrapper or a networking related leak. Will report the results with Manfred's patch tomorrow. Thanks Parag - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
Parag Warudkar [EMAIL PROTECTED] wrote: On Wednesday 16 February 2005 12:12 am, Andrew Morton wrote: echo size-4096 0 0 0 /proc/slabinfo Is there a reason X86_64 doesnt have CONFIG_FRAME_POINTER anywhere in the .config? No good reason, I suspect. I tried -rc4 with Manfred's patch and with CONFIG_DEBUG_SLAB and CONFIG_DEBUG. Thanks. I get the following output from echo size-64 0 0 0 /proc/slabinfo obj 81002fe8/0: 08a8 0x8a8 obj 81002fe8/1: 08a8 0x8a8 obj 81002fe8/2: 08a8 0x8a8 : 3 : 4 : : obj 81002fe8/43: 08a8 0x8a8 obj 81002fe8/44: 08a8 0x8a8 How do I know what is at 81002fe8? I tried the normal tricks (gdb -c /proc/kcore vmlinux, objdump -d etc.) but none of the places list this address. 81002fe8 is the address of the slab object. 08a8 is supposed to be the caller's text address. It appears that __builtin_return_address(0) is returning junk. Perhaps due to -fomit-frame-pointer. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
Parag Warudkar [EMAIL PROTECTED] wrote: On Wednesday 16 February 2005 12:12 am, Andrew Morton wrote: Plenty of moisture there. Could you please use this patch? Make sure that you enable CONFIG_FRAME_POINTER (might not be needed for __builtin_return_address(0), but let's be sure). Also enable CONFIG_DEBUG_SLAB. Will try that out. For now I tried -rc4 and couple other things - removing nvidia module doesnt make any difference but removing ndiswrapper and with no networking the slab growth stops. With 8139too driver and network the growth is there but pretty slower than with ndiswrapper. With 8139too + some network activity slab seems to reduce sometimes. OK. Seems either an ndiswrapper or a networking related leak. Will report the results with Manfred's patch tomorrow. So it's probably an ndiswrapper bug? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
On Wednesday 16 February 2005 06:51 pm, Andrew Morton wrote: 81002fe8 is the address of the slab object. 08a8 is supposed to be the caller's text address. It appears that __builtin_return_address(0) is returning junk. Perhaps due to -fomit-frame-pointer. I tried manually removing -fomit-frame-pointer from Makefile and adding -fno-omit-frame-pointer but with same results - junk return addresses. Probably a X86_64 issue. So it's probably an ndiswrapper bug? I looked at ndiswrapper mailing lists and found this explanation for the same issue of growing size-64 with ndiswrapper - -- It looks like the problem is kernel-version related, not ndiswrapper. ndiswrapper just uses some API that starts the memory leak but the problem is indeed in the kernel itself. versions from 2.6.10 up to .11-rc3 have this problem afaik. havent tested rc4 but maybe this one doesnt have the problem anymore, we will see -- I tested -rc4 and it has the problem too. More over, with plain old 8139too driver, the slab still continues to grow albeit slowly. So there is a reason to suspect kernel leak as well. I will try binary searching... Parag - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
Andrew Morton [EMAIL PROTECTED] said: Parag Warudkar [EMAIL PROTECTED] wrote: [...] Is there a reason X86_64 doesnt have CONFIG_FRAME_POINTER anywhere in the .config? No good reason, I suspect. Does x86_64 use up a (freeable) register for the frame pointer or not? I.e., does -fomit-frame-pointer have any effect on the generated code? -- Dr. Horst H. von Brand User #22616 counter.li.org Departamento de Informatica Fono: +56 32 654431 Universidad Tecnica Federico Santa Maria +56 32 654239 Casilla 110-V, Valparaiso, ChileFax: +56 32 797513 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
Parag Warudkar <[EMAIL PROTECTED]> wrote: > > I am running -rc3 on my AMD64 laptop and I noticed it becomes sluggish after > use mainly due to growing swap use. It has 768M of RAM and a Gig of swap. > After following this thread, I started monitoring /proc/slabinfo. It seems > size-64 is continuously growing and doing a compile run seem to make it grow > noticeably faster. After a day's uptime size-64 line in /proc/slabinfo looks > like > > size-64 7216543 7216544 64 611 : tunables 120 600 > : > slabdata 118304 118304 0 Plenty of moisture there. Could you please use this patch? Make sure that you enable CONFIG_FRAME_POINTER (might not be needed for __builtin_return_address(0), but let's be sure). Also enable CONFIG_DEBUG_SLAB. From: Manfred Spraul <[EMAIL PROTECTED]> With the patch applied, echo "size-4096 0 0 0" > /proc/slabinfo walks the objects in the size-4096 slab, printing out the calling address of whoever allocated that object. It is for leak detection. diff -puN mm/slab.c~slab-leak-detector mm/slab.c --- 25/mm/slab.c~slab-leak-detector 2005-02-15 21:06:44.0 -0800 +++ 25-akpm/mm/slab.c 2005-02-15 21:06:44.0 -0800 @@ -2116,6 +2116,15 @@ cache_alloc_debugcheck_after(kmem_cache_ *dbg_redzone1(cachep, objp) = RED_ACTIVE; *dbg_redzone2(cachep, objp) = RED_ACTIVE; } + { + int objnr; + struct slab *slabp; + + slabp = GET_PAGE_SLAB(virt_to_page(objp)); + + objnr = (objp - slabp->s_mem) / cachep->objsize; + slab_bufctl(slabp)[objnr] = (unsigned long)caller; + } objp += obj_dbghead(cachep); if (cachep->ctor && cachep->flags & SLAB_POISON) { unsigned long ctor_flags = SLAB_CTOR_CONSTRUCTOR; @@ -2179,12 +2188,14 @@ static void free_block(kmem_cache_t *cac objnr = (objp - slabp->s_mem) / cachep->objsize; check_slabp(cachep, slabp); #if DEBUG +#if 0 if (slab_bufctl(slabp)[objnr] != BUFCTL_FREE) { printk(KERN_ERR "slab: double free detected in cache '%s', objp %p.\n", cachep->name, objp); BUG(); } #endif +#endif slab_bufctl(slabp)[objnr] = slabp->free; slabp->free = objnr; STATS_DEC_ACTIVE(cachep); @@ -2998,6 +3009,29 @@ struct seq_operations slabinfo_op = { .show = s_show, }; +static void do_dump_slabp(kmem_cache_t *cachep) +{ +#if DEBUG + struct list_head *q; + + check_irq_on(); + spin_lock_irq(>spinlock); + list_for_each(q,>lists.slabs_full) { + struct slab *slabp; + int i; + slabp = list_entry(q, struct slab, list); + for (i = 0; i < cachep->num; i++) { + unsigned long sym = slab_bufctl(slabp)[i]; + + printk("obj %p/%d: %p", slabp, i, (void *)sym); + print_symbol(" <%s>", sym); + printk("\n"); + } + } + spin_unlock_irq(>spinlock); +#endif +} + #define MAX_SLABINFO_WRITE 128 /** * slabinfo_write - Tuning for the slab allocator @@ -3038,9 +3072,11 @@ ssize_t slabinfo_write(struct file *file batchcount < 1 || batchcount > limit || shared < 0) { - res = -EINVAL; + do_dump_slabp(cachep); + res = 0; } else { - res = do_tune_cpucache(cachep, limit, batchcount, shared); + res = do_tune_cpucache(cachep, limit, + batchcount, shared); } break; } _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
-rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
I am running -rc3 on my AMD64 laptop and I noticed it becomes sluggish after use mainly due to growing swap use. It has 768M of RAM and a Gig of swap. After following this thread, I started monitoring /proc/slabinfo. It seems size-64 is continuously growing and doing a compile run seem to make it grow noticeably faster. After a day's uptime size-64 line in /proc/slabinfo looks like size-64 7216543 7216544 64 611 : tunables 120 600 : slabdata 118304 118304 0 Since this doesn't seem to bio, I think we have another slab leak somewhere. The box recently went OOM during a gcc compile run after I killed the swap. Output from free , OOM Killer, and /proc/slabinfo is down below.. free output - total used free sharedbuffers cached Mem:767996 758120 9876 0 5276 130360 -/+ buffers/cache: 622484 145512 Swap: 1052248 67668 984580 OOM Killer Output oom-killer: gfp_mask=0x1d2 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 HighMem per-cpu: empty Free pages:7260kB (0kB HighMem) Active:62385 inactive:850 dirty:0 writeback:0 unstable:0 free:1815 slab:120136 mapped:62334 pagetables:2110 DMA free:3076kB min:72kB low:88kB high:108kB active:3328kB inactive:0kB present:16384kB pages_scanned:4446 all_unreclaimable? yes lowmem_reserve[]: 0 751 751 Normal free:4184kB min:3468kB low:4332kB high:5200kB active:246212kB inactive:3400kB present:769472kB pages_scanned:3834 all_unreclaimable? no lowmem_reserve[]: 0 0 0 HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 DMA: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3076kB Normal: 170*4kB 10*8kB 2*16kB 0*32kB 1*64kB 0*128kB 1*256kB 2*512kB 0*1024kB 1*2048kB 0*4096kB = 4184kB HighMem: empty Swap cache: add 310423, delete 310423, find 74707/105490, race 0+0 Free swap = 0kB Total swap = 0kB Out of Memory: Killed process 4898 (klauncher). oom-killer: gfp_mask=0x1d2 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 HighMem per-cpu: empty Free pages:7020kB (0kB HighMem) Active:62308 inactive:648 dirty:0 writeback:0 unstable:0 free:1755 slab:120439 mapped:62199 pagetables:2020 DMA free:3076kB min:72kB low:88kB high:108kB active:3336kB inactive:0kB present:16384kB pages_scanned:7087 all_unreclaimable? yes lowmem_reserve[]: 0 751 751 Normal free:3944kB min:3468kB low:4332kB high:5200kB active:245896kB inactive:2592kB present:769472kB pages_scanned:3861 all_unreclaimable? no lowmem_reserve[]: 0 0 0 HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 DMA: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3076kB Normal: 112*4kB 9*8kB 0*16kB 1*32kB 1*64kB 0*128kB 1*256kB 2*512kB 0*1024kB 1*2048kB 0*4096kB = 3944kB HighMem: empty Swap cache: add 310423, delete 310423, find 74707/105490, race 0+0 Free swap = 0kB Total swap = 0kB Out of Memory: Killed process 4918 (kwin). /proc/slabinfo output ipx_sock 0 089641 : tunables 54 270 : slabdata 0 0 0 scsi_cmd_cache 3 757671 : tunables 54 270 : slabdata 1 1 0 ip_fib_alias 10119 32 1191 : tunables 120 600 : slabdata 1 1 0 ip_fib_hash 10 61 64 611 : tunables 120 600 : slabdata 1 1 0 sgpool-12832 32 409611 : tunables 24 120 : slabdata 32 32 0 sgpool-64 32 32 204821 : tunables 24 120 : slabdata 16 16 0 sgpool-32 32 32 102441 : tunables 54 270 : slabdata 8 8 0 sgpool-16 32 3251281 : tunables 54 270 : slabdata 4 4 0 sgpool-8 32 45256 151 : tunables 120 600 : slabdata 3 3 0 ext3_inode_cache2805 3063 122431 : tunables 24 120 : slabdata 1021 1021 0 ext3_xattr 0 0 88 451 : tunables 120 600 : slabdata 0 0 0 journal_handle16156 24 1561 : tunables 120 600 : slabdata 1 1 0 journal_head 49180 88 451 : tunables 120 600 : slabdata 4 4 0 revoke_table 6225 16 2251 : tunables 120 600 : slabdata 1 1 0 revoke_record 0 0 32 1191 :
-rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
I am running -rc3 on my AMD64 laptop and I noticed it becomes sluggish after use mainly due to growing swap use. It has 768M of RAM and a Gig of swap. After following this thread, I started monitoring /proc/slabinfo. It seems size-64 is continuously growing and doing a compile run seem to make it grow noticeably faster. After a day's uptime size-64 line in /proc/slabinfo looks like size-64 7216543 7216544 64 611 : tunables 120 600 : slabdata 118304 118304 0 Since this doesn't seem to bio, I think we have another slab leak somewhere. The box recently went OOM during a gcc compile run after I killed the swap. Output from free , OOM Killer, and /proc/slabinfo is down below.. free output - total used free sharedbuffers cached Mem:767996 758120 9876 0 5276 130360 -/+ buffers/cache: 622484 145512 Swap: 1052248 67668 984580 OOM Killer Output oom-killer: gfp_mask=0x1d2 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 HighMem per-cpu: empty Free pages:7260kB (0kB HighMem) Active:62385 inactive:850 dirty:0 writeback:0 unstable:0 free:1815 slab:120136 mapped:62334 pagetables:2110 DMA free:3076kB min:72kB low:88kB high:108kB active:3328kB inactive:0kB present:16384kB pages_scanned:4446 all_unreclaimable? yes lowmem_reserve[]: 0 751 751 Normal free:4184kB min:3468kB low:4332kB high:5200kB active:246212kB inactive:3400kB present:769472kB pages_scanned:3834 all_unreclaimable? no lowmem_reserve[]: 0 0 0 HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 DMA: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3076kB Normal: 170*4kB 10*8kB 2*16kB 0*32kB 1*64kB 0*128kB 1*256kB 2*512kB 0*1024kB 1*2048kB 0*4096kB = 4184kB HighMem: empty Swap cache: add 310423, delete 310423, find 74707/105490, race 0+0 Free swap = 0kB Total swap = 0kB Out of Memory: Killed process 4898 (klauncher). oom-killer: gfp_mask=0x1d2 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 HighMem per-cpu: empty Free pages:7020kB (0kB HighMem) Active:62308 inactive:648 dirty:0 writeback:0 unstable:0 free:1755 slab:120439 mapped:62199 pagetables:2020 DMA free:3076kB min:72kB low:88kB high:108kB active:3336kB inactive:0kB present:16384kB pages_scanned:7087 all_unreclaimable? yes lowmem_reserve[]: 0 751 751 Normal free:3944kB min:3468kB low:4332kB high:5200kB active:245896kB inactive:2592kB present:769472kB pages_scanned:3861 all_unreclaimable? no lowmem_reserve[]: 0 0 0 HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 DMA: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3076kB Normal: 112*4kB 9*8kB 0*16kB 1*32kB 1*64kB 0*128kB 1*256kB 2*512kB 0*1024kB 1*2048kB 0*4096kB = 3944kB HighMem: empty Swap cache: add 310423, delete 310423, find 74707/105490, race 0+0 Free swap = 0kB Total swap = 0kB Out of Memory: Killed process 4918 (kwin). /proc/slabinfo output ipx_sock 0 089641 : tunables 54 270 : slabdata 0 0 0 scsi_cmd_cache 3 757671 : tunables 54 270 : slabdata 1 1 0 ip_fib_alias 10119 32 1191 : tunables 120 600 : slabdata 1 1 0 ip_fib_hash 10 61 64 611 : tunables 120 600 : slabdata 1 1 0 sgpool-12832 32 409611 : tunables 24 120 : slabdata 32 32 0 sgpool-64 32 32 204821 : tunables 24 120 : slabdata 16 16 0 sgpool-32 32 32 102441 : tunables 54 270 : slabdata 8 8 0 sgpool-16 32 3251281 : tunables 54 270 : slabdata 4 4 0 sgpool-8 32 45256 151 : tunables 120 600 : slabdata 3 3 0 ext3_inode_cache2805 3063 122431 : tunables 24 120 : slabdata 1021 1021 0 ext3_xattr 0 0 88 451 : tunables 120 600 : slabdata 0 0 0 journal_handle16156 24 1561 : tunables 120 600 : slabdata 1 1 0 journal_head 49180 88 451 : tunables 120 600 : slabdata 4 4 0 revoke_table 6225 16 2251 : tunables 120 600 : slabdata 1 1 0 revoke_record 0 0 32 1191 :
Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]
Parag Warudkar [EMAIL PROTECTED] wrote: I am running -rc3 on my AMD64 laptop and I noticed it becomes sluggish after use mainly due to growing swap use. It has 768M of RAM and a Gig of swap. After following this thread, I started monitoring /proc/slabinfo. It seems size-64 is continuously growing and doing a compile run seem to make it grow noticeably faster. After a day's uptime size-64 line in /proc/slabinfo looks like size-64 7216543 7216544 64 611 : tunables 120 600 : slabdata 118304 118304 0 Plenty of moisture there. Could you please use this patch? Make sure that you enable CONFIG_FRAME_POINTER (might not be needed for __builtin_return_address(0), but let's be sure). Also enable CONFIG_DEBUG_SLAB. From: Manfred Spraul [EMAIL PROTECTED] With the patch applied, echo size-4096 0 0 0 /proc/slabinfo walks the objects in the size-4096 slab, printing out the calling address of whoever allocated that object. It is for leak detection. diff -puN mm/slab.c~slab-leak-detector mm/slab.c --- 25/mm/slab.c~slab-leak-detector 2005-02-15 21:06:44.0 -0800 +++ 25-akpm/mm/slab.c 2005-02-15 21:06:44.0 -0800 @@ -2116,6 +2116,15 @@ cache_alloc_debugcheck_after(kmem_cache_ *dbg_redzone1(cachep, objp) = RED_ACTIVE; *dbg_redzone2(cachep, objp) = RED_ACTIVE; } + { + int objnr; + struct slab *slabp; + + slabp = GET_PAGE_SLAB(virt_to_page(objp)); + + objnr = (objp - slabp-s_mem) / cachep-objsize; + slab_bufctl(slabp)[objnr] = (unsigned long)caller; + } objp += obj_dbghead(cachep); if (cachep-ctor cachep-flags SLAB_POISON) { unsigned long ctor_flags = SLAB_CTOR_CONSTRUCTOR; @@ -2179,12 +2188,14 @@ static void free_block(kmem_cache_t *cac objnr = (objp - slabp-s_mem) / cachep-objsize; check_slabp(cachep, slabp); #if DEBUG +#if 0 if (slab_bufctl(slabp)[objnr] != BUFCTL_FREE) { printk(KERN_ERR slab: double free detected in cache '%s', objp %p.\n, cachep-name, objp); BUG(); } #endif +#endif slab_bufctl(slabp)[objnr] = slabp-free; slabp-free = objnr; STATS_DEC_ACTIVE(cachep); @@ -2998,6 +3009,29 @@ struct seq_operations slabinfo_op = { .show = s_show, }; +static void do_dump_slabp(kmem_cache_t *cachep) +{ +#if DEBUG + struct list_head *q; + + check_irq_on(); + spin_lock_irq(cachep-spinlock); + list_for_each(q,cachep-lists.slabs_full) { + struct slab *slabp; + int i; + slabp = list_entry(q, struct slab, list); + for (i = 0; i cachep-num; i++) { + unsigned long sym = slab_bufctl(slabp)[i]; + + printk(obj %p/%d: %p, slabp, i, (void *)sym); + print_symbol( %s, sym); + printk(\n); + } + } + spin_unlock_irq(cachep-spinlock); +#endif +} + #define MAX_SLABINFO_WRITE 128 /** * slabinfo_write - Tuning for the slab allocator @@ -3038,9 +3072,11 @@ ssize_t slabinfo_write(struct file *file batchcount 1 || batchcount limit || shared 0) { - res = -EINVAL; + do_dump_slabp(cachep); + res = 0; } else { - res = do_tune_cpucache(cachep, limit, batchcount, shared); + res = do_tune_cpucache(cachep, limit, + batchcount, shared); } break; } _ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/