Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-20 Thread Parag Warudkar
On Thursday 17 February 2005 08:38 pm, Badari Pulavarty wrote:
> > On Wednesday 16 February 2005 06:52 pm, Andrew Morton wrote:
> > > So it's probably an ndiswrapper bug?
> >
> > Andrew,
> > It looks like it is a kernel bug triggered by NdisWrapper. Without
> > NdisWrapper, and with just 8139too plus some light network activity the
> > size-64 grew from ~ 1100 to 4500 overnight. Is this normal? I will keep
> > it running to see where it goes.

[OT]

Didn't wanted to keep this hanging - It turned out to be a strange ndiswrapper 
bug - It seems that the other OS in question allows the following without a 
leak ;) -
ptr =Allocate(...);
ptr = Allocate(...);
:
repeat this zillion times without ever fearing that 'ptr' will leak..

I sent a fix to ndiswrapper-general mailing list on sourceforge if any one is 
using ndiswrapper and having a similar problem.

Parag
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-20 Thread Parag Warudkar
On Thursday 17 February 2005 08:38 pm, Badari Pulavarty wrote:
  On Wednesday 16 February 2005 06:52 pm, Andrew Morton wrote:
   So it's probably an ndiswrapper bug?
 
  Andrew,
  It looks like it is a kernel bug triggered by NdisWrapper. Without
  NdisWrapper, and with just 8139too plus some light network activity the
  size-64 grew from ~ 1100 to 4500 overnight. Is this normal? I will keep
  it running to see where it goes.

[OT]

Didn't wanted to keep this hanging - It turned out to be a strange ndiswrapper 
bug - It seems that the other OS in question allows the following without a 
leak ;) -
ptr =Allocate(...);
ptr = Allocate(...);
:
repeat this zillion times without ever fearing that 'ptr' will leak..

I sent a fix to ndiswrapper-general mailing list on sourceforge if any one is 
using ndiswrapper and having a similar problem.

Parag
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-17 Thread Badari Pulavarty
On Thu, 2005-02-17 at 05:00, Parag Warudkar wrote:
> On Wednesday 16 February 2005 06:52 pm, Andrew Morton wrote:
> > So it's probably an ndiswrapper bug?
> Andrew, 
> It looks like it is a kernel bug triggered by NdisWrapper. Without 
> NdisWrapper, and with just 8139too plus some light network activity the 
> size-64 grew from ~ 1100 to 4500 overnight. Is this normal? I will keep it 
> running to see where it goes.
> 
> A question - is it safe to assume it is  a kmalloc based leak? (I am thinking 
> of tracking it down by using kprobes to insert a probe into __kmalloc and 
> record the stack to see what is causing so many allocations.)
> 

Last time I debugged something like this, I ended up adding dump_stack()
in kmem_cache_alloc() for the specific slab.

If you are really interested, you can try to get following jprobe
module working. (need to teach about kmem_cache_t structure to
get it to compile and export kallsyms_lookup_name() symbol etc).

Thanks,
Badari



#include 
#include 
#include 
#include 

MODULE_PARM_DESC(kmod, "\n");

int count = 0;
void fastcall inst_kmem_cache_alloc(kmem_cache_t *cachep, int flags)
{
	if (cachep->objsize == 64) {
		if (count++ == 100) {
			dump_stack();
			count = 0;
		}
	}
	jprobe_return();
}
static char *fn_names[] = {
	"kmem_cache_alloc",
};

static struct jprobe kmem_probes[] = {
  {
.entry = (kprobe_opcode_t *) inst_kmem_cache_alloc,
.kp.addr=(kprobe_opcode_t *) 0,
  }
};

#define MAX_KMEM_ROUTINE (sizeof(kmem_probes)/sizeof(struct kprobe))

/* installs the probes in the appropriate places */
static int init_kmods(void)
{
	int i;

	for (i = 0; i < MAX_KMEM_ROUTINE; i++) {
		kmem_probes[i].kp.addr = kallsyms_lookup_name(fn_names[i]);
		if (kmem_probes[i].kp.addr) { 
			printk("plant jprobe at name %s %p, handler addr %p\n",
		  fn_names[i], kmem_probes[i].kp.addr, kmem_probes[i].entry);
			register_jprobe(_probes[i]);
		}
	}
	return 0;
}

static void cleanup_kmods(void)
{
	int i;
	for (i = 0; i < MAX_KMEM_ROUTINE; i++) {
		unregister_jprobe(_probes[i]);
	}
}

module_init(init_kmods);
module_exit(cleanup_kmods);
MODULE_LICENSE("GPL");


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-17 Thread Linus Torvalds


On Thu, 17 Feb 2005, Parag Warudkar wrote:
> 
> A question - is it safe to assume it is  a kmalloc based leak? (I am thinking 
> of tracking it down by using kprobes to insert a probe into __kmalloc and 
> record the stack to see what is causing so many allocations.)

It's definitely kmalloc-based, but you may not catch it in __kmalloc. The 
"kmalloc()" function is actually an inline function which has some magic 
compile-time code that statically determines when the size is constant and 
can be turned into a direct call to "kmem_cache_alloc()" with the proper 
cache descriptor.

So you'd need to either instrument kmem_cache_alloc() (and trigger on the 
proper slab descriptor) or you would need to modify the kmalloc() 
definition in  to not do the constant size optimization, at 
which point you can instrument just __kmalloc() and avoid some of the 
overhead.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-17 Thread Parag Warudkar
On Wednesday 16 February 2005 10:48 pm, Horst von Brand wrote:
> Does x86_64 use up a (freeable) register for the frame pointer or not?
> I.e., does -fomit-frame-pointer have any effect on the generated code?

{Took Linus out of the loop as he probably isn't interested}

The generated code is different for both cases but for some reason gcc has 
trouble with __builtin_return_address on x86-64.

For e.g. specifying gcc -fo-f-p, a method produces following assembly.

method_1:
.LFB2:
subq$8, %rsp
.LCFI0:
movl$__FUNCTION__.0, %esi
movl$.LC0, %edi
movl$0, %eax
callprintf
movl$0, %eax
addq$8, %rsp
ret

And with -fno-o-f-p,  the same method yields 

method_1:
.LFB2:
pushq   %rbp
.LCFI0:
movq%rsp, %rbp
.LCFI1:
movl$__FUNCTION__.0, %esi
movl$.LC0, %edi
movl$0, %eax
callprintf
movl$0, %eax
leave
ret
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-17 Thread Parag Warudkar
On Wednesday 16 February 2005 06:52 pm, Andrew Morton wrote:
> So it's probably an ndiswrapper bug?
Andrew, 
It looks like it is a kernel bug triggered by NdisWrapper. Without 
NdisWrapper, and with just 8139too plus some light network activity the 
size-64 grew from ~ 1100 to 4500 overnight. Is this normal? I will keep it 
running to see where it goes.

A question - is it safe to assume it is  a kmalloc based leak? (I am thinking 
of tracking it down by using kprobes to insert a probe into __kmalloc and 
record the stack to see what is causing so many allocations.)

Thanks
Parag
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-17 Thread Parag Warudkar
On Wednesday 16 February 2005 06:52 pm, Andrew Morton wrote:
 So it's probably an ndiswrapper bug?
Andrew, 
It looks like it is a kernel bug triggered by NdisWrapper. Without 
NdisWrapper, and with just 8139too plus some light network activity the 
size-64 grew from ~ 1100 to 4500 overnight. Is this normal? I will keep it 
running to see where it goes.

A question - is it safe to assume it is  a kmalloc based leak? (I am thinking 
of tracking it down by using kprobes to insert a probe into __kmalloc and 
record the stack to see what is causing so many allocations.)

Thanks
Parag
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-17 Thread Parag Warudkar
On Wednesday 16 February 2005 10:48 pm, Horst von Brand wrote:
 Does x86_64 use up a (freeable) register for the frame pointer or not?
 I.e., does -fomit-frame-pointer have any effect on the generated code?

{Took Linus out of the loop as he probably isn't interested}

The generated code is different for both cases but for some reason gcc has 
trouble with __builtin_return_address on x86-64.

For e.g. specifying gcc -fo-f-p, a method produces following assembly.

method_1:
.LFB2:
subq$8, %rsp
.LCFI0:
movl$__FUNCTION__.0, %esi
movl$.LC0, %edi
movl$0, %eax
callprintf
movl$0, %eax
addq$8, %rsp
ret

And with -fno-o-f-p,  the same method yields 

method_1:
.LFB2:
pushq   %rbp
.LCFI0:
movq%rsp, %rbp
.LCFI1:
movl$__FUNCTION__.0, %esi
movl$.LC0, %edi
movl$0, %eax
callprintf
movl$0, %eax
leave
ret
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-17 Thread Linus Torvalds


On Thu, 17 Feb 2005, Parag Warudkar wrote:
 
 A question - is it safe to assume it is  a kmalloc based leak? (I am thinking 
 of tracking it down by using kprobes to insert a probe into __kmalloc and 
 record the stack to see what is causing so many allocations.)

It's definitely kmalloc-based, but you may not catch it in __kmalloc. The 
kmalloc() function is actually an inline function which has some magic 
compile-time code that statically determines when the size is constant and 
can be turned into a direct call to kmem_cache_alloc() with the proper 
cache descriptor.

So you'd need to either instrument kmem_cache_alloc() (and trigger on the 
proper slab descriptor) or you would need to modify the kmalloc() 
definition in linux/slab.h to not do the constant size optimization, at 
which point you can instrument just __kmalloc() and avoid some of the 
overhead.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-17 Thread Badari Pulavarty
On Thu, 2005-02-17 at 05:00, Parag Warudkar wrote:
 On Wednesday 16 February 2005 06:52 pm, Andrew Morton wrote:
  So it's probably an ndiswrapper bug?
 Andrew, 
 It looks like it is a kernel bug triggered by NdisWrapper. Without 
 NdisWrapper, and with just 8139too plus some light network activity the 
 size-64 grew from ~ 1100 to 4500 overnight. Is this normal? I will keep it 
 running to see where it goes.
 
 A question - is it safe to assume it is  a kmalloc based leak? (I am thinking 
 of tracking it down by using kprobes to insert a probe into __kmalloc and 
 record the stack to see what is causing so many allocations.)
 

Last time I debugged something like this, I ended up adding dump_stack()
in kmem_cache_alloc() for the specific slab.

If you are really interested, you can try to get following jprobe
module working. (need to teach about kmem_cache_t structure to
get it to compile and export kallsyms_lookup_name() symbol etc).

Thanks,
Badari



#include linux/module.h
#include linux/kprobes.h
#include linux/kallsyms.h
#include linux/kdev_t.h

MODULE_PARM_DESC(kmod, \n);

int count = 0;
void fastcall inst_kmem_cache_alloc(kmem_cache_t *cachep, int flags)
{
	if (cachep-objsize == 64) {
		if (count++ == 100) {
			dump_stack();
			count = 0;
		}
	}
	jprobe_return();
}
static char *fn_names[] = {
	kmem_cache_alloc,
};

static struct jprobe kmem_probes[] = {
  {
.entry = (kprobe_opcode_t *) inst_kmem_cache_alloc,
.kp.addr=(kprobe_opcode_t *) 0,
  }
};

#define MAX_KMEM_ROUTINE (sizeof(kmem_probes)/sizeof(struct kprobe))

/* installs the probes in the appropriate places */
static int init_kmods(void)
{
	int i;

	for (i = 0; i  MAX_KMEM_ROUTINE; i++) {
		kmem_probes[i].kp.addr = kallsyms_lookup_name(fn_names[i]);
		if (kmem_probes[i].kp.addr) { 
			printk(plant jprobe at name %s %p, handler addr %p\n,
		  fn_names[i], kmem_probes[i].kp.addr, kmem_probes[i].entry);
			register_jprobe(kmem_probes[i]);
		}
	}
	return 0;
}

static void cleanup_kmods(void)
{
	int i;
	for (i = 0; i  MAX_KMEM_ROUTINE; i++) {
		unregister_jprobe(kmem_probes[i]);
	}
}

module_init(init_kmods);
module_exit(cleanup_kmods);
MODULE_LICENSE(GPL);


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-16 Thread Horst von Brand
Andrew Morton <[EMAIL PROTECTED]> said:
> Parag Warudkar <[EMAIL PROTECTED]> wrote:

[...]

> > Is there a reason X86_64 doesnt have CONFIG_FRAME_POINTER anywhere in 
> > the .config?

> No good reason, I suspect.

Does x86_64 use up a (freeable) register for the frame pointer or not?
I.e., does -fomit-frame-pointer have any effect on the generated code?
-- 
Dr. Horst H. von Brand   User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria  +56 32 654239
Casilla 110-V, Valparaiso, ChileFax:  +56 32 797513
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-16 Thread Parag Warudkar
On Wednesday 16 February 2005 06:51 pm, Andrew Morton wrote:
> 81002fe8 is the address of the slab object.  08a8 is
> supposed to be the caller's text address.  It appears that
> __builtin_return_address(0) is returning junk.  Perhaps due to
> -fomit-frame-pointer.
I tried manually removing -fomit-frame-pointer from Makefile and adding 
-fno-omit-frame-pointer but with same results - junk return addresses. 
Probably a X86_64 issue.

>So it's probably an ndiswrapper bug? 
I looked at ndiswrapper mailing lists and found this explanation for the same 
issue of growing size-64 with ndiswrapper  -
--
"It looks like the problem is kernel-version related, not ndiswrapper. 
 ndiswrapper just uses some API that starts the memory leak but the 
 problem is indeed in the kernel itself. versions from 2.6.10 up to 
 .11-rc3 have this problem afaik. haven"t tested rc4 but maybe this one 
 doesn"t have the problem anymore, we will see"
--

I tested -rc4 and it has the problem too.  More over, with plain old 8139too 
driver, the slab still continues to grow albeit slowly. So there is a reason 
to suspect kernel leak as well. I will try binary searching...

Parag
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-16 Thread Andrew Morton
Parag Warudkar <[EMAIL PROTECTED]> wrote:
>
> On Wednesday 16 February 2005 12:12 am, Andrew Morton wrote:
> > Plenty of moisture there.
> >
> > Could you please use this patch?  Make sure that you enable
> > CONFIG_FRAME_POINTER (might not be needed for __builtin_return_address(0),
> > but let's be sure).  Also enable CONFIG_DEBUG_SLAB.
> 
> Will try that out. For now I tried -rc4 and couple other things - removing 
> nvidia module doesnt make any difference but removing ndiswrapper and with no 
> networking the slab growth stops. With 8139too driver and network the growth 
> is there but pretty slower than with ndiswrapper. With 8139too + some network 
> activity slab seems to reduce sometimes.

OK.

> Seems either an ndiswrapper or a networking related leak. Will report the 
> results with Manfred's patch tomorrow.

So it's probably an ndiswrapper bug?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-16 Thread Andrew Morton
Parag Warudkar <[EMAIL PROTECTED]> wrote:
>
> On Wednesday 16 February 2005 12:12 am, Andrew Morton wrote:
> > echo "size-4096 0 0 0" > /proc/slabinfo
> 
> Is there a reason X86_64 doesnt have CONFIG_FRAME_POINTER anywhere in 
> the .config?

No good reason, I suspect.

> I tried -rc4 with Manfred's patch and with CONFIG_DEBUG_SLAB and 
> CONFIG_DEBUG.

Thanks.

> I get the following output from
> echo "size-64 0 0 0" > /proc/slabinfo
> 
> obj 81002fe8/0: 08a8 <0x8a8>
> obj 81002fe8/1: 08a8 <0x8a8>
> obj 81002fe8/2: 08a8 <0x8a8>
> : 3
> : 4
> : :
> obj 81002fe8/43: 08a8 <0x8a8>
> obj 81002fe8/44: 08a8 <0x8a8>
>  
> How do I know what is at 81002fe8? I tried the normal tricks (gdb 
> -c /proc/kcore vmlinux, objdump -d etc.) but none of the places list this 
> address.

81002fe8 is the address of the slab object.  08a8 is
supposed to be the caller's text address.  It appears that
__builtin_return_address(0) is returning junk.  Perhaps due to
-fomit-frame-pointer.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-16 Thread Parag Warudkar
On Wednesday 16 February 2005 12:12 am, Andrew Morton wrote:
> Plenty of moisture there.
>
> Could you please use this patch?  Make sure that you enable
> CONFIG_FRAME_POINTER (might not be needed for __builtin_return_address(0),
> but let's be sure).  Also enable CONFIG_DEBUG_SLAB.

Will try that out. For now I tried -rc4 and couple other things - removing 
nvidia module doesnt make any difference but removing ndiswrapper and with no 
networking the slab growth stops. With 8139too driver and network the growth 
is there but pretty slower than with ndiswrapper. With 8139too + some network 
activity slab seems to reduce sometimes.

Seems either an ndiswrapper or a networking related leak. Will report the 
results with Manfred's patch tomorrow.

Thanks
Parag
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-16 Thread Parag Warudkar
On Wednesday 16 February 2005 12:12 am, Andrew Morton wrote:
 Plenty of moisture there.

 Could you please use this patch?  Make sure that you enable
 CONFIG_FRAME_POINTER (might not be needed for __builtin_return_address(0),
 but let's be sure).  Also enable CONFIG_DEBUG_SLAB.

Will try that out. For now I tried -rc4 and couple other things - removing 
nvidia module doesnt make any difference but removing ndiswrapper and with no 
networking the slab growth stops. With 8139too driver and network the growth 
is there but pretty slower than with ndiswrapper. With 8139too + some network 
activity slab seems to reduce sometimes.

Seems either an ndiswrapper or a networking related leak. Will report the 
results with Manfred's patch tomorrow.

Thanks
Parag
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-16 Thread Andrew Morton
Parag Warudkar [EMAIL PROTECTED] wrote:

 On Wednesday 16 February 2005 12:12 am, Andrew Morton wrote:
  echo size-4096 0 0 0  /proc/slabinfo
 
 Is there a reason X86_64 doesnt have CONFIG_FRAME_POINTER anywhere in 
 the .config?

No good reason, I suspect.

 I tried -rc4 with Manfred's patch and with CONFIG_DEBUG_SLAB and 
 CONFIG_DEBUG.

Thanks.

 I get the following output from
 echo size-64 0 0 0  /proc/slabinfo
 
 obj 81002fe8/0: 08a8 0x8a8
 obj 81002fe8/1: 08a8 0x8a8
 obj 81002fe8/2: 08a8 0x8a8
 : 3
 : 4
 : :
 obj 81002fe8/43: 08a8 0x8a8
 obj 81002fe8/44: 08a8 0x8a8
  
 How do I know what is at 81002fe8? I tried the normal tricks (gdb 
 -c /proc/kcore vmlinux, objdump -d etc.) but none of the places list this 
 address.

81002fe8 is the address of the slab object.  08a8 is
supposed to be the caller's text address.  It appears that
__builtin_return_address(0) is returning junk.  Perhaps due to
-fomit-frame-pointer.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-16 Thread Andrew Morton
Parag Warudkar [EMAIL PROTECTED] wrote:

 On Wednesday 16 February 2005 12:12 am, Andrew Morton wrote:
  Plenty of moisture there.
 
  Could you please use this patch?  Make sure that you enable
  CONFIG_FRAME_POINTER (might not be needed for __builtin_return_address(0),
  but let's be sure).  Also enable CONFIG_DEBUG_SLAB.
 
 Will try that out. For now I tried -rc4 and couple other things - removing 
 nvidia module doesnt make any difference but removing ndiswrapper and with no 
 networking the slab growth stops. With 8139too driver and network the growth 
 is there but pretty slower than with ndiswrapper. With 8139too + some network 
 activity slab seems to reduce sometimes.

OK.

 Seems either an ndiswrapper or a networking related leak. Will report the 
 results with Manfred's patch tomorrow.

So it's probably an ndiswrapper bug?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-16 Thread Parag Warudkar
On Wednesday 16 February 2005 06:51 pm, Andrew Morton wrote:
 81002fe8 is the address of the slab object.  08a8 is
 supposed to be the caller's text address.  It appears that
 __builtin_return_address(0) is returning junk.  Perhaps due to
 -fomit-frame-pointer.
I tried manually removing -fomit-frame-pointer from Makefile and adding 
-fno-omit-frame-pointer but with same results - junk return addresses. 
Probably a X86_64 issue.

So it's probably an ndiswrapper bug? 
I looked at ndiswrapper mailing lists and found this explanation for the same 
issue of growing size-64 with ndiswrapper  -
--
It looks like the problem is kernel-version related, not ndiswrapper. 
 ndiswrapper just uses some API that starts the memory leak but the 
 problem is indeed in the kernel itself. versions from 2.6.10 up to 
 .11-rc3 have this problem afaik. havent tested rc4 but maybe this one 
 doesnt have the problem anymore, we will see
--

I tested -rc4 and it has the problem too.  More over, with plain old 8139too 
driver, the slab still continues to grow albeit slowly. So there is a reason 
to suspect kernel leak as well. I will try binary searching...

Parag
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-16 Thread Horst von Brand
Andrew Morton [EMAIL PROTECTED] said:
 Parag Warudkar [EMAIL PROTECTED] wrote:

[...]

  Is there a reason X86_64 doesnt have CONFIG_FRAME_POINTER anywhere in 
  the .config?

 No good reason, I suspect.

Does x86_64 use up a (freeable) register for the frame pointer or not?
I.e., does -fomit-frame-pointer have any effect on the generated code?
-- 
Dr. Horst H. von Brand   User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria  +56 32 654239
Casilla 110-V, Valparaiso, ChileFax:  +56 32 797513
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-15 Thread Andrew Morton
Parag Warudkar <[EMAIL PROTECTED]> wrote:
>
> I am running -rc3 on my AMD64 laptop and I noticed it becomes sluggish after 
> use mainly due to growing swap use.  It has 768M of RAM and a Gig of swap. 
> After following this thread, I started monitoring /proc/slabinfo. It seems 
> size-64 is continuously growing and doing a compile run seem to make it grow 
> noticeably faster. After a day's uptime size-64 line in /proc/slabinfo looks 
> like 
> 
> size-64   7216543 7216544 64   611 : tunables  120   600 
> : 
> slabdata 118304 118304  0

Plenty of moisture there.

Could you please use this patch?  Make sure that you enable
CONFIG_FRAME_POINTER (might not be needed for __builtin_return_address(0),
but let's be sure).  Also enable CONFIG_DEBUG_SLAB.



From: Manfred Spraul <[EMAIL PROTECTED]>

With the patch applied,

echo "size-4096 0 0 0" > /proc/slabinfo

walks the objects in the size-4096 slab, printing out the calling address
of whoever allocated that object.

It is for leak detection.


diff -puN mm/slab.c~slab-leak-detector mm/slab.c
--- 25/mm/slab.c~slab-leak-detector 2005-02-15 21:06:44.0 -0800
+++ 25-akpm/mm/slab.c   2005-02-15 21:06:44.0 -0800
@@ -2116,6 +2116,15 @@ cache_alloc_debugcheck_after(kmem_cache_
*dbg_redzone1(cachep, objp) = RED_ACTIVE;
*dbg_redzone2(cachep, objp) = RED_ACTIVE;
}
+   {
+   int objnr;
+   struct slab *slabp;
+
+   slabp = GET_PAGE_SLAB(virt_to_page(objp));
+
+   objnr = (objp - slabp->s_mem) / cachep->objsize;
+   slab_bufctl(slabp)[objnr] = (unsigned long)caller;
+   }
objp += obj_dbghead(cachep);
if (cachep->ctor && cachep->flags & SLAB_POISON) {
unsigned long   ctor_flags = SLAB_CTOR_CONSTRUCTOR;
@@ -2179,12 +2188,14 @@ static void free_block(kmem_cache_t *cac
objnr = (objp - slabp->s_mem) / cachep->objsize;
check_slabp(cachep, slabp);
 #if DEBUG
+#if 0
if (slab_bufctl(slabp)[objnr] != BUFCTL_FREE) {
printk(KERN_ERR "slab: double free detected in cache 
'%s', objp %p.\n",
cachep->name, objp);
BUG();
}
 #endif
+#endif
slab_bufctl(slabp)[objnr] = slabp->free;
slabp->free = objnr;
STATS_DEC_ACTIVE(cachep);
@@ -2998,6 +3009,29 @@ struct seq_operations slabinfo_op = {
.show   = s_show,
 };
 
+static void do_dump_slabp(kmem_cache_t *cachep)
+{
+#if DEBUG
+   struct list_head *q;
+
+   check_irq_on();
+   spin_lock_irq(>spinlock);
+   list_for_each(q,>lists.slabs_full) {
+   struct slab *slabp;
+   int i;
+   slabp = list_entry(q, struct slab, list);
+   for (i = 0; i < cachep->num; i++) {
+   unsigned long sym = slab_bufctl(slabp)[i];
+
+   printk("obj %p/%d: %p", slabp, i, (void *)sym);
+   print_symbol(" <%s>", sym);
+   printk("\n");
+   }
+   }
+   spin_unlock_irq(>spinlock);
+#endif
+}
+
 #define MAX_SLABINFO_WRITE 128
 /**
  * slabinfo_write - Tuning for the slab allocator
@@ -3038,9 +3072,11 @@ ssize_t slabinfo_write(struct file *file
batchcount < 1 ||
batchcount > limit ||
shared < 0) {
-   res = -EINVAL;
+   do_dump_slabp(cachep);
+   res = 0;
} else {
-   res = do_tune_cpucache(cachep, limit, 
batchcount, shared);
+   res = do_tune_cpucache(cachep, limit,
+   batchcount, shared);
}
break;
}
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


-rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-15 Thread Parag Warudkar
I am running -rc3 on my AMD64 laptop and I noticed it becomes sluggish after 
use mainly due to growing swap use.  It has 768M of RAM and a Gig of swap. 
After following this thread, I started monitoring /proc/slabinfo. It seems 
size-64 is continuously growing and doing a compile run seem to make it grow 
noticeably faster. After a day's uptime size-64 line in /proc/slabinfo looks 
like 

size-64   7216543 7216544 64   611 : tunables  120   600 : 
slabdata 118304 118304  0

Since this doesn't seem to bio, I think we have another slab leak somewhere. 
The box recently went OOM during a gcc compile run after I killed the swap.

Output from free , OOM Killer, and /proc/slabinfo is down below..

free output -
   total   used   free sharedbuffers cached
Mem:767996 758120   9876  0   5276 130360
-/+ buffers/cache: 622484 145512
Swap:  1052248  67668 984580

OOM Killer Output
oom-killer: gfp_mask=0x1d2
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
HighMem per-cpu: empty

Free pages:7260kB (0kB HighMem)
Active:62385 inactive:850 dirty:0 writeback:0 unstable:0 free:1815 slab:120136 
mapped:62334 pagetables:2110
DMA free:3076kB min:72kB low:88kB high:108kB active:3328kB inactive:0kB 
present:16384kB pages_scanned:4446 all_unreclaimable? yes
lowmem_reserve[]: 0 751 751
Normal free:4184kB min:3468kB low:4332kB high:5200kB active:246212kB 
inactive:3400kB present:769472kB pages_scanned:3834 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB 
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 
1*2048kB 0*4096kB = 3076kB
Normal: 170*4kB 10*8kB 2*16kB 0*32kB 1*64kB 0*128kB 1*256kB 2*512kB 0*1024kB 
1*2048kB 0*4096kB = 4184kB
HighMem: empty
Swap cache: add 310423, delete 310423, find 74707/105490, race 0+0
Free swap  = 0kB
Total swap = 0kB
Out of Memory: Killed process 4898 (klauncher).
oom-killer: gfp_mask=0x1d2
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
HighMem per-cpu: empty

Free pages:7020kB (0kB HighMem)
Active:62308 inactive:648 dirty:0 writeback:0 unstable:0 free:1755 slab:120439 
mapped:62199 pagetables:2020
DMA free:3076kB min:72kB low:88kB high:108kB active:3336kB inactive:0kB 
present:16384kB pages_scanned:7087 all_unreclaimable? yes
lowmem_reserve[]: 0 751 751
Normal free:3944kB min:3468kB low:4332kB high:5200kB active:245896kB 
inactive:2592kB present:769472kB pages_scanned:3861 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB 
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 
1*2048kB 0*4096kB = 3076kB
Normal: 112*4kB 9*8kB 0*16kB 1*32kB 1*64kB 0*128kB 1*256kB 2*512kB 0*1024kB 
1*2048kB 0*4096kB = 3944kB
HighMem: empty
Swap cache: add 310423, delete 310423, find 74707/105490, race 0+0
Free swap  = 0kB
Total swap = 0kB
Out of Memory: Killed process 4918 (kwin).

/proc/slabinfo output

ipx_sock   0  089641 : tunables   54   270 : 
slabdata  0  0  0
scsi_cmd_cache 3  757671 : tunables   54   270 : 
slabdata  1  1  0
ip_fib_alias  10119 32  1191 : tunables  120   600 : 
slabdata  1  1  0
ip_fib_hash   10 61 64   611 : tunables  120   600 : 
slabdata  1  1  0
sgpool-12832 32   409611 : tunables   24   120 : 
slabdata 32 32  0
sgpool-64 32 32   204821 : tunables   24   120 : 
slabdata 16 16  0
sgpool-32 32 32   102441 : tunables   54   270 : 
slabdata  8  8  0
sgpool-16 32 3251281 : tunables   54   270 : 
slabdata  4  4  0
sgpool-8  32 45256   151 : tunables  120   600 : 
slabdata  3  3  0
ext3_inode_cache2805   3063   122431 : tunables   24   120 : 
slabdata   1021   1021  0
ext3_xattr 0  0 88   451 : tunables  120   600 : 
slabdata  0  0  0
journal_handle16156 24  1561 : tunables  120   600 : 
slabdata  1  1  0
journal_head  49180 88   451 : tunables  120   600 : 
slabdata  4  4  0
revoke_table   6225 16  2251 : tunables  120   600 : 
slabdata  1  1  0
revoke_record  0  0 32  1191 : 

-rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-15 Thread Parag Warudkar
I am running -rc3 on my AMD64 laptop and I noticed it becomes sluggish after 
use mainly due to growing swap use.  It has 768M of RAM and a Gig of swap. 
After following this thread, I started monitoring /proc/slabinfo. It seems 
size-64 is continuously growing and doing a compile run seem to make it grow 
noticeably faster. After a day's uptime size-64 line in /proc/slabinfo looks 
like 

size-64   7216543 7216544 64   611 : tunables  120   600 : 
slabdata 118304 118304  0

Since this doesn't seem to bio, I think we have another slab leak somewhere. 
The box recently went OOM during a gcc compile run after I killed the swap.

Output from free , OOM Killer, and /proc/slabinfo is down below..

free output -
   total   used   free sharedbuffers cached
Mem:767996 758120   9876  0   5276 130360
-/+ buffers/cache: 622484 145512
Swap:  1052248  67668 984580

OOM Killer Output
oom-killer: gfp_mask=0x1d2
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
HighMem per-cpu: empty

Free pages:7260kB (0kB HighMem)
Active:62385 inactive:850 dirty:0 writeback:0 unstable:0 free:1815 slab:120136 
mapped:62334 pagetables:2110
DMA free:3076kB min:72kB low:88kB high:108kB active:3328kB inactive:0kB 
present:16384kB pages_scanned:4446 all_unreclaimable? yes
lowmem_reserve[]: 0 751 751
Normal free:4184kB min:3468kB low:4332kB high:5200kB active:246212kB 
inactive:3400kB present:769472kB pages_scanned:3834 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB 
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 
1*2048kB 0*4096kB = 3076kB
Normal: 170*4kB 10*8kB 2*16kB 0*32kB 1*64kB 0*128kB 1*256kB 2*512kB 0*1024kB 
1*2048kB 0*4096kB = 4184kB
HighMem: empty
Swap cache: add 310423, delete 310423, find 74707/105490, race 0+0
Free swap  = 0kB
Total swap = 0kB
Out of Memory: Killed process 4898 (klauncher).
oom-killer: gfp_mask=0x1d2
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
HighMem per-cpu: empty

Free pages:7020kB (0kB HighMem)
Active:62308 inactive:648 dirty:0 writeback:0 unstable:0 free:1755 slab:120439 
mapped:62199 pagetables:2020
DMA free:3076kB min:72kB low:88kB high:108kB active:3336kB inactive:0kB 
present:16384kB pages_scanned:7087 all_unreclaimable? yes
lowmem_reserve[]: 0 751 751
Normal free:3944kB min:3468kB low:4332kB high:5200kB active:245896kB 
inactive:2592kB present:769472kB pages_scanned:3861 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB 
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 
1*2048kB 0*4096kB = 3076kB
Normal: 112*4kB 9*8kB 0*16kB 1*32kB 1*64kB 0*128kB 1*256kB 2*512kB 0*1024kB 
1*2048kB 0*4096kB = 3944kB
HighMem: empty
Swap cache: add 310423, delete 310423, find 74707/105490, race 0+0
Free swap  = 0kB
Total swap = 0kB
Out of Memory: Killed process 4918 (kwin).

/proc/slabinfo output

ipx_sock   0  089641 : tunables   54   270 : 
slabdata  0  0  0
scsi_cmd_cache 3  757671 : tunables   54   270 : 
slabdata  1  1  0
ip_fib_alias  10119 32  1191 : tunables  120   600 : 
slabdata  1  1  0
ip_fib_hash   10 61 64   611 : tunables  120   600 : 
slabdata  1  1  0
sgpool-12832 32   409611 : tunables   24   120 : 
slabdata 32 32  0
sgpool-64 32 32   204821 : tunables   24   120 : 
slabdata 16 16  0
sgpool-32 32 32   102441 : tunables   54   270 : 
slabdata  8  8  0
sgpool-16 32 3251281 : tunables   54   270 : 
slabdata  4  4  0
sgpool-8  32 45256   151 : tunables  120   600 : 
slabdata  3  3  0
ext3_inode_cache2805   3063   122431 : tunables   24   120 : 
slabdata   1021   1021  0
ext3_xattr 0  0 88   451 : tunables  120   600 : 
slabdata  0  0  0
journal_handle16156 24  1561 : tunables  120   600 : 
slabdata  1  1  0
journal_head  49180 88   451 : tunables  120   600 : 
slabdata  4  4  0
revoke_table   6225 16  2251 : tunables  120   600 : 
slabdata  1  1  0
revoke_record  0  0 32  1191 : 

Re: -rc3 leaking NOT BIO [Was: Memory leak in 2.6.11-rc1?]

2005-02-15 Thread Andrew Morton
Parag Warudkar [EMAIL PROTECTED] wrote:

 I am running -rc3 on my AMD64 laptop and I noticed it becomes sluggish after 
 use mainly due to growing swap use.  It has 768M of RAM and a Gig of swap. 
 After following this thread, I started monitoring /proc/slabinfo. It seems 
 size-64 is continuously growing and doing a compile run seem to make it grow 
 noticeably faster. After a day's uptime size-64 line in /proc/slabinfo looks 
 like 
 
 size-64   7216543 7216544 64   611 : tunables  120   600 
 : 
 slabdata 118304 118304  0

Plenty of moisture there.

Could you please use this patch?  Make sure that you enable
CONFIG_FRAME_POINTER (might not be needed for __builtin_return_address(0),
but let's be sure).  Also enable CONFIG_DEBUG_SLAB.



From: Manfred Spraul [EMAIL PROTECTED]

With the patch applied,

echo size-4096 0 0 0  /proc/slabinfo

walks the objects in the size-4096 slab, printing out the calling address
of whoever allocated that object.

It is for leak detection.


diff -puN mm/slab.c~slab-leak-detector mm/slab.c
--- 25/mm/slab.c~slab-leak-detector 2005-02-15 21:06:44.0 -0800
+++ 25-akpm/mm/slab.c   2005-02-15 21:06:44.0 -0800
@@ -2116,6 +2116,15 @@ cache_alloc_debugcheck_after(kmem_cache_
*dbg_redzone1(cachep, objp) = RED_ACTIVE;
*dbg_redzone2(cachep, objp) = RED_ACTIVE;
}
+   {
+   int objnr;
+   struct slab *slabp;
+
+   slabp = GET_PAGE_SLAB(virt_to_page(objp));
+
+   objnr = (objp - slabp-s_mem) / cachep-objsize;
+   slab_bufctl(slabp)[objnr] = (unsigned long)caller;
+   }
objp += obj_dbghead(cachep);
if (cachep-ctor  cachep-flags  SLAB_POISON) {
unsigned long   ctor_flags = SLAB_CTOR_CONSTRUCTOR;
@@ -2179,12 +2188,14 @@ static void free_block(kmem_cache_t *cac
objnr = (objp - slabp-s_mem) / cachep-objsize;
check_slabp(cachep, slabp);
 #if DEBUG
+#if 0
if (slab_bufctl(slabp)[objnr] != BUFCTL_FREE) {
printk(KERN_ERR slab: double free detected in cache 
'%s', objp %p.\n,
cachep-name, objp);
BUG();
}
 #endif
+#endif
slab_bufctl(slabp)[objnr] = slabp-free;
slabp-free = objnr;
STATS_DEC_ACTIVE(cachep);
@@ -2998,6 +3009,29 @@ struct seq_operations slabinfo_op = {
.show   = s_show,
 };
 
+static void do_dump_slabp(kmem_cache_t *cachep)
+{
+#if DEBUG
+   struct list_head *q;
+
+   check_irq_on();
+   spin_lock_irq(cachep-spinlock);
+   list_for_each(q,cachep-lists.slabs_full) {
+   struct slab *slabp;
+   int i;
+   slabp = list_entry(q, struct slab, list);
+   for (i = 0; i  cachep-num; i++) {
+   unsigned long sym = slab_bufctl(slabp)[i];
+
+   printk(obj %p/%d: %p, slabp, i, (void *)sym);
+   print_symbol( %s, sym);
+   printk(\n);
+   }
+   }
+   spin_unlock_irq(cachep-spinlock);
+#endif
+}
+
 #define MAX_SLABINFO_WRITE 128
 /**
  * slabinfo_write - Tuning for the slab allocator
@@ -3038,9 +3072,11 @@ ssize_t slabinfo_write(struct file *file
batchcount  1 ||
batchcount  limit ||
shared  0) {
-   res = -EINVAL;
+   do_dump_slabp(cachep);
+   res = 0;
} else {
-   res = do_tune_cpucache(cachep, limit, 
batchcount, shared);
+   res = do_tune_cpucache(cachep, limit,
+   batchcount, shared);
}
break;
}
_

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/