Re: Kernel memory leak in ATAPI/CAM or ATAng?

2003-11-10 Thread Kevin Oberman
 Date: Sun, 09 Nov 2003 22:43:47 -0700
 From: Scott Long [EMAIL PROTECTED]
 
 Kevin Oberman wrote:
  Tested. It's much better, although ATA request keeps adding more
  memory all the time when mplayer is playing, but it's now increasing
  at about 20K/minute which is a huge improvement. Still, I don't
  understand why it should just continue to grow all of the time. The
  data rate is about constant. I would expect that it should grow to a
  size where the data being processed can be accommodated and then stop
  growing. I don't see it stopping.
  
  Thanks for the quick fix.
 
 Well, it sounds like there is still a memory leak somewhere.  Make sure
 that you have rev 1.27 of atapi-cam.c to be sure.  If so, please let me
 know which malloc type in vmstat -m is growing.

Oh, crap! I guess I pulled the new version too quickly yesterday when
your message arrived. I had 1.26. And I don't have a DVD with me, so I
was seeing a much slower leak because the CD transfers data so much
more slowly.

After a kernel rebuild I see:
ATA request 0 0K  1K 7285  128
after reading some bulk data off of a CD.

Thanks!
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]   Phone: +1 510 486-8634
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Kernel memory leak in ATAPI/CAM or ATAng?

2003-11-09 Thread Scott Long
Fixed.  Please retest.

Scott Long wrote:
Kevin Oberman wrote:

Date: Thu, 6 Nov 2003 11:23:30 -0500 (EST)
From: Robert Watson [EMAIL PROTECTED]
On Thu, 6 Nov 2003, Kevin Oberman wrote:


I have learned a bit more about the problems I have been having with
the DVD drive on my T30 laptop. When I have run the drive for an
extended time (like 2 or 3 hours), I invariably have my system lock up
because it can't malloc kernel memory for the ATAPI/CAM or ATA
device. (Usually it's both.)
The only recovery seems to be to reboot the system.


Is it possible to drop to DDB and generate a coredump at that point?  If
so, you can run vmstat on the core to look at memory use statistics in a
post-mortem way.  As to what to look for: big numbers is about the 
limit
of what I can suggest, I'm afraid :-).  Usually the activity of 
choice is
to compare vmstat statistics (with -m and -z) during normal operation 
and
when the leak has occurred, and look for any marked differences.  It's
worth observing that there are two failure modes here that appear almost
identical: (1) a memory leak resulting in address space exhaustion 
for the
kernel, and (2) a tunable maximum allocation being too high for the
available address space.  Note that (2) isn't a leak, simply a poorly
tuned value.  We've noticed a number of tuned memory limits were set 
when
memory sizes on systems were much lower, and so we've had to readjust 
the
tuning parameters for large memory systems.  Likewise, a number of
problems were observed when PAE was introduced, as some of the tuning
parameters scaled with the amount of physical memory, not with the
addressable space for the kernel.  So we probably want to be on the look
out for both of these possibilities.


Well, I have no details to this point, but 'vmstat -m' makes the
problem obvious. The amount of kernel memory allocated to ATA request
climbs forever and after enough data is transferred, it runs out of
KVM. This is a continual leak, and monitoring it on the running system
makes it pretty clear that something is leaking. I don't think (2) is
the issue. Because the field allocated in vmstat are not large enough,
this is a bit hard to read. The field all merge into some REALLY large
numbers. After reboot, it is 5K. When running mencode I see this
increasing at a rate of a bit under 1.9 MB per minute.
It does not look like a tuning issue. No matter how big KVM is allowed
to grow, it's only a matter of time until it is gone.
I am going to do some testing to see what operations seem to causse
this. I assume it does not happen all of the time or everyone would
have seen it. I suspect it only happens with ATAPI/CAM activity,
possibly only with simultaneous ATA and ATAPI/COM activity.


Does vmstat -m show which malloc type is growing?  Knowing this will
greatly speed up the debugging process.
Thanks!

Scott



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Kernel memory leak in ATAPI/CAM or ATAng?

2003-11-09 Thread Alex Wilkinson
On Thu, Nov 06, 2003 at 08:08:31AM -0800, Kevin Oberman wrote:

Any ideas on where I can look for more information? I'm going to try
doing some monitoring with vmstat while running to see if I can spot
anything, but I am not sure just what I am looking for. The VM system
is not something I know much about, but I did read Terry Lambert's
excellent message to current on KVM tuning and I'm hoping that this
might help, but, if there really is a memory leak, tuning will not fix
it.

Got a link to ...Terry Lambert's excellent message to current on KVM tuning.. ?

Very keen to read it.

 - aW
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Kernel memory leak in ATAPI/CAM or ATAng?

2003-11-09 Thread Kevin Oberman
 Date: Mon, 10 Nov 2003 11:37:00 +1030
 From: Alex Wilkinson [EMAIL PROTECTED]
 
   On Thu, Nov 06, 2003 at 08:08:31AM -0800, Kevin Oberman wrote:
   
   Any ideas on where I can look for more information? I'm going to try
   doing some monitoring with vmstat while running to see if I can spot
   anything, but I am not sure just what I am looking for. The VM system
   is not something I know much about, but I did read Terry Lambert's
   excellent message to current on KVM tuning and I'm hoping that this
   might help, but, if there really is a memory leak, tuning will not fix
   it.
 
 Got a link to ...Terry Lambert's excellent message to current on KVM tuning.. ?

I found it in Google Groups. Search for mailing.freebsd.current
Lambert kmem_map and you will find several good articles. It's really
worth reading the entire threads to get a better understanding of how
all of this works.

The one I was referring to was at:
http://groups.google.com/groups?q=mailing.freebsd.current+Lambert+kmem_mapstart=10hl=enlr=ie=UTF-8selm=bde8n6%244jb%241%40FreeBSD.csie.NCTU.edu.twrnum=12

-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]   Phone: +1 510 486-8634
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Kernel memory leak in ATAPI/CAM or ATAng?

2003-11-09 Thread Kevin Oberman
Tested. It's much better, although ATA request keeps adding more
memory all the time when mplayer is playing, but it's now increasing
at about 20K/minute which is a huge improvement. Still, I don't
understand why it should just continue to grow all of the time. The
data rate is about constant. I would expect that it should grow to a
size where the data being processed can be accommodated and then stop
growing. I don't see it stopping.

Thanks for the quick fix.

Sorry to have taken so long to test it, but I am at SC2003 in Phoenix
for the next two weeks building and running the show network. About
40 10Gig links this year and about 150 100K and 1Gig links this
year. I have no idea how many miles of fiber in the convention
center, but we start installing it tomorrow morning. We also should be
bringing up the OC-192s to the major research nets over the next two
days. 

If any of you are at the show, stop by the NOC and say Hi.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]   Phone: +1 510 486-8634
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Kernel memory leak in ATAPI/CAM or ATAng?

2003-11-09 Thread Scott Long
Kevin Oberman wrote:
Tested. It's much better, although ATA request keeps adding more
memory all the time when mplayer is playing, but it's now increasing
at about 20K/minute which is a huge improvement. Still, I don't
understand why it should just continue to grow all of the time. The
data rate is about constant. I would expect that it should grow to a
size where the data being processed can be accommodated and then stop
growing. I don't see it stopping.
Thanks for the quick fix.
Well, it sounds like there is still a memory leak somewhere.  Make sure
that you have rev 1.27 of atapi-cam.c to be sure.  If so, please let me
know which malloc type in vmstat -m is growing.
Scott

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Kernel memory leak in ATAPI/CAM or ATAng?

2003-11-07 Thread Scott Long
Kevin Oberman wrote:
Date: Thu, 6 Nov 2003 11:23:30 -0500 (EST)
From: Robert Watson [EMAIL PROTECTED]
On Thu, 6 Nov 2003, Kevin Oberman wrote:


I have learned a bit more about the problems I have been having with
the DVD drive on my T30 laptop. When I have run the drive for an
extended time (like 2 or 3 hours), I invariably have my system lock up
because it can't malloc kernel memory for the ATAPI/CAM or ATA
device. (Usually it's both.)
The only recovery seems to be to reboot the system.
Is it possible to drop to DDB and generate a coredump at that point?  If
so, you can run vmstat on the core to look at memory use statistics in a
post-mortem way.  As to what to look for: big numbers is about the limit
of what I can suggest, I'm afraid :-).  Usually the activity of choice is
to compare vmstat statistics (with -m and -z) during normal operation and
when the leak has occurred, and look for any marked differences.  It's
worth observing that there are two failure modes here that appear almost
identical: (1) a memory leak resulting in address space exhaustion for the
kernel, and (2) a tunable maximum allocation being too high for the
available address space.  Note that (2) isn't a leak, simply a poorly
tuned value.  We've noticed a number of tuned memory limits were set when
memory sizes on systems were much lower, and so we've had to readjust the
tuning parameters for large memory systems.  Likewise, a number of
problems were observed when PAE was introduced, as some of the tuning
parameters scaled with the amount of physical memory, not with the
addressable space for the kernel.  So we probably want to be on the look
out for both of these possibilities.


Well, I have no details to this point, but 'vmstat -m' makes the
problem obvious. The amount of kernel memory allocated to ATA request
climbs forever and after enough data is transferred, it runs out of
KVM. This is a continual leak, and monitoring it on the running system
makes it pretty clear that something is leaking. I don't think (2) is
the issue. Because the field allocated in vmstat are not large enough,
this is a bit hard to read. The field all merge into some REALLY large
numbers. After reboot, it is 5K. When running mencode I see this
increasing at a rate of a bit under 1.9 MB per minute.
It does not look like a tuning issue. No matter how big KVM is allowed
to grow, it's only a matter of time until it is gone.
I am going to do some testing to see what operations seem to causse
this. I assume it does not happen all of the time or everyone would
have seen it. I suspect it only happens with ATAPI/CAM activity,
possibly only with simultaneous ATA and ATAPI/COM activity.
Does vmstat -m show which malloc type is growing?  Knowing this will
greatly speed up the debugging process.
Thanks!

Scott

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Kernel memory leak in ATAPI/CAM or ATAng?

2003-11-07 Thread Kevin Oberman
 Date: Fri, 07 Nov 2003 00:45:47 -0700
 From: Scott Long [EMAIL PROTECTED]
 
 Kevin Oberman wrote:
 Date: Thu, 6 Nov 2003 11:23:30 -0500 (EST)
 From: Robert Watson [EMAIL PROTECTED]
 
 
 On Thu, 6 Nov 2003, Kevin Oberman wrote:
 
 
 I have learned a bit more about the problems I have been having with
 the DVD drive on my T30 laptop. When I have run the drive for an
 extended time (like 2 or 3 hours), I invariably have my system lock up
 because it can't malloc kernel memory for the ATAPI/CAM or ATA
 device. (Usually it's both.)
 
 The only recovery seems to be to reboot the system.
 
 Is it possible to drop to DDB and generate a coredump at that point?  If
 so, you can run vmstat on the core to look at memory use statistics in a
 post-mortem way.  As to what to look for: big numbers is about the limit
 of what I can suggest, I'm afraid :-).  Usually the activity of choice is
 to compare vmstat statistics (with -m and -z) during normal operation and
 when the leak has occurred, and look for any marked differences.  It's
 worth observing that there are two failure modes here that appear almost
 identical: (1) a memory leak resulting in address space exhaustion for the
 kernel, and (2) a tunable maximum allocation being too high for the
 available address space.  Note that (2) isn't a leak, simply a poorly
 tuned value.  We've noticed a number of tuned memory limits were set when
 memory sizes on systems were much lower, and so we've had to readjust the
 tuning parameters for large memory systems.  Likewise, a number of
 problems were observed when PAE was introduced, as some of the tuning
 parameters scaled with the amount of physical memory, not with the
 addressable space for the kernel.  So we probably want to be on the look
 out for both of these possibilities.
  
  
  Well, I have no details to this point, but 'vmstat -m' makes the
  problem obvious. The amount of kernel memory allocated to ATA request
  climbs forever and after enough data is transferred, it runs out of
  KVM. This is a continual leak, and monitoring it on the running system
  makes it pretty clear that something is leaking. I don't think (2) is
  the issue. Because the field allocated in vmstat are not large enough,
  this is a bit hard to read. The field all merge into some REALLY large
  numbers. After reboot, it is 5K. When running mencode I see this
  increasing at a rate of a bit under 1.9 MB per minute.
  
  It does not look like a tuning issue. No matter how big KVM is allowed
  to grow, it's only a matter of time until it is gone.
  
  I am going to do some testing to see what operations seem to causse
  this. I assume it does not happen all of the time or everyone would
  have seen it. I suspect it only happens with ATAPI/CAM activity,
  possibly only with simultaneous ATA and ATAPI/COM activity.
 
 Does vmstat -m show which malloc type is growing?  Knowing this will
 greatly speed up the debugging process.

I'm not sure I follow. The leak is in ATA request. Is there
something more to be seen in vmstat -m?

I have confirmed that it seems to happen with any reads from the
DVD device, but my testing has been done with mplayer. Makes it
a bit tough to watch a full-length movie!

I have opened kern/59043 on the problem. Let me know if I can do
further testing.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]   Phone: +1 510 486-8634
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Kernel memory leak in ATAPI/CAM or ATAng?

2003-11-06 Thread Kevin Oberman
I have learned a bit more about the problems I have been having with
the DVD drive on my T30 laptop. When I have run the drive for an
extended time (like 2 or 3 hours), I invariably have my system lock up
because it can't malloc kernel memory for the ATAPI/CAM or ATA
device. (Usually it's both.)

The only recovery seems to be to reboot the system.

I suspect a memory leak because it seems to be linked to total amount
of data transferred, even in multiple invocations of the program. Of
course, once the kernel grabs VM, I guess it generally does not
actually release it, but it should re-use the existing allocation and
not keep allocating more.

I posted my config and dmesg files yesterday. I have tried tuning
KVM_SIZE stuff with no real success, just the loss of the ability to
run with APM loaded. (This is possibly due to mis-tuning.)

Any ideas on where I can look for more information? I'm going to try
doing some monitoring with vmstat while running to see if I can spot
anything, but I am not sure just what I am looking for. The VM system
is not something I know much about, but I did read Terry Lambert's
excellent message to current on KVM tuning and I'm hoping that this
might help, but, if there really is a memory leak, tuning will not fix
it.

FWIW, this problem did not exist a few month ago prior to ATAng.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]   Phone: +1 510 486-8634
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Kernel memory leak in ATAPI/CAM or ATAng?

2003-11-06 Thread Robert Watson

On Thu, 6 Nov 2003, Kevin Oberman wrote:

 I have learned a bit more about the problems I have been having with
 the DVD drive on my T30 laptop. When I have run the drive for an
 extended time (like 2 or 3 hours), I invariably have my system lock up
 because it can't malloc kernel memory for the ATAPI/CAM or ATA
 device. (Usually it's both.)
 
 The only recovery seems to be to reboot the system.

Is it possible to drop to DDB and generate a coredump at that point?  If
so, you can run vmstat on the core to look at memory use statistics in a
post-mortem way.  As to what to look for: big numbers is about the limit
of what I can suggest, I'm afraid :-).  Usually the activity of choice is
to compare vmstat statistics (with -m and -z) during normal operation and
when the leak has occurred, and look for any marked differences.  It's
worth observing that there are two failure modes here that appear almost
identical: (1) a memory leak resulting in address space exhaustion for the
kernel, and (2) a tunable maximum allocation being too high for the
available address space.  Note that (2) isn't a leak, simply a poorly
tuned value.  We've noticed a number of tuned memory limits were set when
memory sizes on systems were much lower, and so we've had to readjust the
tuning parameters for large memory systems.  Likewise, a number of
problems were observed when PAE was introduced, as some of the tuning
parameters scaled with the amount of physical memory, not with the
addressable space for the kernel.  So we probably want to be on the look
out for both of these possibilities.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Kernel memory leak in ATAPI/CAM or ATAng?

2003-11-06 Thread Kevin Oberman
 Date: Thu, 6 Nov 2003 11:23:30 -0500 (EST)
 From: Robert Watson [EMAIL PROTECTED]
 
 
 On Thu, 6 Nov 2003, Kevin Oberman wrote:
 
  I have learned a bit more about the problems I have been having with
  the DVD drive on my T30 laptop. When I have run the drive for an
  extended time (like 2 or 3 hours), I invariably have my system lock up
  because it can't malloc kernel memory for the ATAPI/CAM or ATA
  device. (Usually it's both.)
  
  The only recovery seems to be to reboot the system.
 
 Is it possible to drop to DDB and generate a coredump at that point?  If
 so, you can run vmstat on the core to look at memory use statistics in a
 post-mortem way.  As to what to look for: big numbers is about the limit
 of what I can suggest, I'm afraid :-).  Usually the activity of choice is
 to compare vmstat statistics (with -m and -z) during normal operation and
 when the leak has occurred, and look for any marked differences.  It's
 worth observing that there are two failure modes here that appear almost
 identical: (1) a memory leak resulting in address space exhaustion for the
 kernel, and (2) a tunable maximum allocation being too high for the
 available address space.  Note that (2) isn't a leak, simply a poorly
 tuned value.  We've noticed a number of tuned memory limits were set when
 memory sizes on systems were much lower, and so we've had to readjust the
 tuning parameters for large memory systems.  Likewise, a number of
 problems were observed when PAE was introduced, as some of the tuning
 parameters scaled with the amount of physical memory, not with the
 addressable space for the kernel.  So we probably want to be on the look
 out for both of these possibilities.

Well, I have no details to this point, but 'vmstat -m' makes the
problem obvious. The amount of kernel memory allocated to ATA request
climbs forever and after enough data is transferred, it runs out of
KVM. This is a continual leak, and monitoring it on the running system
makes it pretty clear that something is leaking. I don't think (2) is
the issue. Because the field allocated in vmstat are not large enough,
this is a bit hard to read. The field all merge into some REALLY large
numbers. After reboot, it is 5K. When running mencode I see this
increasing at a rate of a bit under 1.9 MB per minute.

It does not look like a tuning issue. No matter how big KVM is allowed
to grow, it's only a matter of time until it is gone.

I am going to do some testing to see what operations seem to causse
this. I assume it does not happen all of the time or everyone would
have seen it. I suspect it only happens with ATAPI/CAM activity,
possibly only with simultaneous ATA and ATAPI/COM activity.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]   Phone: +1 510 486-8634
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]