Re: strange high system cpu usage.

2007-03-30 Thread Elliott Johnson
Lee

Thanks for your help.  In testing different kernels we found that using an 
unpatched kernel from kernel.org seems to fix the problem.  I'm assuming that a 
patch added in the gentoo-sources patch set was creating the problem.  Our once 
8 minute untar is now down to 7-8 seconds with a vanilla 2.6.18.6 kernel.

If anyone is interested in our oprofile code or other info, just ask and I'll 
post it.  Otherwise I'll be reporting this to the gentoo developers.

-E

> - Original Message -
> From: "Elliott Johnson" <[EMAIL PROTECTED]>
> To: linux-kernel@vger.kernel.org
> Subject: Re: strange high system cpu usage.
> Date: Fri, 30 Mar 2007 11:54:57 +0800
> 
> 
> > What problem are you trying to solve?  IOW, how do you know it's not
> > just an artifact of diferent load average calculation between 2.4 and
> > 2.6?
> >
> > Are you actually seeing reduced throughput/performance?  Or are you
> > just looking at load average?
> >
> > Lee
> 
> Well the problem is apparent, we are having abnormally high cpu 
> usage.  It's about a
> 20-40% performance hit.
> 
> The load calculations were not between 2.4 and 2.6 kernel versions, 
> but between 2.6.8 and
> 2.6.19.  Sorry if this wasn't very clear from my last email.
> 
> In trying to diagnose the problem I also looked at memory stats 
> (vmstat) and found the
> 'buffered' memory statistic way off from the comparable debian 
> (2.6.8) install (0-300kb
> versus 500mb).
> 
> The vmstat man page has little information on this statistic and 
> there seems to be varying
> explanations on the web.  I was hoping for a decisive explanation 
> (or link) and possibly
> advice in toggling this value (or reasons not to).
> 
> I'm still trying to work on this at my end.  Some recent tests show 
> that it might be
> related to the megasas driver or the large number of small files we 
> are using on a xfs
> formated 10T array.  I'll keep at it.
> 
> Thanks for your response,
> 
> -Elliott
> 
> =
> Search for products and services at:
> http://search.mail.com
> 
> --
> Powered by Outblaze
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

>


=
Search for products and services at: 
http://search.mail.com

-- 
Powered by Outblaze
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: strange high system cpu usage.

2007-03-30 Thread Elliott Johnson
Lee

Thanks for your help.  In testing different kernels we found that using an 
unpatched kernel from kernel.org seems to fix the problem.  I'm assuming that a 
patch added in the gentoo-sources patch set was creating the problem.  Our once 
8 minute untar is now down to 7-8 seconds with a vanilla 2.6.18.6 kernel.

If anyone is interested in our oprofile code or other info, just ask and I'll 
post it.  Otherwise I'll be reporting this to the gentoo developers.

-E

 - Original Message -
 From: Elliott Johnson [EMAIL PROTECTED]
 To: linux-kernel@vger.kernel.org
 Subject: Re: strange high system cpu usage.
 Date: Fri, 30 Mar 2007 11:54:57 +0800
 
 
  What problem are you trying to solve?  IOW, how do you know it's not
  just an artifact of diferent load average calculation between 2.4 and
  2.6?
 
  Are you actually seeing reduced throughput/performance?  Or are you
  just looking at load average?
 
  Lee
 
 Well the problem is apparent, we are having abnormally high cpu 
 usage.  It's about a
 20-40% performance hit.
 
 The load calculations were not between 2.4 and 2.6 kernel versions, 
 but between 2.6.8 and
 2.6.19.  Sorry if this wasn't very clear from my last email.
 
 In trying to diagnose the problem I also looked at memory stats 
 (vmstat) and found the
 'buffered' memory statistic way off from the comparable debian 
 (2.6.8) install (0-300kb
 versus 500mb).
 
 The vmstat man page has little information on this statistic and 
 there seems to be varying
 explanations on the web.  I was hoping for a decisive explanation 
 (or link) and possibly
 advice in toggling this value (or reasons not to).
 
 I'm still trying to work on this at my end.  Some recent tests show 
 that it might be
 related to the megasas driver or the large number of small files we 
 are using on a xfs
 formated 10T array.  I'll keep at it.
 
 Thanks for your response,
 
 -Elliott
 
 =
 Search for products and services at:
 http://search.mail.com
 
 --
 Powered by Outblaze
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/




=
Search for products and services at: 
http://search.mail.com

-- 
Powered by Outblaze
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: strange high system cpu usage.

2007-03-29 Thread Lee Revell

On 3/29/07, Elliott Johnson <[EMAIL PROTECTED]> wrote:

>What problem are you trying to solve?  IOW, how do you know it's not
>just an artifact of diferent load average calculation between 2.4 and
>2.6?
>
>Are you actually seeing reduced throughput/performance?  Or are you
>just looking at load average?
>
>Lee

Well the problem is apparent, we are having abnormally high cpu usage.  It's 
about a
20-40% performance hit.



Please post a kernel profile for the problematic workload with the
"good" and "bad" kernels (search the list archive for Andrew Morton's
instructions on doing it with oprofile, email me privately if you
can't find it).


The vmstat man page has little information on this statistic and there seems to 
be varying
explanations on the web.  I was hoping for a decisive explanation (or link) and 
possibly
advice in toggling this value (or reasons not to).


The meaning of these numbers can change drastically from one minor
release to the next, and the docs often lag behind the code.

I would not focus on tweaking VM knobs, but on describing the problem
in enough detail to fix the kernel - it's a bug if the same workload
regresses significantly from one release to another.

Lee
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: strange high system cpu usage.

2007-03-29 Thread Elliott Johnson
>What problem are you trying to solve?  IOW, how do you know it's not
>just an artifact of diferent load average calculation between 2.4 and
>2.6?
>
>Are you actually seeing reduced throughput/performance?  Or are you
>just looking at load average?
>
>Lee

Well the problem is apparent, we are having abnormally high cpu usage.  It's 
about a 
20-40% performance hit.

The load calculations were not between 2.4 and 2.6 kernel versions, but between 
2.6.8 and 
2.6.19.  Sorry if this wasn't very clear from my last email.

In trying to diagnose the problem I also looked at memory stats (vmstat) and 
found the 
'buffered' memory statistic way off from the comparable debian (2.6.8) install 
(0-300kb 
versus 500mb).

The vmstat man page has little information on this statistic and there seems to 
be varying
explanations on the web.  I was hoping for a decisive explanation (or link) and 
possibly 
advice in toggling this value (or reasons not to).

I'm still trying to work on this at my end.  Some recent tests show that it 
might be
related to the megasas driver or the large number of small files we are using 
on a xfs
formated 10T array.  I'll keep at it.

Thanks for your response,

-Elliott

=
Search for products and services at: 
http://search.mail.com

-- 
Powered by Outblaze
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: strange high system cpu usage.

2007-03-29 Thread Lee Revell

On 3/29/07, Elliott Johnson <[EMAIL PROTECTED]> wrote:

Hello,

I've been upgrading a few machines here at work and noticed some problems with 
high system cpu usage on one machine.  In trying to debug the problem I've come 
across a few confusing stats that I was hoping could be cleared up by someone 
on this list.


What problem are you trying to solve?  IOW, how do you know it's not
just an artifact of diferent load average calculation between 2.4 and
2.6?

Are you actually seeing reduced throughput/performance?  Or are you
just looking at load average?

Lee
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: strange high system cpu usage.

2007-03-29 Thread Lee Revell

On 3/29/07, Elliott Johnson [EMAIL PROTECTED] wrote:

Hello,

I've been upgrading a few machines here at work and noticed some problems with 
high system cpu usage on one machine.  In trying to debug the problem I've come 
across a few confusing stats that I was hoping could be cleared up by someone 
on this list.


What problem are you trying to solve?  IOW, how do you know it's not
just an artifact of diferent load average calculation between 2.4 and
2.6?

Are you actually seeing reduced throughput/performance?  Or are you
just looking at load average?

Lee
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: strange high system cpu usage.

2007-03-29 Thread Elliott Johnson
What problem are you trying to solve?  IOW, how do you know it's not
just an artifact of diferent load average calculation between 2.4 and
2.6?

Are you actually seeing reduced throughput/performance?  Or are you
just looking at load average?

Lee

Well the problem is apparent, we are having abnormally high cpu usage.  It's 
about a 
20-40% performance hit.

The load calculations were not between 2.4 and 2.6 kernel versions, but between 
2.6.8 and 
2.6.19.  Sorry if this wasn't very clear from my last email.

In trying to diagnose the problem I also looked at memory stats (vmstat) and 
found the 
'buffered' memory statistic way off from the comparable debian (2.6.8) install 
(0-300kb 
versus 500mb).

The vmstat man page has little information on this statistic and there seems to 
be varying
explanations on the web.  I was hoping for a decisive explanation (or link) and 
possibly 
advice in toggling this value (or reasons not to).

I'm still trying to work on this at my end.  Some recent tests show that it 
might be
related to the megasas driver or the large number of small files we are using 
on a xfs
formated 10T array.  I'll keep at it.

Thanks for your response,

-Elliott

=
Search for products and services at: 
http://search.mail.com

-- 
Powered by Outblaze
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: strange high system cpu usage.

2007-03-29 Thread Lee Revell

On 3/29/07, Elliott Johnson [EMAIL PROTECTED] wrote:

What problem are you trying to solve?  IOW, how do you know it's not
just an artifact of diferent load average calculation between 2.4 and
2.6?

Are you actually seeing reduced throughput/performance?  Or are you
just looking at load average?

Lee

Well the problem is apparent, we are having abnormally high cpu usage.  It's 
about a
20-40% performance hit.



Please post a kernel profile for the problematic workload with the
good and bad kernels (search the list archive for Andrew Morton's
instructions on doing it with oprofile, email me privately if you
can't find it).


The vmstat man page has little information on this statistic and there seems to 
be varying
explanations on the web.  I was hoping for a decisive explanation (or link) and 
possibly
advice in toggling this value (or reasons not to).


The meaning of these numbers can change drastically from one minor
release to the next, and the docs often lag behind the code.

I would not focus on tweaking VM knobs, but on describing the problem
in enough detail to fix the kernel - it's a bug if the same workload
regresses significantly from one release to another.

Lee
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/