Re: [dtrace-discuss] How to measure the CPU time of a process and its children? And non-lock ps?

Brendan Gregg - Sun Microsystems Thu, 14 Feb 2008 13:14:23 -0800

G'Day Sean,

On Thu, Feb 14, 2008 at 12:41:35PM -0800, Sean Liu wrote:
> It turned out ps wasn't really what was using most CPU resource. The MQ agent 
> ( and its probes ) is.
> I put in a small script to output continuously every second:
> #!/usr/sbin/dtrace -qs
> 
> profile-1001
> /pid == $target || progenyof($target)/
> {
>  @a["on-CPU, 1001 Hertz count:"] = count();
> }
> 
> tick-1
> {
>  printa(@a);
>  trunc(@a)
> }


Yes, this makes sense.

> This is from Application Agent:
>   on-CPU, 1001 Hertz count:                                         2
>   on-CPU, 1001 Hertz count:                                         2
>   on-CPU, 1001 Hertz count:                                         2
>   on-CPU, 1001 Hertz count:                                       510
>   on-CPU, 1001 Hertz count:                                       410
>   on-CPU, 1001 Hertz count:                                       677
>   on-CPU, 1001 Hertz count:                                       796
>   on-CPU, 1001 Hertz count:                                       750
>   on-CPU, 1001 Hertz count:                                       828
>   on-CPU, 1001 Hertz count:                                       518
>   on-CPU, 1001 Hertz count:                                       751
>   on-CPU, 1001 Hertz count:                                      1086
>   on-CPU, 1001 Hertz count:                                       604
>   on-CPU, 1001 Hertz count:                                       992
>   on-CPU, 1001 Hertz count:                                       568
>   on-CPU, 1001 Hertz count:                                       736
>  ...
> 
> And MQ agent:
> ...
>   on-CPU, 1001 Hertz count:                                      1696
>   on-CPU, 1001 Hertz count:                                      1743
>   on-CPU, 1001 Hertz count:                                      1784
>   on-CPU, 1001 Hertz count:                                      1679
>   on-CPU, 1001 Hertz count:                                      1691
>   on-CPU, 1001 Hertz count:                                      1547
>  ...
> 
> Since profile probes fire on both CPUs, a 1001 Hz probe should fire 2002 
> times a second correct?

Yes, it will fire 2002 times a second.  There may be a tiny variance (such
as 0.1%) in the output (due to how aggregates are printed from the user-land
dtrace consumer, and how it is CPU scheduled) - but the numbers will sum
correctly.

> So 1743 fires is almost like 85% of CPU used.

Yes, it strongly suggests that.  We don't know that for certain since this
is a simple (yet useful) script which samples rather than traces.  For what
you are doing (identifying largest CPU consumer) sampling is probably
adequate.

You can get it to print out percentages, if that helps readability.
Something like:

profile-1001
/pid == $target || progenyof($target)/
{
        @a["on-CPU, percentage:"] = count();
}

tick-1
{
        normalize(@a, (1001 * `ncpus_online) / 100);
        printa(@a);
        trunc(@a)
}

The normalize() won't be spot on, since I believe the float will be 
converted to an integer before it is used.  So there will be some small
rounding error, in addition to that of it being sample based.  But again,
if the point is to identify large CPU consumers - it should work fine.

Those percentages are for all CPUs.  Drop the "* `ncpus_online" for
per-CPU percentages.

> Thanks Brendan for the idea.

no worries,

Brendan

-- 
Brendan
[CA, USA]
_______________________________________________
dtrace-discuss mailing list
[email protected]

Re: [dtrace-discuss] How to measure the CPU time of a process and its children? And non-lock ps?

Reply via email to