G'Day Bob,
On Fri, Jun 27, 2008 at 05:07:21AM -0700, Bob Resendes wrote:
> >
> > prstat -mL is giving a by-thread summary, which is
> > going to be better than
> > either prstat -m, pfilestat, or anything else that
> > tries to represent
> > multiple thread info by-process.
> Brendan, thanks for the quick reply. Are you suggesting that a single
> threaded application should reflect "closer" numbers? If so, then I'm still
> curious about the differences I'm seeing. For example, the following are
> (representative) snapshots of both commands for the same (single-threaded)
> process:
>
> prstat -mL -p 10997 5
> ===================
> PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
> 10997 40011 14 6.1 0.0 0.0 0.0 0.0 79 1.1 730 19 3K 0 postgres/1
>
> [Note: LAT = 1.1% here]
>
> ./pfilestat 10997
> ==============
> STATE FDNUM Time Filename
> read 4 0% /zones/dbzone05/<snip>
> read 17 0% /zones/dbzone05/<snip>
> write 20 0% /zones/dbzone05/<snip>
> read 19 0% /zones/dbzone05/<snip>
> read 23 0% /zones/dbzone05/<snip>
> write 19 1% /zones/dbzone05/<snip>
> sleep-w 0 2%
> waitcpu 0 4%
> running 0 13%
> sleep 0 76%
> [snip]
> Total event time (ms): 4999 Total Mbytes/sec: 1
>
> [Note: waitcpu is 4% here.]
>
> I'm fine with the answer that they are measuring things differently. I'm just
> curious as to what accounts for the 1% vs 4% difference.
Yes, I often use other tools as a sanity check of DTrace scripts, and I do
worry about small percentage differences - they often lead to bugs or a
better understanding of system behaviour...
> Again, I'm not really complaining. I think these scripts are going to be a
> tremendous help in getting started. I'm just concerned that I'm missing
> something about the internal workings of DTrace. For example, something like
> "the sched:::* probes are done in a different context than the application
> and therefore the numbers don't match".
No, pfilestat is careful to follow the documented sched provider:
http://wikis.sun.com/display/DTrace/sched+Provider
Such as using 'pid' for events in thread context:
sched:::off-cpu
/pid == PID/
and 'args[1]->pr_pid' for events that aren't:
sched:::dequeue
/args[1]->pr_pid == PID/
However I did spot a couple of things with pfilestat that might be improved
like this:
--- /opt/DTT/Bin/pfilestat Wed Aug 1 11:01:38 2007
+++ /tmp/pfilestat Sat Jun 28 03:47:48 2008
@@ -184,7 +184,8 @@
sched:::on-cpu
/pid == PID/
{
- @logical["waitcpu", (uint64_t)0, ""] = sum(timestamp - last);
+ @logical[probefunc == "resume" ? "running" : "waitcpu",
+ (uint64_t)0, ""] = sum(timestamp - last);
totaltime += timestamp - last;
last = timestamp;
}
@@ -233,7 +234,8 @@
sched:::enqueue
/args[1]->pr_pid == PID/
{
- @logical["waitcpu", (uint64_t)0, ""] = sum(timestamp - last);
+ @logical[pid == args[1]->pr_pid ? "running" : "sleep",
+ (uint64_t)0, ""] = sum(timestamp - last);
totaltime += timestamp - last;
last = timestamp;
}
If you don't mind, try that out and let us know how it then compares to
prstat -m.
cheers,
Brendan
--
Brendan
[CA, USA]
_______________________________________________
dtrace-discuss mailing list
[email protected]