G'Day Folks,

On Fri, Jan 18, 2008 at 04:52:58PM -0500, Bryan Cantrill wrote:
> 
> On Fri, Jan 18, 2008 at 04:33:17PM -0500, Colin Burgess wrote:
> > I see Intel has released a new tool.  Oh, it requires some patches to 
> > the kernel to record
> > latency times.  Good thing people don't mind patching their kernels, eh?
> > 
> > So who can write the equivalent latencytop.d the fastest? ;-)
> 
> Brendan, I'm sure -- if he hasn't by the time I'm writing this...

Aw, come on - I've barely had half an hour since I saw the message -
couldn't you have waited a bit longer before tempting me to reply? :)

It has about 40 statically defined trace points and I've coded 8 so far.
A bunch of these are specific to Linux, so I'd need to think of what was
the closest useful match for Solaris.

Anyway, here's where I'm at:

# ./latencytop.d 
Tracing... Hit Ctrl-C to end.
  Cause                                   Count        Maximum        Average
  mutex lock                                 10        80 usec        30 usec
  Reading from file                         450     22822 usec      1900 usec

  Cause                                   Count        Maximum        Average
  mutex lock                                  2        11 usec        10 usec
  Reading from file                         474     26716 usec      1836 usec

  Cause                                   Count        Maximum        Average
  mutex lock                                  2        85 usec        48 usec
  Reading from file                         392     29972 usec      2312 usec

  Cause                                   Count        Maximum        Average
  mutex lock                                 19      1379 usec       574 usec
  Writing to file                           162     43011 usec      1655 usec
  Reading from file                         302     80847 usec      2945 usec

  Cause                                   Count        Maximum        Average
  mutex lock                                  1         8 usec         8 usec
  Reading from file                         468     22978 usec      1813 usec
[...]

Yes, I added a "Count" column since it proved of limited use without one.

Of course, bounding this to Count/Maximum/Average is also of limited use -
I'd really like to include distribution plots as well.

Source so far is (hot off the keyboard, so beware of typos):
--------------------------------------------------------------
#!/usr/sbin/dtrace -Cs

#pragma D option quiet

#define RW_READER 0x1

dtrace:::BEGIN
{
        printf("Tracing... Hit Ctrl-C to end.\n");
}

/*
 * Reading from file
 * Writing to file
 */
io:::start
{
        start[args[0]->b_edev, args[0]->b_blkno] = timestamp;
}

io:::done
/start[args[0]->b_edev, args[0]->b_blkno]/
{
        this->delta = timestamp - start[args[0]->b_edev, args[0]->b_blkno];
        this->cause = args[0]->b_flags & B_READ ?
             "Reading from file" : "Writing to file";
        @lcount[this->cause] = count();
        @lmax[this->cause] = max(this->delta);
        @lavg[this->cause] = avg(this->delta);
}

/*
 * mutex lock
 */
lockstat:::adaptive-block
{
        @lcount["mutex lock"] = count();
        @lmax["mutex lock"] = max(arg1);
        @lavg["mutex lock"] = avg(arg1);
}

/*
 * rwsem read lock
 * rwsem write lock
 */
lockstat:::rw-block
{
        this->cause = arg2 == RW_READER ?
            "rwsem read lock" : "rwsum write lock";
        @lcount[this->cause] = count();
        @lmax[this->cause] = max(arg1);
        @lavg[this->cause] = avg(arg1);
}

/*
 * Unlinking file
 */
syscall::unlink:entry
{
        self->start_unlink = timestamp;
}

syscall::unlink:return
/self->start_unlink/
{
        this->delta = timestamp - self->start_unlink;
        @lcount["Unlink file"] = count();
        @lmax["Unlink file"] = max(this->delta);
        @lavg["Unlink file"] = avg(this->delta);
        self->start_unlink = 0;
}

/*
 * sync system call
 */
syscall::syssync:entry
{
        self->start_syssync = timestamp;
}

syscall::syssync:return
/self->start_syssync/
{
        this->delta = timestamp - self->start_syssync;
        @lcount["sync system call"] = count();
        @lmax["sync system call"] = max(this->delta);
        @lavg["sync system call"] = avg(this->delta);
        self->start_syssync = 0;
}

/*
 * fsync system call
 */
syscall::fdsync:entry
{
        self->start_fdsync = timestamp;
}

syscall::fdsync:return
/self->start_fdsync/
{
        this->delta = timestamp - self->start_fdsync;
        @lcount["fsync system call"] = count();
        @lmax["fsync system call"] = max(this->delta);
        @lavg["fsync system call"] = avg(this->delta);
        self->start_fdsync = 0;
}

/*
 * Print output summary
 */
profile:::tick-1s,
dtrace:::END
{
        normalize(@lmax, 1000);
        normalize(@lavg, 1000);

        printf("  %-32s %12s %14s %14s\n", "Cause", "Count", "Maximum",
            "Average");
        printa("  %-32s [EMAIL PROTECTED] [EMAIL PROTECTED] usec [EMAIL 
PROTECTED] usec\n", @lcount, @lmax, @lavg);
        printf("\n");

        trunc(@lcount);
        trunc(@lmax);
        trunc(@lavg);
}
--------------------------------------------------------------

I'll add some more later - I have other work to be doing right now. :)
If anyone would like to work on some of the other metrics, please do and
post the bits here.

> (Of course, that said it's a pretty lame tool.  Isn't obvious that this
> is providing such paltry information that it's raising more questions than
> answers?  Fine, better than nothing -- but not much better...)

Yes, not that impressive.  Although getting customers to look and think
about latency is certainly worthwhile, and the start of the road to DTrace.

Brendan

-- 
Brendan
[CA, USA]
_______________________________________________
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

Reply via email to