Hey Rafael,
Rafael Vanoni wrote:
> I've been working on DTrace scripts to figure out exactly what's causing
> CPU wakeups and would like to get your opinion on this one.
>
> Since idle-state-transition is fired by the scheduler, there's no way of
> associating the offender with the transition itself. That is, the thread
> that runs through idle-state is not the same that lead to the
> transition.
Right, in fact the thread that runs through idle-state is probbaly (by
definition) the idle thread. :)
> So I've been working around the following script, which uses
> a global variable to pick up the next thread that goes online after the
> transition.
>
> #!/usr/sbin/dtrace -s
> #pragma D option quiet
>
> int up, res;
>
> sdt:::idle-state-transition
> /arg0 != 0/
> {
> self->start = timestamp;
> }
>
> sdt:::idle-state-transition
> /arg0 == 0 && self->start/
> {
> res = (timestamp - self->start);
> up = 1;
> }
>
> syscall:::entry
> /up == 1 && pid != 0/
> {
> printf("0 %d %d %d %d %d %s %s\n",
> cpu,
> pid,
> tid,
> timestamp,
> res,
> execname,
> probefunc);
> self->start = 0;
> up = 0;
> res = 0;
> }
>
>
> The problem is somewhat obvious. There's no guarantee that the next
> thread is the actual offender. It's very likely, but not guaranteed :(
>
> Any thoughts?
>
I think the apprach is probably ok. Like you said, the trick is to
figure out what's causing the idle-state transition...and before I think
we figured either:
1 device interrupt
2 "poke" because something became runnable on the local queue
3 "poke" because something became runnable on a remote queue that
was already busy
4 xcall for some other reason
- You might want to use the "sched::on-cpu" probe to catch the first
thread that this CPU is now going to run (what if the next thread is a
kernel service thread)? If that's the next probe to fire, it's probably
scenario 2.
- If the sched::dequeue probe fires before sched::on-cpu, it's probably
scenario 3.
- If an interrupt probe fires before the idle-state transition probe
(which will happen since the CPU will awake to the interrupt handler),
then it's probbaly scenario 1.
Then I would be curious how much is left...which I assume would fall
into the scenario 4 bucket. :)
Thanks,
-Eric