http://sources.redhat.com/bugzilla/show_bug.cgi?id=10799

Here is a stap puzzler:
Why does the output of these scripts differ so much:

$ stap -e 'global syscalls=0; probe syscall.* { syscalls++ } probe end {
printf("syscalls: %d\n", syscalls) }' -c 'sleep 5'
syscalls: 22492

$ stap -e 'global count=0; probe syscall.* { count++ } probe end {
printf("syscalls: %d\n", count) }' -c 'sleep 5'
syscalls: 5

Note that while in the first the number of syscalls fluctuates somewhat
depending on what is being run, in the second count is always constantly 5.

Looking at the generated source code we see that for the first variant we see
two probe_ functions. One that hooks each syscall kernel.function, and one for
the end probe. But for the second there are more...

In the second case each syscall probe alias that defines a local variable
'count' generates a new probe_ function body. In such a function body we see
something curious:

 {
   (void) 
   ({
     l->__tmp0 = 
     ({
       function__dwarf_tvar_get_count_1410 (c);
       if (unlikely(c->last_error)) goto out;
       c->locals[c->nesting+1].function__dwarf_tvar_get_count_1410.__retvalue;
     });
     global.s_count = l->__tmp0;
     l->__tmp0;
   });
   
   (void) 
   ({
     l->__tmp3 = global.s_count;
     global.s_count += 1;
     l->__tmp3;
   });
   
 }

Not only is the local dwarf count variable retrieved, but global.s_count is
first assigned that value before 1 is added to it. Whoa!

------- Additional Comment #1 >From Frank Ch. Eigler 2009-10-17 12:21 [reply] -------
It's a namespace collision.  Some syscall alias must be defining
the "count" value, so its value gets reset every time that syscall
gets hit.

We could warn against this by detecting if a probe alias variable
could resolve to a global.

------- Additional Comment #2 >From Mark Wielaard 2009-10-17 14:16 [reply] -------
(In reply to comment #1)
> It's a namespace collision.
> 
> We could warn against this by detecting if a probe alias variable
> could resolve to a global.

Yes, definitely. This issue is very surprising if you are probing aliases with
some wildcard. Depending on which aliases are matched your results could be
totally off when one of them happens to define a local variable that then resets
your global variables in your script.

Reply via email to