On 7/21/2010 2:18 AM, Adam Leventhal wrote:
Not generally possible: I guess this is technically true, thanks to
trampolines, tail calls, and JMPL whose source registers have probably been
clobbered. However, we should at least be able to detect vanilla CALL reliably
by decoding %i7 (I've never heard of a compiler clobbering it). The attached
scripts do this. The dtrace script reads in the instruction at %i7, decodes it
and computes the target when possible, and emits the result at the head of each
stack trace. The awk script (which takes the dtrace script's output as input)
then checks whether i7's target points to the same function as the leaf's
caller and decide whether to filter it out.
It it not generally possible to determine the the probe fires whether or not
the thread is currently executing in leaf context. All symbolic translations
happens in user-land after the fact for both ustack() and uaddr().
I'm afraid you lost me. What does symbolic translation have to do with
detecting the presence of a stack frame? I could imagine it being
possible to stash this info in debug sections (at least for dwarf2), but
I don't think that's what you're referring to here.
By 'leaf' I just meant 'the top of the user stack' (as in the analyzer's
'leaf is' predicate) but maybe there's a more precise definition of leaf
I wasn't aware of?
Running the two scripts on a process, I got 13.5k total samples, with with 214
of those unable to read %i7 for some reason; of the valid samples, 8.2k had
%i7's target point to a different function than the leaf.
I've checked source code for a smattering of the samples, and every time the
i7-enhanced stack trace is the correct one.
Maybe this won't ever get baked into dtrace, but at least there's a workaround
now if other folks need it.
What we could do is have the ustack() action record %o7 as well and then figure
out in user-land whether or not it's relevant.
Actually, I tried %o7 first, and it's pretty dodgy -- once a stack frame
has been created, it can hold anything. CC seems to use it a lot for
bit-shifting and as the target of JMPL. In contrast, %i7 is (nearly)
always valid.
Also, it might be good to update the wiki with this gotcha. Missing tail calls
is one thing... silently missing legitimate callers who made a stack frame is
pretty annoying (and makes profiling significantly less useful if you're
hunting for the caller of that expensive function).
Good idea. Propose some changes to the list and we'll get it reviewed.
Two changes, I think. BTW, is this a sparc-only issue or would it also
affect x86?
http://wikis.sun.com/display/DTrace/Actions+and+Subroutines#ActionsandSubroutines-%7B%7Bustack%7D%7D,
needs something resembling:
Limitations: Because ustack() must traverse stack frames to build its
stack trace, functions which do not establish a stack frame can lead to
unpredictable results. In particular
* Functions making tail calls will not appear because they tear down
their own stack frame before making the call.
* Except inside function entry probes, leaf functions which have not
(yet) established a stack frame sometimes prevent their caller
from appearing in the stack trace (e.g. foo -> bar -> baz will
appear as foo -> baz). See <link-to-note-at-profile-provider>.
http://wikis.sun.com/display/DTrace/profile+Provider needs a new section
at the end (before 'Stability'):
Limitations:
ustack() only reports the caller of a leaf function if the latter has
established a stack frame. Otherwise, the caller's caller appears to
call the leaf function (e.g. foo -> bar -> baz will be repoted as foo ->
baz). This behavior occurs because dtrace has no way to determine
whether the current top of stack has established a frame or not. DTrace
can avoid the problem with function entry probes because they always
fire before the target has established a stack frame, and with the
return address is in a known location, neither of which is necessarily
true in profile probe context.
Thoughts?
Ryan
_______________________________________________
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org