On 13/02/2015 23:26, Brendan Gregg wrote:
Thanks Bertrand and John for explaining the invokedynamic issue, and
Vladimir for filing the bug.
I'll reply here (I don't have a JBS account; I would like one!).
The profilers I'm using (Linux perf, and Solaris DTrace) can already
handle a broken RBP, and we see this all the time when profiling
OracleJDK today (eg, as a flame graph:
http://www.slideshare.net/brendangregg/netflix-from-clouds-to-roots/66).
Including an option (eg, -XX:+NoOmitFramePointer, or
-XX:+ReduceOmitFramePointer, or -XX:+MoreFramePointer) which improved
RBP profiling (like my patch) would have great value for us. I'm fine
with a profiler not working 100% of the time, provided we understand
that there is an error margin and why (Bertrand and John's descriptions)
for when interpreting the profiles. Any of these options could also be
improved as follow-on changes, if and when needed.
As long as it is clear that RBP can be misleading and that trying to fix
that would be an RFE, not a bug, I have no objection.
I'll let official Reviewers see whether a command line option is
necessary (e.g. whether there are concerns about possible performance
regressions when RBP cannot be used by the register allocator).
Regards,
Bertrand.
I haven't had a chance yet to prototype more (eg, option processing).
There's also work happening in Linux (two projects on lkml this week,
one by Stephane Eranian and another by Carl Love) for improving Java JIT
symbol support in perf_events. I think there will be more demand for
system stack walking, as perf gets more symbol translation options.
Brendan
On Thu, Jan 15, 2015 at 9:50 AM, Vladimir Kozlov
<vladimir.koz...@oracle.com <mailto:vladimir.koz...@oracle.com>> wrote:
Thank you, Bertrand and John
I added this conversation to the bug report.
Thanks,
Vladimir
On 1/15/15 3:13 AM, Bertrand Delsart wrote:
On 14/01/2015 20:12, John Rose wrote:
On Jan 14, 2015, at 6:42 AM, Bertrand Delsart
<bertrand.dels...@oracle.com
<mailto:bertrand.dels...@oracle.com>
<mailto:bertrand.delsart@__oracle.com
<mailto:bertrand.dels...@oracle.com>>> wrote:
I would not prevent the JITs from using RBP as long as
the changeset
is not sufficient to guarantee the profiling will
work... and IMHO
solving the JSR292 issue will be much more intrusive
(impacting
HotSpot stack walking code).
Here are some thoughts on that.
SPARC uses L7 (L7_mh_SP_save) for the same purpose of method
handle
support as x86 uses RBP (rbp_mh_SP_save). So there's not a hard
requirement for x86 to take over RBP.
(Deep background: This purpose, in method handle support,
is to allow
an adapter to make changes to the caller's SP. The adapter
is the
initial callee from the caller, but may change argument
shape, and
tail-calls the ultimate callee. Because it is a tail-call,
the original
caller must have a spot where his original SP can be
preserved. The
preservation works because the original caller knows he is
calling a
MH.invoke method, which requires the extra argument
preservation. The
repertoire of argument shape changes is quite small,
actually; it is not
a very general mechanism since the LF machinery was put in.
Perhaps the
whole thing could be removed somehow, by finding alternative
techniques
for the few remaining changes. OTOH, this SP-restoring
mechanism may be
helpful in doing more a general tail-call mechanism, and
perhaps in
managing int/comp mode changes more cleanly, so I'd like us
to keep it.
And document it better.)
Any register or stack slot will do for this purpose, as long
as (i) its
value can be recovered after the MH.invoke call returns to
the caller,
and (ii) its value can be dug up somehow during stack
walking. There
are only a couple of places where stack walking code needs
to sample the
value, so they should be adjustable.
Both x86 and SPARC use registers which are callee-save (or
"non-volatile
across calls") which satisfy properties (i) and (ii). A
standard stack
slot (addressed based on caller's RBP) would probably also
satisfy those
properties.
A variably-positioned stack slot would also work, which
would require
registering the position in each CodeBlob. That's
unpleasant extra
detail, but it would align somewhat with the current logic
which allows
each CodeBlob (nmethod, actually) to advertise which call
sites need the
special processing (see the function
is_method_handle_return(__caller_pc)).
I recommend reserving a dead word of space in every stack
frame that
makes MH.invoke calls, at a fixed position relative to that
frame's RBP.
— John
I perfectly agree that it is doable (and with your proposed
approach).
I just wanted to be sure people were aware that the RFE is more
complex
than what the current changeset may suggest. We are not just taking
about reviewing and integrating a complete changeset contributed
by the
community. There is more work needed, either by the community or by
Oracle. This will require changes at least in C1 and C2 call
sequences,
in the stack walking, in the creation and sizing of compiled
frames...
Regards,
Bertrand.
--
Bertrand Delsart, Grenoble Engineering Center
Oracle, 180 av. de l'Europe, ZIRST de Montbonnot
38330 Montbonnot Saint Martin, FRANCE
bertrand.dels...@oracle.com Phone : +33 4 76 18 81 23
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: This email message is for the sole use of the intended
recipient(s) and may contain confidential and privileged
information. Any unauthorized review, use, disclosure or
distribution is prohibited. If you are not the intended recipient,
please contact the sender by reply email and destroy all copies of
the original message.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~