Re: How badly do JFR stack traces lie?

Nitsan Wakart Sun, 03 Dec 2017 03:39:07 -0800

So, as apangin points out there's an issue where JFR cannot walk the stack 
safely. To add insult to injury, JFR does not report failed samples at all, 
which results in a systematic omission of certain methods from the profile. 
This is a massive reporting issue in my opinion, and has not been fixed in JDK 
9. I have discussed it with members of the JFR team, hopefully it will be fixed 
in the near future. Honest Profiler and Async-Profiler are both significantly 
better in that regard.
On top of that, JFR and other Java level profilers, rely on the debug 
information provided by the JVM to help translate the sampled program 
counter(PC)to a Java bytecode and relevant stack trace. The debug information 
provided by default is quite sparse, and can be greatly improved by 
-XX:+DebugNonSafepoints. Even after that, the translation can be lacking, or 
misleading. This is due in part to certain compiler optimisations not creating 
the relevant mapping information. In any case the information is often 
incomplete, and where a mapping is not available the nearest mapping is 
taken(e.g. No bytecode is associated with PC, but there's a mapping available 
for PC+10, so report that BCI). The mapping of a single instruction to a BCI is 
also at times incorrect as the PC is in fact the combined result of many 
bytecodes.
Further more, instruction profiling itself suffers from certain inaccuracies 
leading to the reported PC to be a few(normally 1-10, but on rare occasions 
allot more) instructions after the instruction where most of the actual cost is.
The above complications are compounded by method in lining, which results in 
the compiler mixing up code from several methods together to generate a single 
'real method'. So where before you could skid a few instructions, map to the 
wrong BCI, but still end up looking at the right method, with inlining you can 
easily skid between lines in different methods.
This issue exists in all profilers when reporting a Java line of code or 
method. Instruction level profilers will show the skid within the context of 
the real method, but will provide enough context IME for you to find the root 
cause.
The potential for error is quite large, and does happen. It does not render JFR 
useless, but it helps to be aware of the above and reach for other 
tools(honest-profiler, async-profiler, perf, Oracle Studio, VTune etc) when the 
data seems suspect. Definitely start by enabling DebugNonSafepoints.



> On 2 Dec 2017, at 07:29, Remko Popma <[email protected]> wrote:
> 
> For background, see https://stackoverflow.com/q/47590263/1446916
> 
> Apangin’s answer seems plausible, would like to hear insights from people on 
> this list. 
> 
> Remko 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "mechanical-sympathy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: How badly do JFR stack traces lie?

Reply via email to