[ 
https://issues.apache.org/jira/browse/LUCENE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17415264#comment-17415264
 ] 

Mark Robert Miller commented on LUCENE-10099:
---------------------------------------------

Their wiki has some info on that:

“CPU profiling

In this mode profiler collects stack trace samples that include Java methods, 
native calls, JVM code and kernel functions.

The general approach is receiving call stacks generated by perf_events and 
matching them up with call stacks generated by AsyncGetCallTrace, in order to 
produce an accurate profile of both Java and native code. Additionally, 
async-profiler provides a workaround to recover stack traces in some corner 
cases where AsyncGetCallTrace fails.

This approach has the following advantages compared to using perf_events 
directly with a Java agent that translates addresses to Java method names:

Works on older Java versions because it doesn't require 
-XX:+PreserveFramePointer, which is only available in JDK 8u60 and later.

Does not introduce the performance overhead from -XX:+PreserveFramePointer, 
which can in rare cases be as high as 10%.

Does not require generating a map file to map Java code addresses to method 
names.

Works with interpreter frames.

Does not require writing out a perf.data file for further processing in user 
space scripts.

If you wish to resolve frames within libjvm, the debug symbols are required.”


> Add -Ptests.asyncprofile option.
> --------------------------------
>
>                 Key: LUCENE-10099
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10099
>             Project: Lucene - Core
>          Issue Type: Test
>          Components: modules/test-framework
>            Reporter: Mark Robert Miller
>            Assignee: Mark Robert Miller
>            Priority: Minor
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Man, I don't even remember the last time I contributed an issue to Lucene 
> that was for Lucene. Feels like it has to be near a decade.
> I recently finished some work that took quite an investment around automated 
> performance aids. It wasn't giving me that feeling of satisfaction I faintly 
> remember though. Yesterday, I was mostly feeling nostalgic about past times 
> when I would bring up some nasty inefficiency or ugly problem and not hear 
> about how it doesn't likely matter or doesn't sound right or 'meh' - I'd hear 
> "omg, you're right, this must be fixed" and then "you're an idiot, it's even 
> worse than that".
> So new little project for today - the first of two things I'd like to 
> contribute - async profiler support for the tests.profile option.
> I have a rough draft, it works roughly as tests.profile but instead of:
> {noformat}
> PROFILE SUMMARY from 174 events (total: 174)
>   tests.profile.mode=cpu
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT       CPU SAMPLES   STACK
> 5.17%         9             java.util.zip.Inflater#inflateBytesBytes()
> 3.45%         6             java.io.FileInputStream#open0()
> 2.30%         4             java.io.UnixFileSystem#getBooleanAttributes0()
> 2.30%         4             java.io.RandomAccessFile#readBytes()
> 1.72%         3             java.io.UnixFileSystem#normalize()
> 1.72%         3             
> org.apache.lucene.util.automaton.RunAutomaton#<init>()
> 1.72%         3             java.util.Arrays#fill()
> 1.15%         2             
> java.lang.invoke.DirectMethodHandle$Holder#invokeStatic()
> 1.15%         2             java.lang.StringLatin1#charAt()
> 1.15%         2             java.io.DataOutputStream#writeUTF()
> {noformat}
> you get more like:
> {noformat}
> PROFILE SUMMARY from 368675 events (total: 368675)
>   tests.profile.mode=cpu
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT       CPU SAMPLES   STACK
> 3.89%         14353         #SymbolTable::lookup_dynamic()
> 1.67%         6171          #Interpreter()
> 1.40%         5175          #InstanceKlass::find_method_index()
> 1.34%         4958          #__tls_get_addr()
> 1.13%         4177          #I2C/C2I adapters()
> 1.09%         4022          #resource_allocate_bytes()
> 0.69%         2528          #ValueStack::values_do()
> 0.62%         2298          #no_Java_frame()
> 0.62%         2281          #ScopeDesc::decode_body()
> 0.55%         2017          #Dictionary::find()
> {noformat}
> as well as these additional files shown below. Sampling was a bit intense I 
> guess, sorry for the slow load on the flame graph ;)
> Summary
> https://elated-banach-e02e8b.netlify.app/asyncprofiler/hotspot-pid-686723-id-0-20210912-162040.txt
> Results Tree
> https://elated-banach-e02e8b.netlify.app/asyncprofiler/tree-hotspot-pid-686723-id-0-20210912-162044.html
> Results Flame Graph
> https://elated-banach-e02e8b.netlify.app/asyncprofiler/flame-hotspot-pid-686723-id-0-20210912-162040.html
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to