Also you may want to have a look at JMH [1] and a presentation by Aleksey 
Shipilev on it 


Shripad

[1] : http://openjdk.java.net/projects/code-tools/jmh/
[2]: https://www.youtube.com/watch?v=VaWgOCDBxYw

On Monday, September 25, 2017 at 1:18:40 AM UTC+5:30, Nathan Fisher wrote:
>
> Hi Peter,
>
> Apologies everyone if I'm polluting the mailing list it's not the typical 
> latency question.
>
> Thanks Peter inline below for answers to your questions.
>
> On Sun, 24 Sep 2017 at 17:45, Peter Booth <[email protected] <javascript:>> 
> wrote:
>
>>
>>
>> Nathan,
>>
>>  
>>
>> You mentioned that it was clojure startup time that you want to improve. 
>> Is it a general "all clojure apps" issue or "our clojure apps?"
>>
>
> NF> Not just me, all Clojure apps. It takes about 700ms to load 
> clojure.core from a fat jar and execute "(+ 1 2 3)" (e.g. 1 + 2 + 3). (see 
> https://dev.clojure.org/display/design/Improving+Clojure+Start+Time) 
>  
>
>> What are typical times for the entire startup that you observe? What do 
>> the clojure apps actually do?
>>
> NF> Typically 10s of seconds except for new/small projects which are 
> single-digit in seconds.
>
>  
>>
>> Some points:
>>
>>  
>>
>> *Precision/noise:* 
>>
>> As Kirk described, calling System.nanoTime() costs about 28 nanos on a 
>> one year old Haswell CPU. It just doesn't work to use it to measure 
>> operations that themselves take tens or hundreds of nanos.
>>
> NF> Thanks for the clarification I wasn't actually sure how long the 
> methods were taking but it did give me insight to look elsewhere. Naively 
> can I assume the approach is a useable albeit crude technique that could be 
> applied where the latency is much larger (e.g. > 100us)? I was considering 
> using a dynamic proxy with that kind of instrumentation to collect data but 
> static methods present that. I also looked at AOP but the site was down.
>
>  
>>
>> *Skewing *
>>
>> Martin Thompson alluded to how measurement can skew behavior of the 
>> underlying system. JMH can’t avoid the Heisenberg effect. Perf-map reduces 
>> Heisenberg cost because you are tracing from outside the process (but still 
>> on the host). Taking measurements out-of-band is the only way I know to 
>> avoid Heisenberg
>>
>>  
>>
> NF> Yes I figured this would be an issue. I was instrumenting one method 
> at a time so that only affected the caller and not the callee I was 
> measuring. It was enough to identify that the method I was measuring at the 
> time of the original e-mail might not yield a huge benefit. The attached 
> Flame Graph generated with perf-map is what I was able to generate for the 
> (+ 1 2 3) example. My possible mis-interpretation of the Flame Graph is 
> that a significant amount of time is being spent in loading the classes and 
> interpreting the byte-code (e.g. "Interpreter" is both wide and deep on the 
> call stacks). When started there are around 2000 classes loaded. So I've 
> started looking into seeing what about the class loading is slow. Some 
> thoughts so far are:
>
>    - zip compression level (0 appears to save 40-80ms, which is similar 
>    savings as when loaded from disk).
>    - class load ordering (e.g. would loading based on a dependency graph 
>    help? would automatically loading a class from the jar as it's streamed 
>    past help? etc).
>    - static field/execution blocks (Clojure employs heavy use of static 
>    initialisation and fields that could be deferred to after start-up in a 
> dev 
>    scenario).
>    - custom class loader (less inclined for this as it introduces another 
>    dependency to "get started").
>
> *Host issues*
>>
>> When you said "spin up a linux box" did you mean a physical box, not a VM 
>> or container?
>>
> NF> VM it's not something where I'm aiming to achieve us performance and a 
> smooth latency curve rather just want to scratch an itch and see if I can 
> make some improvements.
>  
>
>> I've had a bunch of consulting projects that were different variations on 
>> “performance issues that only occur in environment X or on hardware Y”. It 
>> common for people to assume “performance is relative. If this is a hotspot 
>> here it will be a hotspot here”
>>
>>  
>>
>> All of the points described here require that you have root access to 
>> physical hosts that are representative of your target hardware. In larger 
>> (and some small) shops this isn’t always easy to get.
>>
>> On Saturday, September 23, 2017 at 10:51:52 AM UTC-4, Nathan Fisher wrote:
>>
>>> Thank-you I ran across an article by Brandon Gregg and was just starting 
>>> to dig into honest-profiler. Looks like I'll spin up a linux box instead to 
>>> use perf-map-agent.
>>>
>>>
>>> http://www.brendangregg.com/blog/2014-06-09/java-cpu-sampling-using-hprof.html
>>>
>>>
>>> On Sat, 23 Sep 2017 at 14:58 Martin Thompson <[email protected]> wrote:
>>>
>>>> This approach to measurement is likely to skew the results. I'd start 
>>>> with perf record via perf-map-agent and then use flame graphs.
>>>>
>>>>
>>>> http://psy-lob-saw.blogspot.co.uk/2017/02/flamegraphs-intro-fire-for-everyone.html
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "mechanical-sympathy" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>> -- 
>>> - sent from my mobile
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "mechanical-sympathy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
> -- 
> - sent from my mobile
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to