Chris Kitching wrote a pretty cool bug comment about micro-optimizations, how they're affected by the Dalvik JIT, and the resultant techniques for benchmarking [1]. The text is reproduced below - thanks Chris!
Comment 22: *(In reply to Michael Comella (:mcomella) from comment #21)> What do you mean by incurs the wrath of the JIT? A link will suffice.>> What is an optimal way to test a micro-optimized patch?The JIT looks to optimise hotspots in code, typically just a couple of basic blocks at a time. Tight loops which do something a million times for benchmarking are certainly going to get its attention (as your program will be spending very nearly all its time in that loop).The JIT is a (fairly poor in Android's case) optimising compiler. It often notices that your test function is a waste of time and optimises it out. You then find that your first three(ish) iterations are slow, and the following 999,997 take no time at all (as they're *just* the print statements).Someone once gave me an r- after doing this. That was a fun conversation :P.Even if you avoid the hilarious case, the JIT makes things harder to reason about. The JIT in Dalvik really likes to process just a few basic blocks at a time (you can read a little more about this here, bonus points for finding the video:http://dl.google.com/googleio/2010/android-jit-compiler-androids-dalvik-vm.pdf <http://dl.google.com/googleio/2010/android-jit-compiler-androids-dalvik-vm.pdf> ).Its tendency to do this is a bit of a pain when you have a big complex system. It's likely the system doesn't *have* any nice hotspots for the JIT to poke with a stick, but it'll probably have thousands of warm-ish bits. These warm-ish bits won't be quite hot enough to get processed promptly, so may be executed a very large number of times in interpreted mode (and may never be processed at all). It's these warm-spots that benefit particularly well from micro-optimisations, but since their warm-ish-ness isn't preserved under benchmarking in isolation convincing reviewers you've not lost your mind can be challenging.To make things even more exciting, the HotSpot JVM usually processes an entire method at a time. If you go run the system on a desktop JVM you'll find that as soon as the whole big function becomes hot, the JIT will eat the whole thing, once again dwarfing your micro-optimisation. You may be able to observe improvement here, but since the HotSpot JIT is a really rather good optimising compiler (in contrast to Dalvik's, which does very little optimisation), it's very likely that it already did your optimisation, along with many many more, at JIT-time.Just because an optimisation isn't interesting post-JIT isn't a reason to not do it (unless it's hurting readability a lot or something), due to these "warm-spots" I described earlier. In the presence of a method-granularity JIT, particularly one with good optimisation like HotSpot's, such work is less pointful.Sort-of sane approaches include:- Do something that causes the whole shaboodle to be executed a large number of times, run this in an instrumenting (not sampling!) profiler, and check the average execution time for the small thing you improved (which will most likely be a number in microseconds, which also upsets reviewers who aren't good at multiplying by a million).- Run it on a desktop JVM with the JIT turned off (this is correct for certain sorts of optimisations, but you now have to consider the different behaviour of Dalvik vs. the desktop JVM. Different bytecode instructions are differently expensive between the two platforms, which may skew your results).- Run a large number of iterations first to *ensure* it's been JIT'd, and then make your measurements. This is useful only if you believe your optimisation affects the post-JIT performance. This is more problematic than you might think, however, thanks to such absurdity on some platforms as "dynamic deoptimisation". There's a rather good SO thread discussing this topic here:http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java <http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java>That thread also links to this rather fun article:https://www.ibm.com/developerworks/java/library/j-jtp02225/ <https://www.ibm.com/developerworks/java/library/j-jtp02225/>Be mindful though of the way these are mostly discussing desktop JVMs (HotSpot and friends) which have radically different JIT characteristics to Dalvik. ART is another kettle of fish entirely.- Check bytecode *length*. If you're really stuck, the solution with the least bytecode is *probably* marginally better, as the JIT finds them easier to swallow. (In general, some particularly dense desktop JVMs have been known to give up in the face of extremely large functions).- Appeal to theory.- Electrocute your reviewer.It's a tricky one. I typically use the profiler approach I mentioned in the event I actually want to measure it.Some things may be sufficiently awkward to measure that you stop caring and just appeal to theory (I do that a lot).* Comment 23: *I should also highlight the extremely excellent name for the "The JIT optimised out your benchmark" problem: Heisenbenchmark.* Comment 24 (rnewman): *> It's a tricky one. I typically use the profiler approach I mentioned in the> event I actually want to measure it.> Some things may be sufficiently awkward to measure that you stop caring and> just appeal to theory (I do that a lot).On this tangent: it's amusing that one of the first things drummed into performance-oriented engineers is "don't trust your instincts; profile, then optimize".When it comes down to the behavior of systems like this in the large, you end up having to measure at incredibly coarse granularity -- e.g., time to page load -- because attempting to measure subsystems affects the experiment itself.At coarse granularity your improvements disappear in the noise, and you end up having to trust to instinct… or its more formal cousin, theory.* (Ed: clipped for relevancy) Comment 25 (ckitching): *(In reply to Richard Newman [:rnewman] from comment #24)> On this tangent: it's amusing that one of the first things drummed into> performance-oriented engineers is "don't trust your instincts; profile, then> optimize".You might enjoy this article:http://www.joshbarczak.com/blog/?p=580 <http://www.joshbarczak.com/blog/?p=580>* - Mike (:mcomella) [1]: https://bugzilla.mozilla.org/show_bug.cgi?id=732177#c22
_______________________________________________ mobile-firefox-dev mailing list [email protected] https://mail.mozilla.org/listinfo/mobile-firefox-dev

