On Apr 16, 2010, at 1:55 PM, Kelly O'Hair wrote: > So often the raw performance comes down to the VM generated code
Yes, that's the main story. Performance is strongly determined by code generated by the dynamic compiler (aka "JIT"). C++-generated instructions are rarely executed, apart from the GC. If we find C++-generated code getting executed in a hot path, we take it over into either VM-compiled code or hand-written assembly. (For an example of the latter, see the arraycopy stubs.) The next most likely bottleneck, IMO, is memory system effects. Occasionally there is a significant effect from the GC code written in C++, so the GC code gets a good share of tweaking. (E.g., see oopDesc::size_given_klass and related functions.) It's rare to find significant overheads in the other C++ code, except during startup. (Therefore the classfile parsing code has received tuning.) -- John
