[jvm-l] Re: Optimizations?

John Rose Wed, 25 Feb 2009 11:11:32 -0800

On Feb 24, 2009, at 10:22 AM, Charles Oliver Nutter wrote:

> - General reduction in bytecode size. Simply having less bytecode  
> seems
> to have a large effect on how well Hotspot is able to consume the  
> code.
> There are also size limitations on inlining code, so less bytecode  
> often
> means more inlining.
> - Reduction in numbers/complexity of branches. This also plays into
> total bytecode size, but I've seen improvements from simply flipping
> loops around or calculating jump conditions in aggregate before  
> making a
> single jump.
> - Outline as much code as humanly possible (as opposed to inlining).
> JRuby's compiler originally just emitted all logic straight into the
> method body. This turned out pretty badly; it was very slow, and there
> was a tremendous amount of duplication. By pulling as much as possible
> into static utility methods, bytecode size was drastically reduced and
> performance went up substantially.


Thanks, Charlie.  Those are great rules of thumb.  They probably  
belong somewhere on the internals wiki, despite the warnings that it  
is not a tuning document.

Branch-to-branch doesn't matter much for HotSpot, since it uses SSA  
and sea-of-nodes IR.  Recently, we have added a pre-pass to make sure  
the IR is generated in RPO, regardless of concrete bytecode  
structure.  This makes the initial JIT steps more strongly  
normalizing, hence more reliably optimizing. I imagine the other JVMs  
do similar "due diligence" on their JIT inputs.

For language runtimes the JVMs should supply more hooks for affecting  
JIT-level inlining, maybe something as simple (in HotSpot's case) as  
@sun.misc.Inline.  I'm not aware of a credible effort to standardize  
on it, though.

Branch-free idioms (as long as they are simple and do not greatly  
expand bytecode size) are helpful.  The Hotspot JIT is pretty good at  
turning simple control flow into branch-free code when possible, but  
in very simple cases it is more reliable to write it in Java code.   
The main problem is that bytecodes cannot express conditional move  
directly; the best you can do is control from from x = p ? x1 : x2.   
The important thing to encourage conditional moves is that the  
predicate p and one or both of the conditional values x1, x2 should  
free of side effects and exceptions, such as constants or variable  
references.

Always use -XX:+PrintAssembly to see what you are doing.

BTW, +PrintCompilation is a very *old* flag; +LogCompilation is  
slightly less old.  The newest thing here, probably is that  
+PrintAssembly is now a diagnostic switch, which means it works for  
product builds (after being unlocked).  You need a disassembler  
plugin, and I'm pleased to announce that this is now second-sourced  
(x86/32 only at present) at http://kenai.com/projects/base-hsdis/ .   
You may still need to do a build, but hopefully the extra option will  
make it easier for some.

Best,
-- John

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "JVM 
Languages" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/jvm-languages?hl=en
-~----------~----~----~----~------~----~------~--~---

[jvm-l] Re: Optimizations?

Reply via email to