What I can say more is that a good "ABCD" optimization complimented
... and ABCD does nothing with BBP, isn't it?
with "loop versioning" optimiztion will make a more readable, more
stable code, AND will give a better performance gain (loop unrolling
Unfortunately no, it will not. :-(
As you probably know ;-), char is 16 bits wide in Java.
The code generated for char moves have to use 16 bit movement
instructions. These instructions include operand-size change prefix
(66h) that makes CPU decoder feel bad. Whatever unrolling or versioning
would leave these heavy instructions on place.
One of the goals was to throw the heavy instructions away and replace
them with more effective and fast ones. It's somehow hard to do in
codegen separately, but much easy (and clearly) with a little help from
the HLO side - this is rationale for the HLO pass.
is awake too). Setting aside the fact that the overall design will be
more straightforward (having no interdependent passes, extra helpers,
etc)
So I vote for focusing on ABCD plus "loop versioning" and leaving
specific benchmark-oriented tricks (complicating our design) alone.
Again, the optimization is orthogonal to the ABCD and was never
positioned as replacement for ABCD. Optimization's main target are
string (and XML) intensive apps.
My 2 cents:
I hope we will use results of the ABCD optimization to completely
replace regular loop with a helper call