On the 0x29A day of Apache Harmony Mikhail Fursov wrote:
> On 15 Mar 2007 19:58:54 +0300, Egor Pasko <[EMAIL PROTECTED]> wrote:
> >
> > this should hypothetically improve one simple code pattern (that is
> > probably widely used?), i.e.: for (int i=0;i<I;i=++){A[i]=X;}
> >
> > What I figured out looking at the patch:
> >
> > * [pass.2] does not seem to throw any AIOutOfBoundsException
> >
> > * [pass.2] does not have any useful tuning parameters such as number
> > of unrolls per loop, thus the scheme eats potential benefit
> > from loop unrolling and does not give any tuning back
>
>
> AFAIK this optimization is much more efficient then loop unrolling on
> microtest you mentioned.
I believe, loop unrolling can do more
> * [pass.1] detects such a rare pattern that I doubt it would benefit a
> > user (but obviously will benefit a you-know-which-benchmark
> > runner)
>
>
> Generic Arrays.fill-like methods could be optimized this way.
> + IMO even several percents in widely known
> benchmarks is a reason to implement even more complicated optimizations.
>
> * [pass.1] has a lot of new code that introduces potential instability
> > (if the pattern was detected not properly, the code does
> > not read easily), but does not contain a single unit test
> > or the like. Together with AIOOBE issue stability becomes a
> > real question.
>
>
> All known bugs can be fixed.
yes, and the fix might take so much time that implementing
versioning+abcd would be faster. Noone knows.
I thought, stability has a higher priority that 0.5% improving
error-prone hacks..
On the other way, all JIT technology is just an error-prone way of
optimizing an interpreter :)
> If AIOOBE is the a real problem here - it looks to be easily fixed
> too. The question if the optimization gives any benefit or not.
Well, I agree. So, let's write tests on this first and let the patch
go then.
> We can move it into separate HLO pass (and separate
> file) and drop it from codebase if it's not needed in future.
I love this idea
> * back branch polling is not performed (which is probably good for
> > performance, but I would better have a tuning option)
>
>
> Do you think that the latency of mem-copying like opt can be a problem here?
may be, not obvious
> What I can say more is that a good "ABCD" optimization complimented
> > with "loop versioning" optimiztion will make a more readable, more
> > stable code, AND will give a better performance gain (loop unrolling
> > is awake too). Setting aside the fact that the overall design will be
> > more straightforward (having no interdependent passes, extra helpers, etc)
>
>
> So I vote for focusing on ABCD plus "loop versioning" and leaving
> > specific benchmark-oriented tricks (complicating our design) alone.
>
>
> I support focusing on loop
> versioning/ABCD and other general purpose optimization we do not have today.
> And until we do not get from these opts better results for your microtest we
> can use Nikolay's approach. At least it's works better today.
> ?
Okay, we can. But, IMHO, Nikolay could have written the right way from
the beginning. That did not happen. I am not worried too much,
though. Now we have the patch and with some more testing it should
just work.
--
Egor Pasko