Re: GCC 4.3.0 Status Report (2007-09-04)

Roger Sayle Sun, 09 Sep 2007 14:11:09 -0700

This is an optimization pass which leads to dramatically better code on

at least one SPEC benchmark.  Ian, Roger, Diego, would one of you care
to review this?

My concern is that as formulated, conditional store elimination is notalways a win.


Transforming

   if (cond)
     *p = x;

into

 tmp = *p;
 if (cond)
   tmp = x;
 *p = tmp;

on it's own, effectively transforms a conditional write to memory into anunconditional write to memory.On many platforms, even x86, this a pessimization. For example, the "IntelArchitecture Optimization Manual", available atftp://download.intel.com/design/PentiumII/manuals/24281603.PDF in section3.5.5 "Write Allocation Effects", actually recommends the inversetransformation. On page 3-21 they show how the "Sieve of Erastothenes"benchmark can be sped up on Pentium class processors by transforming theline


   array[j] = 0;

into the equivalent

   if (array[j] != 0)
     array[j] = 0;

i.e. by introducing conditional stores.

The significant observation with Michael Matz's extremely impressive 26%improvement on 456.hmmer is the interaction between this transformation withother passes, that allow the conditional store to be hoisted out of acritical loop. By reading the value into a "tmp" before the loop,conditionally storing to the register tmp in the loop, then unconditionallywriting the result back afterwards, we dramatically reduce the number ofmemory writes, rather than increase them as when this transformation isapplied in isolation.

I think the correct fix is not to apply this transformation everywhere, butto correctly identify those loop cases where it helps and perform the looptransformation there. i.e. conditional induction variable identification,hoisting and sinking needs to be improved instead of pessimizing code to asimpler form that allows our existing flawed passes to trigger.

I do very much like the loop-restricted version of this transformation, andit's impressive impact of HMMR (whose author Sean Eddy is a good friend).Perhaps Mark might give revised versions of this patch special dispensationto be applied in stage 3. I'd not expect any correctness issues/bugs, justperformance trade-offs that need to be investigated. Perhaps we should evenapply this patch as is during stage 2, and allow the potential non-loopperformance degradations to be addressed as follow-up patches and thereforeregression fixes suitable for stage 3?

Congratulations again to Michael for this impressive performanceimprovement.


Roger
--
Roger Sayle, Ph.D.
OpenEye Scientific Software,
Suite #D, 9 Bisbee Court,
Santa Fe, New Mexico, 87508.

Re: GCC 4.3.0 Status Report (2007-09-04)

Reply via email to