------- Additional Comments From rakdver at gcc dot gnu dot org 2004-12-01 21:44 ------- There is no additional cost counted for "(double *) coefPtr". The reason why the result with the cast is different is because PRE creates a (dead) phi node when moving the cast out of the loop. This phi node changes estimate of register pressure used in ivopts, which leads to divergence -- we do not perform strength reduction.
Which is a problem in DFmode, since [reg + reg] addressing for DF mode object requires two extra additions: [reg1 + reg2] = ... /* lower half */ reg2+=4; [reg1 + reg2] = ... /* upper half */ reg2-=4; Whereas with [reg] addressing mode, this would only be [reg] = ... [reg + 4] = ... Of course when we clean up the unused phi nodes created by PRE, the regression gets hidden (Daniel is working on this). But the unrelying problem -- the fact that we cannot tell that the [reg + reg] addressing for DF mode is expensive -- remains. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18768