https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832
Alexander Monakov <amonakov at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |amonakov at gcc dot gnu.org --- Comment #18 from Alexander Monakov <amonakov at gcc dot gnu.org> --- The apparent 'bias' is introduced by instruction scheduling: haifa-sched lifts a +64 increment over memory accesses, transforming +0 and +32 displacements to -64 and -32. Sometimes this helps a little bit even on modern x86 CPUs. Also note that 'vfnmadd231pd 32(%rdx,%rax), %ymm3, %ymm0' would be 'unlaminated' (turned to 2 uops before renaming), so selecting independent IVs for the two arrays actually helps on this testcase.