On Tue, Nov 11, 2014 at 08:27:22PM -0500, Michael Meissner wrote: > > Before the patch, the final reduction used *vsx_reduc_splus_v2df; after > > the patch, it is *vsx_reduc_plus_v2df_scalar. The former does a vector > > add, the latter a float add. And it uses the same pseudoregister for the > > accumulator throughout. IRA decides a register is more expensive than > > memory for this, I suppose because it wants both V2DF and DF? It doesn't > > seem to like the subreg very much. > > I haven't looked into in detail (I've been a little busy with th upper regs > patch), but I suspect the problem is that 128-bit and 64-bit types cannot > overlap (i.e. rs6000_cannot_change_mode_class returns true). This is due to > the fact that scalars in VSX registers occupy the upper 64-bits, which would > not match the compiler's notion of that it should be in the bottom 64-bits.
You suspect correctly. Hacking around that in cannot_change_mode_class doesn't help, subreg_get_info disallows it next. Changing the pattern so it does two extracts instead of an extract and a subreg works (you get an fmr for the high part though, register alloc doesn't know dest=src is for free here). _Should_ the subreg thing work? Or should the patterns be fixed? Segher