On Mon, Sep 19, 2016 at 05:43:19PM -0500, Segher Boessenkool wrote: > On Mon, Sep 19, 2016 at 06:02:08PM -0400, Michael Meissner wrote: > > vector float combine (float a, float b, float c, float d) > > { > > return (vector float) { a, b, c, d }; > > } > > [ ... ] > > > However ISA 2.07 (i.e. power8) added the VMRGEW instruction, which can do > > this > > more simply: > > > > xxpermdi 34,1,2,0 > > xxpermdi 32,3,4,0 > > xvcvdpsp 34,34 > > xvcvdpsp 32,32 > > vmrgew 2,2,0 > > This results in {a,c,b,d} instead?
Yes. > > --- gcc/config/rs6000/rs6000.c > > (.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) > > (revision 240142) > > +++ gcc/config/rs6000/rs6000.c (.../gcc/config/rs6000) (working copy) > > @@ -6821,11 +6821,26 @@ rs6000_expand_vector_init (rtx target, r > > rtx op2 = force_reg (SFmode, XVECEXP (vals, 0, 2)); > > rtx op3 = force_reg (SFmode, XVECEXP (vals, 0, 3)); > > > > - emit_insn (gen_vsx_concat_v2sf (dbl_even, op0, op1)); > > - emit_insn (gen_vsx_concat_v2sf (dbl_odd, op2, op3)); > > - emit_insn (gen_vsx_xvcvdpsp (flt_even, dbl_even)); > > - emit_insn (gen_vsx_xvcvdpsp (flt_odd, dbl_odd)); > > - rs6000_expand_extract_even (target, flt_even, flt_odd); > > + /* Use VMRGEW if we can instead of doing a permute. */ > > + if (TARGET_P8_VECTOR) > > + { > > + emit_insn (gen_vsx_concat_v2sf (dbl_even, op0, op2)); > > + emit_insn (gen_vsx_concat_v2sf (dbl_odd, op1, op3)); > > But this looks correct, so just the example is pastoed? Yes, I pasted the code for -mcpu=power7 and -mcpu=power8. The original code puts the elements in a different order, and then fixes it up with a permute. I changed the order so that it would match how VMRGEW works, and I tested it on both big and little endian power8's. The original puts the values as: +-------+-------+-------+-------+ | a | unsued| b | unused| +-------+-------+-------+-------+ +-------+-------+-------+-------+ | c | unsued| d | unused| +-------+-------+-------+-------+ The VMRGEW instruction wants the register as: +-------+-------+-------+-------+ | a | unsued| c | unused| +-------+-------+-------+-------+ +-------+-------+-------+-------+ | b | unsued| d | unused| +-------+-------+-------+-------+ > Okay for trunk if you can clear that up. Did that answer the question? > Thanks, > > > Segher > -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797