On 10/19/2011 12:25 PM, Jakub Jelinek wrote:
> 2011-10-19 Jakub Jelinek <[email protected]>
>
> * config/i386/i386.c (expand_vec_perm_vpshufb2_vpermq_even_odd): Use
> d->op1 instead of d->op0 for the second vpshufb.
> (expand_vec_perm_even_odd_1): For V8SImode fix vpshufd immediates.
> (ix86_expand_vec_perm_const): If mask indicates two operands are
> needed, but both are the same and expanding them as d.op0 == d.op1
> failed, retry with d.op0 != d.op1.
> (ix86_expand_vec_perm_builtin): Likewise. Handle sorry printing
> also for d.nelt == 32.
>
> * gcc.dg/torture/vshuf-32.inc: Add interleave permutations.
> * gcc.dg/torture/vshuf-16.inc: Likewise.
> * gcc.dg/torture/vshuf-8.inc: Likewise.
> * gcc.dg/torture/vshuf-4.inc: Likewise.
Ok.
Although I think a good followup would be to fix
> + if (which == 3 && d.op0 == d.op1)
> + {
> + rtx seq;
> + bool ok;
> +
> + memcpy (d.perm, perm, sizeof (perm));
> + d.op1 = gen_reg_rtx (d.vmode);
> + start_sequence ();
> + ok = ix86_expand_vec_perm_builtin_1 (&d);
> + seq = get_insns ();
> + end_sequence ();
> + if (ok)
> + {
> + emit_move_insn (d.op1, d.op0);
> + emit_insn (seq);
this so that we don't need a copy to another register.
That could be done by adding a d.one_operand field, and
using that test instead of explicit equality everywhere.
r~