On 23/04/14 16:20, Alan Lawrence wrote:
> This patch is a small tidy of a more-complicated expression that just flips a
> single bit and can thus be a simple XOR.
>
> No regressions on aarch64-none-elf or aarch64_be-none-elf. (I've verified
> code
> is indeed exercised by dg-torture.exp vshuf-v*.c).
>
> Also ok after applying TBL and testsuite patches in
> http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01309.html and
> http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00579.html.
>
> gcc/ChangeLog:
> 2014-04-23 Alan Lawrence <[email protected]>
>
> * config/aarch64/aarch64.c (aarch64_expand_vec_perm_1): tidy bit-flip
> expression.
>
s/tidy/Tidy/
It's not obvious from your description (or from the code, for that
matter) that for this to be valid nelt must be a power of 2.
I suggest that, above the loop, you put
gcc_assert (nelt == (nelt & -nelt));
OK with those changes.
R.
>
> xor_tidy.patch
>
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index a3147ee..b879754 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -8124,7 +8124,7 @@ aarch64_expand_vec_perm_const_1 (struct
> expand_vec_perm_d *d)
> rtx x;
>
> for (i = 0; i < nelt; ++i)
> - d->perm[i] = (d->perm[i] + nelt) & (2 * nelt - 1);
> + d->perm[i] ^= nelt; /* Keep the same index, but in the other vector. */
>
> x = d->op0;
> d->op0 = d->op1;
>