[PATCH][AArch64]Add vec_shr pattern for 64-bit vectors using ush{l,r}; enable tests.

Alan Lawrence Fri, 14 Nov 2014 07:32:18 -0800

Following recent vectorizer changes to reductions via shifts, AArch64 will nowreduce loops such as this


unsigned char in[8] = {1, 3, 5, 7, 9, 11, 13, 15};


int
main (unsigned char argc, char **argv)
{
  unsigned char prod = 1;

  /* Prevent constant propagation of the entire loop below.  */
  asm volatile ("" : : : "memory");

  for (unsigned char i = 0; i < 8; i++)
    prod *= in[i];

  if (prod != 17)
      __builtin_printf("Failed %d\n", prod);

  return 0;
}

using an 'ext' instruction from aarch64_expand_vec_perm_const:

main:
        adrp    x0, .LANCHOR0
        movi    v2.2s, 0    <=== note reg used here
        ldr     d1, [x0, #:lo12:.LANCHOR0]
        ext     v0.8b, v1.8b, v2.8b, #4
        mul     v1.8b, v1.8b, v0.8b
        ext     v0.8b, v1.8b, v2.8b, #2
        mul     v0.8b, v1.8b, v0.8b
        ext     v2.8b, v0.8b, v2.8b, #1
        mul     v0.8b, v0.8b, v2.8b
        umov    w1, v0.b[0]

The 'ext' works for both 64-bit vectors, and 128-bit vectors; but for 64-bitvectors, we can do slightly better using ushr; this patch improves the above to:


main:
        adrp    x0, .LANCHOR0
        ldr     d0, [x0, #:lo12:.LANCHOR0]
        ushr d1, d0, 32
        mul     v0.8b, v0.8b, v1.8b
        ushr d1, d0, 16
        mul     v0.8b, v0.8b, v1.8b
        ushr d1, d0, 8
        mul     v0.8b, v0.8b, v1.8b
        umov    w1, v0.b[0]
        ...

Tested with bootstrap + check-gcc on aarch64-none-linux-gnu.
Cross-testing of check-gcc on aarch64_be-none-elf in progress.

Ok if no regressions on big-endian?

Cheers,
--Alan

gcc/ChangeLog:

        * config/aarch64/aarch64-simd.md (vec_shr<mode>): New.

gcc/testsuite/ChangeLog:

        * lib/target-supports.exp
        (check_effective_target_whole_vector_shift): Add aarch64{,_be}.

[PATCH][AArch64]Add vec_shr pattern for 64-bit vectors using ush{l,r}; enable tests.

Reply via email to