Hi, This patch adds combine pass support for the following SVE2 bitwise logic instructions:
- EOR3 (3-way vector exclusive OR)
- BSL (bitwise select)
- NBSL (inverted ")
- BSL1N (" with first input inverted)
- BSL2N (" with second input inverted)
Example template snippet:
void foo (TYPE *a, TYPE *b, TYPE *c, TYPE *d, int n)
{
for (int i = 0; i < n; i++)
a[i] = OP (b[i], c[i], d[i]);
}
EOR3:
// #define OP(x,y,z) ((x) ^ (y) ^ (z))
before eor z1.d, z1.d, z2.d
eor z0.d, z0.d, z1.d
...
after eor3 z0.d, z0.d, z1.d, z2.d
BSL:
// #define OP(x,y,z) (((x) & (z)) | ((y) & ~(z)))
before eor z0.d, z0.d, z1.d
and z0.d, z0.d, z2.d
eor z0.d, z0.d, z1.d
...
after bsl z0.d, z0.d, z1.d, z2.d
NBSL:
// #define OP(x,y,z) ~(((x) & (z)) | ((y) & ~(z)))
before eor z0.d, z0.d, z1.d
and z0.d, z0.d, z2.d
eor z0.d, z0.d, z1.d
not z0.s, p1/m, z0.s
...
after nbsl z0.d, z0.d, z1.d, z2.d
BSL1N:
// #define OP(x,y,z) ((~(x) & (z)) | ((y) & ~(z)))
before eor z0.d, z0.d, z1.d
bic z0.d, z2.d, z0.d
eor z0.d, z0.d, z1.d
...
after bsl1n z0.d, z0.d, z1.d, z2.d
BSL2N:
// #define OP(x,y,z) (((x) & (z)) | (~(y) & ~(z)))
before orr z0.d, z1.d, z0.d
and z1.d, z1.d, z2.d
not z0.s, p1/m, z0.s
orr z0.d, z0.d, z1.d
...
after bsl2n z0.d, z0.d, z1.d, z2.d
Additionally, vector NOR and NAND operations are now optimized with NBSL:
NOR x, y -> NBSL x, y, x
NAND x, y -> NBSL x, y, y
Built and tested on aarch64-none-elf.
Best Regards,
Yuliang Wang
gcc/ChangeLog:
2019-10-16 Yuliang Wang <[email protected]>
* config/aarch64/aarch64-sve2.md (aarch64_sve2_eor3<mode>)
(aarch64_sve2_nor<mode>, aarch64_sve2_nand<mode>)
(aarch64_sve2_bsl<mode>, aarch64_sve2_nbsl<mode>)
(aarch64_sve2_bsl1n<mode>, aarch64_sve2_bsl2n<mode>):
New combine patterns.
* config/aarch64/iterators.md (BSL_3RD): New int iterator for the above.
(bsl_1st, bsl_2nd, bsl_3rd, bsl_mov): Attributes for the above.
* config/aarch64/aarch64.h (AARCH64_ISA_SVE2_AES, AARCH64_ISA_SVE2_SM4)
(AARCH64_ISA_SVE2_SHA3, AARCH64_ISA_SVE2_BITPERM): New ISA flag macros.
(TARGET_SVE2_AES, TARGET_SVE2_SM4, TARGET_SVE2_SHA3)
(TARGET_SVE2_BITPERM): New CPU targets.
gcc/testsuite/ChangeLog:
2019-10-16 Yuliang Wang <[email protected]>
* gcc.target/aarch64/sve2/eor3_1.c: New test.
* gcc.target/aarch64/sve2/eor3_2.c: As above.
* gcc.target/aarch64/sve2/nlogic_1.c: As above.
* gcc.target/aarch64/sve2/nlogic_2.c: As above.
* gcc.target/aarch64/sve2/bitsel_1.c: As above.
* gcc.target/aarch64/sve2/bitsel_2.c: As above.
* gcc.target/aarch64/sve2/bitsel_3.c: As above.
* gcc.target/aarch64/sve2/bitsel_4.c: As above.
rb11975.patch
Description: rb11975.patch
