Richard Sandiford <[email protected]> writes:
> Artemiy Volkov <[email protected]> writes:
>> SVE2.2 (or in streaming mode, SME2.2) adds support for zeroing
>> predication for the following SVE bit reversal instructions:
>>
>> - REVB, REVH, REVW (Reverse bytes / halfwords / words within elements
>>   (predicated))
>> - REVD (Reverse 64-bit doublewords in elements (predicated)) (SVE2 only)
>>
>> The first three are covered by the SVE_INT_UNARY code iterator, and REVD,
>> being SVE2-only, has a standalone pattern in aarch64-sve2.md.  This patch
>> adds an alternative for the zeroing-predication forms of the original
>> instructions.  The pattern for REVD also required changes to the predicate
>> for operand 3 to accept constant zero RTX.
>>
>> The tests that have been added are based on the original SVE/SVE2 tests
>> for corresponding instructions, but all have a "_z" suffix in their name
>> since they only test codegen for the "_z" variants of the corresponding
>> intrinsics.
>>
>> gcc/ChangeLog:
>>
>>      * config/aarch64/aarch64-sve.md (@cond_<optab><mode>):
>>      New alternative for zeroing predication.  Add `arch` attribute
>>      to every alternative.
>>      * config/aarch64/aarch64-sve2.md (@cond_<optab><mode>):
>>      Accept constant zero as operand 3.  New alternative for zeroing
>>      predication.  Add `arch` attribute to every alternative.
>>
>> gcc/testsuite/ChangeLog:
>>
>>      * gcc.target/aarch64/sve2/acle/asm/revb_s16_z.c: New test.
>>      * gcc.target/aarch64/sve2/acle/asm/revb_s32_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revb_s64_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revb_u16_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revb_u32_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revb_u64_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revd_bf16_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revd_f16_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revd_f32_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revd_f64_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revd_s16_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revd_s32_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revd_s64_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revd_s8_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revd_u16_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revd_u32_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revd_u64_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revd_u8_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revh_s32_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revh_s64_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revh_u32_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revh_u64_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revw_s64_z.c: Likewise.
>>      * gcc.target/aarch64/sve2/acle/asm/revw_u64_z.c: Likewise.
>> ---
>>  gcc/config/aarch64/aarch64-sve.md             |  9 ++--
>>  gcc/config/aarch64/aarch64-sve2.md            |  9 ++--
>>  .../aarch64/sve2/acle/asm/revb_s16_z.c        | 30 ++++++++++++
>>  .../aarch64/sve2/acle/asm/revb_s32_z.c        | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revb_s64_z.c        | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revb_u16_z.c        | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revb_u32_z.c        | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revb_u64_z.c        | 29 ++++++++++++
>>  .../aarch64/sve2/acle/asm/revd_bf16_z.c       | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revd_f16_z.c        | 47 +++++++++++++++++++
>>  .../aarch64/sve2/acle/asm/revd_f32_z.c        | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revd_f64_z.c        | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revd_s16_z.c        | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revd_s32_z.c        | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revd_s64_z.c        | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revd_s8_z.c         | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revd_u16_z.c        | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revd_u32_z.c        | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revd_u64_z.c        | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revd_u8_z.c         | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revh_s32_z.c        | 30 ++++++++++++
>>  .../aarch64/sve2/acle/asm/revh_s64_z.c        | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revh_u32_z.c        | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revh_u64_z.c        | 30 ++++++++++++
>>  .../aarch64/sve2/acle/asm/revw_s64_z.c        | 28 +++++++++++
>>  .../aarch64/sve2/acle/asm/revw_u64_z.c        | 29 ++++++++++++
>>  26 files changed, 709 insertions(+), 8 deletions(-)
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revb_s16_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revb_s32_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revb_s64_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revb_u16_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revb_u32_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revb_u64_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_bf16_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_f16_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_f32_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_f64_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_s16_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_s32_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_s64_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_s8_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_u16_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_u32_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_u64_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_u8_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revh_s32_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revh_s64_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revh_u32_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revh_u64_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revw_s64_z.c
>>  create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revw_u64_z.c
>>
>> diff --git a/gcc/config/aarch64/aarch64-sve.md 
>> b/gcc/config/aarch64/aarch64-sve.md
>> index 4b4e6cec549..79f0136101a 100644
>> --- a/gcc/config/aarch64/aarch64-sve.md
>> +++ b/gcc/config/aarch64/aarch64-sve.md
>> @@ -3531,10 +3531,11 @@
>>         (match_operand:SVE_FULL_I 3 "aarch64_simd_reg_or_zero")]
>>        UNSPEC_SEL))]
>>    "TARGET_SVE && <elem_bits> >= <min_elem_bits>"
>> -  {@ [ cons: =0 , 1   , 2 , 3  ; attrs: movprfx ]
>> -     [ w        , Upl , w , 0  ; *              ] 
>> <sve_int_op>\t%0.<Vetype>, %1/m, %2.<Vetype>
>> -     [ ?&w      , Upl , w , Dz ; yes            ] movprfx\t%0.<Vetype>, 
>> %1/z, %2.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %2.<Vetype>
>> -     [ ?&w      , Upl , w , w  ; yes            ] movprfx\t%0, 
>> %3\;<sve_int_op>\t%0.<Vetype>, %1/m, %2.<Vetype>
>> +  {@ [ cons: =0 , 1   , 2 , 3  ; attrs: movprfx, arch ]
>> +     [ w        , Upl , w , 0  ; *   , *                ] 
>> <sve_int_op>\t%0.<Vetype>, %1/m, %2.<Vetype>
>> +     [ w        , Upl , w , Dz ; *   , sve2p2_or_sme2p2 ] 
>> <sve_int_op>\t%0.<Vetype>, %1/z, %2.<Vetype>
>> +     [ ?&w      , Upl , w , Dz ; yes , *                ] 
>> movprfx\t%0.<Vetype>, %1/z, %2.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, 
>> %2.<Vetype>
>> +     [ ?&w      , Upl , w , w  ; yes , *                ] movprfx\t%0, 
>> %3\;<sve_int_op>\t%0.<Vetype>, %1/m, %2.<Vetype>
>>    }
>>    [(set_attr "sve_type" "sve_int_general")]
>>  )
>> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
>> b/gcc/config/aarch64/aarch64-sve2.md
>> index 017efb3cf5e..45e8b0a02a4 100644
>> --- a/gcc/config/aarch64/aarch64-sve2.md
>> +++ b/gcc/config/aarch64/aarch64-sve2.md
>> @@ -4144,12 +4144,13 @@
>>         (unspec:SVE_FULL
>>           [(match_operand:SVE_FULL 2 "register_operand")]
>>           UNSPEC_REVD_ONLY)
>> -       (match_operand:SVE_FULL 3 "register_operand")]
>> +       (match_operand:SVE_FULL 3 "aarch64_simd_reg_or_zero")]
>>        UNSPEC_SEL))]
>>    "TARGET_SVE2p1_OR_SME"
>> -  {@ [ cons: =0 , 1   , 2 , 3  ; attrs: movprfx ]
>> -     [ w        , Upl , w , 0  ; *              ] revd\t%0.q, %1/m, %2.q
>> -     [ ?&w      , Upl , w , w  ; yes            ] movprfx\t%0, 
>> %3\;revd\t%0.q, %1/m, %2.q
>> +  {@ [ cons: =0 , 1   , 2 , 3  ; attrs: movprfx, arch ]
>> +     [ w        , Upl , w , 0  ; *   , *                ] revd\t%0.q, %1/m, 
>> %2.q
>> +     [ w        , Upl , w , Dz ; *   , sve2p2_or_sme2p2 ] revd\t%0.q, %1/z, 
>> %2.q
>> +     [ ?&w      , Upl , w , w  ; yes , *                ] movprfx\t%0, 
>> %3\;revd\t%0.q, %1/m, %2.q
>
> This looks behaviourally correct, but the predicate change will have the
> effect of allowing zero for SVE2p1 before register allocation, leaving the
> allocator to find a spare register at the last minute.  It'd probably be
> better to reject zero in the C++ condition for !TARGET_SVE2p2_OR_SME2p2,
> or add a new predicate that only allows zero for TARGET_SVE2p2_OR_SME2p2.

Sorry, saw later that Kyrill made the same comment in response to patch 8.

Reply via email to