> On 7 Nov 2025, at 10:42, Kyrylo Tkachov <[email protected]> wrote:
> 
> 
> 
>> On 7 Nov 2025, at 10:23, Andre Vieira <[email protected]> wrote:
>> 
>> Expands the use of eor3 where we'd otherwise use two vector eor's.
>> 
>> Bootstrapped and regression tested on aarch64-none-linux-gnu.
>> 
>> OK for trunk?
>> 
>> gcc/ChangeLog:
>> 
>> * config/aarch64/aarch64-simd.md (*eor3q<mode>4): New insn to be used by
>>       combine after reload to optimize any grouping of eor's that are using 
>> FP registers for
>>       scalar modes.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> * gcc.target/aarch64/eor3-opt.c: New test.
>> 
>> <eor3.patch>
> 
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 
> 0d5b02a739fa74724d6dc8b658638d55b8db6890..3bf668e25b58a463f1d35387b1c6af7cc04e3a16
>  100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -9201,15 +9201,28 @@
> 
> ;; sha3
> 
> -(define_insn "eor3q<mode>4"
> -  [(set (match_operand:VDQ_I 0 "register_operand" "=w")
> - (xor:VDQ_I
> - (xor:VDQ_I
> -  (match_operand:VDQ_I 2 "register_operand" "w")
> -  (match_operand:VDQ_I 3 "register_operand" "w"))
> - (match_operand:VDQ_I 1 "register_operand" "w")))]
> +(define_insn_and_split "eor3q<mode>4"
> +  [(set (match_operand:VSDQ_I 0 "register_operand")
> + (xor:VSDQ_I
> + (xor:VSDQ_I
> +  (match_operand:VSDQ_I 2 "register_operand")
> +  (match_operand:VSDQ_I 3 "register_operand"))
> + (match_operand:VSDQ_I 1 "register_operand")))]
>   "TARGET_SHA3"
> -  "eor3\\t%0.16b, %1.16b, %2.16b, %3.16b"
> +  {@ [ cons: =0 , %1 , 2 , 3 ]
> +     [ w ,  w , w , w ] eor3\t%0.16b, %1.16b, %2.16b, %3.16b
> +     [ r ,  r , r , r ] #
> +  }
> +  "&& reload_completed && GP_REGNUM_P (REGNO (operands[0]))”
> The “=r,r,r,r” alternative should only be allowed for 64-bit modes?
> Maybe it’s cleaner to split this pattern to allow the define_and_split just 
> for VD_I modes?
> 
> 
> +  [(const_int 0)]
> +  {
> +    machine_mode xor_mode = <MODE>mode == DImode ? DImode : SImode;
> Consequence of the above, this path can be executed for non-32 or 64-bit 
> modes as well.
> 
> +    emit_move_insn (operands[0],
> +    gen_rtx_XOR (xor_mode, operands[1], operands[2]));
> That doesn’t seem like it would generate valid RTL. I think the XOR needs to 
> have the same mode as its operands.
> So operands[1], operands[2] need to be wrapped in a subregion of an 
> appropriate mode to keep things consistent

Subregion -> subreg, of course.


> Thanks,
> Kyrill
> 
> +    emit_move_insn (operands[0],
> +    gen_rtx_XOR (xor_mode, operands[0], operands[3]));
> +    DONE;
> +  }
>   [(set_attr "type" "crypto_sha3")]
> )


Reply via email to