> On 7 Nov 2025, at 10:23, Andre Vieira <[email protected]> wrote:
> 
> Expands the use of eor3 where we'd otherwise use two vector eor's.
> 
> Bootstrapped and regression tested on aarch64-none-linux-gnu.
> 
> OK for trunk?
> 
> gcc/ChangeLog:
> 
> * config/aarch64/aarch64-simd.md (*eor3q<mode>4): New insn to be used by
>        combine after reload to optimize any grouping of eor's that are using 
> FP registers for
>        scalar modes.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/aarch64/eor3-opt.c: New test.
> 
> <eor3.patch>

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 
0d5b02a739fa74724d6dc8b658638d55b8db6890..3bf668e25b58a463f1d35387b1c6af7cc04e3a16
 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -9201,15 +9201,28 @@
 
 ;; sha3
 
-(define_insn "eor3q<mode>4"
-  [(set (match_operand:VDQ_I 0 "register_operand" "=w")
-       (xor:VDQ_I
-        (xor:VDQ_I
-         (match_operand:VDQ_I 2 "register_operand" "w")
-         (match_operand:VDQ_I 3 "register_operand" "w"))
-        (match_operand:VDQ_I 1 "register_operand" "w")))]
+(define_insn_and_split "eor3q<mode>4"
+  [(set (match_operand:VSDQ_I 0 "register_operand")
+       (xor:VSDQ_I
+        (xor:VSDQ_I
+         (match_operand:VSDQ_I 2 "register_operand")
+         (match_operand:VSDQ_I 3 "register_operand"))
+        (match_operand:VSDQ_I 1 "register_operand")))]
   "TARGET_SHA3"
-  "eor3\\t%0.16b, %1.16b, %2.16b, %3.16b"
+  {@ [ cons: =0 , %1 , 2 , 3 ]
+     [ w       ,  w , w , w ] eor3\t%0.16b, %1.16b, %2.16b, %3.16b
+     [ r       ,  r , r , r ] #
+  }
+  "&& reload_completed && GP_REGNUM_P (REGNO (operands[0]))”
The “=r,r,r,r” alternative should only be allowed for 64-bit modes?
Maybe it’s cleaner to split this pattern to allow the define_and_split just for 
VD_I modes?


+  [(const_int 0)]
+  {
+    machine_mode xor_mode = <MODE>mode == DImode ? DImode : SImode;
Consequence of the above, this path can be executed for non-32 or 64-bit modes 
as well.

+    emit_move_insn (operands[0],
+                   gen_rtx_XOR (xor_mode, operands[1], operands[2]));
That doesn’t seem like it would generate valid RTL. I think the XOR needs to 
have the same mode as its operands.
So operands[1], operands[2] need to be wrapped in a subregion of an appropriate 
mode to keep things consistent
Thanks,
Kyrill

+    emit_move_insn (operands[0],
+                   gen_rtx_XOR (xor_mode, operands[0], operands[3]));
+    DONE;
+  }
   [(set_attr "type" "crypto_sha3")]
 )

Reply via email to