On 30/07/18 14:30, Christophe Lyon wrote:
Hi,

On Tue, 24 Jul 2018 at 17:39, Kyrill Tkachov
<kyrylo.tkac...@foss.arm.com> wrote:

On 24/07/18 16:12, James Greenhalgh wrote:
On Thu, Jul 19, 2018 at 07:35:22AM -0500, Matthew Malcomson wrote:
Hi again.

Providing an updated patch to include the formatting suggestions.
Please try not to top-post replies, it makes the conversation thread
harder to follow (reply continues below!).

On 12/07/18 11:39, Sudakshina Das wrote:
Hi Matthew

On 12/07/18 11:18, Richard Sandiford wrote:
Looks good to me FWIW (not a maintainer), just a minor formatting thing:

Matthew Malcomson <matthew.malcom...@arm.com> writes:
diff --git a/gcc/config/aarch64/aarch64-simd.md
b/gcc/config/aarch64/aarch64-simd.md
index
aac5fa146ed8dde4507a0eb4ad6a07ce78d2f0cd..67b29cbe2cad91e031ee23be656ec61a403f2cf9
100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3302,38 +3302,78 @@
     DONE;
   })
   -(define_insn "aarch64_<ANY_EXTEND:su><ADDSUB:optab>w<mode>"
+(define_insn "aarch64_<ANY_EXTEND:su>subw<mode>"
     [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
-        (ADDSUB:<VWIDE> (match_operand:<VWIDE> 1 "register_operand"
"w")
-            (ANY_EXTEND:<VWIDE>
-              (match_operand:VD_BHSI 2 "register_operand" "w"))))]
+            (minus:<VWIDE>
+             (match_operand:<VWIDE> 1 "register_operand" "w")
+             (ANY_EXTEND:<VWIDE>
+               (match_operand:VD_BHSI 2 "register_operand" "w"))))]
The (minus should be under the "(match_operand":

(define_insn "aarch64_<ANY_EXTEND:su>subw<mode>"
    [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
     (minus:<VWIDE> (match_operand:<VWIDE> 1 "register_operand" "w")
                (ANY_EXTEND:<VWIDE>
              (match_operand:VD_BHSI 2 "register_operand" "w"))))]

Same for the other patterns.

Thanks,
Richard

You will need a maintainer's approval but this looks good to me.
Thanks for doing this. I would only point out one other nit which you
can choose to ignore:

+/* Ensure
+   saddw2 and one saddw for the function add()
+   ssubw2 and one ssubw for the function subtract()
+   uaddw2 and one uaddw for the function uadd()
+   usubw2 and one usubw for the function usubtract() */
+
+/* { dg-final { scan-assembler-times "\[ \t\]ssubw2\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]ssubw\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]saddw2\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]saddw\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]usubw2\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]usubw\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]uaddw2\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]uaddw\[ \t\]+" 1 } } */

The scan-assembly directives for the different
functions can be placed right below each of them and that would
make it easier to read the expected results in the test and you
can get rid of the comments saying the same.
Thanks for the first-line review Sudi.

OK for trunk.

I've committed this on behalf of Matthew as: https://gcc.gnu.org/r262949

The new test fail with -mabi=ilp32:
     gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]saddw2[ \t]+ 1
     gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]saddw[ \t]+ 1
     gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]ssubw2[ \t]+ 1
     gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]ssubw[ \t]+ 1
     gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]uaddw2[ \t]+ 1
     gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]uaddw[ \t]+ 1
     gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]usubw2[ \t]+ 1
     gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]usubw[ \t]+ 1

I'm not sure how much we care about ilp32?

Looks like the usual pitfall of unsigned longs being 32-bit in ILP32.
The test should be made to use "long long"s that are guaranteed to be 64 bits
instead of just longs

Thanks,
Kyrill

Christophe

Thanks,
Kyrill

Thanks,
James


Reply via email to