[PATCH v8] rs6000: Use vector addition when left shifting by 1 [PR119702]

Avinash Jayakar Fri, 24 Oct 2025 02:20:01 -0700

Hi Segher,

Please ignore the v7 patch, had minor errors in capitalization.
Thanks for the review of the v6 patch. I have incorporated all the mentioned 
changes in this patch.
Bootstrapped and regtested on powerpc64le-linux with no regressions.
Ok for trunk?


As mentioned in the PATCH v6 thread vaddudm tests fail in power8/9 for:
1. pr119702-3.c: Surya's patch for PR107757 should fix this test case once
upstream.
2. pr119702-4.c: Work in progress (PR122065)

Changes from v7:
        1. Minor corrections in commit message.
Changes from v6:
        1. for loop formatting in predicates.md
        2. Correct type from long to char in pr119702-2.c
        3. Add has_arch_pwr8 for vaddudm tests.
Changes from v5:
        1. Corrected formatting and define_insn name.
        2. Removed cpu specific flag for altivec tests.
        3. Remove target lp64 for altivec tests.
        4. Use native types instead of types from inttypes.h for altivec tests.
Changes from v4:
        1. Added comments for the new predicate "vector_constant_1".
        2. Added new tests for altivec vector types.
        3. Added comments in test file.
Changes from v3:
        1. Add author information before changelog.
        2. Right placement of PR target/119702.
        3. Added new test to check multiply by 2 generates vadd insn.
Changes from v2:
        1. Indentation fixes in the commit message
        2. define_insn has the name *altivec_vsl<VI_char>_const_1
        3. Iterate starting from 0 for checking vector constant = 1 and
        fixed source code formatting for the for loop.
        4. Removed unused macro in pr119702-1.c test file
Changes from v1:
        1. Use define_insn instead of define_expand to recognize left
        shift by constant 1 and generate add instruction.
        2. Updated test cases to cover integer types from byte, half-
        word, word and double word.

Thanks and regards,
Avinash Jayakar


Whenever a vector of integers is left shifted by a constant value 1,
gcc generates the following code for powerpc64le target:
        vspltisw 0,1
        vsld 2,2,0
Instead the following code can be generated which is more efficient:
        vaddudm 2,2,2
This patch adds a pattern in  altivec.md to recognize a vector left
shift by a constant value, and generates an add instruction if constant
is 1.

Added the pattern in altivec.md to recognize a vector left shift by a
constant value, and generate add instructions if constant is 1.
Added a predicate in predicates.md to recognize if the rtl node is a
uniform constant vector with value 1.

2025-10-24  Avinash Jayakar  <[email protected]>

gcc/ChangeLog:
        PR target/119702
        * config/rs6000/altivec.md (*altivec_vsl<VI_char>_const_1): Recognize
        << 1 and replace with vadd insn.
        * config/rs6000/predicates.md (vector_constant_1): Predicate to check if
        all elements of a vector constant is 1.

gcc/testsuite/ChangeLog:
        PR target/119702
        * gcc.target/powerpc/pr119702-1.c: New test.
        * gcc.target/powerpc/pr119702-2.c: New test.
        * gcc.target/powerpc/pr119702-3.c: New test.
        * gcc.target/powerpc/pr119702-4.c: New test.
---
 gcc/config/rs6000/altivec.md                  |  8 +++
 gcc/config/rs6000/predicates.md               | 13 ++++
 gcc/testsuite/gcc.target/powerpc/pr119702-1.c | 60 +++++++++++++++++++
 gcc/testsuite/gcc.target/powerpc/pr119702-2.c | 59 ++++++++++++++++++
 gcc/testsuite/gcc.target/powerpc/pr119702-3.c | 36 +++++++++++
 gcc/testsuite/gcc.target/powerpc/pr119702-4.c | 36 +++++++++++
 6 files changed, 212 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr119702-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr119702-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr119702-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr119702-4.c

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index fa3368079ad..11b8501a7d0 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -2107,6 +2107,14 @@
   "vsrv %0,%1,%2"
   [(set_attr "type" "vecsimple")])
 
+(define_insn "*altivec_vsl<VI_char>_const_1"
+  [(set (match_operand:VI2 0 "register_operand" "=v")
+       (ashift:VI2 (match_operand:VI2 1 "register_operand" "v")
+                     (match_operand:VI2 2 "vector_constant_1" "")))]
+  "<VI_unit>"
+  "vaddu<VI_char>m %0,%1,%1"
+)
+
 (define_insn "*altivec_vsl<VI_char>"
   [(set (match_operand:VI2 0 "register_operand" "=v")
         (ashift:VI2 (match_operand:VI2 1 "register_operand" "v")
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 647e89afb6a..017ff867aea 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -924,6 +924,19 @@
     }
 })
 
+;; Return 1 if the operand is a vector constant with 1 in all of the elements.
+(define_predicate "vector_constant_1"
+  (match_code "const_vector")
+{
+  unsigned nunits = GET_MODE_NUNITS (mode);
+  for (unsigned i = 0; i < nunits; i++)
+    {
+      if (INTVAL (CONST_VECTOR_ELT (op, i)) != 1)
+       return 0;
+    }
+  return 1;
+})
+
 ;; Return 1 if operand is 0.0.
 (define_predicate "zero_fp_constant"
   (and (match_code "const_double")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr119702-1.c 
b/gcc/testsuite/gcc.target/powerpc/pr119702-1.c
new file mode 100644
index 00000000000..d12ae23be60
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr119702-1.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -maltivec" } */
+/* { dg-require-effective-target powerpc_altivec } */
+
+/* PR target/119702 -- verify left shift by 1 is converted into
+   a vaddu<x>m instruction.  */
+
+
+void lshift1_64(unsigned long long *a)
+{
+  a[0] <<= 1;
+  a[1] <<= 1;
+}
+
+void lshift1_32(unsigned int *a)
+{
+  a[0] <<= 1;
+  a[1] <<= 1;
+  a[2] <<= 1;
+  a[3] <<= 1;
+}
+
+void lshift1_16(unsigned short *a)
+{
+  a[0] <<= 1;
+  a[1] <<= 1;
+  a[2] <<= 1;
+  a[3] <<= 1;
+  a[4] <<= 1;
+  a[5] <<= 1;
+  a[6] <<= 1;
+  a[7] <<= 1;
+}
+
+void lshift1_8(unsigned char *a)
+{
+  a[0] <<= 1;
+  a[1] <<= 1;
+  a[2] <<= 1;
+  a[3] <<= 1;
+  a[4] <<= 1;
+  a[5] <<= 1;
+  a[6] <<= 1;
+  a[7] <<= 1;
+  a[8] <<= 1;
+  a[9] <<= 1;
+  a[10] <<= 1;
+  a[11] <<= 1;
+  a[12] <<= 1;
+  a[13] <<= 1;
+  a[14] <<= 1;
+  a[15] <<= 1;
+}
+
+
+/* { dg-final { scan-assembler-times {\mvaddudm\M} 1 { target has_arch_pwr8 } 
} } */
+/* { dg-final { scan-assembler-times {\mvadduwm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvadduhm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvaddubm\M} 1 } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr119702-2.c 
b/gcc/testsuite/gcc.target/powerpc/pr119702-2.c
new file mode 100644
index 00000000000..45161f6311a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr119702-2.c
@@ -0,0 +1,59 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -maltivec" } */
+/* { dg-require-effective-target powerpc_altivec } */
+
+/* PR target/119702 -- verify multiply by 2 is converted into
+   a vaddu<x>m instruction.  */
+
+void lshift1_64(unsigned long long *a)
+{
+  a[0] *= 2;
+  a[1] *= 2;
+}
+
+void lshift1_32(unsigned int *a)
+{
+  a[0] *= 2;
+  a[1] *= 2;
+  a[2] *= 2;
+  a[3] *= 2;
+}
+
+void lshift1_16(unsigned short *a)
+{
+  a[0] *= 2;
+  a[1] *= 2;
+  a[2] *= 2;
+  a[3] *= 2;
+  a[4] *= 2;
+  a[5] *= 2;
+  a[6] *= 2;
+  a[7] *= 2;
+}
+
+void lshift1_8(unsigned char  *a)
+{
+  a[0] *= 2;
+  a[1] *= 2;
+  a[2] *= 2;
+  a[3] *= 2;
+  a[4] *= 2;
+  a[5] *= 2;
+  a[6] *= 2;
+  a[7] *= 2;
+  a[8] *= 2;
+  a[9] *= 2;
+  a[10] *= 2;
+  a[11] *= 2;
+  a[12] *= 2;
+  a[13] *= 2;
+  a[14] *= 2;
+  a[15] *= 2;
+}
+
+
+/* { dg-final { scan-assembler-times {\mvaddudm\M} 1 { target has_arch_pwr8 } 
} } */
+/* { dg-final { scan-assembler-times {\mvadduwm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvadduhm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvaddubm\M} 1 } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr119702-3.c 
b/gcc/testsuite/gcc.target/powerpc/pr119702-3.c
new file mode 100644
index 00000000000..28a53e75dd9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr119702-3.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -maltivec" } */
+/* { dg-require-effective-target powerpc_altivec } */
+
+/* PR target/119702 -- verify vector shift left by 1 is converted into
+   a vaddu<x>m instruction.  */
+
+vector unsigned long long
+lshift1_64 (vector unsigned long long a)
+{
+  return a << (vector unsigned long long) { 1, 1 };
+}
+
+vector unsigned int
+lshift1_32 (vector unsigned int a)
+{
+  return a << (vector unsigned int) { 1, 1, 1, 1 };
+}
+
+vector unsigned short
+lshift1_16 (vector unsigned short a)
+{
+  return a << (vector unsigned short) { 1, 1, 1, 1, 1, 1, 1, 1 };
+}
+
+vector unsigned char
+lshift1_8 (vector unsigned char a)
+{
+  return a << (vector unsigned char) { 1, 1, 1, 1, 1, 1, 1, 1,
+                                       1, 1, 1, 1, 1, 1, 1, 1 };
+}
+
+/* { dg-final { scan-assembler-times {\mvaddudm\M} 1 { target has_arch_pwr8 } 
} } */
+/* { dg-final { scan-assembler-times {\mvadduwm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvadduhm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvaddubm\M} 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr119702-4.c 
b/gcc/testsuite/gcc.target/powerpc/pr119702-4.c
new file mode 100644
index 00000000000..759866e2cd2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr119702-4.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -maltivec" } */
+/* { dg-require-effective-target powerpc_altivec } */
+
+/* PR target/119702 -- verify vector multiply by 2 is converted into
+   a vaddu<x>m instruction.  */
+
+vector unsigned long long
+lshift1_64 (vector unsigned long long a)
+{
+  return a * (vector unsigned long long) { 2, 2 };
+}
+
+vector unsigned int
+lshift1_32 (vector unsigned int a)
+{
+  return a * (vector unsigned int) { 2, 2, 2, 2 };
+}
+
+vector unsigned short
+lshift1_16 (vector unsigned short a)
+{
+  return a * (vector unsigned short) { 2, 2, 2, 2, 2, 2, 2, 2 };
+}
+
+vector unsigned char
+lshift1_8 (vector unsigned char a)
+{
+  return a * (vector unsigned char) { 2, 2, 2, 2, 2, 2, 2, 2,
+                                       2, 2, 2, 2, 2, 2, 2, 2 };
+}
+
+/* { dg-final { scan-assembler-times {\mvaddudm\M} 1 { target has_arch_pwr8 } 
} } */
+/* { dg-final { scan-assembler-times {\mvadduwm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvadduhm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvaddubm\M} 1 } } */
-- 
2.51.0

[PATCH v8] rs6000: Use vector addition when left shifting by 1 [PR119702]

Reply via email to