work221-sha)] Update ChangeLog.*

Michael Meissner via Gcc-cvs Mon, 08 Sep 2025 13:06:32 -0700

https://gcc.gnu.org/g:440276a1e1117e1ce26b39e6d28b2e07b584d1b9


commit 440276a1e1117e1ce26b39e6d28b2e07b584d1b9
Author: Michael Meissner <meiss...@linux.ibm.com>
Date:   Mon Sep 8 16:05:57 2025 -0400

    Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.sha | 2426 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 2426 insertions(+)

diff --git a/gcc/ChangeLog.sha b/gcc/ChangeLog.sha
index 0af4cdd11faf..2629fb9271e2 100644
--- a/gcc/ChangeLog.sha
+++ b/gcc/ChangeLog.sha
@@ -1,3 +1,2429 @@
+==================== Branch work221-sha, patch #445 ====================
+
+PR target/117251: Add tests
+
+This is patch #45 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VAND' instruction feeding
+into 'VNAND'.  The 'XXEVAL' instruction can use all 64 vector
+registers, instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+This patch adds the tests for generating 'XXEVAL' to the testsuite.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/testsuite/
+
+       PR target/117251
+       * gcc.target/powerpc/p10-vector-fused-1.c: New test.
+       * gcc.target/powerpc/p10-vector-fused-2.c: Likewise.
+
+==================== Branch work221-sha, patch #444 ====================
+
+PR target/117251: Improve vector and to vector nand fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #44 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VAND' instruction feeding
+into 'VNAND'.  The 'XXEVAL' instruction can use all 64 vector
+registers, instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((c & d) & b);
+
+Generates:
+
+       vand   t,c,d
+       vnand  a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,254
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector and => nand fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #443 ====================
+
+PR target/117251: Improve vector andc to vector nand fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #43 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VANDC' instruction feeding
+into 'VNAND'.  The 'XXEVAL' instruction can use all 64 vector
+registers, instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((c & ~ d) & b);
+
+Generates:
+
+       vandc  t,c,d
+       vnand  a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,253
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector andc => nand fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #442 ====================
+
+PR target/117251: Improve vector xor to vector nand fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #42 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VXOR' instruction feeding
+into 'VNAND'.  The 'XXEVAL' instruction can use all 64 vector
+registers, instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((c ^ d) & b);
+
+Generates:
+
+       vxor   t,c,d
+       vnand  a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,249
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector xor => nand fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #441 ====================
+
+PR target/117251: Improve vector or to vector nand fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #41 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VOR' instruction feeding into
+'VNAND'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((c | d) & b);
+
+Generates:
+
+       vor    t,c,d
+       vnand  a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,248
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector or => nand fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #440 ====================
+
+PR target/117251: Improve vector nor to vector nand fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #40 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VNOR' instruction feeding
+into 'VNAND'.  The 'XXEVAL' instruction can use all 64 vector
+registers, instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((~ (c | d)) & b);
+
+Generates:
+
+       vnor   t,c,d
+       vnand  a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,247
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector nor => nand fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #439 ====================
+
+PR target/117251: Improve vector eqv to vector nand fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #39 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VEQV' instruction feeding
+into 'VNAND'.  The 'XXEVAL' instruction can use all 64 vector
+registers, instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((~ (c ^ d)) & b);
+
+Generates:
+
+       veqv   t,c,d
+       vnand  a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,246
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector eqv => nand fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #438 ====================
+
+PR target/117251: Improve vector orc to vector nand fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #38 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VORC' instruction feeding
+into 'VNAND'.  The 'XXEVAL' instruction can use all 64 vector
+registers, instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((c | ~ d) & b);
+
+Generates:
+
+       vorc   t,c,d
+       vnand  a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,244
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector orc => nand fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #437 ====================
+
+PR target/117251: Improve vector nand to vector nand fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #37 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VNAND' instruction feeding
+into 'VNAND'.  The 'XXEVAL' instruction can use all 64 vector
+registers, instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((~ (c & d)) & b);
+
+Generates:
+
+       vnand  t,c,d
+       vnand  a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,241
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector nand => nand fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #436 ====================
+
+PR target/117251: Improve vector nand to vector or fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #36 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VNAND' instruction feeding
+into 'VOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (~ (c & d)) | b;
+
+Generates:
+
+       vnand  t,c,d
+       vor    a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,239
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector nand => or fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #435 ====================
+
+PR target/117251: Improve vector nand to vector xor fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #35 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VNAND' instruction feeding
+into 'VXOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (~ (c & d)) ^ b;
+
+Generates:
+
+       vnand  t,c,d
+       vxor   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,225
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector nand => xor fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #434 ====================
+
+PR target/117251: Improve vector and to vector nor fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #34 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VAND' instruction feeding
+into 'VNOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((c & d) | b);
+
+Generates:
+
+       vand   t,c,d
+       vnor   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,224
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector and => nor fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #433 ====================
+
+PR target/117251: Improve vector andc to vector eqv fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #33 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VANDC' instruction feeding
+into 'VEQV'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((c & ~ d) ^ b);
+
+Generates:
+
+       vandc  t,c,d
+       veqv   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,210
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector andc => eqv fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #432 ====================
+
+PR target/117251: Improve vector andc to vector nor fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #32 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VANDC' instruction feeding
+into 'VNOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((c & ~ d) | b);
+
+Generates:
+
+       vandc  t,c,d
+       vnor   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,208
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector andc => nor fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #431 ====================
+
+PR target/117251: Improve vector orc to vector or fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #31 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VORC' instruction feeding
+into 'VOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c | ~ d) | b;
+
+Generates:
+
+       vorc   t,c,d
+       vor    a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,191
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector orc => or fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #430 ====================
+
+PR target/117251: Improve vector orc to vector xor fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #30 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VORC' instruction feeding
+into 'VXOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c | ~ d) ^ b;
+
+Generates:
+
+       vorc   t,c,d
+       vxor   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,180
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector orc => xor fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #429 ====================
+
+PR target/117251: Improve vector eqv to vector or fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #29 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VEQV' instruction feeding
+into 'VOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (~ (c ^ d)) | b;
+
+Generates:
+
+       veqv   t,c,d
+       vor    a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,159
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector eqv => or fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #428 ====================
+
+PR target/117251: Improve vector eqv to vector xor fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #28 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VEQV' instruction feeding
+into 'VXOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (~ (c ^ d)) ^ b;
+
+Generates:
+
+       veqv   t,c,d
+       vxor   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,150
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector eqv => xor fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #427 ====================
+
+PR target/117251: Improve vector xor to vector nor fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #27 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VXOR' instruction feeding
+into 'VNOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((c ^ d) | b);
+
+Generates:
+
+       vxor   t,c,d
+       vnor   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,144
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector xor => nor fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #426 ====================
+
+PR target/117251: Improve vector nor to vector or fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #26 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VNOR' instruction feeding
+into 'VOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (~ (c | d)) | b;
+
+Generates:
+
+       vnor   t,c,d
+       vor    a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,143
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector nor => or fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #425 ====================
+
+PR target/117251: Improve vector nor to vector xor fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #25 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VNOR' instruction feeding
+into 'VXOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (~ (c | d)) ^ b;
+
+Generates:
+
+       vnor   t,c,d
+       vxor   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,135
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector nor => xor fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #424 ====================
+
+PR target/117251: Improve vector or to vector nor fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #24 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VOR' instruction feeding into
+'VNOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((c | d) | b);
+
+Generates:
+
+       vor    t,c,d
+       vnor   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,128
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector or => nor fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #423 ====================
+
+PR target/117251: Improve vector or to vector or fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #23 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VOR' instruction feeding into
+'VOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c | d) | b;
+
+Generates:
+
+       vor    t,c,d
+       vor    a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,127
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector or => or fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #422 ====================
+
+PR target/117251: Improve vector or to vector xor fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #22 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VOR' instruction feeding into
+'VXOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c | d) ^ b;
+
+Generates:
+
+       vor    t,c,d
+       vxor   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,120
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector or => xor fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #421 ====================
+
+PR target/117251: Improve vector nor to vector nor fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #21 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VNOR' instruction feeding
+into 'VNOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((~ (c | d)) | b);
+
+Generates:
+
+       vnor   t,c,d
+       vnor   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,112
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector nor => nor fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #420 ====================
+
+PR target/117251: Improve vector xor to vector or fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #20 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VXOR' instruction feeding
+into 'VOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c ^ d) | b;
+
+Generates:
+
+       vxor   t,c,d
+       vor    a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,111
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector xor => or fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #419 ====================
+
+PR target/117251: Improve vector xor to vector xor fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #19 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VXOR' instruction feeding
+into 'VXOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c ^ d) ^ b;
+
+Generates:
+
+       vxor   t,c,d
+       vxor   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,105
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector xor => xor fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #418 ====================
+
+PR target/117251: Improve vector eqv to vector nor fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #18 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VEQV' instruction feeding
+into 'VNOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((~ (c ^ d)) | b);
+
+Generates:
+
+       veqv   t,c,d
+       vnor   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,96
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector eqv => nor fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #417 ====================
+
+PR target/117251: Improve vector orc to vector orc fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #17 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VORC' instruction feeding
+into 'VORC'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c | ~ d) | ~ b;
+
+Generates:
+
+       vorc   t,c,d
+       vorc   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,79
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector orc => orc fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #416 ====================
+
+PR target/117251: Improve vector orc to vector eqv fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #16 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VORC' instruction feeding
+into 'VEQV'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((c | ~ d) ^ b);
+
+Generates:
+
+       vorc   t,c,d
+       veqv   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,75
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector orc => eqv fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #415 ====================
+
+PR target/117251: Improve vector orc to vector nor fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #15 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VORC' instruction feeding
+into 'VNOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((c | ~ d) | b);
+
+Generates:
+
+       vorc   t,c,d
+       vnor   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,64
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector orc => nor fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #414 ====================
+
+PR target/117251: Improve vector andc to vector or fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #14 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VANDC' instruction feeding
+into 'VOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c & ~ d) | b;
+
+Generates:
+
+       vandc  t,c,d
+       vor    a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,47
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector andc => or fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #413 ====================
+
+PR target/117251: Improve vector andc to vector xor fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #13 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VANDC' instruction feeding
+into 'VXOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c & ~ d) ^ b;
+
+Generates:
+
+       vandc  t,c,d
+       vxor   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,45
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector andc => xor fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #412 ====================
+
+PR target/117251: Improve vector and to vector or fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #12 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VAND' instruction feeding
+into 'VOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c & d) | b;
+
+Generates:
+
+       vand   t,c,d
+       vor    a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,31
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector and => or fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #411 ====================
+
+PR target/117251: Improve vector and to vector xor fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #11 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VAND' instruction feeding
+into 'VXOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c & d) ^ b;
+
+Generates:
+
+       vand   t,c,d
+       vxor   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,30
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector/vector and/xor fusion if XXEVAL is
+       supported.
+
+==================== Branch work221-sha, patch #410 ====================
+
+PR target/117251: Improve vector nand to vector nor fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #10 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VNAND' instruction feeding
+into 'VNOR'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = ~ ((~ (c & d)) | b);
+
+Generates:
+
+       vnand  t,c,d
+       vnor   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,16
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector/vector nand/nor fusion if XXEVAL is
+       supported.
+
+==================== Branch work221-sha, patch #409 ====================
+
+PR target/117251: Improve vector nand to vector and fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #9 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VNAND' instruction feeding
+into 'VAND'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (~ (c & d)) & b;
+
+Generates:
+
+       vnand  t,c,d
+       vand   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,14
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector/vector nand/and fusion if XXEVAL is
+       supported.
+
+==================== Branch work221-sha, patch #408 ====================
+
+PR target/117251: Improve vector andc to vector andc fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #8 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VANDC' instruction feeding
+into 'VANDC'.  The 'XXEVAL' instruction can use all 64 vector
+registers, instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c & ~ d) & ~ b;
+
+Generates:
+
+       vandc  t,c,d
+       vandc  a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,13
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector/vector andc/andc fusion if XXEVAL is
+       supported.
+
+==================== Branch work221-sha, patch #407 ====================
+
+PR target/117251: Improve vector orc to vector and fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #7 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VORC' instruction feeding
+into 'VAND'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c | ~ d) & b;
+
+Generates:
+
+       vorc   t,c,d
+       vand   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,11
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector/vector orc/and fusion if XXEVAL is
+       supported.
+
+==================== Branch work221-sha, patch #406 ====================
+
+PR target/117251: Improve vector eqv to vector and fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #6 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VEQV' instruction feeding
+into 'VAND'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (~ (c ^ d)) & b;
+
+Generates:
+
+       veqv   t,c,d
+       vand   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,9
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector/vector nor/and fusion if XXEVAL is
+       supported.
+
+==================== Branch work221-sha, patch #405 ====================
+
+PR target/117251: Improve vector nor to vector and fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #5 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VNOR' instruction feeding
+into 'VAND'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (~ (c | d)) & b;
+
+Generates:
+
+       vnor   t,c,d
+       vand   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,8
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector/vector nor/and fusion if XXEVAL is
+       supported.
+
+==================== Branch work221-sha, patch #404 ====================
+
+PR target/117251: Improve vector or to vector and fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #4 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VOR' instruction feeding into
+'VAND'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c | d) & b;
+
+Generates:
+
+       vor    t,c,d
+       vand   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,7
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
+       generate vector/vector or/and fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #403 ====================
+
+PR target/117251: Improve vector xor to vector and fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #3 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VXOR' instruction feeding
+into 'VAND'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c ^ d) & b;
+
+Generates:
+
+       vxor   t,c,d
+       vand   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,6
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
+       generate vector/vector xor/and fusion if XXEVAL is supported.
+
+==================== Branch work221-sha, patch #402 ====================
+
+PR target/117251: Improve vector andc to vector and fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #2 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VANDC' instruction feeding
+into 'VAND'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c & ~ d) & b;
+
+Generates:
+
+       vandc  t,c,d
+       vand   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,2
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add support
+       to generate vector/vector andc/and fusion if XXEVAL is
+       supported.
+
+==================== Branch work221-sha, patch #401 ====================
+
+PR target/117251: Improve vector and to vector and fusion
+
+See the following post for a complete explanation of what the patches
+for PR target/117251:
+
+ * https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686474.html
+
+This is patch #1 of 45 to generate the 'XXEVAL' instruction on power10
+and power11 instead of using the Altivec 'VAND' instruction feeding
+into 'VAND'.  The 'XXEVAL' instruction can use all 64 vector registers,
+instead of the 32 registers that traditional Altivec vector
+instructions use.  By allowing all of the vector registers to be used,
+it reduces the amount of spilling that a large benchmark generated.
+
+Currently the following code:
+
+       vector int a, b, c, d;
+       a = (c & d) & b;
+
+Generates:
+
+       vand   t,c,d
+       vand   a,t,b
+
+Now in addition with this patch, if the arguments or result is
+allocated to a traditional FPR register, the GCC compiler will now
+generate the following code instead of adding vector move instructions:
+
+       xxeval a,b,c,1
+
+Since fusion using 2 Altivec instructions is slightly faster than using
+the 'XXEVAL' instruction we prefer to generate the Altivec instructions
+if we can.  In addition, because 'XXEVAL' is a prefixed instruction, it
+possibly might generate an extra NOP instruction to align the 'XXEVAL'
+instruction.
+
+I have tested these patches on both big endian and little endian
+PowerPC servers, with no regressions.  Can I check these patchs into
+the trunk?
+
+2025-09-08  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/117251
+       * config/rs6000/fusion.md: Regenerate.
+       * config/rs6000/genfusion.pl (gen_logical_addsubf): Add
+       support to generate vector/vector and/and fusion if XXEVAL is
+       supported.
+       * config/rs6000/predicates.md (vector_fusion_operand): New
+       predicate.
+       * config/rs6000/rs6000.h (TARGET_XXEVAL): New macro.
+       * config/rs6000/rs6000.md (isa attribute): Add xxeval.
+       (enabled attribute): Add support for XXEVAL support.
+
+==================== Branch work221-sha, information ====================
+
+PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations
+
+History: This is version 2 of the patch.  In the original patch, all 44
+fusion opportunities were lumped together in one patch.  Outside of
+fusion.md, these changes are fairly small, in that it adds one
+alternative to each of the fusion patterns to add xxeval support.
+Fusion.md is a generated file (created from genfusion.md) that does all
+of the fusion combinations.  Because of these automated changes,
+fusion.md had 265 lines that were deleted and 397 lines that were
+added.
+
+In version 2 of the patch, I broke the original patch into 45 separate
+patches.  The first patch adds the basic support to genfusion.pl,
+predicates.md, rs6000.h, and rs6000.md.  The first patch adds the first
+fusion case (vector 'AND' fusing into vector 'AND'). The next 43
+patches each add one more fusion case.  Then the last case adds the two
+test cases.
+
+The multibuff.c benchmark attached to the PR target/117251 compiled for
+Power10 PowerPC that implement SHA3 has a slowdown in the current trunk
+and GCC 14 compared to GCC 11 - GCC 13, due to excessive amounts of
+spilling.
+
+The main function for the multibuf.c file has 3,747 lines, all of which
+are using vector unsigned long long.  There are 696 vector rotates (all
+rotates are constant), 1,824 vector xor's and 600 vector andc's.
+
+In looking at it, the main thing that steps out is the reason for
+either spilling or moving variables is the support in fusion.md
+(generated by genfusion.pl) that tries to fuse the vec_andc feeding
+into vec_xor, and other vec_xor's feeding into vec_xor.
+
+On the powerpc for power10, there is a special fusion mode that happens
+if the machine has a VANDC or VXOR instruction that is adjacent to a
+VXOR instruction and the VANDC/VXOR feeds into the 2nd VXOR
+instruction.
+
+While the Power10 has 64 vector registers (which uses the XXL prefix to
+do logical operations), the fusion only works with the older Altivec
+instruction set (which uses the V prefix).  The Altivec instruction
+only has 32 vector registers (which are overlaid over the VSX vector
+registers 32-63).
+
+By having the combiner patterns fuse_vandc_vxor and fuse_vxor_vxor to
+do this fusion, it means that the register allocator has more register
+pressure for the traditional Altivec registers instead of the VSX
+registers.
+
+In addition, since there are vector rotates, these rotates only work on
+the traditional Altivec registers, which adds to the Altivec register
+pressure.
+
+Finally in addition to doing the explicit xor, andc, and rotates using
+the Altivec registers, we have to also load vector constants for the
+rotate amount and these registers also are allocated as Altivec
+registers.
+
+Current trunk and GCC 12-14 have more vector spills than GCC 11, but
+GCC 11 has many more vector moves that the later compilers.  Thus even
+though it has way less spills, the vector moves are why GCC 11 have the
+slowest results.
+
+There is an instruction that was added in power10 (XXEVAL) that does
+provide fusion between VSX vectors that includes ANDC->XOR and XOR->XOR
+fusion.
+
+The latency of XXEVAL is slightly more than the fused VANDC/VXOR or
+VXOR/VXOR, so I have written the patch to prefer doing the Altivec
+instructions if they don't need a temporary register.
+
+Here are the results for adding support for XXEVAL for the multibuff.c
+benchmark attached to the PR.  Note that we essentially recover the
+speed with this patch that were lost with GCC 14 and the current trunk:
+
+                               XXEVAL   Trunk   GCC15   GCC14    GCC13
+                               ------   -----   -----   -----    -----
+Multibuf time in seconds        5.600   6.151   6.129   6.053    5.539
+XXEVAL improvement percentage     ---   +9.8%   +9.4%   +8.1%    -1.1%
+
+Fuse VANDC -> VXOR                209     600      600    600      600
+Fuse VXOR -> VXOR                   0     241      241    240      120
+XXEVAL to fuse ANDC -> XOR (#45)  391       0        0      0        0
+XXEVAL to fuse XOR -> XOR (#105)  240       0        0      0        0
+
+Spill vector to stack             140     417      417     403     226
+Load spilled vector from stack    490   1,012    1,012   1,000     766
+Vector moves                        8      93      100      70      72
+
+XXLANDC or VANDC                  209     600      600     600     600
+XXLXOR or VXOR                    953   1,824    1,824   1,824   1,824
+XXEVAL                            631       0        0       0       0
+
+
+Here are the results for adding support for XXEVAL for the singlebuff.c
+benchmark attached to the PR.  Note that adding XXEVAL greatly speeds
+up this particular benchmark:
+
+                               XXEVAL   Trunk   GCC15   GCC14    GCC13
+                               ------   -----   -----   -----    -----
+Singlebuf time in seconds       4.429   5.330   5.333   5.315    5.270
+XXEVAL improvement percentage     ---  +20.3%  +20.4%  +20.0%   +19.0%
+
+Fuse VANDC -> VXOR                210     600     600     600      600
+Fuse VXOR -> VXOR                   0     240     240     240      120
+XXEVAL to fuse ANDC -> XOR (#45)  390       0       0       0        0
+XXEVAL to fuse XOR -> XOR (#105)  240       0       0       0        0
+
+Spill vector to stack             134     388     388     388      391
+Load spilled vector from stack    357     808     808     808      769
+Vector moves                       34      80      80      80      119
+
+XXLANDC or VANDC                  210     600     600     600      600
+XXLXOR or VXOR                    954   1,824   1,824   1,824    1,824
+XXEVAL                            630       0       0       0        0
+
+
+These patches add the following fusion patterns:
+
+       xxland  => xxland       xxlandc => xxland
+       xxlxor  => xxland       xxlor   => xxland
+       xxlnor  => xxland       xxleqv  => xxland
+       xxlorc  => xxland       xxlandc => xxlandc
+       xxlnand => xxland       xxlnand => xxlnor
+       xxland  => xxlxor       xxland  => xxlor
+       xxlandc => xxlxor       xxlandc => xxlor
+       xxlorc  => xxlnor       xxlorc  => xxleqv
+       xxlorc  => xxlorc       xxleqv  => xxlnor
+       xxlxor  => xxlxor       xxlxor  => xxlor
+       xxlnor  => xxlnor       xxlor   => xxlxor
+       xxlor   => xxlor        xxlor   => xxlnor
+       xxlnor  => xxlxor       xxlnor  => xxlor
+       xxlxor  => xxlnor       xxleqv  => xxlxor
+       xxleqv  => xxlor        xxlorc  => xxlxor
+       xxlorc  => xxlor        xxlandc => xxlnor
+       xxlandc => xxleqv       xxland  => xxlnor
+       xxlnand => xxlxor       xxlnand => xxlor
+       xxlnand => xxlnand      xxlorc  => xxlnand
+       xxleqv  => xxlnand      xxlnor  => xxlnand
+       xxlor   => xxlnand      xxlxor  => xxlnand
+       xxlandc => xxlnand      xxland  => xxlnand
+
 ==================== Branch work221-sha, baseline ====================
 
 2025-09-08   Michael Meissner  <meiss...@linux.ibm.com>

[gcc(refs/users/meissner/heads/work221-sha)] Update ChangeLog.*

Reply via email to