[gcc(refs/users/meissner/heads/work164-vpair)] Power10: Add options to disable load and store vector pair.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:a6a927f86a58862aa30bda8f609fb5641fc76810

commit a6a927f86a58862aa30bda8f609fb5641fc76810
Author: Michael Meissner 
Date:   Tue Apr 9 00:44:44 2024 -0400

Power10: Add options to disable load and store vector pair.

In working on some future patches that involve utilizing vector pair
instructions, I wanted to be able to tune my program to enable or disable 
using
the vector pair load or store operations while still keeping the other
operations on the vector pair.

This patch adds two undocumented tuning options.  The -mno-load-vector-pair
option would tell GCC to generate two load vector instructions instead of a
single load vector pair.  The -mno-store-vector-pair option would tell GCC 
to
generate two store vector instructions instead of a single store vector 
pair.

If either -mno-load-vector-pair is used, GCC will not generate the indexed
stxvpx instruction.  Similarly if -mno-store-vector-pair is used, GCC will 
not
generate the indexed lxvpx instruction.  The reason for this is to enable
splitting the {,p}lxvp or {,p}stxvp instructions after reload without 
needing a
scratch GPR register.

The default for -mcpu=power10 is that both load vector pair and store vector
pair are enabled.

I added code so that the user code can modify these settings using either a
'#pragma GCC target' directive or used __attribute__((__target__(...))) in 
the
function declaration.

I added tests for the switches, #pragma, and attribute options.

I have built this on both little endian power10 systems and big endian 
power9
systems doing the normal bootstrap and test.  There were no regressions in 
any
of the tests, and the new tests passed.  Can I check this patch into the 
master
branch?

2024-04-09  Michael Meissner  

gcc/

* config/rs6000/mma.md (movoo): Add support for 
-mno-load-vector-pair and
-mno-store-vector-pair.
* config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add support 
for
-mload-vector-pair and -mstore-vector-pair.
(POWERPC_MASKS): Likewise.
* config/rs6000/rs6000.cc (rs6000_setup_reg_addr_masks): Only allow
indexed mode for OOmode if we are generating both load vector pair 
and
store vector pair instructions.
(rs6000_option_override_internal): Add support for 
-mno-load-vector-pair
and -mno-store-vector-pair.
(rs6000_opt_masks): Likewise.
* config/rs6000/rs6000.md (isa attribute): Add lxvp and stxvp
attributes.
(enabled attribute): Likewise.
* config/rs6000/rs6000.opt (-mload-vector-pair): New option.
(-mstore-vector-pair): Likewise.

gcc/testsuite/

* gcc.target/powerpc/vector-pair-attribute.c: New test.
* gcc.target/powerpc/vector-pair-pragma.c: New test.
* gcc.target/powerpc/vector-pair-switch1.c: New test.
* gcc.target/powerpc/vector-pair-switch2.c: New test.
* gcc.target/powerpc/vector-pair-switch3.c: New test.
* gcc.target/powerpc/vector-pair-switch4.c: New test.

Diff:
---
 gcc/config/rs6000/mma.md   | 19 +---
 gcc/config/rs6000/rs6000-cpus.def  |  8 +++-
 gcc/config/rs6000/rs6000.cc| 30 +++-
 gcc/config/rs6000/rs6000.md| 10 +++-
 gcc/config/rs6000/rs6000.opt   |  8 
 .../gcc.target/powerpc/vector-pair-attribute.c | 39 +++
 .../gcc.target/powerpc/vector-pair-pragma.c| 55 ++
 .../gcc.target/powerpc/vector-pair-switch1.c   | 16 +++
 .../gcc.target/powerpc/vector-pair-switch2.c   | 17 +++
 .../gcc.target/powerpc/vector-pair-switch3.c   | 17 +++
 .../gcc.target/powerpc/vector-pair-switch4.c   | 17 +++
 11 files changed, 225 insertions(+), 11 deletions(-)

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 04e2d0066df..6a7d8a836db 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -292,27 +292,34 @@
 gcc_assert (false);
 })
 
+;; If the user used -mno-store-vector-pair or -mno-load-vector pair, use an
+;; alternative that does not allow indexed addresses so we can split the load
+;; or store.
 (define_insn_and_split "*movoo"
-  [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,ZwO,wa")
-   (match_operand:OO 1 "input_operand" "ZwO,wa,wa"))]
+  [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,wa,ZwO,QwO,wa")
+   (match_operand:OO 1 "input_operand" "ZwO,QwO,wa,wa,wa"))]
   "TARGET_MMA
&& (gpc_reg_operand (operands[0], OOmode)
|| gpc_reg_operand (operands[1], OOmode))"
   "@
lxvp%X1 %x0,%1
+   #
stxvp%X0 %x1,%0
+   #
#"
   "&& reload_completed
-   && (!MEM_P (operands[0]) && !MEM_P (oper

[gcc(refs/users/meissner/heads/work164-vpair)] Power10: Add options to disable load and store vector pair.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:76066d95bf3f71b749084b11fc6dab1ee7cbd813

commit 76066d95bf3f71b749084b11fc6dab1ee7cbd813
Author: Michael Meissner 
Date:   Tue Apr 9 00:42:14 2024 -0400

Power10: Add options to disable load and store vector pair.

In working on some future patches that involve utilizing vector pair
instructions, I wanted to be able to tune my program to enable or disable 
using
the vector pair load or store operations while still keeping the other
operations on the vector pair.

This patch adds two undocumented tuning options.  The -mno-load-vector-pair
option would tell GCC to generate two load vector instructions instead of a
single load vector pair.  The -mno-store-vector-pair option would tell GCC 
to
generate two store vector instructions instead of a single store vector 
pair.

If either -mno-load-vector-pair is used, GCC will not generate the indexed
stxvpx instruction.  Similarly if -mno-store-vector-pair is used, GCC will 
not
generate the indexed lxvpx instruction.  The reason for this is to enable
splitting the {,p}lxvp or {,p}stxvp instructions after reload without 
needing a
scratch GPR register.

The default for -mcpu=power10 is that both load vector pair and store vector
pair are enabled.

I added code so that the user code can modify these settings using either a
'#pragma GCC target' directive or used __attribute__((__target__(...))) in 
the
function declaration.

I added tests for the switches, #pragma, and attribute options.

I have built this on both little endian power10 systems and big endian 
power9
systems doing the normal bootstrap and test.  There were no regressions in 
any
of the tests, and the new tests passed.  Can I check this patch into the 
master
branch?

2024-04-09  Michael Meissner  

gcc/

* config/rs6000/mma.md (movoo): Add support for 
-mno-load-vector-pair and
-mno-store-vector-pair.
* config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add support 
for
-mload-vector-pair and -mstore-vector-pair.
(POWERPC_MASKS): Likewise.
* config/rs6000/rs6000.cc (rs6000_setup_reg_addr_masks): Only allow
indexed mode for OOmode if we are generating both load vector pair 
and
store vector pair instructions.
(rs6000_option_override_internal): Add support for 
-mno-load-vector-pair
and -mno-store-vector-pair.
(rs6000_opt_masks): Likewise.
* config/rs6000/rs6000.md (isa attribute): Add lxvp and stxvp
attributes.
(enabled attribute): Likewise.
* config/rs6000/rs6000.opt (-mload-vector-pair): New option.
(-mstore-vector-pair): Likewise.

gcc/testsuite/

* gcc.target/powerpc/vector-pair-attribute.c: New test.
* gcc.target/powerpc/vector-pair-pragma.c: New test.
* gcc.target/powerpc/vector-pair-switch1.c: New test.
* gcc.target/powerpc/vector-pair-switch2.c: New test.
* gcc.target/powerpc/vector-pair-switch3.c: New test.
* gcc.target/powerpc/vector-pair-switch4.c: New test.

Diff:
---
 gcc/config/rs6000/mma.md  | 19 +--
 gcc/config/rs6000/rs6000-cpus.def |  8 ++--
 gcc/config/rs6000/rs6000.cc   | 30 --
 gcc/config/rs6000/rs6000.md   | 10 +-
 gcc/config/rs6000/rs6000.opt  |  8 
 5 files changed, 64 insertions(+), 11 deletions(-)

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 04e2d0066df..6a7d8a836db 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -292,27 +292,34 @@
 gcc_assert (false);
 })
 
+;; If the user used -mno-store-vector-pair or -mno-load-vector pair, use an
+;; alternative that does not allow indexed addresses so we can split the load
+;; or store.
 (define_insn_and_split "*movoo"
-  [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,ZwO,wa")
-   (match_operand:OO 1 "input_operand" "ZwO,wa,wa"))]
+  [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,wa,ZwO,QwO,wa")
+   (match_operand:OO 1 "input_operand" "ZwO,QwO,wa,wa,wa"))]
   "TARGET_MMA
&& (gpc_reg_operand (operands[0], OOmode)
|| gpc_reg_operand (operands[1], OOmode))"
   "@
lxvp%X1 %x0,%1
+   #
stxvp%X0 %x1,%0
+   #
#"
   "&& reload_completed
-   && (!MEM_P (operands[0]) && !MEM_P (operands[1]))"
+   && ((MEM_P (operands[0]) && !TARGET_STORE_VECTOR_PAIR)
+   || (MEM_P (operands[1]) && !TARGET_LOAD_VECTOR_PAIR)
+   || (!MEM_P (operands[0]) && !MEM_P (operands[1])))"
   [(const_int 0)]
 {
   rs6000_split_multireg_move (operands[0], operands[1]);
   DONE;
 }
-  [(set_attr "type" "vecload,vecstore,veclogical")
+  [(set_attr "type" "vecload,vecload,vecstore,vecstore,veclogical")
(set_attr "size" "256")
-   (set_attr "length" "*