Hi,
I have tried to add the support for movoo and movxo on 64bit powerpc
architecures. Since these are opaque modes, my assumption is that we can
use any means to achieve reg-reg, reg-mem, mem-reg transfers. But for
power9 and later we can define OOmode to be 2 contig-aligned VSX and
XOmode to be 4 contig-aligned altivec registers. And from power10 we can
also define XOmode to be a proxy for accumulators as well. And since
these modes are linked to __vector_pair and __vector_quad, we can define
the semantics of those types in same way.
I hope this approach is fine. If you still feel like
__vector_quad/XOmode should be restricted to MMA, i can limit XOmode
moves to just MMA as was before, and only do this change for
__vector_pair/OOmode (this can be supported even in 32 bit arch).
The patch has been bootstrapped and regtested on powerpc64le and
powerpc32, with no other regressions. Kindly review the patch.
Changes from v1:
1. Support movxo/movoo for POWERPC64 as well.
2. Update test cases to check errors with -m32 option.
Thanks and regards,
Avinash Jayakar
The OO and XO modes were originally added for supporting MMA builtins.
But since these are opaque modes, we need to support the reg-memory,
memory-reg and reg-reg transfers for the reload pass to work correctly.
This patch adds support for movxo and movoo on 64 bit powerpc
architectures. By treating 4/8 contiguous 64 bit-GPRs for OO/XO modes
respectively, we can achieve the move operations on these architectures.
With this change, the semantics of OOmode and XOmode are mentioned
below:
1. In 64 bit powerpc: 4(OO)/8(XO) contiguous-aligned GPRs. Memory
loads/stores achieved through 4 x ld/std insns.
2. POWER9 vector:
OO: 2 contiguous aligned VSX registers. loads/stores - 2 x lxv/stxv
insns.
XO: 4 contiguous altivec/upper 32 VSX reg. loads/stores -
4 x lxv/stxv insns.
2. MMA:
OO: 2 contiguous aligned VSX registers. loads/stores -
1 x lxvp/stxvp insns.
XO: 4 contiguous altivec/upper 32 VSX reg. loads/stores -
2 x lxvp/stxvp insns, with prime/deprime if xxmfacc and xxmtacc.
On 32 bit architectures, these modes will not be supported since XO
would need 15 contiguous 32 bit registers and the reg alloc will fail.
But we can support the OO mode.
2025-11-26 Avinash Jayakar <[email protected]>
PR 106736
PR 108272
gcc/ChangeLog:
* config/rs6000/mma.md (*movoo): Support on powerpc64.
(*movoo_p10): Use lxvp/stxvp with mma.
* config/rs6000/rs6000.cc (rs6000_hard_regno_mode_ok_uncached):
Use 4/8 contig-aligned GPRs for powerpc64.
(rs6000_split_multireg_move): Use DI mode for powerpc64,
V1TImode for power9 vector.
(rs6000_opaque_type_invalid_use_p): Update error messages.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr106736-1.c: Fail with -m32.
* gcc.target/powerpc/pr106736-2.c: Fail with -m32.
* gcc.target/powerpc/pr106736-3.c: Fail with -m32.
* gcc.target/powerpc/pr106736-4.c: Fail with -m32.
* gcc.target/powerpc/pr106736-5.c: Fail with -m32.
* gcc.target/powerpc/pr108272-1.c: Fail with -m32.
* gcc.target/powerpc/pr108272-2.c: Fail with -m32.
* gcc.target/powerpc/pr108272-3.c: Fail with -m32.
* gcc.target/powerpc/pr108272-4.c: Fail with -m32.
---
gcc/config/rs6000/mma.md | 48 ++++++++++++-------
gcc/config/rs6000/rs6000.cc | 33 +++++++++----
gcc/testsuite/gcc.target/powerpc/pr106736-1.c | 2 +-
gcc/testsuite/gcc.target/powerpc/pr106736-2.c | 2 +-
gcc/testsuite/gcc.target/powerpc/pr106736-3.c | 2 +-
gcc/testsuite/gcc.target/powerpc/pr106736-4.c | 2 +-
gcc/testsuite/gcc.target/powerpc/pr106736-5.c | 2 +-
gcc/testsuite/gcc.target/powerpc/pr108272-1.c | 2 +-
gcc/testsuite/gcc.target/powerpc/pr108272-2.c | 2 +-
gcc/testsuite/gcc.target/powerpc/pr108272-3.c | 2 +-
gcc/testsuite/gcc.target/powerpc/pr108272-4.c | 2 +-
11 files changed, 64 insertions(+), 35 deletions(-)
diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 85f3a926682..90c369fa8b4 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -269,7 +269,7 @@
(match_operand:OO 1 "input_operand"))]
""
{
- if (TARGET_MMA)
+ if (TARGET_POWERPC64)
{
rs6000_emit_move (operands[0], operands[1], OOmode);
DONE;
@@ -291,26 +291,38 @@
gcc_assert (false);
})
-(define_insn_and_split "*movoo"
- [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,ZwO,wa")
- (match_operand:OO 1 "input_operand" "ZwO,wa,wa"))]
+(define_insn "*movoo_p10"
+ [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,ZwO")
+ (match_operand:OO 1 "input_operand" "ZwO,wa"))]
"TARGET_MMA
&& (gpc_reg_operand (operands[0], OOmode)
|| gpc_reg_operand (operands[1], OOmode))"
"@
lxvp%X1 %x0,%1
- stxvp%X0 %x1,%0
+ stxvp%X0 %x1,%0"
+ [(set_attr "type" "vecload,vecstore")])
+
+(define_insn_and_split "*movoo"
+ [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,ZwO,wa,r,Z,r")
+ (match_operand:OO 1 "input_operand" "ZwO,wa,wa,Z,r,r"))]
+ "(gpc_reg_operand (operands[0], OOmode)
+ || gpc_reg_operand (operands[1], OOmode))"
+ "@
+ #
+ #
+ #
+ #
+ #
#"
- "&& reload_completed
- && (!MEM_P (operands[0]) && !MEM_P (operands[1]))"
+ "&& reload_completed"
[(const_int 0)]
{
rs6000_split_multireg_move (operands[0], operands[1]);
DONE;
}
- [(set_attr "type" "vecload,vecstore,veclogical")
+ [(set_attr "type" "vecload,vecstore,veclogical,load,store,logical")
(set_attr "size" "256")
- (set_attr "length" "*,*,8")])
+ (set_attr "length" "*,*,8,*,*,*")])
;; Vector quad support. XOmode can only live in FPRs.
@@ -319,7 +331,7 @@
(match_operand:XO 1 "input_operand"))]
""
{
- if (TARGET_MMA)
+ if (TARGET_POWERPC64)
{
rs6000_emit_move (operands[0], operands[1], XOmode);
DONE;
@@ -339,12 +351,14 @@
})
(define_insn_and_split "*movxo"
- [(set (match_operand:XO 0 "nonimmediate_operand" "=d,ZwO,d")
- (match_operand:XO 1 "input_operand" "ZwO,d,d"))]
- "TARGET_MMA
- && (gpc_reg_operand (operands[0], XOmode)
+ [(set (match_operand:XO 0 "nonimmediate_operand" "=d,ZwO,d,r,Z,r")
+ (match_operand:XO 1 "input_operand" "ZwO,d,d,Z,r,r"))]
+ "(gpc_reg_operand (operands[0], XOmode)
|| gpc_reg_operand (operands[1], XOmode))"
"@
+ #
+ #
+ #
#
#
#"
@@ -354,9 +368,9 @@
rs6000_split_multireg_move (operands[0], operands[1]);
DONE;
}
- [(set_attr "type" "vecload,vecstore,veclogical")
- (set_attr "length" "*,*,16")
- (set_attr "max_prefixed_insns" "2,2,*")])
+ [(set_attr "type" "vecload,vecstore,veclogical,load,store,logical")
+ (set_attr "length" "*,*,16,*,*,*")
+ (set_attr "max_prefixed_insns" "2,2,*,*,*,*")])
(define_expand "vsx_assemble_pair"
[(match_operand:OO 0 "vsx_register_operand")
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 1d5cd25c0f0..b68f7e9ab82 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -1861,12 +1861,27 @@ rs6000_hard_regno_mode_ok_uncached (int regno,
machine_mode mode)
/* Vector pair modes need even/odd VSX register pairs. Only allow vector
registers. */
if (mode == OOmode)
- return (TARGET_MMA && VSX_REGNO_P (regno) && (regno & 1) == 0);
+ {
+ // Allow VSX registers if MMA/P9_VECTOR else fall back to GPR
+ if (TARGET_P9_VECTOR || TARGET_MMA)
+ return VSX_REGNO_P (regno) && (regno & 1) == 0;
+ else
+ return (IN_RANGE (regno, FIRST_GPR_REGNO, LAST_GPR_REGNO)
+ && IN_RANGE (last_regno, FIRST_GPR_REGNO, LAST_GPR_REGNO)
+ && ((regno & 3) == 0));
+ }
/* MMA accumulator modes need FPR registers divisible by 4. */
if (mode == XOmode)
- return (TARGET_MMA && FP_REGNO_P (regno) && (regno & 3) == 0);
-
+ {
+ // Allow altivec reg if MMA/P9_VECTOR else fall back to GPR
+ if (TARGET_P9_VECTOR || TARGET_MMA)
+ return FP_REGNO_P (regno) && (regno & 3) == 0;
+ else
+ return (IN_RANGE (regno, FIRST_GPR_REGNO, LAST_GPR_REGNO)
+ && IN_RANGE (last_regno, FIRST_GPR_REGNO, LAST_GPR_REGNO)
+ && ((regno & 7) == 0));
+ }
/* PTImode can only go in GPRs. Quad word memory operations require even/odd
register combinations, and use PTImode where we need to deal with quad
word memory operations. Don't allow quad words in the argument or frame
@@ -27403,7 +27418,7 @@ rs6000_split_multireg_move (rtx dst, rtx src)
/* If we have a vector quad register for MMA, and this is a load or store,
see if we can use vector paired load/stores. */
- if (mode == XOmode && TARGET_MMA
+ if ((mode == XOmode || mode == OOmode) && TARGET_MMA
&& (MEM_P (dst) || MEM_P (src)))
{
reg_mode = OOmode;
@@ -27412,7 +27427,7 @@ rs6000_split_multireg_move (rtx dst, rtx src)
/* If we have a vector pair/quad mode, split it into two/four separate
vectors. */
else if (mode == OOmode || mode == XOmode)
- reg_mode = V1TImode;
+ reg_mode = TARGET_P9_VECTOR ? V1TImode : DImode;
else if (FP_REGNO_P (reg))
reg_mode = DECIMAL_FLOAT_MODE_P (mode) ? DDmode :
(TARGET_HARD_FLOAT ? DFmode : SFmode);
@@ -29292,7 +29307,7 @@ constant_generates_xxspltidp (vec_const_128bit_type
*vsx_const)
/* Now we have only two opaque types, they are __vector_quad and
__vector_pair built-in types. They are target specific and
- only available when MMA is supported. With MMA supported, it
+ only available when PPC64 is supported. With PPC64 supported, it
simply returns true, otherwise it checks if the given gimple
STMT is an assignment, asm or call stmt and uses either of
these two opaque types unexpectedly, if yes, it would raise
@@ -29301,7 +29316,7 @@ constant_generates_xxspltidp (vec_const_128bit_type
*vsx_const)
bool
rs6000_opaque_type_invalid_use_p (gimple *stmt)
{
- if (TARGET_MMA)
+ if (TARGET_POWERPC64)
return false;
/* If the given TYPE is one MMA opaque type, emit the corresponding
@@ -29311,12 +29326,12 @@ rs6000_opaque_type_invalid_use_p (gimple *stmt)
tree mv = TYPE_MAIN_VARIANT (type);
if (mv == vector_quad_type_node)
{
- error ("type %<__vector_quad%> requires the %qs option", "-mmma");
+ error ("type %<__vector_quad%> requires the %qs option", "-m64");
return true;
}
else if (mv == vector_pair_type_node)
{
- error ("type %<__vector_pair%> requires the %qs option", "-mmma");
+ error ("type %<__vector_pair%> requires the %qs option", "-m64");
return true;
}
return false;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr106736-1.c
b/gcc/testsuite/gcc.target/powerpc/pr106736-1.c
index 65bd79d3dce..cac95c03c64 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr106736-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr106736-1.c
@@ -2,7 +2,7 @@
/* If the default cpu type is power10 or later, type __vector_quad is
supported. To keep the test point available all the time, this case
specifies -mdejagnu-cpu=power9 here. */
-/* { dg-options "-mdejagnu-cpu=power9" } */
+/* { dg-options "-m32" } */
/* Verify there is no ICE and don't check the error messages on unsupported
type since they could be fragile and are not test points of this case. */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr106736-2.c
b/gcc/testsuite/gcc.target/powerpc/pr106736-2.c
index 12ad936fccc..b646c6d3e35 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr106736-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr106736-2.c
@@ -2,7 +2,7 @@
/* If the default cpu type is power10 or later, type __vector_pair is
supported. To keep the test point available all the time, this case
specifies -mdejagnu-cpu=power9 here. */
-/* { dg-options "-mdejagnu-cpu=power9" } */
+/* { dg-options "-m32" } */
/* Verify there is no ICE and don't check the error messages on unsupported
type since they could be fragile and are not test points of this case. */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr106736-3.c
b/gcc/testsuite/gcc.target/powerpc/pr106736-3.c
index 4fb368b8fb5..0a7f0494b04 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr106736-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr106736-3.c
@@ -2,7 +2,7 @@
/* If the default cpu type is power10 or later, type __vector_quad is
supported. To keep the test point available all the time, this case
specifies -mdejagnu-cpu=power9 here. */
-/* { dg-options "-mdejagnu-cpu=power9" } */
+/* { dg-options "-m32" } */
/* Verify there is no ICE and don't check the error messages on unsupported
type since they could be fragile and are not test points of this case. */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr106736-4.c
b/gcc/testsuite/gcc.target/powerpc/pr106736-4.c
index 4b366416b0a..36afbc03e00 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr106736-4.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr106736-4.c
@@ -2,7 +2,7 @@
/* If the default cpu type is power10 or later, type __vector_quad is
supported. To keep the test point available all the time, this case
specifies -mdejagnu-cpu=power9 here. */
-/* { dg-options "-mdejagnu-cpu=power9" } */
+/* { dg-options "-m32" } */
/* Verify there is no ICE and don't check the error messages on unsupported
type since they could be fragile and are not test points of this case. */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr106736-5.c
b/gcc/testsuite/gcc.target/powerpc/pr106736-5.c
index d7370b81e81..92896722b40 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr106736-5.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr106736-5.c
@@ -2,7 +2,7 @@
/* If the default cpu type is power10 or later, type __vector_pair is
supported. To keep the test point available all the time, this case
specifies -mdejagnu-cpu=power9 here. */
-/* { dg-options "-mdejagnu-cpu=power9" } */
+/* { dg-options "-m32" } */
/* Verify there is no ICE and don't check the error messages on unsupported
type since they could be fragile and are not test points of this case. */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr108272-1.c
b/gcc/testsuite/gcc.target/powerpc/pr108272-1.c
index b99e6a4d86d..ef6ccf182b7 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr108272-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr108272-1.c
@@ -2,7 +2,7 @@
/* If the default cpu type is power10 or later, type __vector_quad is
supported. To keep the test point available all the time, this case
specifies -mdejagnu-cpu=power9 here. */
-/* { dg-options "-mdejagnu-cpu=power9" } */
+/* { dg-options "-m32" } */
/* Verify there is no ICE and don't check the error messages on unsupported
type since they could be fragile and are not test points of this case. */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr108272-2.c
b/gcc/testsuite/gcc.target/powerpc/pr108272-2.c
index 51b2100d0f1..beb20470ce5 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr108272-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr108272-2.c
@@ -2,7 +2,7 @@
/* If the default cpu type is power10 or later, type __vector_pair is
supported. To keep the test point available all the time, this case
specifies -mdejagnu-cpu=power9 here. */
-/* { dg-options "-mdejagnu-cpu=power9" } */
+/* { dg-options "-m32" } */
/* Verify there is no ICE and don't check the error messages on unsupported
type since they could be fragile and are not test points of this case. */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr108272-3.c
b/gcc/testsuite/gcc.target/powerpc/pr108272-3.c
index 634a529b5c8..b10972af39b 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr108272-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr108272-3.c
@@ -2,7 +2,7 @@
/* If the default cpu type is power10 or later, type __vector_quad is
supported. To keep the test point available all the time, this case
specifies -mdejagnu-cpu=power9 here. */
-/* { dg-options "-mdejagnu-cpu=power9" } */
+/* { dg-options "-m32" } */
/* Verify there is no ICE and don't check the error messages on unsupported
type since they could be fragile and are not test points of this case. */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr108272-4.c
b/gcc/testsuite/gcc.target/powerpc/pr108272-4.c
index 7eecd6c5a0d..f7148ca6f2e 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr108272-4.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr108272-4.c
@@ -2,7 +2,7 @@
/* If the default cpu type is power10 or later, type __vector_pair is
supported. To keep the test point available all the time, this case
specifies -mdejagnu-cpu=power9 here. */
-/* { dg-options "-mdejagnu-cpu=power9" } */
+/* { dg-options "-m32" } */
/* Verify there is no ICE and don't check the error messages on unsupported
type since they could be fragile and are not test points of this case. */
--
2.51.0