Some AMD GCN devices support an "XNACK" mode in which the device can
handle page-misses (and maybe other traps in memory instructions), but
it's not completely invisible to software.
We need this now to support OpenMP Unified Shared Memory (I plan to post
updated patches for that in January), and in future it may enable
support for APU devices (such as MI300).
The first patch ensures that load instructions are "restartable",
meaning that the outputs do not overwrite the input registers (address
and offsets). This maps pretty much exactly to the GCC "early-clobber"
concept, so we just need to add additional alternatives and then not
generate problem instructions explicitly.
The second patch is a workaround for the register allocation patch I
asked about on gcc@ yesterday. The early clobber increases register
pressure which causes compile failure when LRA is unable to spill
additional registers without needing yet more registers. This doesn't
become a problem on gfx90a (MI200) so soon due to the additional AVGPR
spill registers, and that's the only device that really supports USM, so
far, so limiting XNACK to that device will work for now.
The -mxnack option was already added as a placeholder, so not much is
needed there.
Committed to master. An older version of these patches is already
committed to devel/omp/gcc-13 (OG13).
Andrewamdgcn: Work around XNACK register allocation problem
The extra register pressure is causing infinite loops in some cases, especially
at -O0. I have not yet observed any issue on devices that have AVGPRs for
spilling, and XNACK is only really useful on those devices anyway, so change
the defaults.
gcc/ChangeLog:
* config/gcn/gcn-hsa.h (NO_XNACK): Change the defaults.
* config/gcn/gcn-opts.h (enum hsaco_attr_type): Add HSACO_ATTR_DEFAULT.
* config/gcn/gcn.cc (gcn_option_override): Set the default flag_xnack.
* config/gcn/gcn.opt: Add -mxnack=default.
* doc/invoke.texi: Document the -mxnack default.
diff --git a/gcc/config/gcn/gcn-hsa.h b/gcc/config/gcn/gcn-hsa.h
index bfb104526c5..b44d42b02d6 100644
--- a/gcc/config/gcn/gcn-hsa.h
+++ b/gcc/config/gcn/gcn-hsa.h
@@ -75,7 +75,9 @@ extern unsigned int gcn_local_sym_hash (const char *name);
supported for gcn. */
#define GOMP_SELF_SPECS ""
-#define NO_XNACK "march=fiji:;march=gfx1030:;"
+#define NO_XNACK "march=fiji:;march=gfx1030:;" \
+/* These match the defaults set in gcn.cc. */ \
+
"!mxnack*|mxnack=default:%{march=gfx900|march=gfx906|march=gfx908:-mattr=-xnack};"
#define NO_SRAM_ECC "!march=*:;march=fiji:;march=gfx900:;march=gfx906:;"
/* In HSACOv4 no attribute setting means the binary supports "any" hardware
diff --git a/gcc/config/gcn/gcn-opts.h b/gcc/config/gcn/gcn-opts.h
index b4f494d868c..634cec6d832 100644
--- a/gcc/config/gcn/gcn-opts.h
+++ b/gcc/config/gcn/gcn-opts.h
@@ -65,7 +65,8 @@ enum hsaco_attr_type
{
HSACO_ATTR_OFF,
HSACO_ATTR_ON,
- HSACO_ATTR_ANY
+ HSACO_ATTR_ANY,
+ HSACO_ATTR_DEFAULT
};
#endif
diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
index d92cd01d03f..b67551a2e8e 100644
--- a/gcc/config/gcn/gcn.cc
+++ b/gcc/config/gcn/gcn.cc
@@ -172,6 +172,29 @@ gcn_option_override (void)
/* Allow HSACO_ATTR_ANY silently because that's the default. */
flag_xnack = HSACO_ATTR_OFF;
}
+
+ /* There's no need for XNACK on devices without USM, and there are register
+ allocation problems caused by the early-clobber when AVGPR spills are not
+ available.
+ FIXME: can the regalloc mean the default can be really "any"? */
+ if (flag_xnack == HSACO_ATTR_DEFAULT)
+switch (gcn_arch)
+ {
+ case PROCESSOR_FIJI:
+ case PROCESSOR_VEGA10:
+ case PROCESSOR_VEGA20:
+ case PROCESSOR_GFX908:
+ flag_xnack = HSACO_ATTR_OFF;
+ break;
+ case PROCESSOR_GFX90a:
+ flag_xnack = HSACO_ATTR_ANY;
+ break;
+ default:
+ gcc_unreachable ();
+ }
+
+ if (flag_sram_ecc == HSACO_ATTR_DEFAULT)
+flag_sram_ecc = HSACO_ATTR_ANY;
}
/* }}} */
diff --git a/gcc/config/gcn/gcn.opt b/gcc/config/gcn/gcn.opt
index c356a0cbb08..32486d9615f 100644
--- a/gcc/config/gcn/gcn.opt
+++ b/gcc/config/gcn/gcn.opt
@@ -97,9 +97,12 @@ Enum(hsaco_attr_type) String(on) Value(HSACO_ATTR_ON)
EnumValue
Enum(hsaco_attr_type) String(any) Value(HSACO_ATTR_ANY)
+EnumValue
+Enum(hsaco_attr_type) String(default) Value(HSACO_ATTR_DEFAULT)
+
mxnack=
-Target RejectNegative Joined ToLower Enum(hsaco_attr_type) Var(flag_xnack)
Init(HSACO_ATTR_ANY)
-Compile for devices requiring XNACK enabled. Default \"any\".
+Target RejectNegative Joined ToLower Enum(hsaco_attr_type) Var(flag_xnack)
Init(HSACO_ATTR_DEFAULT)
+Compile for devices requiring XNACK enabled. Default \"any\" if USM is
supported.
msram-ecc=
Target RejectNegative Joined ToLower Enum(hsaco_attr_type) Var(flag_sram_ecc)
Init(HSACO_ATTR_ANY)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index