On 26/01/2024 10:40, Richard Biener wrote:
The following avoids selecting an unsupported agent early, avoiding
later asserts when we rely on it being supported.
tested on x86_64-unknown-linux-gnu -> amdhsa-gcn on gfx1060
that's the alternative to the other patch. I do indeed seem to get
the other (unsupported) agent selected somehow after the other supported
agent finished a kernel run. Not sure if it's the CPU or the IGPU though.
OK? Which variant?
So, looking at it again, the original intent of the assert was to alert
toolchain developers that they missed adding a new name when porting to
a new device, but I concur that it's not ideal when the assert
encounters an unknown device in the wild.
However, if we're trying to do something more useful than merely fixing
an ugly error message, maybe we should look at removing unsupported
devices in "suitable_hsa_agent_p" instead? Unsupported GPUs wouldn't be
assigned a device number at all.
Probably devices that are GPUs but skipped because they are unsupported
should be mentioned on GOMP_DEBUG (as well as GCN_DEBUG)?
The goal should be that folks with your twin-GPU setup shouldn't have to
work around it, but I don't really want to remove the message for people
who only have one device but don't realize it is unsupported.
On the other hand, if a user has two devices that *are* supported, but
the second one is preferred, they'll have to set OMP_DEFAULT_DEVICE
explicitly, and is this so different?
As a user, WDYT?
Andrew
libgomp/
* plugin/plugin-gcn.c (get_agent_info): When the agent isn't supported
return NULL.
---
libgomp/plugin/plugin-gcn.c | 7 +++
1 file changed, 7 insertions(+)
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index d8c3907c108..f453f630e06 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -1036,6 +1036,8 @@ print_kernel_dispatch (struct kernel_dispatch *dispatch,
unsigned indent)
/* }}} */
/* {{{ Utility functions */
+static const char* isa_hsa_name (int isa);
+
/* Cast the thread local storage to gcn_thread. */
static inline struct gcn_thread *
@@ -1589,6 +1591,11 @@ get_agent_info (int n)
GOMP_PLUGIN_error ("Attempt to use an uninitialized GCN agent.");
return NULL;
}
+ if (!isa_hsa_name (hsa_context.agents[n].device_isa))
+{
+ GOMP_PLUGIN_error ("Attempt to use an unsupported GCN agent.");
+ return NULL;
+}
return &hsa_context.agents[n];
}