Hi!

On 2019-11-12T13:29:16+0000, Andrew Stubbs <a...@codesourcery.com> wrote:
> This patch contributes the GCN libgomp plugin, with the various
> configure and make bits to go with it.

An issue with libgomp GCN plugin 'GCN_SUPPRESS_HOST_FALLBACK' (which is
different from the libgomp-level host-fallback execution):

> --- /dev/null
> +++ b/libgomp/plugin/plugin-gcn.c

> +/* Flag to decide if the runtime should suppress a possible fallback to host
> +   execution.  */
> +
> +static bool suppress_host_fallback;

> +static void
> +init_environment_variables (void)
> +{
> +  [...]
> +  if (secure_getenv ("GCN_SUPPRESS_HOST_FALLBACK"))
> +    suppress_host_fallback = true;
> +  else
> +    suppress_host_fallback = false;

> +/* Return true if the HSA runtime can run function FN_PTR.  */
> +
> +bool
> +GOMP_OFFLOAD_can_run (void *fn_ptr)
> +{
> +  struct kernel_info *kernel = (struct kernel_info *) fn_ptr;
> +
> +  init_kernel (kernel);
> +  if (kernel->initialization_failed)
> +    goto failure;
> +
> +  return true;
> +
> +failure:
> +  if (suppress_host_fallback)
> +    GOMP_PLUGIN_fatal ("GCN host fallback has been suppressed");
> +  GCN_WARNING ("GCN target cannot be launched, doing a host fallback\n");
> +  return false;
> +}

This originates in the libgomp HSA plugin, where the idea was -- in my
understanding -- that you wouldn't have device code available for all
'fn_ptr's, and in that case transparently (shared-memory system!) do
host-fallback execution.  Or, with 'GCN_SUPPRESS_HOST_FALLBACK' set,
you'd get those diagnosed.

This has then been copied into the libgomp GCN plugin (see above).
However, is it really still applicable there; don't we assume that we're
generating device code for all relevant functions?  (I suppose everyone
really is testing with 'GCN_SUPPRESS_HOST_FALLBACK' set?)  Should we thus
actually remove 'suppress_host_fallback' (that is, make it
always-'true'), including removal of the 'can_run' hook?  (I suppose that
even in a future shared-memory "GCN" configuration, we're not expecting
to use this again; expecting always-'true' for 'can_run'?)


Now my actual issue: the libgomp GCN plugin then invented an additional
use of 'GCN_SUPPRESS_HOST_FALLBACK':

> +/* Initialize hsa_context if it has not already been done.
> +   Return TRUE on success.  */
> +
> +static bool
> +init_hsa_context (void)
> +{
> +  hsa_status_t status;
> +  int agent_index = 0;
> +
> +  if (hsa_context.initialized)
> +    return true;
> +  init_environment_variables ();
> +  if (!init_hsa_runtime_functions ())
> +    {
> +      GCN_WARNING ("Run-time could not be dynamically opened\n");
> +      if (suppress_host_fallback)
> +     GOMP_PLUGIN_fatal ("GCN host fallback has been suppressed");
> +      return false;
> +    }

That is, if 'GCN_SUPPRESS_HOST_FALLBACK' is (routinely) set (for its
original purpose), and you have the libgomp GCN plugin configured, but
don't have 'libhsa-runtime64.so.1' available, you run into a fatal error.

The libgomp nvptx plugin in such cases silently disables the
plugin/device (and thus lets libgomp proper do its thing), and I propose
we do the same here.  OK to push the attached
"GCN: Even with 'GCN_SUPPRESS_HOST_FALLBACK' set, failure to 
'init_hsa_runtime_functions' is not fatal"?


Grüße
 Thomas


>From f037d2d8274940f042633a0ecb18a53942c075f5 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <tschwi...@baylibre.com>
Date: Thu, 7 Mar 2024 10:43:15 +0100
Subject: [PATCH] GCN: Even with 'GCN_SUPPRESS_HOST_FALLBACK' set, failure to
 'init_hsa_runtime_functions' is not fatal

'GCN_SUPPRESS_HOST_FALLBACK' controls the libgomp GCN plugin's capability to
transparently use host-fallback execution for certain device functions; it
shouldn't control failure of libgomp GCN plugin initialization (which libgomp
handles fine: triggering use of a different plugin/device, or general
host-fallback execution, or fatal error, as applicable).

	libgomp/
	* plugin/plugin-gcn.c (init_hsa_context): Even with
	'GCN_SUPPRESS_HOST_FALLBACK' set, failure to
	'init_hsa_runtime_functions' is not fatal.
---
 libgomp/plugin/plugin-gcn.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index 2771123252a..464164afb03 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -1524,8 +1524,6 @@ init_hsa_context (void)
   if (!init_hsa_runtime_functions ())
     {
       GCN_WARNING ("Run-time could not be dynamically opened\n");
-      if (suppress_host_fallback)
-	GOMP_PLUGIN_fatal ("GCN host fallback has been suppressed");
       return false;
     }
   status = hsa_fns.hsa_init_fn ();
-- 
2.34.1

Reply via email to