Hi Tobias!

On 2024-03-07T12:43:07+0100, Tobias Burnus <tbur...@baylibre.com> wrote:
> Thomas Schwinge wrote:
>> An issue with libgomp GCN plugin 'GCN_SUPPRESS_HOST_FALLBACK' (which is
>> different from the libgomp-level host-fallback execution):
>>> +failure:
>>> +  if (suppress_host_fallback)
>>> +    GOMP_PLUGIN_fatal ("GCN host fallback has been suppressed");
>>> +  GCN_WARNING ("GCN target cannot be launched, doing a host fallback\n");
>>> +  return false;
>>> +}
>>
>> This originates in the libgomp HSA plugin, where the idea was -- in my
>> understanding -- that you wouldn't have device code available for all
>> 'fn_ptr's, and in that case transparently (shared-memory system!) do
>> host-fallback execution.  Or, with 'GCN_SUPPRESS_HOST_FALLBACK' set,
>> you'd get those diagnosed.
>>
>> This has then been copied into the libgomp GCN plugin (see above).
>> However, is it really still applicable there; don't we assume that we're
>> generating device code for all relevant functions?  (I suppose everyone
>> really is testing with 'GCN_SUPPRESS_HOST_FALLBACK' set?)
>
> First, I think most users do not set GCN_SUPPRESS_HOST_FALLBACK – and it 
> is also not really desirable.

External users probably don't, but certainly all our internal testing is
setting it, and also implicitly all nvptx offloading testing: simply by
means of having such knob in the libgomp nvptx plugin.  That is, the
libgomp nvptx plugin has an implicit 'suppress_host_fallback = true' for
(the original meaning of) that flag (and does not have the "init"-error
behavior that I consider bogus, and try to remove from the libgomp GCN
plugin).

And, one step back: how is (the original meaning of)
'suppress_host_fallback = false' even supposed to work on non-shared
memory systems as currently implemented by the libgomp GCN plugin?

> If I run on my Linux system the system compiler with nvptx + gcn suppost 
> installed, I get (with a nvptx permission problem):
>
> $ GCN_SUPPRESS_HOST_FALLBACK=1 ./a.out
>
> libgomp: GCN host fallback has been suppressed
>
> And exit code = 1. The same result with '-foffload=disable' or with 
> '-foffload=nvptx-none'.

I can't tell if that's what you expect to see there, or not?

(For avoidance of doubt: I'm expecting silent host-fallback execution in
case that libgomp GCN and/or nvptx plugins are available, but no
corresponding devices.  That's what my patch achieves.)

>> Should we thus
>> actually remove 'suppress_host_fallback' (that is, make it
>> always-'true'),
>
> If we want to remove it, we can make it always false - but I am strongly 
> against making it always true.

I'm confused.  So you want the GCN and nvptx plugins to behave
differently in that regard?  What is the rationale for that?  In
particular also regarding this whole concept of dynamic plugin-level
host-fallback execution being in conflict with our current non-shared
memory system configurations?


> Use OMP_TARGET_OFFLOAD=mandatory (or that GCN env) if you want to 
> prevent the host fallback, but don't break somewhat common systems.

That's an orthogonal concept?


Grüße
 Thomas

Reply via email to