On 11.12.23 15:31, Thomas Schwinge wrote:
On 2023-12-08T17:44:14+0100, Tobias Burnus <tob...@codesourcery.com> wrote:
On 08.12.23 15:09, Thomas Schwinge wrote:
On 22/11/2023 17:07, Tobias Burnus wrote:
Let's start with the patch itself:
--- a/libgomp/target.c
+++ b/libgomp/target.c
...
+static struct gomp_device_descr *
+get_device_for_page_locked (void)
+{
+ gomp_debug (0, "%s\n",
+ __FUNCTION__);
+
+ struct gomp_device_descr *device;
+#ifdef HAVE_SYNC_BUILTINS
+ device
+ = __atomic_load_n (&device_for_page_locked, MEMMODEL_RELAXED);
+ if (device == (void *) -1)
+ {
+ gomp_debug (0, " init\n");
+
+ gomp_init_targets_once ();
+
+ device = NULL;
+ for (int i = 0; i < num_devices; ++i)
Given that this function just sets a single variable based on whether the
page_locked_host_alloc_func function pointer exists, wouldn't it be much
simpler to just do all this handling in gomp_target_init ?
@Thomas, care to comment on this?
From what I remember, we cannot assume that 'gomp_target_init' has
already been done when we get here; therefore 'gomp_init_targets_once' is
being called here. We may get to 'get_device_for_page_locked' via
host-side OpenMP, in code that doesn't contain any OpenMP 'target'
offloading things. Therefore, this was (a) necessary to make that work,
and (b) did seem to be a useful abstraction to me.
I am not questioning the "gomp_init_targets_once ();" but I am wounding
whether only 'gomp_init_targets_once()' should remain without the
locking + loading dance - and then just set that single variable inside
gomp_target_init.
Ah, I see, thanks.
If you reach here w/o target set up, the "gomp_init_targets_once ();"
would ensure it gets initialized with all the other code inside
gomp_target_init.
And if gomp_target_init() was called before, gomp_init_targets_once()
will just return without doing anything and your are also fine.
Yes, I suppose we could do it that way. 'get_device_for_page_locked'
could then, after 'gomp_init_targets_once', unconditionally return
'device_for_page_locked' (even without '__atomic_load', right?).
Yes, that was my idea.
A disadvantage is that the setup of 'device_for_page_locked' (in
'gomp_target_init') and use of it (in 'get_device_for_page_locked') is
then split apart. I guess I don't have a strong opinion on that one.
;-)
But pro is that it avoids the #ifdef HAVE_SYNC_BUILTINS, avoiding a "-1"
initialization of using_device_for_page_locked, atomic loads all over
the place etc.
Thus, I prefer this option – but I also don't have a strong opinion,
either.
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht
München, HRB 106955