On Tue, Dec 01, 2015 at 08:29:27PM +0300, Ilya Verbin wrote:
> libgomp/
> * target.c (finalized): New static variable.
> (resolve_device): Do nothing when finalized is true.
> (GOMP_offload_register_ver): Likewise.
> (GOMP_offload_unregister_ver): Likewise.
> (gomp_target_fini): New static function.
> (gomp_target_init): Call gomp_target_fini at exit.
> liboffloadmic/
> * plugin/libgomp-plugin-intelmic.cpp (unregister_main_image): Remove.
> (register_main_image): Do not call unregister_main_image at exit.
> (GOMP_OFFLOAD_fini_device): Allow for OpenMP. Unregister main image.
>
> diff --git a/libgomp/target.c b/libgomp/target.c
> index cf9d0e6..320178e 100644
> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -78,6 +78,10 @@ static int num_devices;
> /* Number of GOMP_OFFLOAD_CAP_OPENMP_400 devices. */
> static int num_devices_openmp;
>
> +/* True when offloading runtime is finalized. */
> +static bool finalized;
> +
> +
> /* Similar to gomp_realloc, but release register_lock before gomp_fatal. */
>
> static void *
> @@ -108,6 +112,9 @@ gomp_get_num_devices (void)
> static struct gomp_device_descr *
> resolve_device (int device_id)
> {
> + if (finalized)
> + return NULL;
> +
This is racy, tsan would tell you so.
Instead of a global var, I'd just change the devicep->is_initialized
field from bool into a 3 state field (perhaps enum), with states
uninitialized, initialized, finalized, and then say in resolve_device,
gomp_mutex_lock (&devices[device_id].lock);
if (devices[device_id].state == GOMP_DEVICE_UNINITIALIZED)
gomp_init_device (&devices[device_id]);
else if (devices[device_id].state == GOMP_DEVICE_FINALIZED)
{
gomp_mutex_unlock (&devices[device_id].lock);
return NULL;
}
gomp_mutex_unlock (&devices[device_id].lock);
Though, of course, that is incomplete, because resolve_device takes one
lock, gomp_get_target_fn_addr another one, gomp_map_vars yet another one.
So I think either we want to rewrite the locking, such that say
resolve_device returns a locked device and then you perform stuff on the
locked device (disadvantage is that gomp_map_vars will call gomp_malloc
with the lock held, which can take some time to allocate the memory),
or there needs to be the possibility that gomp_map_vars rechecks if the
device has not been finalized after taking the lock and returns to the
caller if the device has been finalized in between resolve_device and
gomp_map_vars.
Jakub