On 2015/12/1 08:01 PM, Julian Brown wrote: > On Tue, 24 Nov 2015 18:27:24 +0800 > Chung-Lin Tang <clt...@codesourcery.com> wrote: > >> Hi, this patch reworks some of the way that asynchronous copyouts are >> implemented for OpenACC in libgomp. >> >> Before this patch, we had a somewhat confusing way of implementing >> this by having two refcounts for each mapping: refcount and >> async_refcount, which I never got working again after the last wave >> of async regressions showed up. >> >> So this patch implements what I believe to be a simplification: >> async_refcount is removed, and instead of trying to queue the async >> copyouts during unmapping we actually do that during the plugin event >> handling. This requires a addition of the async stream integer as an >> argument to the register_async_cleanup plugin hook, but overall I >> think this should be more elegant than before. > > This looks OK to me I think (I've only looked fairly briefly). I vaguely > remember trying something along these lines in an earlier iteration of > the async support -- maybe hitting problems with locking (I see you > have code to mitigate problems with that, and locking generally has > probably evolved a bit since I last looked at the code in detail > anyway). > > Can event_gc ever be called when the *device* lock is held?
It only matters when the memmap_lockable argument is true, and for those cases, no the device lock is never held. > I'm slightly concerned that pushing async unmapping into event_gc means > that program-level semantics are deferred to the backend, which is > arguably the wrong place. But then I don't understand what went wrong > with the dual-refcount implementation, so maybe it's unavoidable for > some reason. I got the dual-refcounting to work again (after the regressions first showed up) in some cases briefly, but regressed in other testcases, which I don't recall the full details now. Indeed the copyout is now triggered inside the plugin, but it is still wrapped inside GOMP_PLUGIN_async_unmap_vars(), so it's probably not too ugly. Per our earlier internal discussion, I'm committing this to the gomp4 branch first. Trunk will need to wait for Jakub's approval. Thanks, Chung-Lin