On 2015/12/1 08:01 PM, Julian Brown wrote:
> On Tue, 24 Nov 2015 18:27:24 +0800
> Chung-Lin Tang <clt...@codesourcery.com> wrote:
> 
>> Hi, this patch reworks some of the way that asynchronous copyouts are
>> implemented for OpenACC in libgomp.
>>
>> Before this patch, we had a somewhat confusing way of implementing
>> this by having two refcounts for each mapping: refcount and
>> async_refcount, which I never got working again after the last wave
>> of async regressions showed up.
>>
>> So this patch implements what I believe to be a simplification:
>> async_refcount is removed, and instead of trying to queue the async
>> copyouts during unmapping we actually do that during the plugin event
>> handling. This requires a addition of the async stream integer as an
>> argument to the register_async_cleanup plugin hook, but overall I
>> think this should be more elegant than before.
> 
> This looks OK to me I think (I've only looked fairly briefly). I vaguely
> remember trying something along these lines in an earlier iteration of
> the async support -- maybe hitting problems with locking (I see you
> have code to mitigate problems with that, and locking generally has
> probably evolved a bit since I last looked at the code in detail
> anyway).
> 
> Can event_gc ever be called when the *device* lock is held?

It only matters when the memmap_lockable argument is true, and for those
cases, no the device lock is never held.

> I'm slightly concerned that pushing async unmapping into event_gc means
> that program-level semantics are deferred to the backend, which is
> arguably the wrong place. But then I don't understand what went wrong
> with the dual-refcount implementation, so maybe it's unavoidable for
> some reason.

I got the dual-refcounting to work again (after the regressions first showed up)
in some cases briefly, but regressed in other testcases, which I don't recall
the full details now.
Indeed the copyout is now triggered inside the plugin, but it is still wrapped
inside GOMP_PLUGIN_async_unmap_vars(), so it's probably not too ugly.

Per our earlier internal discussion, I'm committing this to the gomp4 branch 
first.
Trunk will need to wait for Jakub's approval.

Thanks,
Chung-Lin

Reply via email to