On Tue, 24 Nov 2015 18:27:24 +0800
Chung-Lin Tang <clt...@codesourcery.com> wrote:

> Hi, this patch reworks some of the way that asynchronous copyouts are
> implemented for OpenACC in libgomp.
> 
> Before this patch, we had a somewhat confusing way of implementing
> this by having two refcounts for each mapping: refcount and
> async_refcount, which I never got working again after the last wave
> of async regressions showed up.
> 
> So this patch implements what I believe to be a simplification:
> async_refcount is removed, and instead of trying to queue the async
> copyouts during unmapping we actually do that during the plugin event
> handling. This requires a addition of the async stream integer as an
> argument to the register_async_cleanup plugin hook, but overall I
> think this should be more elegant than before.

This looks OK to me I think (I've only looked fairly briefly). I vaguely
remember trying something along these lines in an earlier iteration of
the async support -- maybe hitting problems with locking (I see you
have code to mitigate problems with that, and locking generally has
probably evolved a bit since I last looked at the code in detail
anyway).

Can event_gc ever be called when the *device* lock is held?

I'm slightly concerned that pushing async unmapping into event_gc means
that program-level semantics are deferred to the backend, which is
arguably the wrong place. But then I don't understand what went wrong
with the dual-refcount implementation, so maybe it's unavoidable for
some reason.

HTH,

Julian

Reply via email to