https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110840

            Bug ID: 110840
           Summary: [OpenMP] Check whether device locking is really needed
                    for bare memcopy to/from devices
                    (omp_target_memcpy...)
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization, openacc, openmp
          Severity: normal
          Priority: P3
         Component: libgomp
          Assignee: unassigned at gcc dot gnu.org
          Reporter: burnus at gcc dot gnu.org
                CC: jakub at gcc dot gnu.org, tschwinge at gcc dot gnu.org
  Target Milestone: ---

See also PR110813 for a performance PR for omp_target_memcpy_rect

Thomas wrote in https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625670.html
>
> >      gomp_mutex_lock (&src_devicep->lock);
> > -  else if (dst_devicep)
> > +  if (lock_dst)
> >      gomp_mutex_lock (&dst_devicep->lock);
>
> (Pre-existing issue, and I've not myself tried to figure out the details
> at this time -- why do we actually lock the devices here, and in similar
> other places?)

 * * *

It seems as if for just copying memory to a fixed pointer, it should not be
needed - if the memory disappears, it will be a user problem. Or is there an
issue in terms of handling the context - like in CUDA's call to
nvptx_attach_host_thread_to_device ?

Pure memory copy without splay-tree lookups is in:

* omp_target_memcpy_copy
* omp_target_memcpy_rect_copy

 * * *

Besides pure memory copy, are there other places where the looking could be
removed? Or places were it is missing?

Reply via email to