https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110840
Bug ID: 110840 Summary: [OpenMP] Check whether device locking is really needed for bare memcopy to/from devices (omp_target_memcpy...) Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization, openacc, openmp Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: jakub at gcc dot gnu.org, tschwinge at gcc dot gnu.org Target Milestone: --- See also PR110813 for a performance PR for omp_target_memcpy_rect Thomas wrote in https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625670.html > > > gomp_mutex_lock (&src_devicep->lock); > > - else if (dst_devicep) > > + if (lock_dst) > > gomp_mutex_lock (&dst_devicep->lock); > > (Pre-existing issue, and I've not myself tried to figure out the details > at this time -- why do we actually lock the devices here, and in similar > other places?) * * * It seems as if for just copying memory to a fixed pointer, it should not be needed - if the memory disappears, it will be a user problem. Or is there an issue in terms of handling the context - like in CUDA's call to nvptx_attach_host_thread_to_device ? Pure memory copy without splay-tree lookups is in: * omp_target_memcpy_copy * omp_target_memcpy_rect_copy * * * Besides pure memory copy, are there other places where the looking could be removed? Or places were it is missing?