On Mon, 26 Jan 2015 14:44:19 +0100 Thomas Schwinge <tho...@codesourcery.com> wrote:
> > On 17 Jan 02:16, Ilya Verbin wrote: > > > Unfortunately, it broke offloading from shared libraries (I mean > > > common libs with NEEDED entries, not dlopened). > > Sorry for that! > > > > Such things are not covered by the > > > testsuite, that's why you missed this issue. Here is a simple > > > testcase: > > <http://news.gmane.org/find-root.php?message_id=%3C20150116231632.GB48380%40msticlxl57.ims.intel.com%3E> > > Probably a good motivation for adding such a test case. ;-) > > > > So, you don't assume that a device can have multiple images from > > > multiple libs? > > > > Ping? > > This probably is "just" a bug that we introduced with our changes? > (Julian?) AFAICR, we haven't yet figured out how to make (shared) libraries work with PTX. Actually I'm not entirely sure if static libraries containing PTX code will work either. But, multiple images (e.g. from different object files) are supported, via the loop in gomp_target_init. (The semantics of gomp_register_image_for_device were changed, but not -- intentionally! -- to limit the number of offloaded images to one.) > > Also, could you please explain, why did you divide a device > > initialization into two functions -- gomp_init_device and > > gomp_init_tables? > > As I understand it (again, Julian, please correct me if I got that > wrong), the reason is that for OpenACC support, we need these as two > separate (independent) actions. Is this causing problems for OpenMP > offloading? This was certainly necessary at some point, when the support for multiple devices of the same type in the OpenACC runtime was delegated entirely to target-dependent code. Later (after one round of refactoring), the gomp_device_descr and the memory map were still separate, with the former possibly representing a number of devices, and the latter having independent copies for each instance of a device. That's largely been refactored (again) away now though -- a gomp_device_descr and its memory map are stored together, per-device instance. So this separation of their initialisation can probably go away, although some (somewhat delicate) code in oacc-init.c would need to be tweaked. Julian