I don't have much to add, but +1. :) On Dec 19, 2012, at 12:58 PM, Brice Goglin wrote:
> Hello, > > We currently have three GPU-related branches: > (1) a (old) CUDA branch that adds "cuda0", "cuda1", ... devices inside > PCI devices and then puts Core and Memory in there to describe the GPU > internals. > (2) a (new) NVML branch that adds "nvml0", "nvml1", ... devices inside > NVIDIA GPU PCI devices (the order can be different in NVML and CUDA). > This is used by batch schedulers to retrieve NVIDIA GPU locality. > (3) a (new) OpenCL branch that adds "opencl0p0", ... devices inside AMD > GPU PCI devices. > > I am going to merge the basic of (1), (2) and (3) by the end of the year > so that users can easily retrieve the locality of CUDA/NVML/OpenCL > device. They'll have functions to convert the device pointer into hwloc > object, a device index into object, or a device pointer into a cpuset. > > The main drawback of this is that the initialization of these libs can > be slow (about 1-2s added to lstopo since it enables I/O by default) if > poorly configured (NVIDIA puts GPGPU device in non-persistent mode by > default, and AMD GPGPU are slower if DISPLAY isn't set to :0). I will > document how to avoid such issues, not sure it's worth disabling all > this plugins by default. > > > Then we'll talk about the remaining part of (1) (GPU internals), I still > need to see if we can do something similar with OpenCL, find out which > numbers of compute units, SIMD units, SIMD width actually matter to > users, and if we can report all this in a somehow portable way. > > Brice > > _______________________________________________ > hwloc-devel mailing list > hwloc-de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/