Re: [hwloc-users] Using hwloc to map GPU layout on system

2014-02-14 Thread Brock Palen
On Feb 7, 2014, at 9:45 AM, Brice Goglin wrote: > Le 06/02/2014 21:31, Brock Palen a écrit : >> Actually that did turn out to help. The nvml# devices appear to be numbered >> in the way that CUDA_VISABLE_DEVICES sees them, while the cuda# devices are >> in the order

Re: [hwloc-users] Using hwloc to map GPU layout on system

2014-02-07 Thread Brice Goglin
Le 06/02/2014 21:31, Brock Palen a écrit : > Actually that did turn out to help. The nvml# devices appear to be numbered > in the way that CUDA_VISABLE_DEVICES sees them, while the cuda# devices are > in the order that PBS and nvidia-smi see them. By the way, did you have CUDA_VISIBLE_DEVICES

Re: [hwloc-users] Using hwloc to map GPU layout on system

2014-02-06 Thread Brice Goglin
Le 06/02/2014 21:31, Brock Palen a écrit : > Actually that did turn out to help. The nvml# devices appear to be numbered > in the way that CUDA_VISABLE_DEVICES sees them, while the cuda# devices are > in the order that PBS and nvidia-smi see them. > > PCIBridge > PCIBridge >

Re: [hwloc-users] Using hwloc to map GPU layout on system

2014-02-06 Thread Samuel Thibault
Brock Palen, le Thu 06 Feb 2014 21:31:42 +0100, a écrit : > GPU L#3 "nvml2" > GPU L#5 "nvml3" > GPU L#7 "nvml0" > GPU L#9 "nvml1" > > Is the L# always going to be in the oder I would expect? Because then I > already have my map then. No,

Re: [hwloc-users] Using hwloc to map GPU layout on system

2014-02-06 Thread Brock Palen
Actually that did turn out to help. The nvml# devices appear to be numbered in the way that CUDA_VISABLE_DEVICES sees them, while the cuda# devices are in the order that PBS and nvidia-smi see them. PCIBridge PCIBridge PCIBridge PCI 10de:1021

Re: [hwloc-users] Using hwloc to map GPU layout on system

2014-02-05 Thread Brice Goglin
Hello Brock, Some people reported the same issue in the past and that's why we added the "nvml" objects. CUDA reorders devices by "performance". Batch-schedulers are somehow supposed to use "nvml" for managing GPUs without actually using them with CUDA directly. And the "nvml" order is the