Hello, This is the main corner case of hwloc-distrib. It can return objects only, not groups of objects. The distrib algorithms is: 1) start at the root, where there are M children, and you have to distribute N processes 2) if there are no children, or if N is 1, return the entire object 3) split N into Ni (N = sum of Ni) into M pieces based on each children weight (the number of PUs under each) If N>=M, all Ni can be > 0, all children will get some process if N<M, you can't split N into M integer pieces, some Ni will be 0, these objects won't get any process 4) go back to (2) recurse in each children object with Ni instead of N
Your case is step 3 with N=2 and M=4. It basically means that we distribute across cores without "assembling group of cores if needed". In your case, when you bind to 2 cores of 4 PUs each, your task only uses one PU in the end, 1 core and 3 PU are ignored as well. They *may* be used, but the operating system scheduler is free to ignore them. So binding to 2 cores or binding to 1 core or binding to 1 PU is almost equivalent. At least the latter is included in the former. And most people pass --single to get a single PU anyway. The case where it's not equivalent is when you bind multithreaded processes. If you have 8 threads, it's better to use 2 cores than 1 single one. If this case matters to you, I will look into fixing this corner case. Brice Le 30/03/2014 07:56, Tim Creech a écrit : > Hello, > I would like to use hwloc_distrib for a project, but I'm having some > trouble understanding how it distributes. Specifically, it seems to > avoid distributing multiple processes across cores, and I'm not sure > why. > > As an example, consider the actual output of: > > $ hwloc-distrib -i "4 4" 2 > 0x0000000f > 0x000000f0 > > I'm expecting hwloc-distrib to tell me how to distribute 2 processes > across the 16 PUs (4 cores by 4 PUs), but the answer only involves 8 > PUs, leaving the other 8 unused. If there were more cores on the > machine, then potentially the vast majority of them would be unused. > > In other words, I might expect the output to use all of the PUs across > cores, for example: > > $ hwloc-distrib -i "4 4" 2 > 0x000000ff > 0x0000ff00 > > Why does hwloc-distrib leave PUs unused? I'm using hwloc-1.9. Any help > in understanding where I'm going wrong is greatly appreciated! > > Thanks, > Tim > > _______________________________________________ > hwloc-users mailing list > hwloc-us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users