Hi Brice, hi all, I'm using hwloc-distrib to distribute the jobs on the big boxes (see the attached topology). We use it to measure the performance under the various loads. The issue is that hwloc-distrib always starts from PU:0. This is a problem since usually PU:0 handles most of the interrupts so it's heavily used by OS itself. Please see the attached topology. On this box, PU#0 is pretty busy with system related tasks. Currently, if the number of jobs is lower than number of PUs I use this workaround to avoid PU#0
if [[ "${Jobs}" < "${TotalPU}" ]] ; then #Avoid using CPU 0 toRunOn=$(hwloc-distrib --single --taskset --restrict $(hwloc-calc machine:0 ~PU:0) ${Jobs}) else toRunOn=$(hwloc-distrib --single --taskset $Jobs) fi Would it possible to add the option --reverse to hwloc-distrib to distribute the jobs in the reverse direction, starting from the last PU and using the strategy to always prefer last socket/core/pu (in the opposite of the current behaviour when always first socket/core/pu is preferred)? On the example below the output would be: hwloc-distrib --single --taskset --reverse 1 =>Socket 7, Core 11, PU 127 hwloc-distrib --single --taskset --reverse 8 =>Socket 7, Core 11, PU 127 =>Socket 6, Core 11, PU 119 =>Socket 5, Core 11, PU 111 =>Socket 4, Core 11, PU 103 =>Socket 3, Core 11, PU 95 =>Socket 2, Core 11, PU 87 =>Socket 1, Core 11, PU 79 =>Socket 0, Core 11, PU 71 hwloc-distrib --single --taskset --reverse 9 =>same as the output above plus Socket 7, Core 10, PU 126 Hopefully you got the idea. The point is to start always with the last PU and when adding the next PU, always start from the last socket, last core and last PU on that core. So with this strategy, PU#0 will be used only when number of jobs >= number of PUs. What do you think about this feature? Do you find it useful? Could it be added to v1.8? Thanks a lot! Jirka