Thanks Brice,

Right now, the user facing issue is that numa binding is requested, and there 
is no numa, so mpirun aborts.

But you have a good point, we could simply not bind at all in this case instead 
of aborting, since the numa node would have been the full machine, which would 
have been a noop.

Fwiw,
- the default binding policy was changed from to socket to numa (for better out 
of the box perfs on KNL iirc)
- in btl/sm we malloc(0) when there is no numa, which causes some memory 
corruption. The fix is trivial and i will push it tomorrow

Cheers,

Gilles

Brice Goglin <brice.gog...@inria.fr> wrote:
>
>
>Le 05/01/2017 07:07, Gilles Gouaillardet a écrit :
>> Brice,
>>
>> things would be much easier if there were an HWLOC_OBJ_NODE object in
>> the topology.
>>
>> could you please consider backporting the relevant changes from master
>> into the v1.11 branch ?
>>
>> Cheers,
>>
>> Gilles
>
>Hello
>Unfortunately, I can't backport this to 1.x. This is very intrusive and
>would break other things.
>However, what problem are you actually seeing? They are no NUMA node in
>hwloc 1.x when the machine isn't NUMA (or when there's no NUMA support
>in the operating system but that's very unlikely). hwloc master would
>show a single NUMA node that is equivalent to the entire machine, so
>binding would be a noop.
>Regards
>Brice
>
>_______________________________________________
>devel mailing list
>devel@lists.open-mpi.org
>https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to