Hello,
I'm new to the list, maybe the issue was somewhere else already
solved. I found a similar question in the mailing list archive but no
solution.
By comparing to a similar system I found that for my 4 sockets Opteron
system it is necessary to run mpi with -bind-to numa option.
On my ubunut 14.04 system I get
----------------------
A request was made to bind a process, but at least one node does NOT
support binding processes to cpus.
Node: leo
This usually is due to not having libnumactl and libnumactl-devel
installed on the node.
-----------------------
locate libnuma | grep so
results in
/usr/lib/x86_64-linux-gnu/libnuma.so
/usr/lib/x86_64-linux-gnu/libnuma.so.1
/usr/lib64/libnuma.so
/usr/lib64/libnuma.so.1
When I configure hwloc 1.11.1 it detects numa (it tells me at the end
of configure)
and "grep numa config.status" results in
S["HWLOC_LIBS"]="-lm -lnuma -lxml2 "
S["HWLOC_LINUX_LIBNUMA_LIBS"]="-lnuma"
When I configure openmpi-1.10.0 it also finds libnuma
grep numa config.status
S["OMPI_WRAPPER_EXTRA_LIBS"]="-lm -lnuma -ldl -lutil "
S["ORTE_WRAPPER_EXTRA_LIBS"]="-lm -lnuma -ldl -lutil "
S["OPAL_WRAPPER_EXTRA_LIBS"]="-lm -lnuma -ldl -lutil "
S["HWLOC_EMBEDDED_LIBS"]="-lm -lnuma"
S["HWLOC_LINUX_LIBNUMA_LIBS"]="-lnuma"
D["WRAPPER_EXTRA_LIBS"]=" \"-lm -lnuma -ldl -lutil \""
However I have no idea how I could install libnumactl and
libnumactl-devel. I cannot google it.
I built numactl-2.0.9 manually but
~/tmp/numactl-2.0.9$ ll lib*
only gives libnuma.a and libnuma.so
Even the source for open-mpi gives no hint
/tmp/openmpi-1.10.0$ grep -r numactl
opal/mca/hwloc/hwloc191/hwloc/README: * libnuma for memory binding
and migration support on Linux (numactl-devel or
orte/mca/rmaps/base/help-orte-rmaps-base.txt:This usually is due to
not having libnumactl and libnumactl-devel
orte/mca/rmaps/base/help-orte-rmaps-base.txt:contained in the
libnumactl and libnumactl-devel packages.
orte/mca/rmaps/base/help-orte-rmaps-base.txt:contained in the
libnumactl and libnumactl-devel packages.
Please help, I have no idea what to try next. The only options I
currently see are to try with mpich or by intel-mpi.
Thanks,
Fabian