> Most people don't care about cache when binding with MPI, so you may
> just ignore the issue and hide the message by setting
> HWLOC_HIDE_ERRORS=1 in the environment. It may work fine (assuming
> MPIs don't have troubles with asymmetric topologies where there are
> some missing L3).

We do see some problems with our MPI (that's what prompted me to
install hwloc 1.9 and openmpi 1.8.2rc2 in the first place), but I don't
think it's related to this cache problem: when we submit an MPI job to,
say, 48 processors, only a handful of random ones work at 100%, the
rest are idling (as seen in htop). We are running openmpi via mpi4py to
do Markov Chain Monte Carlo (mcmc) fitting.

> Otherwise, hwloc can load the topology from XML. So we'll just need to
> generate a fixed topology, export it to XML and set an environment
> variable to have hwloc load from there. A single file may even be
> enough for all similar nodes assuming your MPI and/or applications
> don't look in deep details of hwloc topologies.

We have 4 identical nodes; one runs as a server and the others boot off
the server via NFS. But there are also 6 older nodes (with 4 processors
each) that boot off the server in the same way, so for those I imagine
that a fixed topology XML won't do much good. That said, we are in the
process of phasing those 6 nodes out, and I wouldn't be terribly sad if
this gives us a good reason to do so.

Thanks a bunch,
Andrej

Reply via email to