On Aug 17 2009, Ralph Castain wrote:

The problem is that the two mpiruns don't know about each other, and therefore the second mpirun doesn't know that another mpirun has already used socket 0.

We hope to change that at some point in the future.

It won't help.  The problem is less likely to be that two jobs are running
OpenMPI programs (that have been recently linked!), but that the other tasks
are not OpenMPI at all.  I have mentioned daemons, kernel threads and so on,
but think of shared-memory parallel programs (OpenMP etc.) and so on; a LOT
of applications nowadays include some sort of threading.

For the ordinary multi-user system, you don't want any form of binding. The scheduler is ricketty enough as it is, without confusing it further. That may change as the consequences of serious levels of multiple cores force that area to be improved, but don't hold your breath. And I haven't a clue which of the many directions scheduler design will go!

I agree that having an option, and having it easy to experiment with, is the
right way to go.  What the default should be is very much less clear.

Regards,
Nick Maclaren.


Reply via email to