On 03/31/09 14:50, Dave Love wrote:
Rolf Vandevaart <rolf.vandeva...@sun.com> writes:

However, I found that if I explicitly specify the "-machinefile
$TMPDIR/machines", all 8 mpi processes were spawned within a single
node, i.e. node0002.

I had that sort of behaviour recently when the tight integration was
broken on the installation we'd been given, and it took me a long time
to spot.  [Is the orte_leave_session_attached fix relevant here?]
No, orte_leave_session_attached is needed to avoid the errno=2 errors from the sm btl. (It is fixed in 1.3.2 and trunk)

And for what it is worth, as you have seen,
you do not need to specify a machines file.  Open MPI will use the
ones that were allocated by SGE.

Yes, but there's a problem with the recommended (as far as I remember)
setup, with one slot per node to ensure a single job per node.  In that
case, you have no control over allocation -- -bynode and -byslot are
equivalent, which apparently can badly affect some codes.  We're
currently using a starter to generate a hosts file for that reason
(complicated by having dual- and quad-core nodes) and would welcome a
better idea.

I am not sure what you are asking here. Are you trying to get a single MPI process per node? You could use -npernode 1. Sorry for my confusion.

Rolf

--

=========================
rolf.vandeva...@sun.com
781-442-3043
=========================

Reply via email to