Hi Gabriele
Could be we have a problem in our LSF support - none of us have a way
of testing it, so this is somewhat of a blind programming case for us.
From the message, it looks like there is some misunderstanding about
how many slots were allocated vs how many were mapped to a specific
host. I don't see your cmd line here - could you pass it along too?
My initial guess is that mpirun is running on node0023, and that we
then mapped procs local to mpirun such that we exceeded LSF's slot
allocation on that node. We don't account for mpirun taking a process
slot in our mapping, and LSF does - hence the error. I think...
You could test this by adding --nolocal to your cmd line. This will
force mpirun to map all procs on other nodes. If my analysis is
correct, the job should run.
Ralph
On Feb 20, 2009, at 6:46 AM, Gabriele Fatigati wrote:
Dear OpenMPi developers,
i'm running my MPI code compiled with OpenMPI 1.3 over Infiniband and
LSF scheduler. But i got the error attached. I suppose that spawning
process doesn't works well. The same program under OpenMPI 1.2.5 works
well. Could you help me?
Thanks in advance.
--
Ing. Gabriele Fatigati
Parallel programmer
CINECA Systems & Tecnologies Department
Supercomputing Group
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.it Tel: +39 051 6171722
g.fatigati [AT] cineca.it
<job.196571.err>_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users