What OMPI version are you using? On Feb 20, 2014, at 7:56 AM, Suraj Prabhakaran <suraj.prabhaka...@gmail.com> wrote:
> Hello! > > I am having problem using MPI_Comm_spawn under torque. It doesn't work when > spawning more than 12 processes on various nodes. To be more precise, > "sometimes" it works, and "sometimes" it doesn't! > > Here is my case. I obtain 5 nodes, 3 cores per node and my $PBS_NODEFILE > looks like below. > > node1 > node1 > node1 > node2 > node2 > node2 > node3 > node3 > node3 > node4 > node4 > node4 > node5 > node5 > node5 > > I started a hello program (which just spawns itself and of course, the > children don't spawn), with > > mpiexec -np 3 ./hello > > Spawning 3 more processes (on node 2) - works! > spawning 6 more processes (node 2 and 3) - works! > spawning 9 processes (node 2,3,4) - "sometimes" OK, "sometimes" not! > spawning 12 processes (node 2,3,4,5) - "mostly" not! > > I ideally want to spawn about 32 processes with large number of nodes, but > this is at the moment impossible. I have attached my hello program to this > email. > > I will be happy to provide any more info or verbose outputs if you could > please tell me what exactly you would like to see. > > Best, > Suraj > > <hello.c>_______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel