Hi John I'm afraid that the straightforward approach you're trying isn't going to work with Open MPI in its current implementation. I had plans for supporting this kind of operation, but....not happening. And as you discovered, you cannot run mpiexec/mpirun in the background, and the "do-not-wait" option doesn't work (may even be turned "off" by now, depending which version you are using).
Your best bet would be to put a call in your first executable to "spawn" the second executable. You don't need to do this via MPI - you can do it directly from a non-MPI program by calling the appropriate RTE function. Several OpenRTE (the RTE underneath Open MPI) users do this regularly, myself included. I don't know what version you are using, but assuming it is 1.2 or the "trunk", you will find an example of this in a test program in orte/test/system/orte_spawn.c. I can provide advice/details on how to make this work, if needed (probably best done off-list, or use the OpenRTE mailing lists - see http://www.open-rte.org). Ralph On 4/23/07 11:18 PM, "John Borchardt" <john.borcha...@gmail.com> wrote: > Greetings, > > I was hoping someone could help me with the following situation. I have a > program which has no MPI support that I'd like to run "in parallel" by running > a portion of my total task on N CPUs of a PBS/Maui/Open-MPI cluster. (The > algorithm is such that there is no real need for MPI, I am just as well-off > running N processes on N CPUs as I would be adding MPI support to my program > and then running on N CPUs.) > > So it's easy enough to set up a Perl script to submit N jobs to the queue to > run on N nodes. But, my cluster has two CPUs per node, and I am not > RAM-limited, so I'd like to run two serial jobs per node, one on each node > CPU. From what my admin tells me, I must use the mpiexec command to run my > program so that the scheduler knows to run my program on the nodes which it > has assigned to me. > > In my PBS script (this is one of N/2 similar scripts), > > #!/bin/bash > #PBS -l nodes=1:ppn=2 > #PBS -l walltime=1:00:00:00 > mpiexec -pernode program-executable<inputfile1>outputfile1 > mpiexec -pernode program-executable<inputfile2>outputfile2 > > does not have the desired effect. It appears that (1) the second process > waits for the first to finish, and (2) MPI or the scheduler (I can't tell > which) tries to re-start the program a few times (you can see this in the > output files). Adding an ampersand to the first mpiexec line appears to cause > mpiexec to crash and the job does not run at all. Using: > > mpiexec -np 1 program-executable<inputfile>outputfile > > avoids the strange re-start problem I mentioned above, but of course does not > use both CPUs on a node. > > > Maybe I am making a simple mistake, but I am quite new to cluster computing... > Any help you can offer is greatly appreciated! > > > Thanks, > > --John Borchardt > > > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users