On Wed, Sep 01, 1999 at 09:13:28PM -0300, Enzo A. Dari wrote: > > Now I'm trying to submit parallel jobs using MPI 1.1.2 and PVM 3.4.0. > Both of them are installed from source code (in /usr/local, shared > by all nodes). >
What you do is specify the number (hard or soft requirement) of nodes you want to run on, and then read those nodes out of a supplied file. The script runs only on the master node. qsub -l linux,qty.eq.3 env ^D output file ----------- HOSTNAME=bakunin.anu.edu.au HOSTS_FILE=/home/dld/STDIN.hosts5849.20215 NUM_HOSTS=3 STDIN.hosts* ------------ bakunin.anu.edu.au freki.anu.edu.au jabez.anu.edu.au Some libraries (PVM) require some initialization be done first. There is support in DQS to do this (-par PVM), but it doesn't really work. The PVM model requires a master pvm daemon be started on one node in the virtual machine, which then starts slave pvmds as hosts are added. The problem is a particular node may only be in one virtual machine at a time (per user). Multiple jobs dispersed across SMP machines will have trouble whether they take down the virtual machine or leave it up. two virtual machines can't be merged. Any job on a virtual machine that is taken down dies. it's also difficult to communicate to a large virtual machine that the queueing system has only allocated you these 3 nodes, not the whole thing. MPI probably doesn't have these difficulties, you'd just "mpirun `cat $HOSTS_FILE` program" (untested, I don't use MPI at the moment). -Drake

