Hi all, i'm having a problem running mpi jobs with torque.

Problem is that if i submit this script:

===========================================================
#!/bin/sh
#PBS -S /bin/sh
#PBS -m e
#PBS -l cput=500:00:00
#PBS -l nodes=3:ppn=2
cd /home/salvator
time mpiexec -boot -v -n 6 mpiblast -p blastx -d nr -i frag.0 -o frag.0.out6p_pbs
===========================================================


All 6 processes will be processed by a single node ( each node have 2 procs) and are not splitted on 3 nodes. After some feedback from Torque mailing list seems that answer is that exists several version of mpiblast and mpiexec, and not all can retrive automatically nodelist from pbs, but if i understood well, the version that comes with Oscar 4 should have the right version. If is this the case why not works? And if i need to recompile the correct version of mpiexec, in README file i read that i must also patch torque sources, but as far i know the version of torque that comes with oscar is modified and dunno it i should patch it or not.

What i must do, to let work torque correctly? Please, it's realy important for me to be able to run mpi jobs avoiding to start manually lamboot and giving nodelist statically ( giving a static list, will force to use those nodes, also if was alseady rederved by torque to other jobs, or will not mark it as already used).

Salvatore Di Nardo

Reply via email to