>>>>> "Steven" == Steven Tucker <[email protected]> writes:

Steven> Hi all, got a problem with my cluster using OpenMPI + Torque+
Steven> Maui.

Steven> I can submit 50 different jobs (single process) and the
Steven> batching system will run all 50 in parallel, but I cant get an
Steven> MPI job to run on more that 1 node. I assumed it must be my
Steven> pbs script, but I have tried just about every config I can
Steven> find/think of and still no luck.

I haven't used torque, but if it's anything like NQS, you need a
different batch queue that's configured with the nodes you want to be
able to use.  Also typically there's a different prologue and epilogue
(differently named files) for parallel as opposed to single-node jobs.


We used to have to do something like
   qmgr -c "set  queue batch16 resources_max.nodect=16"
to allow jobs submitted to the queue `batch16' to use up to 16 nodes,
for instance.   It's been fifteen years since I used NQS so my memory
may be faulty.  And of course, Torque has its own command set
(although I believe it's based on NQS).
--
Dr Peter Chubb  http://www.gelato.unsw.edu.au  peterc AT gelato.unsw.edu.au
http://www.ertos.nicta.com.au           ERTOS within National ICT Australia
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Reply via email to