Hi,
last month there was a thread about a problem with jobs requesting
multimple nodes:
http://www.supercluster.org/pipermail/mauiusers/2011-March/004608.html
I just found that in our setting there is also a problem with this:
Maui version 3.2.6p21
Torque verion: 2.4.11
Our cluster has 16 nodes, each with 2 CPUs, and 3 nodes, each with 16 CPUs.
Ideally I would like to submit a job to the whole cluster, and if there
are no other jobs running, then the following does work OK:
[angelv@diodo ~]$ qsub -lnodes=16:ppn=2:cpus2+3:ppn=16:cpus16 runparallel.sh
But if the cluster is busy (as it is right now), then some multi-node
jobs go to the deferred state instead of to the Idle state, and never
get executed. For example:
A request for a 3 CPU job is accepted, and it goes into the Idle queue
(though wrongly reporting that it requires only 1 CPU)
[angelv@diodo ~]$ qsub -lnodes=1:ppn=1:cpus2+1:ppn=2:cpus16 runparallel.sh
88313.diodo.ll.iac.es
[angelv@diodo ~]$ showq -i | grep angelv
88313 4000 1.0 - angelv angelv 1
1:00:00 default Mon Apr 11 11:49:30
[angelv@diodo ~]$ showq | grep angelv
88313 angelv Idle 1 1:00:00 Mon Apr 11
11:49:30
If I ask for 17 CPUs, then teh job goes into the "Deferred" state:
[angelv@diodo ~]$ qsub -lnodes=1:ppn=1:cpus2+1:ppn=16:cpus16 runparallel.sh
88314.diodo.ll.iac.es
[angelv@diodo ~]$ showq | grep angelv
88313 angelv Idle 1 1:00:00 Mon Apr 11
11:49:30
88314 angelv Deferred 1 1:00:00 Mon Apr 11
11:50:07
But there are plenty of cpus16 resources (though busy right now), and I
can submit withouth issues a job requesting for 48 CPUs, but not as a
multi-node job:
[angelv@diodo ~]$ qsub -lnodes=3:ppn=16:cpus16 runparallel.sh
88315.diodo.ll.iac.es
[angelv@diodo ~]$ showq | grep angelv
88315 angelv Idle 48 1:00:00 Mon Apr 11
11:50:24
88313 angelv Idle 1 1:00:00 Mon Apr 11
11:49:30
88314 angelv Deferred 1 1:00:00 Mon Apr 11
11:50:07
[angelv@diodo ~]$
Any ideas?
Thanks,
Ángel de Vicente
--
http://www.iac.es/galeria/angelv/
High Performance Computing Support PostDoc
Instituto de Astrofísica de Canarias
---------------------------------------------------------------------------------------------
ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Protección de
Datos, acceda a http://www.iac.es/disclaimer.php
WARNING: For more information on privacy and fulfilment of the Law concerning
the Protection of Data, consult http://www.iac.es/disclaimer.php?lang=en
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers