Dear all
I've install for a course a Rocks Cluster of 2 nodes, with SGE. Each
node are a 4 cores nodes.
I do a shutdown of a node, and so i have ready uniquely 4 cores:
$ qstat -f
queuename qtype resv/used/tot. load_avg arch
states
---------------------------------------------------------------------------------
all.q@compute-0-0.local BIP 0/0/4 0.00 linux-x64
---------------------------------------------------------------------------------
all.q@compute-0-1.local BIP 0/0/4 -NA- linux-x64
au
But i come in a strange issue, that i can't explain yet:
My user submit a paralele job with 8 cores.
When i check my job state, in "qw" state, i've get back thios message:
$ qtsat j 58
../..
scheduling info: queue instance "all.q@compute-0-1.local"
dropped because it is temporarily not available
cannot run in PE "orte" because it only
offers 7 slots
If i power on the second node, the message is ths same:
$ qstat -f
queuename qtype resv/used/tot. load_avg arch
states
---------------------------------------------------------------------------------
all.q@compute-0-0.local BIP 0/0/4 0.00 linux-x64
---------------------------------------------------------------------------------
all.q@compute-0-1.local BIP 0/0/4 0.10 linux-x64
$ qstat -j 58
../..
parallel environment: orte range: 8
version: 3
scheduling info: cannot run in PE "orte" because it only
offers 7 slots
I've search on all of the configuration of SGE. I do too the
reinstalation of the 2 nodes. But the same message appears, that
uniquely 7 slots free !
Someone can't get me some help?
Regards
--
-- Jérôme
On n'a jamais vu un aveugle dans un camp de nudistes.
(Woody Allen)
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users