This is a problem that has plagued me, and various people time and
again, and every time it gets fixed, the method and cause seems to get
lost in the aether. I've hit it several times over the years, and each
time the problem and solution see to vanish...
I do seem to recall that the message about the PE is basically irrelevant,
and entirely misleading.
Here is the problem:
$ qrsh -clear -w v -pe make-dedicated 4 -b y /bin/true
Job 1953957 cannot run in PE "make-dedicated" because it only offers 0
slots
verification: no suitable queues
Yet:
$ qrsh -clear -w v -b y /bin/true
verification: found suitable queue(s)
The problem only appears with parallel jobs. Serial jobs are not
affected.
I am running SGE 6.2u5.
The PE is attached to various queues. The PE has sufficient slots, and
the variuos exec nodes are sufficiently idle to take a trivial 4 slot
job. The PE definition:
$ qconf -sp make-dedicated
pe_name make-dedicated
slots 5000
user_lists NONE
xuser_lists NONE
start_proc_args /bin/true
stop_proc_args /bin/true
allocation_rule $pe_slots
control_slaves FALSE
job_is_first_task TRUE
urgency_slots max
accounting_summary TRUE
All exec hosts have 8 or more CPUs and slots.
The problem turned out to be a typo in a complex name that had been
assigned to the queues.
Hopefully this will help someone, somewhere, at some point.
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
:(){ :&:};:
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users