This is a problem that has plagued me, and various people time and
again, and every time it gets fixed, the method and cause seems to get
lost in the aether.  I've hit it several times over the years, and each
time the problem and solution see to vanish...

I do seem to recall that the message about the PE is basically irrelevant,
and entirely misleading.

Here is the problem:

        $ qrsh -clear -w v -pe make-dedicated 4 -b y /bin/true
        Job 1953957 cannot run in PE "make-dedicated" because it only offers 0 
slots
        verification: no suitable queues

Yet:

        $ qrsh -clear -w v  -b y /bin/true
        verification: found suitable queue(s)

The problem only appears with parallel jobs.  Serial jobs are not
affected.

I am running SGE 6.2u5.

The PE is attached to various queues.  The PE has sufficient slots, and
the variuos exec nodes are sufficiently idle to take a trivial 4 slot
job. The PE definition:
        $ qconf -sp make-dedicated
        pe_name            make-dedicated
        slots              5000
        user_lists         NONE
        xuser_lists        NONE
        start_proc_args    /bin/true
        stop_proc_args     /bin/true
        allocation_rule    $pe_slots
        control_slaves     FALSE
        job_is_first_task  TRUE
        urgency_slots      max
        accounting_summary TRUE

All exec hosts have 8 or more CPUs and slots.


The problem turned out to be a typo in a complex name that had been
assigned to the queues.

Hopefully this will help someone, somewhere, at some point.

--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
:(){ :&:};:
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to