We're using Grid Engine 6.2u3 here.  I recently added a per job
consumable (mic) to try to limit access to a co-processor.  What I'm
finding is that the consumable value doesn't seem to be being checked.

Jobs requesting the consumable still try to start even when there is
none of the consumable left according to qhost -F mic.  The amount of
the resource available even drops below 0 for a short while as the job
tries to start (as reported by qhost -F mic).  Fortunately I have a
prolog script that uses lockfiles  to allocate specific
co-processors(fake RSMAP) and this has the side effect of bouncing the
extra jobs back into the queue so the impact is mostly limited to a lot
of jobs  flipping between Rr and Rq states.  I can probably minimise
this by converting this resource to a non-consumable load sensor that
looks for my lock files to calculate availability of the co-processors
but it looks like per job consumables don't work in any useful way on
our cluster.  

Is this a known issue?  Is it fixed in later versions?

William

 

Attachment: signature.asc
Description: PGP signature

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to