We're using Grid Engine 6.2u3 here. I recently added a per job consumable (mic) to try to limit access to a co-processor. What I'm finding is that the consumable value doesn't seem to be being checked.
Jobs requesting the consumable still try to start even when there is none of the consumable left according to qhost -F mic. The amount of the resource available even drops below 0 for a short while as the job tries to start (as reported by qhost -F mic). Fortunately I have a prolog script that uses lockfiles to allocate specific co-processors(fake RSMAP) and this has the side effect of bouncing the extra jobs back into the queue so the impact is mostly limited to a lot of jobs flipping between Rr and Rq states. I can probably minimise this by converting this resource to a non-consumable load sensor that looks for my lock files to calculate availability of the co-processors but it looks like per job consumables don't work in any useful way on our cluster. Is this a known issue? Is it fixed in later versions? William
signature.asc
Description: PGP signature
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
