Can anyone shed any light on an error I'm getting repeated thousands of times 
in my grid engine messages log.  This happens when I have a job which is 
submitted and which is stopped from running by an RQS rule I have set up.  The 
error I get is:

07/14/2017 09:27:08|schedu|rocks1|C|not a single host excluded in 
rqs_excluded_hosts()

The RQS ruleset I have which triggers this looks like:

{
   name         per_user_slot_limit
   description  "limit the number of slots per user"
   enabled      TRUE
   limit        users {*} hosts {@interactive} to slots=8
   limit        users {andrewss} to slots=2
   limit        users {@bioinf} to slots=616
   limit        users {*} to slots=411
}

The rule seems to work, and jobs are held, and then started as expected.  A job 
which fails to schedule gets a state like this:

scheduling info:            cannot run in queue instance 
"all.q@compute-1-6.local" because it is not of type batch
                            cannot run in queue instance 
"all.q@compute-1-5.local" because it is not of type batch
                            cannot run in queue instance 
"all.q@compute-1-7.local" because it is not of type batch
                            cannot run in queue instance 
"all.q@compute-1-0.local" because it is not of type batch
                            cannot run in queue instance 
"all.q@compute-1-3.local" because it is not of type batch
                            cannot run because it exceeds limit "andrewss/////" 
in rule "per_user_slot_limit/3"
                            cannot run in queue instance 
"all.q@compute-1-4.local" because it is not of type batch
                            cannot run in queue instance 
"all.q@compute-1-1.local" because it is not of type batch
                            cannot run in queue instance 
"all.q@compute-1-2.local" because it is not of type batch

So it's seeing the rule and is applying it correctly, but the spurious errors 
are causing my messages file to inflate quickly when there are a lot of queued 
jobs.

Can anyone suggest how to debug or fix this?  I can't find anything relevant 
from googling around for the specific error outside of the library API it comes 
from.

This is using SGE-6.2u5p2-1.x86_64.

Thanks for any help you can offer!

Simon.


The Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT Registered 
Charity No. 1053902.
The information transmitted in this email is directed only to the addressee. If 
you received this in error, please contact the sender and delete this email 
from your system. The contents of this e-mail are the views of the sender and 
do not necessarily represent the views of the Babraham Institute. Full 
conditions at: www.babraham.ac.uk<http://www.babraham.ac.uk/terms>
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to