This would actually be a bit more involved. When this check is done the nodes haven't been assigned to the job yet. So we would have to pull this logic into the select plugin as well to pick the correct nodes a job could use without going over the limit. IMHO this is way more complexity and overhead than a return on investment would present.
Danny On 09/12/12 09:53, Moe Jette wrote: > The code current increments and decrements a counter when jobs start > and end. It would be possible to track the specific nodes allocated to > each group and avoid counting nodes twice, but that would require code > changes and higher overhead than simply incrementing and decrementing > a counter. > > Quoting Evren Yurtesen IB <[email protected]>: > >> >> >> On Tue, 11 Sep 2012, Danny Auble wrote: >> >>> On 09/11/12 05:59, Evren Yurtesen IB wrote: >>>> Hello, >>>> >>>> We have a cluster with 12 cores on each node. I made a QOS entry >>>> with GrpNodes 4 (I dont want this group of users to be able to use >>>> more than 4 nodes) >>>> >>>> If somebody queues tasks running (2 jobs on the same node): >>>> >>>> job 1 - node 1 >>>> job 2 - node 1 >>>> job 3 - node 2 >>>> job 4 - node 2 >>>> >>>> It looks like slurm is thinking 4 nodes are used? Because I see >>>> the next task queued in the system shows Nodes 1 and pending due >>>> to QOSResourceLimit. In my opinion, 2 nodes are used? :) >>>> >>>> Could it be that it counts same nodes again because they are in >>>> different jobs? (v2.4.2 is used on this system) >>>> >>> Yes >> Well, is there a way to make it count each node once? :) >> >> Thanks, >> Evren
