Hi Stephen, Rather than separate queues for different hardware resources, I would recommend using requestable complexes, unless you really need something that only a queue can provide (suspend configuration, resubmit, etc.). You can use urgency values on the complexes to ensure that jobs that depend on that resource have a high enough priority to run. Resource reservations can help too, although note that the overhead of scheduling reservations can be quite high if you have lots of running jobs.
On Sun, Jan 18, 2015 at 12:20:36PM -0800, Stephen Spencer wrote: > Chris (and all who've responded), > > Thank you for the responses. It's given me some directions to explore. > > At present, I have just the default queue on one cluster, and two queues - > one for the machines with GPUs, and one for machines without GPUs - on the > other cluster. > > The "fairshare-by-user" policy sounds very interesting; users typically > only complain when one user is monopolizing the cluster, submitting > hundreds of jobs, forcing anyone else to join the end of the line with > their jobs. > > Best, > Stephen > > On Fri, Jan 16, 2015 at 12:51 PM, Chris Dagdigian <[email protected]> wrote: > > > > > Queues are just a piece of the puzzle when it comes to handling resource > > allocation on a multi user system, what (if any) scheduling policies and > > resource quotas are you currently using? > > > > That said you are using the queue methods in a good way. There are certain > > things that can only be really done on a per-queue basis and top of the > > list would be ACL protection and the ability to impose hard or soft > > wallclock limits. > > > > A fairshare-by-user policy with the queue structure you set up would be a > > decent starting point from which you can gather more data and user feedback. > > > > Thoughts > > > > - resource quota would perfectly handle the "only N jobs per user can run > > in the long-job.q cluster queue ..." > > > > - I've had little success putting wallclock limits on interactive queues; > > there are legit business/scientific reasons in many cases for a long > > running interactive session. You might want to poll the users or collect > > data on this. In a few different environments I've had decent success by > > leaving interactive queue slots unrestricted but putting a resource quota > > around how many slots a single user can consume. It's also pretty easy to > > set up tools that would allow you to dynamically adjust the size/count of > > the interactive slot pool to account for changing demand - it's > > particularly easy when used with SGE hostgroup objects. > > > > My $.02 > > > > > > > > > > > > Stephen Spencer <mailto:[email protected]> > >> January 16, 2015 at 2:50 PM > >> Good morning. > >> > >> With the number of users on our clusters growing, it's becoming less > >> realistic to say "play fair 'cause you're not the only user of the > >> cluster." > >> > >> I'm looking for suggestions on setting up queues, both the "why" and > >> "how," that will allow more of our users access to the cluster. > >> > >> What I'm thinking of is a multi-queue approach: > >> > >> * some limited number of "interactive" slots (and they'd be > >> time-limited) > >> * a queue for jobs with short time duration - the "express" queue > >> * a queue for jobs that will run longer... but only so many of these > >> per user > >> > >> Any and all suggestions are welcome. > >> > >> Thank you! > >> > >> Best, > >> -- > >> Stephen Spencer > >> [email protected] <mailto:[email protected]> > >> _______________________________________________ > >> users mailing list > >> [email protected] > >> https://gridengine.org/mailman/listinfo/users > >> > > > > > -- > Stephen Spencer > [email protected] > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users -- -- Skylar Thompson ([email protected]) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
