Hey fellow Slurm users: I have just a quick question to see what other slurm users are using for their queue size. We had the default queue of 10,000 jobs in place, and a user submitted 18,000 which basically has made the entire system unusable for anybody else for the past 3 days, as many of her jobs are 24 hour plus. We plan to mitigate this by increasing the queue so that no "reasonable" amount of jobs are refused, but enabling a fair share factor model of priority so abusive users get trumped by non-abusive users when their jobs are queued. With our cluster being quite small, @ 640 total slots, we didn't want to set the queue size to ~10,000,000 or something stupid... though that would likely solve our issue with jobs not getting into the queue, even at the default value of 10,000 we are "oversubscribed" on queue vs slots to the tune of 16 to 1. Thoughts, observations or suggestions from other SLURM users would be very much appreciated.
AC
