[slurm-dev] Insane message length

2013-06-19 Thread Pancorbo, Juan
Hi all, Today a single user submitted 7000 jobs and squeue and scancel returns the error message: Insane Message Length. I have read on a previous topic in slurm devel

[slurm-dev] Job Groups

2013-06-19 Thread Paul Edmon
I have a group here that wants to submit a ton of jobs to the queue, but want to restrict how many they have running at any given time so that they don't torch their fileserver. They were using bgmod -L in LSF to do this, but they were wondering if there was a similar way in SLURM to do so.

[slurm-dev] Re: Job Groups

2013-06-19 Thread Marcin Stolarek
2013/6/19 Paul Edmon ped...@cfa.harvard.edu: I have a group here that wants to submit a ton of jobs to the queue, but want to restrict how many they have running at any given time so that they don't torch their fileserver. They were using bgmod -L in LSF to do this, but they were wondering

[slurm-dev] Re: Job Groups

2013-06-19 Thread Ralph Castain
Could you just create a dedicated queue for those jobs, and then configure its priority and max simultaneous settings? Then all they would have to do is ensure they submit those jobs to that queue. On Jun 19, 2013, at 8:36 AM, Paul Edmon ped...@cfa.harvard.edu wrote: I have a group here

[slurm-dev] Re: Job Groups

2013-06-19 Thread Danny Auble
Sounds like something you would use a QOS for. That way you get all the limits from accounting but only applies to certain jobs. On 06/19/13 09:03, Ralph Castain wrote: Could you just create a dedicated queue for those jobs, and then configure its priority and max simultaneous settings?

[slurm-dev] Re: Job Groups

2013-06-19 Thread John Thiltges
On 06/19/2013 10:36 AM, Paul Edmon wrote: I have a group here that wants to submit a ton of jobs to the queue, but want to restrict how many they have running at any given time so that they don't torch their fileserver. The licenses feature might work OK for this. Create a license for the

[slurm-dev] Re: Job Groups

2013-06-19 Thread Ryan Cox
Paul, We were discussing this yesterday due to a user not limiting the amount of jobs hammering our storage. A QOS with a GrpJobs limit sounds like the best approach for both us and you. Ryan On 06/19/2013 09:36 AM, Paul Edmon wrote: I have a group here that wants to submit a ton of jobs

[slurm-dev] Resubmit on failure

2013-06-19 Thread Mario Kadastik
Hi, I've tried to look for this, but is there any way to have automatic job resubmission in case it fails. We occasionally have hiccups for random nodes where a job might fail due to temporary network loss or loss of storage mount or what not and when users send thousands of jobs and say 0.1%

[slurm-dev] Re: Job Groups

2013-06-19 Thread Paul Edmon
Thanks for the input. Can GrpJobs be modified from the user side? -Paul Edmon- On 06/19/2013 12:15 PM, Ryan Cox wrote: Paul, We were discussing this yesterday due to a user not limiting the amount of jobs hammering our storage. A QOS with a GrpJobs limit sounds like the best approach

[slurm-dev] Re: jobacct_gather plugins

2013-06-19 Thread Eva Hocks
I second that! Sounds like the correct approach for data intensive computing. Thanks Eva -- University of California, San Diego SDSC, MC 0505 9500 Gilman Drive La Jolla, Ca 92093-0505 Web : http://www.sdsc.edu/~hocks (858) 822-0954email: ho...@sdsc.edu

[slurm-dev] Re: Resubmit on failure

2013-06-19 Thread Moe Jette
One note: Only batch jobs will be requeued. We can't do much for jobs initiated by salloc or srun. Quoting Aaron Knister aaron.knis...@gmail.com: Hi Mario, SLURM can and will, I believe by default, resubmit jobs that fail due to node failures recognized by slurmctld that put the node

[slurm-dev] Re: Job Groups

2013-06-19 Thread Paul Edmon
Okay, thanks. -Paul Edmon- On 06/19/2013 04:32 PM, Ryan Cox wrote: Not that I'm aware of. I don't know of a way to give users control over a QOS like you can do with account coordinators for accounts. Ryan On 06/19/2013 10:55 AM, Paul Edmon wrote: Thanks for the input. Can GrpJobs be

[slurm-dev] Re: Job Groups

2013-06-19 Thread Ryan Cox
Not that I'm aware of. I don't know of a way to give users control over a QOS like you can do with account coordinators for accounts. Ryan On 06/19/2013 10:55 AM, Paul Edmon wrote: Thanks for the input. Can GrpJobs be modified from the user side? -Paul Edmon- On 06/19/2013 12:15 PM,