I have to second Phil's point as it's relevant to where I'd like to go with
SLURM on my cluster. Perhaps it could be a configurable parameter in
slurm.conf-- something like AllowUserChangeQOS?

On Fri, Jul 22, 2011 at 11:09 AM, Eckert, Phil <[email protected]> wrote:

> At our site we have qos's defined that allow will allow jobs to
> start much sooner than others, but with the penalty of being
> preempted when higher priority qos's jobs are submitted.
>
> If the qos of the running job is allowed to be modified, this
> will allow users to game the system. They can submit the job
> with the preemptable qos and then modify it to a non-preemptable
> qos after it starts running. I can also imagine other qos
> definitions that might be abused in this manner.
>
> I do not think it would be wrong to allow root/admin to modify
> it, but it would cause problems (for us anyway) if the owner
> of the job were allowed to do this.
>
> Phil Eckert
> LLNL
>
>
> On 7/21/11 9:49 AM, "[email protected]" <[email protected]> wrote:
>
> > Mike,
> >
> > Here is some updated information. This patch will make accounting
> > inconsistent since we don't create a new job record that says the job
> > ran for so much time under each QOS. If that capability is important,
> > it would require some SLURM development work to change the accounting.
> >
> > Moe Jette
> > SchedMD LLC
> >
> > Quoting [email protected]:
> >
> >> We believe that removing that test will be fine, but it would take some
> >> work to be certain. Removing the test could break some QOS-related
> >> functionality.
> >>
> >> Quoting Mike Schachter <[email protected]>:
> >>
> >>> I just posted a question yesterday about this, but might
> >>> be better on a separate thread for archival purposes.
> >>>
> >>> I want to change the QOS of a job that is a RUNNING state,
> >>> but line 5882 of slurmctld/job_mgr.c (version 2.2.7) is preventing
> >>> me from doing so:
> >>>
> >>> } else if (job_specs->qos) {
> >>>   slurmdb_qos_rec_t qos_rec;
> >>>   if (!IS_JOB_PENDING(job_ptr))              // i want to remove this
> >>>     error_code = ESLURM_DISABLED;
> >>>   else {
> >>>
> >>> Is there any danger in removing that check, and allowing qos to
> >>> be changed for a running job?
> >>>
> >>>  mike
> >>>
> >
> >
> >
> >
>
>
>


-- 
Aaron Knister
Systems Administrator
Division of Information Technology
University of Maryland, Baltimore County
[email protected]

Reply via email to