I have to second Phil's point as it's relevant to where I'd like to go with SLURM on my cluster. Perhaps it could be a configurable parameter in slurm.conf-- something like AllowUserChangeQOS?
On Fri, Jul 22, 2011 at 11:09 AM, Eckert, Phil <[email protected]> wrote: > At our site we have qos's defined that allow will allow jobs to > start much sooner than others, but with the penalty of being > preempted when higher priority qos's jobs are submitted. > > If the qos of the running job is allowed to be modified, this > will allow users to game the system. They can submit the job > with the preemptable qos and then modify it to a non-preemptable > qos after it starts running. I can also imagine other qos > definitions that might be abused in this manner. > > I do not think it would be wrong to allow root/admin to modify > it, but it would cause problems (for us anyway) if the owner > of the job were allowed to do this. > > Phil Eckert > LLNL > > > On 7/21/11 9:49 AM, "[email protected]" <[email protected]> wrote: > > > Mike, > > > > Here is some updated information. This patch will make accounting > > inconsistent since we don't create a new job record that says the job > > ran for so much time under each QOS. If that capability is important, > > it would require some SLURM development work to change the accounting. > > > > Moe Jette > > SchedMD LLC > > > > Quoting [email protected]: > > > >> We believe that removing that test will be fine, but it would take some > >> work to be certain. Removing the test could break some QOS-related > >> functionality. > >> > >> Quoting Mike Schachter <[email protected]>: > >> > >>> I just posted a question yesterday about this, but might > >>> be better on a separate thread for archival purposes. > >>> > >>> I want to change the QOS of a job that is a RUNNING state, > >>> but line 5882 of slurmctld/job_mgr.c (version 2.2.7) is preventing > >>> me from doing so: > >>> > >>> } else if (job_specs->qos) { > >>> slurmdb_qos_rec_t qos_rec; > >>> if (!IS_JOB_PENDING(job_ptr)) // i want to remove this > >>> error_code = ESLURM_DISABLED; > >>> else { > >>> > >>> Is there any danger in removing that check, and allowing qos to > >>> be changed for a running job? > >>> > >>> mike > >>> > > > > > > > > > > > -- Aaron Knister Systems Administrator Division of Information Technology University of Maryland, Baltimore County [email protected]
