You would need to update slurm_errno.c and slurm_errno.h and rebuild everything that uses the new variable (e.g. srun, sbatch, salloc, and your plugin). No need to rebuild libslurm.so (at least with recent versions of Slurm, but it isn't a bad idea to do so for anyone directly using the APIs).
Note that you can configure Slurm with a default memory value for the jobs, either system-wide or on a per partition basis. See "man slurm.conf" and the parameters "DefMemPerNode" and "DefMemPerCPU", which may be a better solution than returning an error. Quoting Loris Bennett <[email protected]>: > > "Loris Bennett" <[email protected]> > writes: > >> "Loris Bennett" <[email protected]> >> writes: >> >>> Hi, >>> >>> I have written a job_submit plugin to check whether memory requirements >>> have been given and reject the job if not. This works, but I am unsure >>> how to generate a specific error message on the command line when sbatch >>> fails due to the requirement not being met. >>> >>> I have added a new error ESLURM_NO_MEMORY_SPECIFIED with errno 8000 to >>> slurm_errno.h and a corresponding error message to slurm_errno.c and >>> have the following in the plugin: >>> >>> >>> ,------------------------------------------------------------------------ >>> | if (job_desc->pn_min_memory == unlimited_mem_mb) { >>> | error("job submit defaults plugin: (almost) unlimited memory given"); >>> | slurm_seterrno(ESLURM_NO_MEMORY_SPECIFIED); >>> | slurm_perror("No memory specified via --mem<GB>, job rejected"); >>> | return ESLURM_NO_MEMORY_SPECIFIED; >>> `------------------------------------------------------------------------ >>> >>> (I known I should probably be using slurm_strerror to get the error from >>> slurm_errno.c). >>> >>> However, I now get the following response on the command line: >>> >>> ,--------------------------------------------------------------- >>> | sbatch: error: Batch job submission failed: Unknown error 8000 >>> `--------------------------------------------------------------- >>> >>> What's missing to get the appropriate error message printed? >>> >>> Cheers, >>> >>> Loris >> >> Having no solution yet, I am having another look at this problem. >> >> As I understand it, if I add a new error number to slurm_errno.h, which >> I can include in job_submit_defaults.c. However, the error message >> associated with the new error number is added to slurm_errno.c, which >> seems just to be compiled into libslurm.so and the main binaries. >> >> Thus, to have a plugin return a non-standard error message, I would have >> to replace the libslurm.so and possibly the slurm daemons. >> >> Is my understanding correct? >> >> Regards >> >> Loris > > Can anyone shed any light on the topic above? > > Cheers, > > Loris > > -- > Dr. Loris Bennett (Mr.) > ZEDAT, Freie Universität Berlin Email [email protected] >
