You would need to update slurm_errno.c and slurm_errno.h and rebuild  
everything that uses the new variable (e.g. srun, sbatch, salloc, and  
your plugin). No need to rebuild libslurm.so (at least with recent  
versions of Slurm, but it isn't a bad idea to do so for anyone  
directly using the APIs).

Note that you can configure Slurm with a default memory value for the  
jobs, either system-wide or on a per partition basis. See "man  
slurm.conf" and the parameters "DefMemPerNode" and "DefMemPerCPU",  
which may be a better solution than returning an error.

Quoting Loris Bennett <[email protected]>:

>
> "Loris Bennett" <[email protected]>
> writes:
>
>> "Loris Bennett" <[email protected]>
>> writes:
>>
>>> Hi,
>>>
>>> I have written a job_submit plugin to check whether memory requirements
>>> have been given and reject the job if not.  This works, but I am unsure
>>> how to generate a specific error message on the command line when sbatch
>>> fails due to the requirement not being met.
>>>
>>> I have added a new error ESLURM_NO_MEMORY_SPECIFIED with errno 8000 to
>>> slurm_errno.h and a corresponding error message to slurm_errno.c and
>>> have the following in the plugin:
>>>
>>>
>>> ,------------------------------------------------------------------------
>>> | if (job_desc->pn_min_memory == unlimited_mem_mb) {
>>> |   error("job submit defaults plugin: (almost) unlimited memory given");
>>> |   slurm_seterrno(ESLURM_NO_MEMORY_SPECIFIED);
>>> |   slurm_perror("No memory specified via --mem<GB>, job rejected");
>>> |   return ESLURM_NO_MEMORY_SPECIFIED;
>>> `------------------------------------------------------------------------
>>>
>>> (I known I should probably be using slurm_strerror to get the error from
>>> slurm_errno.c).
>>>
>>> However, I now get the following response on the command line:
>>>
>>> ,---------------------------------------------------------------
>>> | sbatch: error: Batch job submission failed: Unknown error 8000
>>> `---------------------------------------------------------------
>>>
>>> What's missing to get the appropriate error message printed?
>>>
>>> Cheers,
>>>
>>> Loris
>>
>> Having no solution yet, I am having another look at this problem.
>>
>> As I understand it, if I add a new error number to slurm_errno.h, which
>> I can include in job_submit_defaults.c.  However, the error message
>> associated with the new error number is added to slurm_errno.c, which
>> seems just to be compiled into libslurm.so and the main binaries.
>>
>> Thus, to have a plugin return a non-standard error message, I would have
>> to replace the libslurm.so and possibly the slurm daemons.
>>
>> Is my understanding correct?
>>
>> Regards
>>
>> Loris
>
> Can anyone shed any light on the topic above?
>
> Cheers,
>
> Loris
>
> --
> Dr. Loris Bennett (Mr.)
> ZEDAT, Freie Universität Berlin         Email [email protected]
>

Reply via email to