Update: In the interest of helping out anyone else in the future who may
have my problem, I'm posting what the solution to the problem.
All I had to do was add the line
JobSubmitPlugins=job_submit/require_timelimit
to the slurm.conf. It would have saved so much time and trouble if this
syntax was documented in the slurm.conf man page, but I couldn't find
anything anywhere on how to properly use it. Just had to go through a lot
of trial and error.
On Thu, Mar 19, 2015 at 2:37 PM, Michael Kit Gilbert <[email protected]> wrote:
> Thanks again for the help!
> OS: CentOS 6.5
> Slurm version: 14.11.2
> Compiler: gcc 4.4.7
>
> Getting rid of the "plugstack.conf" file allowed me to start running jobs
> again, but the plugin that I'm wanting to work doesn't appear to be enabled.
>
> There are a bunch of *.so plugin files in the /usr/lib64/slurm directory.
> One of them is "job_submit_require_timelimit.so". I assume that since these
> are installed here that they were compiled when slurm was installed. So
> since they're in this directory, how do I enable them? I want people to be
> forced to enter a time limit and that is what this plugin appears to do.
>
> On Thu, Mar 19, 2015 at 1:46 PM, Andy Riebs <[email protected]> wrote:
>
>> OK, we (or at least I) have reached the point where you need to provide
>> some more information:
>> * What operating system and version?
>> * What Slurm version?
>> * What compiler?
>>
>> You apparently have some kind of build problem, as Slurm plugins are
>> required to export a specific set of symbols; they seem not to be exported
>> in your plugin.
>>
>> Have you gotten Slurm to run without the plugin? That's a useful first
>> step before adding anything that is optional. (BTW, did you discover that
>> MailProg is a requirement, once you get further down the road?)
>>
>> Andy
>>
>>
>> On 03/19/2015 04:33 PM, Michael Kit Gilbert wrote:
>>
>> Update: So, I have figured out the problem with slurm not running
>> properly. It had to do with my fstab file being incorrect and not mounting
>> /var/spool correctly.
>>
>> Now I can start slurm correctly. However, when trying to run a job,
>> slurm doesn't load the plugin properly, so it fails with the following
>> message:
>>
>> sbatch: error: spank:
>> "/usr/lib64/slurm/job_submit_require_timelimit.so" exports 0 symbols
>> sbatch: error: spank: /etc/slurm/plugstack.conf:7: Failed to load plugin
>> /usr/lib64/slurm/job_submit_require_timelimit.so. Aborting.
>> sbatch: error: Failed to initialize plugin stack
>>
>> I posted the slurm.conf and plugstack.conf changes I made in the first
>> post. Thanks for any help!
>>
>> On Thu, Mar 19, 2015 at 11:55 AM, Michael Kit Gilbert <[email protected]>
>> wrote:
>>
>>> Thank you so much for the reply, Andy. Well, apparently there's a lot
>>> happening that may be causing the issue. First, I can't seem to get
>>> slurmctld running properly. When I run "slurmctld -D", this is my output:
>>>
>>> slurmctld: error: Can't save state, create file
>>> /var/spool/slurm/last_config_lite.new error Permission denied
>>> slurmctld: error: Configured MailProg is invalid
>>> slurmctld: Job accounting information stored, but details not gathered
>>> slurmctld: fatal: Incorrect permissions on state save loc:
>>> /var/spool/slurm
>>>
>>> I have the MailProg line in slurm.conf commented out, so does it have
>>> to be specified to work? Also, since I'm root and root is the owner of the
>>> /var/spool/slurm directory, I'm not sure why it's telling me the
>>> permissions are incorrect...
>>>
>>> On Thu, Mar 19, 2015 at 10:20 AM, Andy Riebs <[email protected]> wrote:
>>>
>>>> Michael,
>>>>
>>>> Try running "slurmctld -D" which should result in output telling you
>>>> what's going wrong.
>>>>
>>>> Andy
>>>>
>>>>
>>>>
>>>> On 03/19/2015 01:15 PM, Michael Kit Gilbert wrote:
>>>>
>>>> Sorry for the basic question, but I am new to slurm and am having some
>>>> basic problems with plugins. What I'd like to do is make the
>>>> job_submit_require_timelimit.so plugin that is found in the source code
>>>> active and required for all jobs.
>>>>
>>>> What I've done so far is I've added the line
>>>>
>>>> *PluginDir=/usr/lib64/slurm*
>>>>
>>>> to slurm.conf and I've created a plugstack.conf file that has one
>>>> line in it:
>>>>
>>>> *required job_submit_require_timelimit.so*
>>>>
>>>> And now slurm won't start at all. So obviously I've made a huge
>>>> newbie error. I've verified that our plugins are found in the
>>>> /usr/lib64/slurm directory, but I can't tell what else I need to do.
>>>> Does this plugin require arguments? Is there something else I'm missing?
>>>>
>>>> Thanks,
>>>>
>>>> Mike
>>>>
>>>>
>>>>
>>>
>>
>>
>