Re: [slurm-users] Slurm setup question

2018-04-11 Thread Douglas Jacobsen
It looks like your slurm.conf is specifying /var/spool as your Save state
directory, and `fatal: Incorrect permissions on state save loc: /var/spool`
indicates that SlurmUser (another configuration in slurm.conf) does not
have access to write to it.  It might be a good to make a directory
dedicated for this purpose, e.g. /var/spool/slurm/_state, and
then make sure that the SlurmUser (usually either "slurm" or root,
depending on your needs), can access that directory.


Doug Jacobsen, Ph.D.
NERSC Computer Systems Engineer
National Energy Research Scientific Computing Center 
dmjacob...@lbl.gov

- __o
-- _ '\<,_
--(_)/  (_)__


On Wed, Apr 11, 2018 at 5:44 AM, Ole Holm Nielsen <
ole.h.niel...@fysik.dtu.dk> wrote:

> Hi Matt,
>
> You might want to take a look at my Slurm Wiki, which focuses on
> CentOS/RHEL 7: https://wiki.fysik.dtu.dk/niflheim/SLURM.  Complete
> instructions for Slurm installation, configuration, etc. is in the Wiki.
>
> /Ole
>
>
> On 04/11/2018 02:26 PM, Matt Hohmeister wrote:
>
>> I’m brand-new to Slurm, and setting it up on a single RHEL 7.4 VM as a
>> proof of concept before I deploy it. After following the instructions on
>> https://www.slothparadise.com/how-to-install-slurm-on-centos-7-cluster/
>> (sorry, site not working now), I can get slurmd to start perfectly, but
>> slurmctld fails to start with the following journalctl -xe; I was wondering
>> if anyone has run into this or could shed some light on this…thanks in
>> advance!
>>
>> Apr 11 08:18:30 psy-slurm polkitd[680]: Registered Authentication Agent
>> for unix-process:1779:31362 (system bus name :1.26 [/usr/bin/pkttyagent
>> --notify-fd 5 --fallbac
>>
>> Apr 11 08:18:30 psy-slurm systemd[1]: Starting Slurm controller daemon...
>>
>> -- Subject: Unit slurmctld.service has begun start-up
>>
>> -- Defined-By: systemd
>>
>> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>>
>> --
>>
>> -- Unit slurmctld.service has begun starting up.
>>
>> Apr 11 08:18:30 psy-slurm systemd[1]: PID file /var/run/slurmctld.pid not
>> readable (yet?) after start.
>>
>> Apr 11 08:18:30 psy-slurm systemd[1]: Started Slurm controller daemon.
>>
>> -- Subject: Unit slurmctld.service has finished start-up
>>
>> -- Defined-By: systemd
>>
>> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>>
>> --
>>
>> -- Unit slurmctld.service has finished starting up.
>>
>> --
>>
>> -- The start-up result is done.
>>
>> Apr 11 08:18:30 psy-slurm polkitd[680]: Unregistered Authentication Agent
>> for unix-process:1779:31362 (system bus name :1.26, object path
>> /org/freedesktop/PolicyKit1/A
>>
>> Apr 11 08:18:30 psy-slurm slurmctld[1787]: fatal: Incorrect permissions
>> on state save loc: /var/spool
>>
>> Apr 11 08:18:30 psy-slurm systemd[1]: slurmctld.service: main process
>> exited, code=exited, status=1/FAILURE
>>
>> Apr 11 08:18:30 psy-slurm systemd[1]: Unit slurmctld.service entered
>> failed state.
>>
>> Apr 11 08:18:30 psy-slurm systemd[1]: slurmctld.service failed.
>>
>> Matt Hohmeister
>>
>> Systems and Network Administrator
>>
>> Department of Psychology
>>
>> Florida State University
>>
>> PO Box 3064301
>>
>> Tallahassee, FL 32306-4301
>>
>> Phone: +1 850 645 1902
>>
>> Fax: +1 850 644 7739
>>
>
>


Re: [slurm-users] Slurm setup question

2018-04-11 Thread Ole Holm Nielsen

Hi Matt,

You might want to take a look at my Slurm Wiki, which focuses on 
CentOS/RHEL 7: https://wiki.fysik.dtu.dk/niflheim/SLURM.  Complete 
instructions for Slurm installation, configuration, etc. is in the Wiki.


/Ole

On 04/11/2018 02:26 PM, Matt Hohmeister wrote:
I’m brand-new to Slurm, and setting it up on a single RHEL 7.4 VM as a 
proof of concept before I deploy it. After following the instructions on 
https://www.slothparadise.com/how-to-install-slurm-on-centos-7-cluster/ 
(sorry, site not working now), I can get slurmd to start perfectly, but 
slurmctld fails to start with the following journalctl -xe; I was 
wondering if anyone has run into this or could shed some light on 
this…thanks in advance!


Apr 11 08:18:30 psy-slurm polkitd[680]: Registered Authentication Agent 
for unix-process:1779:31362 (system bus name :1.26 [/usr/bin/pkttyagent 
--notify-fd 5 --fallbac


Apr 11 08:18:30 psy-slurm systemd[1]: Starting Slurm controller daemon...

-- Subject: Unit slurmctld.service has begun start-up

-- Defined-By: systemd

-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

--

-- Unit slurmctld.service has begun starting up.

Apr 11 08:18:30 psy-slurm systemd[1]: PID file /var/run/slurmctld.pid 
not readable (yet?) after start.


Apr 11 08:18:30 psy-slurm systemd[1]: Started Slurm controller daemon.

-- Subject: Unit slurmctld.service has finished start-up

-- Defined-By: systemd

-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

--

-- Unit slurmctld.service has finished starting up.

--

-- The start-up result is done.

Apr 11 08:18:30 psy-slurm polkitd[680]: Unregistered Authentication 
Agent for unix-process:1779:31362 (system bus name :1.26, object path 
/org/freedesktop/PolicyKit1/A


Apr 11 08:18:30 psy-slurm slurmctld[1787]: fatal: Incorrect permissions 
on state save loc: /var/spool


Apr 11 08:18:30 psy-slurm systemd[1]: slurmctld.service: main process 
exited, code=exited, status=1/FAILURE


Apr 11 08:18:30 psy-slurm systemd[1]: Unit slurmctld.service entered 
failed state.


Apr 11 08:18:30 psy-slurm systemd[1]: slurmctld.service failed.

Matt Hohmeister

Systems and Network Administrator

Department of Psychology

Florida State University

PO Box 3064301

Tallahassee, FL 32306-4301

Phone: +1 850 645 1902

Fax: +1 850 644 7739