Actually, I’ve gotten it all working; I just had overlooked some things in 
slurm.conf.

I had previously trying to get Son of Grid Engine working, but after ripping 
out half my hair, I went to Slurm, since it’s what our research computing 
center uses. 😊

Matt Hohmeister
Systems and Network Administrator
Department of Psychology
Florida State University
PO Box 3064301
Tallahassee, FL 32306-4301
Phone: +1 850 645 1902
Fax: +1 850 644 7739

From: slurm-users <slurm-users-boun...@lists.schedmd.com> On Behalf Of Sean 
Caron
Sent: Wednesday, April 11, 2018 3:36 PM
To: Slurm User Community List <slurm-users@lists.schedmd.com>; Sean Caron 
<sca...@umich.edu>
Subject: Re: [slurm-users] FSU & Slurm

Hi Matt,

As a protest to asking questions on this list and getting solicitations for 
pay-for support, let me give you some advice for free :)

If you look at your slurm.conf you'll see there are two directories that your 
slurm user and group need to have write access to.

One is whatever you configure as SlurmdSpoolDir. This needs to be available on 
all worker nodes that are running slurmd. Set ownership to slurm user and slurm 
group and mode 755.

The other is StateSaveLocation. This needs to be present just on your 
controller (where slurmctld runs). Again, this should have ownership of slurm 
user and slurm group and mode 755.

You probably want to use something more specific than just /var/spool for your 
StateSaveLocation

Best,

Sean


On Wed, Apr 11, 2018 at 1:48 PM, Jess Arrington 
<j...@schedmd.com<mailto:j...@schedmd.com>> wrote:
Hi Matt,

I hope your day is treating you well.


Thank you for your posts on the Slurm user list.


By chance, do you work with Paul Van Der Mark?


Would there be interest on your side to see a Slurm support contract for your 
systems at FSU?

Sites running Slurm with support give us feedback that support is invaluable 
and a great return back to the organization with much better system utilization 
with optimized configs by our experts (which pays for the support contract in 
and of itself) and their sites not having to rely on in-house best effort 
support hacks that get very expensive and turn into complicated chaos and 
potential down systems.


Additionally, support keeps the Slurm project alive and going strong


Please let me know your thoughts or if you would like me to reach out to 
another contact at FSU to chat about this further.



Take care,



Jess Arrington
Director of Sales | 801-616-7823
204 N 1200 E #203 Lehi, UT 
84043<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D204-2BN-2B1200-2BE-2B-2523203-2BLehi-2C-2BUT-2B84043-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=HPMtquzZjKY31rtkyGRFnQ&r=Y7_jKRiyJUHl8NulOtnzB4UPVSMWmGk9Sds6aXi7m3U&m=8yB0SkVystFGD-TJ40nZcctsJi2KZKItVMPqZCkrGl0&s=rftyBrt83wE0kthyAQSLm2BY7GFZE6IA7LKFL2AOfbg&e=>

On Wed, Apr 11, 2018 at 6:26 AM, Matt Hohmeister 
<hohmeis...@psy.fsu.edu<mailto:hohmeis...@psy.fsu.edu>> wrote:
I’m brand-new to Slurm, and setting it up on a single RHEL 7.4 VM as a proof of 
concept before I deploy it. After following the instructions on 
https://www.slothparadise.com/how-to-install-slurm-on-centos-7-cluster/<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.slothparadise.com_how-2Dto-2Dinstall-2Dslurm-2Don-2Dcentos-2D7-2Dcluster_&d=DwMFaQ&c=HPMtquzZjKY31rtkyGRFnQ&r=Y7_jKRiyJUHl8NulOtnzB4UPVSMWmGk9Sds6aXi7m3U&m=8yB0SkVystFGD-TJ40nZcctsJi2KZKItVMPqZCkrGl0&s=_q5sO-LaJk4lnhH0SfJWMgyuoX8UBSrQ8xm09qfEKTE&e=>
 (sorry, site not working now), I can get slurmd to start perfectly, but 
slurmctld fails to start with the following journalctl -xe; I was wondering if 
anyone has run into this or could shed some light on this…thanks in advance!

Apr 11 08:18:30 psy-slurm polkitd[680]: Registered Authentication Agent for 
unix-process:1779:31362 (system bus name :1.26 [/usr/bin/pkttyagent --notify-fd 
5 --fallbac
Apr 11 08:18:30 psy-slurm systemd[1]: Starting Slurm controller daemon...
-- Subject: Unit slurmctld.service has begun start-up
-- Defined-By: systemd
-- Support: 
http://lists.freedesktop.org/mailman/listinfo/systemd-devel<https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_systemd-2Ddevel&d=DwMFaQ&c=HPMtquzZjKY31rtkyGRFnQ&r=Y7_jKRiyJUHl8NulOtnzB4UPVSMWmGk9Sds6aXi7m3U&m=8yB0SkVystFGD-TJ40nZcctsJi2KZKItVMPqZCkrGl0&s=IblrcsfHqVpgFyEMwN0EEP79-4O-Hu67St1xNF1e734&e=>
--
-- Unit slurmctld.service has begun starting up.
Apr 11 08:18:30 psy-slurm systemd[1]: PID file /var/run/slurmctld.pid not 
readable (yet?) after start.
Apr 11 08:18:30 psy-slurm systemd[1]: Started Slurm controller daemon.
-- Subject: Unit slurmctld.service has finished start-up
-- Defined-By: systemd
-- Support: 
http://lists.freedesktop.org/mailman/listinfo/systemd-devel<https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_systemd-2Ddevel&d=DwMFaQ&c=HPMtquzZjKY31rtkyGRFnQ&r=Y7_jKRiyJUHl8NulOtnzB4UPVSMWmGk9Sds6aXi7m3U&m=8yB0SkVystFGD-TJ40nZcctsJi2KZKItVMPqZCkrGl0&s=IblrcsfHqVpgFyEMwN0EEP79-4O-Hu67St1xNF1e734&e=>
--
-- Unit slurmctld.service has finished starting up.
--
-- The start-up result is done.
Apr 11 08:18:30 psy-slurm polkitd[680]: Unregistered Authentication Agent for 
unix-process:1779:31362 (system bus name :1.26, object path 
/org/freedesktop/PolicyKit1/A
Apr 11 08:18:30 psy-slurm slurmctld[1787]: fatal: Incorrect permissions on 
state save loc: /var/spool
Apr 11 08:18:30 psy-slurm systemd[1]: slurmctld.service: main process exited, 
code=exited, status=1/FAILURE
Apr 11 08:18:30 psy-slurm systemd[1]: Unit slurmctld.service entered failed 
state.
Apr 11 08:18:30 psy-slurm systemd[1]: slurmctld.service failed.

Matt Hohmeister
Systems and Network Administrator
Department of Psychology
Florida State University
PO Box 3064301
Tallahassee, FL 32306-4301
Phone: +1 850 645 1902
Fax: +1 850 644 7739



Reply via email to