+1 for using package managers in general.
On our CentOS clusters, I do the munge and slurm installs using pkgsrc
(+ pkgsrc-wip).
http://acadix.biz/pkgsrc.php
I use Yum for most system services and for libraries required by
commercial software, and pkgsrc for all the latest open source software.
I rsync a small pkgsrc tree to the local drive of each node for munge,
slurm, and a few basic tools, and keep separate, more extensive tree on
the NFS share for scientific software.
The current slurm package is pretty outdated, but we'll bring it
up-to-date soon.
Regards,
Jason
On 3/26/15 4:23 AM, Paddy Doyle wrote:
+1 for local installs.
We build the RPMs and put them in a local repo (Scientific Linux 6), and so
installing/upgrading via Salt/Puppet/Ansible etc is quite scalable.
It works for us, but of course YMMV.
Paddy
On Tue, Mar 24, 2015 at 06:46:46PM -0700, Paul Edmon wrote:
Yeah, we've been running CentOS 6 and slurm in this fashion for
about a year and a half on about a thousand machines and haven't
really had a problem with this. Though I don't know if this method
scales indefinitely. We just have a symlink back to our conf from
/etc/slurm/slurm.conf. We then control the version via RPM
installs.
-Paul Edmon-
On 3/24/2015 4:22 PM, Jason Bacon wrote:
I ran one of our CentOS clusters this way for about a year and
found it to be more trouble than it was worth.
I recently reconfigured it to run all system services from local
disks so that nodes are as independent of each other as possible.
Assuming you have ssh keys on all the nodes, syncing slurm.conf
and other files is a snap using a simple shell script. We only
use NFS for data files and user applications at this point.
Of course, if your compute nodes don't have local disks, that's
another story.
Jason
On 03/24/15 14:42, Jeff Layton wrote:
Good afternoon,
I apologies for the newb question but I'm setting up slurm
for the first time in a very long time. I've got a small cluster
of a master node and 4 compute nodes. I'd like to install
slurm on an NFS file system that is exported from the master
node and mounted on the compute nodes. I've been reading
a bit about this but does anyone have recommendations on
what to watch out for?
Thanks!
Jeff
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Jason W. Bacon
[email protected]
If a problem can be solved,
there's no need to worry.
If it cannot be solved, then
worrying will do no good.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~