I am slowly migrating my slurmctld's from RHEL 5 to FreeBSD for greater
reliability. My cluster support systems are all FreeBSD, and not
accessible to end users. The RHEL 5 systems are all end user
accessible. I've copied a working slurm.conf file over to FreeBSD, and
slurmctld segfaults at startup if I have AllowGroups=Something in the
partition definition. The group definitions are stored in LDAP if that
makes a difference.
I'm running 2.6.4, as provided by the FreeBSD Ports tree.
Slurm.conf excerpt. If I have any partition except tiger uncommented,
slurmctld segfaults at startup. If I remove the group restriction from
the partitions, it works.
# COMPUTE NODES
NodeName=lion-[1-48] RealMemory=1800 Sockets=2 CoresPerSocket=1
ThreadsPerCore=1 State=unknown
NodeName=tiger-[1-75] Procs=4 RealMemory=7800 Sockets=2 CoresPerSocket=2
ThreadsPerCore=1 State=unknown
NodeName=calvin-[1-8] RealMemory=16000 Sockets=2 CoresPerSocket=6
ThreadsPerCore=1 State=unknown
Nodename=neutron-[1-8] RealMemory=16000 Sockets=2 CoresPerSocket=6
ThreadsPerCore=1 State=unknown
NodeName=jaguar Procs=48 RealMemory=63000 Sockets=4 CoresPerSocket=12
ThreadsPerCore=1 State=unknown
#PartitionName=tiger Nodes=tiger-[1-75] Default=YES MaxTime=2880
State=DOWN MaxNodes=16
PartitionName=jaguar Nodes=jaguar Default=NO MaxTime=2880 State=DOWN
AllowGroups=lion.che_cluster_access
#PartitionName=calvin Nodes=calvin-[1-8] Default=NO MaxTime=2880
State=DOWN AllowGroups=calvin_che_access
#PartitionName=neutron Nodes=neutron-[1-8] Default=NO MaxTime=2880
State=DOWN AllowGroups=neutron_mat_access
#PartitionName=lion Nodes=lion-[1-48] Default=NO MaxTime=2880 State=DOWN
AllowGroups=lion.che_cluster_access
--
Bob Healey
Systems Administrator
Biocomputation and Bioinformatics Constellation
and Molecularium
[email protected]
(518) 276-4407