Hello Bob,

FYI, the FreeBSD port is very new and the groups feature is untested as far as I know. If you can provide any additional information, such as a debugger stack trace from the crash, it might expedite a solution.

I'm currently working on the srun --pty flag and there are probably a few other issues I'm not yet aware of. If you discover any other issues, please report them to this list and I'll put them on the to-do list.

Regards,

    Jason

On 2/20/14 3:08 PM, Bob Healey wrote:

I am slowly migrating my slurmctld's from RHEL 5 to FreeBSD for greater reliability. My cluster support systems are all FreeBSD, and not accessible to end users. The RHEL 5 systems are all end user accessible. I've copied a working slurm.conf file over to FreeBSD, and slurmctld segfaults at startup if I have AllowGroups=Something in the partition definition. The group definitions are stored in LDAP if that makes a difference.

I'm running 2.6.4, as provided by the FreeBSD Ports tree.


Slurm.conf excerpt. If I have any partition except tiger uncommented, slurmctld segfaults at startup. If I remove the group restriction from the partitions, it works.
# COMPUTE NODES
NodeName=lion-[1-48] RealMemory=1800 Sockets=2 CoresPerSocket=1 ThreadsPerCore=1 State=unknown NodeName=tiger-[1-75] Procs=4 RealMemory=7800 Sockets=2 CoresPerSocket=2 ThreadsPerCore=1 State=unknown NodeName=calvin-[1-8] RealMemory=16000 Sockets=2 CoresPerSocket=6 ThreadsPerCore=1 State=unknown Nodename=neutron-[1-8] RealMemory=16000 Sockets=2 CoresPerSocket=6 ThreadsPerCore=1 State=unknown NodeName=jaguar Procs=48 RealMemory=63000 Sockets=4 CoresPerSocket=12 ThreadsPerCore=1 State=unknown #PartitionName=tiger Nodes=tiger-[1-75] Default=YES MaxTime=2880 State=DOWN MaxNodes=16 PartitionName=jaguar Nodes=jaguar Default=NO MaxTime=2880 State=DOWN AllowGroups=lion.che_cluster_access #PartitionName=calvin Nodes=calvin-[1-8] Default=NO MaxTime=2880 State=DOWN AllowGroups=calvin_che_access #PartitionName=neutron Nodes=neutron-[1-8] Default=NO MaxTime=2880 State=DOWN AllowGroups=neutron_mat_access #PartitionName=lion Nodes=lion-[1-48] Default=NO MaxTime=2880 State=DOWN AllowGroups=lion.che_cluster_access



--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Jason W. Bacon
  [email protected]

  Circumstances don't make a man:
  They reveal him.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reply via email to