For some projects, we use group passwords and have users authenticate into the group when they need to access those files. The password is set with gpasswd and stored in /etc/gshadow. However after a user authenticates into a group, they can no longer run SLURM jobs, their jobs go into status "(launch failed requeued held)".
in slurmd.log I see: [2016-07-05T13:28:55.167] Launching batch job 2844 for UID 1000 [2016-07-05T13:28:55.175] uid 1000 is not a member of gid 5000 [2016-07-05T13:28:55.175] batch_stepd_step_rec_create() failed: Group ID not found on host [2016-07-05T13:28:55.175] _step_setup: no job returned [2016-07-05T13:28:55.175] done with job It looks like this is a sanity check looking if a user is a static member of the group - which they're not, so it fails. Is there any way to turn off this sanity check? Below is an example of the commands to reach this point: [prout@login-0 ~]$ id uid=1000(prout) gid=1000(prout) groups=1000(prout),10(wheel) [prout@login-0 ~]$ newgrp ProjectX Password: abc123 [prout@login-0 ~]$ id uid=1000(prout) gid=5000(ProjectX) groups=1000(prout),10(wheel),5000(ProjectX) [prout@login-0 ~]$ sbatch --wrap="srun /bin/sleep 300" Submitted batch job 2844 [prout@login-0 ~]$ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 2844 normal wrap prout PD 0:00 1 (launch failed requeued held)