Dear list, Am I correct assuming that if I use munge for slurm, then only slurmd and munged should be running on all slurm nodes, and I can keep all unix users on another server, with, for instance, slurmctld running. For instance, I create a user with useradd, and when this user run a job in slurm, then, with a help of munge, node of slurm will recognise his uid and gid, even though this unix user is not created on this node of slurm. If yes, then why I might be having this error:
[2014-11-05T10:24:30.729] launch task 602.0 request from 1007.1007@IP (port 22410) [2014-11-05T10:24:30.751] error: _send_slurmstepd_init getpwuid_r: No error [2014-11-05T10:24:30.751] error: Unable to init slurmstepd [2014-11-05T10:24:30.751] uid 1007 not found on system [2014-11-05T10:24:30.752] _step_setup: no job returned [2014-11-05T10:24:30.752] Unable to send "fail" to slurmd [2014-11-05T10:24:30.752] done with job Munge keys are exactly same on slurm server and all nodes. >From the description here (http://linux.die.net/man/7/munge) it looks like all slurm users are kept in one place. But if not, then why there is munge? Or a user with specific uid and gid must exist on a node? Thanks a lot for your help, Anna
