Hi Anna,

I'm sorry to inform you that you have to have the user information on
all nodes. You cannot run jobs with UIDs from users the local system
does not know.

If you don't want to distribute your /etc/passdw, /etc/shadow and
/etc/group everytime a user is added or removed the best option would
probably be a centralized user directory, e.g. LDAP.

Regards,

        Uwe

Am 05.11.2014 um 20:32 schrieb Anna Kostikova:
> 
> Dear list,
> 
> Am I correct assuming that if I use munge for slurm, then only slurmd
> and munged should be running on all slurm nodes, and I can keep all
> unix users on another server, with, for instance, slurmctld running.
> For instance, I create a user with useradd, and when this user run a
> job in slurm, then, with a help of munge, node of slurm will recognise
> his uid and gid, even though this unix user is not created on this
> node of slurm. If yes, then why I might be having this error:
> 
> [2014-11-05T10:24:30.729] launch task 602.0 request from 1007.1007@IP
> (port 22410)
> [2014-11-05T10:24:30.751] error: _send_slurmstepd_init getpwuid_r: No error
> [2014-11-05T10:24:30.751] error: Unable to init slurmstepd
> [2014-11-05T10:24:30.751] uid 1007 not found on system
> [2014-11-05T10:24:30.752] _step_setup: no job returned
> [2014-11-05T10:24:30.752] Unable to send "fail" to slurmd
> [2014-11-05T10:24:30.752] done with job
> Munge keys are exactly same on slurm server and all nodes.
> 
> From the description here (http://linux.die.net/man/7/munge) it looks
> like all slurm users are kept in one place. But if not, then why there
> is munge?
> Or a user with specific uid and gid must exist on a node?
> 
> Thanks a lot for your help,
> Anna
> 

Reply via email to