Hi Y'all, This may be more of a slurm/MPI installation question than an elasticluster question - but the problem is occurring on a slurm/MPI cluster I built using elasticluster.
When I add new users using `adduser`, new home directories are created for them correctly. But when those users connect to a worker (via `srun -N 4 -t 10 --pty bash`), they have a userid but no username -- and no matching entry in /etc/passwd. As a consequence, they cannot run jobs. However everything works fine for the original default user. The specific error is as follows: *rieffelj@master001*:*~*$ srun -N 4 -t 10 --pty bash *I have no name!@worker001*:*~*$ mpirun -np 4 ./a. *I have no name!@worker001*:*~*$ mpirun -np 4 ./a.out -------------------------------------------------------------------------- An ORTE daemon has unexpectedly failed after launch and before communicating back to mpirun. This could be caused by a number of factors, including an inability to create a connection back to mpirun due to a lack of common network interfaces and/or no route found between them. Please check network connectivity (including firewalls and network routing requirements). -------------------------------------------------------------------------- Any thoughts on what I'm doing wrong? I'm certain I used this same workflow two years ago without any problems. jr -- You received this message because you are subscribed to the Google Groups "elasticluster" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticluster/14fa8d54-ec20-46a8-a98d-07f6d03316b1n%40googlegroups.com.
