Hi Y'all,

This may be more of a slurm/MPI installation question than an elasticluster 
question - but the problem is occurring on a slurm/MPI cluster I built 
using elasticluster.

When I add new users using `adduser`, new home directories are created for 
them correctly.  But when those users connect to a worker (via `srun -N 4 
-t 10 --pty bash`), they have a userid but no username -- and no matching 
entry in /etc/passwd.  As a consequence, they cannot run jobs.   However 
everything works fine for the original default user.   

The specific error is as follows:

*rieffelj@master001*:*~*$ srun -N 4 -t 10 --pty bash

*I have no name!@worker001*:*~*$ mpirun -np 4 ./a.

*I have no name!@worker001*:*~*$ mpirun -np 4 ./a.out

--------------------------------------------------------------------------

An ORTE daemon has unexpectedly failed after launch and before

communicating back to mpirun. This could be caused by a number

of factors, including an inability to create a connection back

to mpirun due to a lack of common network interfaces and/or no

route found between them. Please check network connectivity

(including firewalls and network routing requirements).

--------------------------------------------------------------------------

Any thoughts on what I'm doing wrong?  I'm certain I used this same 
workflow two years ago without any problems.


jr


-- 
You received this message because you are subscribed to the Google Groups 
"elasticluster" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticluster/14fa8d54-ec20-46a8-a98d-07f6d03316b1n%40googlegroups.com.

Reply via email to