It's because Open MPI uses a tree-based ssh startup pattern. (amusingly enough, I'm literally half way through writing up a blog entry about this exact same issue :-) )
That is, not only does Open MPI ssh from your mpirun-server to host1, Open MPI may also ssh from host1 to host2 (or host1 to host3). In short, if you're not using a resource manager (such as Torque or SLURM), then you can't predict the ssh pattern, and you need passwordless/passphraseless ssh logins from each server to each other server. Make sense? > On Jan 16, 2015, at 3:29 PM, Chan, Elbert <ec...@csuchico.edu> wrote: > > Hi > > I'm hoping that someone will be able to help me figure out a problem with > connecting to multiple nodes with v1.8.4. > > Currently, I'm running into this issue: > $ mpirun --host host1 hostname > host1 > > $ mpirun --host host2,host3 hostname > host2 > host3 > > Running this command on 1 or 2 nodes generates the expected result. However: > $ mpirun --host host1,host2,host3 hostname > Permission denied, please try again. > Permission denied, please try again. > Permission denied (publickey,password,keyboard-interactive). > -------------------------------------------------------------------------- > ORTE was unable to reliably start one or more daemons. > This usually is caused by: > > * not finding the required libraries and/or binaries on > one or more nodes. Please check your PATH and LD_LIBRARY_PATH > settings, or configure OMPI with --enable-orterun-prefix-by-default > > * lack of authority to execute on one or more specified nodes. > Please verify your allocation and authorities. > > * the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base). > Please check with your sys admin to determine the correct location to use. > > * compilation of the orted with dynamic libraries when static are required > (e.g., on Cray). Please check your configure cmd line and consider using > one of the contrib/platform definitions for your system type. > > * an inability to create a connection back to mpirun due to a > lack of common network interfaces and/or no route found between > them. Please check network connectivity (including firewalls > and network routing requirements). > -------------------------------------------------------------------------- > > This is set up with passwordless logins with passphrases/ssh-agent. When I > run passphraseless, I get the expected result. > > What am I doing wrong? What can I look at to see where my problem could be? > > Elbert > > -- > ******************************** > Elbert Chan > Operating Systems Analyst > College of ECC > CSU, Chico > 530-898-6481 > ******************************** > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/01/26207.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/