Hi Jeff,

Thank you for picking up on the thread. I followed your suggestion and I also 
did  some research. I  think I fixed the issue.
My problem was two things:

1) From the mater node I made sure  i could do a password-less ssh connection  
to the other nodes  via; IP address of the nodes, hostname of the nodes and, 
Fully Qualified Name of the nodes.
I noticed that the ssh session with the IP address was fine but I needed to 
allow/trust the connection by entering “yes” when i first ran password-less ssh 
sessions by hostname and FQN hostname.
That’s interesting to me because all the post, forum, tutorial i read only talk 
about password-less session with IP addresses and never mentioned by hostname 
and FQN hostname.
Now i wonder is it because in my /etc/hosts file  includes IP's and hostnames 
as well as FQN hostnames  ??

2) I have forgotten to set the environment path in the /etc/environment file. I 
had to add the following (in bold)

Now i can run; mpirun -hostfile hostsfile  -np 72 

I can see the 12 cores per node (6 nodes in total) working!!

Thank you Jeff (and Gille) for pointing me into the right direction.  Now my 
next step is to bind Python to OpenMPI.



Eric F.  Alemany
System Administrator for Research

Division of Radiation & Cancer  Biology
Department of Radiation Oncology

Stanford University School of Medicine
Stanford, California 94305

Tel:1-650-498-7969<tel:1-650-498-7969>  No Texting

users mailing list

Reply via email to