Re: [Wien] dstart_mpi error

2018-07-20 Thread karima Physique
Dear Dr. Gavin Abo thank you very much for your answers, my system does not have a HFI card but I will try the second solution. Le ven. 20 juil. 2018 à 02:54, Gavin Abo a écrit : > Good to hear that the "unable to get host address" and "unable to connect > to server" errors are gone after you

Re: [Wien] dstart_mpi error

2018-07-19 Thread Gavin Abo
Good to hear that the "unable to get host address" and "unable to connect to server" errors are gone after you fixed the hosts file on each node. Regarding the "no hfi units are available" error, if your system has Intel OP HFI cards, then maybe they just need configured to work [

Re: [Wien] dstart_mpi error

2018-07-19 Thread Laurence Marks
As I said, this is in your IB (or similar) fabric. On Thu, Jul 19, 2018 at 11:54 AM, karima Physique wrote: > Dear prof. Laurence Marks > > *I note that I am using the latest version of intel compilers (Intel > Parallel Studio Cluster Edition)* > *I read about the possible solution but I did

Re: [Wien] dstart_mpi error

2018-07-19 Thread karima Physique
Dear prof. Laurence Marks *I note that I am using the latest version of intel compilers (Intel Parallel Studio Cluster Edition)* *I read about the possible solution but I did not find a solution related to intel.* *do you have any solution for this problem?* Le jeu. 19 juil. 2018 à 16:02,

Re: [Wien] dstart_mpi error

2018-07-19 Thread Laurence Marks
See https://www.google.com/search?q=no+hfi+units+are+available+(err%3D23)=no+hfi+units+are+available+(err%3D23)=chrome..69i57.481j0j4=chrome=UTF-8 This appears to be an issue with your local mpi/fabric. On Thu, Jul 19, 2018 at 8:03 AM, karima Physique wrote: > *dear dr Gavin Abo* > actually,

Re: [Wien] dstart_mpi error

2018-07-19 Thread karima Physique
*dear dr Gavin Abo* actually, the problem was solved by adding the hostname in the hosts file in all the nodes and not only in the master node. now the calculation works very well but at each excusion of LAPW0 in the scf I get this error without affecting the calculations : *""calcul.23539PSM2

Re: [Wien] dstart_mpi error

2018-07-19 Thread karima Physique
*Thank you dr Gavin Abo* *I checked the etc/hosts file and it is ok* *but why lapw1_mpi works fine and in all the nodes while dstart_mi and lapw0_mpi do not work on the nodes* Le jeu. 19 juil. 2018 à 04:23, Gavin Abo a écrit : > As the error message says, one possible cause is the connection

Re: [Wien] dstart_mpi error

2018-07-18 Thread Gavin Abo
As the error message says, one possible cause is the connection being blocked by a firewall. Another possible cause is a ssh passwordless access problem: https://stackoverflow.com/questions/19565795/unable-to-execute-mpich2-on-multiple-machines-on-ubuntu-12-04-hydu-sock-connect Yet, another