Dear Ralph Castain, Thank you for you reply!!!
Actually, I have adjusted my /etc/security/limits.conf file, I modified the "soft nofile" and "hard nofile" values up to 65535, so these days I tried another possible limits settings another settings include "soft memlock" ,"hard memlock", and /proc/sys/fs/file-max file. it still didn't work... But at the last, I think my "soft nofile" and "hard nofile" values may be not large enough. After I arise those value, it works finally !!!!!! lol Thank you for your suggestion again!!! It's very useful!!! :)))) On Sun, Jun 2, 2013 at 10:38 PM, Ralph Castain <r...@open-mpi.org> wrote: > I would suspect you are hitting limits on the number of open sockets - > check your limits settings on file descriptors. > > On Jun 1, 2013, at 11:43 AM, vacate <vacatehop...@gmail.com> wrote: > > Hello everybody, > > I'm a beginner in Open MPI world. > Maybe it's a simple problem, but I cannot figure out what happen on it... > > My situation is > I use 4 hosts totally, and their IP address are static. > I can't do *mpirun* over 1500 times almost at the same time. > (but it's always okay less than 1000 times) > I got many "*ssh_exchange_identification: connection closed by remote host > *" errors. > > > -------------------------------------------------------------------------------------------------------------------------- > My Open MPI version : 1.6.2 > > -------------------------------------------------------------------------------------------------------------------------- > I use a simple bash shell script file to run my Open MPI file(named > openMPI_test) > Here is my script content : > > for (( index=0; index<2000 ; index++)) > do > (time mpirun --hostfile my_hostfile openMPI_test &) >> file 2>&1 > done > > > p.s.1 "my_hostfile" file lists 4 hosts' IP address. > p.s.2 "openMPI_test" file ask each host do the same thing, it means: > if(rank == 0){ for(i=0 ; i<65535 ; i++) temp = i/(i+1); > } > else if(rank == 1){ for(i=0 ; i<65535 ; i++) temp = > i/(i+1); } > else if(rank == 2){ for(i=0 ; i<65535 ; i++) temp = > i/(i+1); } > else if(rank == 3){ for(i=0 ; i<65535 ; i++) temp = > i/(i+1); } > > -------------------------------------------------------------------------------------------------------------------------- > > At the first, I thought I have some system problems, > so I tried to modify my /etc/ssh/sshd_config file. > I set Max_Sessions up to 65535, and MaxStartups up to 65535, > but the result made me so sad because it still didn't work :(( > > I'm not sure if there are some settings or limits in Open MPI, > or I just have another system problems? > > I really hope someone can help me.. > Thank you all very very much!! > > > > Best Wishes, > Jen > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >