I don't think so,
It's always the 66th node, even if I swap between 65th and 66th
I also get the same error when setting np=66, while having only 65 hosts in 
hostfile
(I am using only tcp btl )


Lenny Verkhovsky
SW Engineer,  Mellanox Technologies
www.mellanox.com<http://www.mellanox.com>

Office:    +972 74 712 9244
Mobile:  +972 54 554 0233
Fax:        +972 72 257 9400

From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
Sent: Monday, August 11, 2014 1:07 AM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI fails with np > 65

Looks to me like your 65th host is missing the dstore library - is it possible 
you don't have your paths set correctly on all hosts in your hostfile?


On Aug 10, 2014, at 1:13 PM, Lenny Verkhovsky 
<len...@mellanox.com<mailto:len...@mellanox.com>> wrote:


Hi all,

Trying to run OpenMPI ( trunk Revision: 32428 ) I faced the problem running 
OMPI with more than 65 procs.
It looks like MPI failes to open 66th connection even with running `hostname` 
over tcp.
It also seems to unrelated to specific host.
All hosts are Ubuntu 12.04.1 LTS

mpirun -np 66 --hostfile /proj/SSA/Mellanox/tmp//20140810_070156_hostfile.txt 
--mca btl tcp,self hostname
[nodename] [[4452,0],65] ORTE_ERROR_LOG: Error in file 
base/ess_base_std_orted.c at line 288

.......................................
It looks like environment issue, but I can't find any limit related.
Any ideas ?
Thanks.
Lenny Verkhovsky
SW Engineer,  Mellanox Technologies
www.mellanox.com<http://www.mellanox.com/>

Office:    +972 74 712 9244
Mobile:  +972 54 554 0233
Fax:        +972 72 257 9400

_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/08/24961.php

Reply via email to