Lenny, Thanks for the info. It doesn't seem to be be working still. My command line is:
/opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -H d2-ib,d3-ib -mca btl openib,self -mca btl_openib_of_pkey_val 33033 /cluster/pallas/x86_64-ib/IMB-MPI1 I don't have a "/sys/class/infiniband/mthca0/ports/1/pkeys/" but I do have "/sys/class/infiniband/mlx4_0/ports/1/pkeys/". It's contents are: 0 106 114 122 16 24 32 40 49 57 65 73 81 9 98 1 107 115 123 17 25 33 41 5 58 66 74 82 90 99 10 108 116 124 18 26 34 42 50 59 67 75 83 91 100 109 117 125 19 27 35 43 51 6 68 76 84 92 101 11 118 126 2 28 36 44 52 60 69 77 85 93 102 110 119 127 20 29 37 45 53 61 7 78 86 94 103 111 12 13 21 3 38 46 54 62 70 79 87 95 104 112 120 14 22 30 39 47 55 63 71 8 88 96 105 113 121 15 23 31 4 48 56 64 72 80 89 97 We aren't using the opensm, but voltaire's SM on a 2012 switch. Thanks again, Matt On Tue, Oct 7, 2008 at 9:37 AM, Lenny Verkhovsky <lenny.verkhov...@gmail.com > wrote: > Hi Matt, > > It seems that the right way to do it is the fallowing: > > -mca btl openib,self -mca btl_openib_ib_pkey_val 33033 > > when the value is a decimal number of the pkey, in your case 0x8109 = > 33033, and no need for btl_openib_ib_pkey_ix value. > > ex. > > mpirun -np 2 -H witch2,witch3 -mca btl openib,self -mca > btl_openib_ib_pkey_val 32769 ./mpi_p1_4_1_2 -t lt > LT (2) (size min max avg) 1 3.511429 3.511429 3.511429 > > if it's not working check cat /sys/class/infiniband/mthca0/ports/1/pkeys/* > for pkeys ans SM, maybe it's a setup. > > Pasha is currently checking this issue. > > Best regards, > > Lenny. > > > > > > On 10/7/08, Jeff Squyres <jsquy...@cisco.com> wrote: >> >> FWIW, if this configuration is for all of your users, you might want to >> specify these MCA params in the default MCA param file, or the environment, >> ...etc. Just so that you don't have to specify it on every mpirun command >> line. >> >> See http://www.open-mpi.org/faq/?category=tuning#setting-mca-params. >> >> >> On Oct 7, 2008, at 5:43 AM, Lenny Verkhovsky wrote: >> >> Sorry, misunderstood the question, >>> >>> thanks for Pasha the right command line will be >>> >>> -mca btl openib,self -mca btl_openib_of_pkey_val 0x8109 -mca >>> btl_openib_of_pkey_ix 1 >>> >>> ex. >>> >>> #mpirun -np 2 -H witch2,witch3 -mca btl openib,self -mca >>> btl_openib_of_pkey_val 0x8001 -mca btl_openib_of_pkey_ix 1 ./mpi_p1_4_TRUNK >>> -t lt >>> LT (2) (size min max avg) 1 3.443480 3.443480 3.443480 >>> >>> >>> Best regards >>> >>> Lenny. >>> >>> >>> On 10/6/08, Jeff Squyres <jsquy...@cisco.com> wrote: On Oct 5, 2008, at >>> 1:22 PM, Lenny Verkhovsky wrote: >>> >>> you should probably use -mca tcp,self -mca btl_openib_if_include >>> ib0.8109 >>> >>> >>> Really? I thought we only took OpenFabrics device names in the >>> openib_if_include MCA param...? It looks like ib0.8109 is an IPoIB device >>> name. >>> >>> >>> >>> Lenny. >>> >>> >>> >>> On 10/3/08, Matt Burgess <burgess.m...@gmail.com> wrote: >>> Hi, >>> >>> >>> I'm trying to get openmpi working over openib partitions. On this >>> cluster, the partition number is 0x109. The ib interfaces are pingable over >>> the appropriate ib0.8109 interface: >>> >>> d2:/opt/openmpi-ib # ifconfig ib0.8109 >>> ib0.8109 Link encap:UNSPEC HWaddr >>> 80-00-00-4A-FE-80-00-00-00-00-00-00-00-00-00-00 >>> inet addr:10.21.48.2 Bcast:10.21.255.255 Mask:255.255.0.0 >>> inet6 addr: fe80::202:c902:26:ca01/64 Scope:Link >>> UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1 >>> RX packets:16811 errors:0 dropped:0 overruns:0 frame:0 >>> TX packets:15848 errors:0 dropped:1 overruns:0 carrier:0 >>> collisions:0 txqueuelen:256 >>> RX bytes:102229428 (97.4 Mb) TX bytes:102324172 (97.5 Mb) >>> >>> >>> I have tried the following: >>> >>> /opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca btl >>> openib,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 0x8109 >>> -mca btl_openib_ib_pkey_ix 1 /cluster/pallas/x86_64-ib/IMB-MPI1 >>> >>> but I just get a RETRY EXCEEDED ERROR. Is there a MCA parameter I am >>> missing? >>> >>> I was successful using tcp only: >>> >>> /opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca btl >>> tcp,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 0x8109 >>> /cluster/pallas/x86_64-ib/IMB-MPI1 >>> >>> >>> >>> Thanks, >>> Matt Burgess >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> -- >>> Jeff Squyres >>> Cisco Systems >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >> >> -- >> Jeff Squyres >> Cisco Systems >> >> >