oops forwarding to the right list. Any help with this issue would be appreciated.
Thanks > Hello, > > I am running into the following issue while trying to run osu_latency: > > -- > -bash-3.2$ mpiexec --mca btl openib,self -mca btl_openib_warn_default_gid_ > prefix 0 -np 2 --hostfile mpihosts > /home/jagga/osu-micro-benchmarks-3.3/openmpi/ofed-1.5.2/bin/osu_latency > # OSU MPI Latency Test v3.3 > # Size Latency (us) > [amber04][[10252,1],1][connect/btl_openib_connect_oob.c:325:qp_connect_all] > error modifing QP to RTR errno says Invalid argument > [amber04][[10252,1],1][connect/btl_openib_connect_oob.c:815:rml_recv_cb] > error in endpoint reply start connect > -------------------------------------------------------------------------- > mpiexec has exited due to process rank 1 with PID 6781 on > node amber04 exiting without calling "finalize". This may > have caused other processes in the application to be > terminated by signals sent by mpiexec (as reported here). > -------------------------------------------------------------------------- > -- > > I can get around this by adding the "--mca btl_openib_cpc_include rdmacm" > option. However, I have another host with a different HCA with all the same > drivers and software versions that I can run this same command successfully > with using the rdmacm option. What could be causing one of my environments > to fail but the other to work fine (without the rdmacm option)? > > -- > [root@amber03 ~]# ofed_info | grep OFED > MLNX_OFED_LINUX-1.5.2-1.0.0 (OFED-1.5.2-20101020-1520): > MLNX_OFED_LINUX-1.5.2-1.0.0 > (/mswg/release/ofed-1.5.2-rpms/rnfs-utils/rnfs-utils-1.1.5-10.OFED.src.rpm): > > [root@amber03 ~]# ibv_devinfo > hca_id: mlx4_0 > transport: InfiniBand (0) > fw_ver: 2.7.9294 > node_guid: 78e7:d103:0021:8884 > sys_image_guid: 78e7:d103:0021:8887 > vendor_id: 0x02c9 > vendor_part_id: 26438 > hw_ver: 0xB0 > board_id: HP_0200000003 > phys_port_cnt: 2 > port: 1 > state: PORT_ACTIVE (4) > max_mtu: 2048 (4) > active_mtu: 2048 (4) > sm_lid: 1 > port_lid: 20 > port_lmc: 0x00 > link_layer: IB > > port: 2 > state: PORT_ACTIVE (4) > max_mtu: 2048 (4) > active_mtu: 1024 (3) > sm_lid: 0 > port_lid: 0 > port_lmc: 0x00 > link_layer: Ethernet > -- > > Any help would be greatly appreciated. > > Thanks, > -J
_______________________________________________ ewg mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
