oops forwarding to the right list.  Any help with this issue would be 
appreciated.

Thanks

> Hello,
> 
> I am running into the following issue while trying to run osu_latency:
> 
> --
> -bash-3.2$ mpiexec --mca btl openib,self -mca btl_openib_warn_default_gid_
> prefix 0 -np 2 --hostfile mpihosts 
> /home/jagga/osu-micro-benchmarks-3.3/openmpi/ofed-1.5.2/bin/osu_latency
> # OSU MPI Latency Test v3.3
> # Size            Latency (us)
> [amber04][[10252,1],1][connect/btl_openib_connect_oob.c:325:qp_connect_all] 
> error modifing QP to RTR errno says Invalid argument
> [amber04][[10252,1],1][connect/btl_openib_connect_oob.c:815:rml_recv_cb] 
> error in endpoint reply start connect
> --------------------------------------------------------------------------
> mpiexec has exited due to process rank 1 with PID 6781 on
> node amber04 exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpiexec (as reported here).
> --------------------------------------------------------------------------
> --
> 
> I can get around this by adding the "--mca btl_openib_cpc_include rdmacm" 
> option.  However, I have another host with a different HCA with all the same 
> drivers and software versions that I can run this same command successfully 
> with using the rdmacm option.  What could be causing one of my environments 
> to fail but the other to work fine (without the rdmacm option)?  
> 
> --
> [root@amber03 ~]# ofed_info | grep OFED
> MLNX_OFED_LINUX-1.5.2-1.0.0 (OFED-1.5.2-20101020-1520):
> MLNX_OFED_LINUX-1.5.2-1.0.0 
> (/mswg/release/ofed-1.5.2-rpms/rnfs-utils/rnfs-utils-1.1.5-10.OFED.src.rpm):
> 
> [root@amber03 ~]# ibv_devinfo 
> hca_id:    mlx4_0
>     transport:            InfiniBand (0)
>     fw_ver:                2.7.9294
>     node_guid:            78e7:d103:0021:8884
>     sys_image_guid:            78e7:d103:0021:8887
>     vendor_id:            0x02c9
>     vendor_part_id:            26438
>     hw_ver:                0xB0
>     board_id:            HP_0200000003
>     phys_port_cnt:            2
>         port:    1
>             state:            PORT_ACTIVE (4)
>             max_mtu:        2048 (4)
>             active_mtu:        2048 (4)
>             sm_lid:            1
>             port_lid:        20
>             port_lmc:        0x00
>             link_layer:        IB
> 
>         port:    2
>             state:            PORT_ACTIVE (4)
>             max_mtu:        2048 (4)
>             active_mtu:        1024 (3)
>             sm_lid:            0
>             port_lid:        0
>             port_lmc:        0x00
>             link_layer:        Ethernet
> --
> 
> Any help would be greatly appreciated.
> 
> Thanks,
> -J
_______________________________________________
ewg mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Reply via email to