3.0
Boris Shpolyansky Application Engineer Mellanox Technologies Inc. 2900 Stender Way Santa Clara, CA 95054 Tel.: (408) 916 0014 Fax: (408) 970 3403 Cell: (408) 834 9365 www.mellanox.com ----- Original Message ----- From: Scott Weitzenkamp (sweitzen) <[EMAIL PROTECTED]> To: Boris Shpolyansky; [email protected] <[email protected]> Sent: Sat Mar 10 14:45:25 2007 Subject: RE: [ofa-general] uDAPL question What version of Intel MPI are you using? ________________________________ From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Boris Shpolyansky Sent: Friday, March 09, 2007 8:40 PM To: [email protected] Subject: [ofa-general] uDAPL question Hi, I'm trying to get simple Intel MPI benchmark running over IB (uDAPL) using OFED-1.1 stack. I'm consistently getting the following error: [EMAIL PROTECTED] ~]# ./runjob_I_MPI.boris 2 Task 0 of 2 tasks started on host ibd005.ibd.mti.com clock_resolution = 1.00e-06 s Task 1 of 2 tasks started on host ibd006.ibd.mti.com [0:ibd005] unexpected DAPL event 4006 from 1:ibd006 [1:ibd006] unexpected DAPL event 4006 from 0:ibd005 rank 0 in job 14 ibd005_36193 caused collective abort of all ranks exit status of rank 0: return code 254 I did some digging and found out that event 4006 (actually 0x4006) means DAT_CONNECTION_EVENT_BROKEN and it is returned by function dat_rmr_bind. So my question is why this function consistently fails. I'm using standard dat.conf file: OpenIB-cma u1.2 nonthreadsafe default /usr/local/ofed/lib64/libdaplcma.so mv_dapl.1.2 "ib0 0" "" Appreciate your help, Boris Shpolyansky
_______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
