On Thu, 3 Jul 2008, Pavel Shamis (Pasha) wrote:

I had similar issue recently. It will be nice to have option to disable/enable *CM via config flags.

Not sure if this is related... I am looking at using a nightly 1.3 snapshot and I get this type of error messages when running:

[n020205][[36506,1],0][connect/btl_openib_connect_ibcm.c:723:ibcm_component_query]
 failed to open IB CM device: /dev/infiniband/ucm0

which is actually right, as /dev/infiniband on the nodes doesn't contain ucm0. On the same cluster, OpenMPI 1.2.7rc2 works fine; the configure options for building them are the same.

The output of ldd shows for the binary linked with 1.3a:

libibcm.so.1 => /opt/ofed/1.2.5.4/lib64/libibcm.so.1

while this is missing from the binary linked with 1.2. Even after printing these messages, the binary linked with 1.3a works; it works even when I specify "--mca btl openib,self" so I think that the IB stack is still being used (there is also a TCP/GigE network which could be chosen otherwise).

I don't know whether this is caused by a somehow inconsistent setup of the system, but I would welcome an option to make 1.3a behave like 1.2.

--
Bogdan Costescu

IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany
Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850
E-mail: bogdan.coste...@iwr.uni-heidelberg.de

Reply via email to