The ompi_info command shows the following description for
"btl_openib_max_btls" parameter
MCA btl: parameter "btl_openib_max_btls" (current value: "-1")  Maximum
number of HCA ports to use (-1 = use all available, otherwise must be >= 1)

Even though I specify "mpirun --mca btl_openib_max_btls 1 ....."  2 openib
btls are created(the HCA has 2 ports).
When I try to run Open MPI across 2 nodes (one node has an HCA with 2 ports
and the other has only one port). Both endpoints send the QP information
over to the peer. Only one endpoint exists at the peer so it prints the
following error message:
[0,1,1][btl_openib_endpoint.c:706:mca_btl_openib_endpoint_recv] can't find
suitable endpoint for this peer

[0,1,0][btl_openib_endpoint.c:913:mca_btl_openib_endpoint_connect] error
posting receive errno says Operation now in progress

[0,1,0][btl_openib_endpoint.c:737:mca_btl_openib_endpoint_recv] endpoint
connect error: -1

Is "btl_openib_max_btls" the maximum number of BTLs or maximum number of
BTLs per port (which is what the current implementation "init_one_hca()"
looks like)?

-Nysal

Reply via email to