Hi Marcin,

2015-09-30 9:19 GMT-06:00 marcin.krotkiewski <marcin.krotkiew...@gmail.com>:

> Thank you, and Jeff, for clarification.
>
> Before I bother you all more without the need, I should probably say I was
> hoping to use libfabric/OpenMPI on an InfiniBand cluster. Somehow now I
> feel I have confused this altogether, so maybe I should go one step back:
>
>  1. libfabric is hardware independent, and does support Infiniband, right?
>

The short answer is yes libfabric is hardware independent (and does work on
goods days on os-x as well as linux).
The longer answer is that there has been more/less work on implementing
providers (the plugins in to libfabric
to interface to different networks) for different networks.

There is a socket provider.  That gets a good amount of attention because
its a base reference provider.
psm/psm2 providers are available.  I have used the psm provider some on a
truescale cluster.  It doesn't
offer better performance than just using psm directly, but it does appear
to work.

There is an mxm provider but it was not implemented by mellanox, and I
can't get it to compile on my
connectx3 system using mxm 1.5.

There is a vanilla verbs provider but it doesn't support FI_EP_RDM endpoint
type, which is used by
the non-cisco component of Open MPI (ofi mtl) which is available.

When you build and install libfabric, there should be an fi_info binary
installed in $(LIBFABRIC_INSTALL_DIR)/bin
On my truescale cluster the output is:

psm: psm

    version: 0.9

    type: FI_EP_RDM

    protocol: FI_PROTO_PSMX

verbs: IB-0x80fe

    version: 1.0

    type: FI_EP_MSG

    protocol: FI_PROTO_RDMA_CM_IB_RC

sockets: IP

    version: 1.0

    type: FI_EP_MSG

    protocol: FI_PROTO_SOCK_TCP

sockets: IP

    version: 1.0

    type: FI_EP_DGRAM

    protocol: FI_PROTO_SOCK_TCP

sockets: IP

    version: 1.0

    type: FI_EP_RDM

    protocol: FI_PROTO_SOCK_TCP

In order to use the mtl/ofi, at a minimum a provider needs to support
FI_EP_RDM type (see above).  Note that on the truescale
cluster the verbs provider is built, but it only supports FI_EP_MSG
endpoint types.  So mtl/ofi can't use that.



>  2. I read that OpenMPI provides interface to libfabric through btl/usnic
> and mtl/ofi.  can any of those use libfabric on Infiniband networks?
>

if you have intel truescale or its follow-on then the answer is yes,
although the default is for Open MPI to use mtl/psm on that network.



>
> Please forgive my ignorance, the amount of different options is rather
> overwhelming..
>
> Marcin
>
>
>
> On 09/30/2015 04:26 PM, Howard Pritchard wrote:
>
> Hello Marcin
>
> What configure options are you using besides with-libfabric?
>
> Could you post your config.log file tp the list?
>
> Looks like you only install fi_ext_usnic.h if you could build the usnic
> libfab provider.  When you configured libfabric what providers were listed
> at the end of configure run? Maybe attach config.log from the libfabric
> build ?
>
> If your cluster has cisco usnics you should probably be using
> libfabric/cisco openmpi.  If you are using intel omnipath you may want to
> try the ofi mtl.  Its not selected by default however.
>
> Howard
>
> ----------
>
> sent from my smart phonr so no good type.
>
> Howard
> On Sep 30, 2015 5:35 AM, "Marcin Krotkiewski" <
> marcin.krotkiew...@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to compile the 2.x branch with libfabric support, but get
>> this error during configure:
>>
>> configure:100708: checking rdma/fi_ext_usnic.h presence
>> configure:100708: gcc -E
>> -I/cluster/software/VERSIONS/openmpi.gnu.2.x/include
>> -I/usit/abel/u1/marcink/software/ompi-release-2.x/opal/mca/hwloc/hwloc1110/hwloc/include
>> conftest.c
>> conftest.c:688:31: fatal error: rdma/fi_ext_usnic.h: No such file or
>> directory
>> [...]
>> configure:100708: checking for rdma/fi_ext_usnic.h
>> configure:100708: result: no
>> configure:101253: checking if MCA component btl:usnic can compile
>> configure:101255: result: no
>>
>> Which is correct - the file is not there. I have downloaded fresh
>> libfabric-1.1.0.tar.bz2 and it does not have this file. Probably OpenMPI
>> needs some updates?
>>
>> I am also wondering what is the state of libfabric support in OpenMPI
>> nowadays. I have seen recent (March) presentation about it, so it seems to
>> be an actively developed feature. Is this correct? It seemed from the
>> presentation that there are benefits to this approach, but is it mature
>> enough in OpenMPI, or it will yet take some time?
>>
>> Thanks!
>>
>> Marcin
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/09/27728.php
>>
>
>
> _______________________________________________
> users mailing listus...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/09/27733.php
>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/09/27743.php
>

Reply via email to