Charles,
If you are using infiniband hardware, the recommended way is to use UCX. Cheers, Gilles On Thursday, June 14, 2018, Charles A Taylor <chas...@ufl.edu> wrote: > Because of the issues we are having with OpenMPI and the openib BTL > (questions previously asked), I’ve been looking into what other transports > are available. I was particularly interested in OFI/libfabric support but > cannot find any information on it more recent than a reference to the usNIC > BTL from 2015 (Jeff Squyres, Cisco). Unfortunately, the openmpi-org > website FAQ’s covering OpenFabrics support don’t mention anything beyond > OpenMPI 1.8. Given that 3.1 is the current stable version, that seems odd. > > That being the case, I thought I’d ask here. After laying down the > libfabric-devel RPM and building (3.1.0) with —with-libfabric=/usr, I end > up with an “ofi” MTL but nothing else. I can run with OMPI_MCA_mtl=ofi > and OMPI_MCA_btl=“self,vader,openib” but it eventually crashes in > libopen-pal.so. (mpi_waitall() higher up the stack). > > GIZMO:9185 terminated with signal 11 at PC=2b4d4b68a91d SP=7ffcfbde9ff0. > Backtrace: > /apps/mpi/intel/2018.1.163/openmpi/3.1.0/lib64/libopen- > pal.so.40(+0x9391d)[0x2b4d4b68a91d] > /apps/mpi/intel/2018.1.163/openmpi/3.1.0/lib64/libopen- > pal.so.40(opal_progress+0x24)[0x2b4d4b632754] > /apps/mpi/intel/2018.1.163/openmpi/3.1.0/lib64/libmpi.so. > 40(ompi_request_default_wait_all+0x11f)[0x2b4d47be2a6f] > /apps/mpi/intel/2018.1.163/openmpi/3.1.0/lib64/libmpi.so. > 40(PMPI_Waitall+0xbd)[0x2b4d47c2ce4d] > > Questions: Am I using the OFI MTL as intended? Should there be an “ofi” > BTL? Does anyone use this? > > Thanks, > > Charlie Taylor > UF Research Computing > > PS - If you could use some help updating the FAQs, I’d be willing to put > in some time. I’d probably learn a lot. > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users