Hi, I have two servers with Mellanox CX4-LX (50GbE Ethernet) back-to-back connected. I am using Ubuntu 14-04. I have made mvapich2 work, and I can confirm both roce and rocev2 work well (by packet capturing).
But I still cannot make openmpi work with roce. I am using openmpi 2.1.1. It looks that this version of openmpi does not recognize CX4-LX, which I have added vendor part id 4117 to mca-btl-openib-device-params.ini, and I have also updated opal/mca/common/verbs/common_verbs_port.c to support CX4-LX, which has speed 64 and width 1. But I am still getting: "WARNING: There was an error initializing an OpenFabrics device. Local host: chguo-msr-linux1 Local device: mlx5_0 " Any hint on what are missing? Thanks, CX
_______________________________________________ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/devel