It's a four node cluster of QEMU/KVM VMs, each running Ubuntu 16.04 with kernel 4.4.0-112, x86_64. Node1 is a NFS server, and nodes 2, 3, and 4 mount /nfs. The libfabric, fabtests, and mpich binaries are all on /nfs.
Without libfabric: $ /nfs/mpich3/bin/mpirun -f /nfs/hosts -n 4 /nfs/mpitests/mpi_hello_world.exe Hello world from processor node1, rank 0 out of 4 processors Hello world from processor node3, rank 2 out of 4 processors Hello world from processor node2, rank 1 out of 4 processors Hello world from processor node4, rank 3 out of 4 processors $ With libfabric: $ /nfs/mpich3/bin/mpirun -f /nfs/hosts -n 4 /nfs/mpitests/mpi_hello_world.exe Hello world from processor node3, rank 2 out of 4 processors Hello world from processor node4, rank 3 out of 4 processors Hello world from processor node1, rank 0 out of 4 processors Hello world from processor node2, rank 1 out of 4 processors ^C[mpiexec@node1] Sending Ctrl-C to processes as requested [mpiexec@node1] Press Ctrl-C again to force abort $ John -----Original Message----- From: Hefty, Sean [mailto:[email protected]] Sent: Monday, February 05, 2018 3:24 PM To: Wilkes, John <[email protected]>; [email protected]; [email protected] Subject: RE: libfabric hangs on QEMU/KVM virtual cluster > Yes, running over the socket provider. I configured libfabric-1.5.3 > with default providers; udp and socket are the only ones - plus rxm > and rxd, but I don't think they apply. > > FWIW, I saw the same hang with 1.3.0 and 1.4.2, and I see the same > hang with OpenVPN and libfabric on QEMU (though I haven't looked into > OpenVPN in as much detail). > > It shouldn't matter, but I'm running QEMU/KVM on an AMD box, so there > could be some hidden Intel-ism that's causing the problem. (My latent > paranoia is showing...) The socket provider is standard BSD sockets, without any CPU specific code. That will change in v1.6.0 in order to add CPU specific instructions to handle persistent memory. But the code should still work fine across any supported platform. I'm just limited on my testing environment. Is the VM 32-bit or 64-bit? - Sean _______________________________________________ ofiwg mailing list [email protected] http://lists.openfabrics.org/mailman/listinfo/ofiwg
