Is anyone using RoCE with good results? We are planning on it, but initial tests are not great - we get much better performance using plain Ethernet over the exact same links.
It's up and working, I can see RDMA connections and counters, no errors, but performance is unstable. And worse than Ethernet, which was just meant to be a sanity check! Things I've looked at based on Lenovo and IBM guides, which I think are all configured correctly: * RoCE interfaces all on the same subnet * They all have IPv6 enabled with addresses using eui64 addr-gen-mode * DSCP trust mode on NICs * PFC flow control on NICs * Global Pause disabled on NICs * ToS configured for RDMA_CM * Source based routing for multiple interfaces on the same subnet. * Switches (nvidia cumulus) all enabled for RoCE QOS Iperf and GPFS over plain Ethernet get nearly 3GB/s, which is near the line speed of the NIC in question - 25Gbps. Testing basic RDMA connections with ib_send_bw gets about the same. But GPFS over RoCE gets from 0.7GB/s to 1.9GB/s. The servers have 4x 200G Mellanox cards. The client has 1x 25G card. What's frustrating and confusing is that we get better performance when we just enable 1 card at the server end, and also get better performance if we have 1 fabric ID per NIC on the server (with all 4 fabric ID on the same NIC at the client end). I can go into more details if anyone has experience! Does this sound familiar to anyone? I am planning to open a call with Lenovo and/or IBM as I'm not quite sure where to look next. Cheers, Luke -- Luke Sudbery Principal Engineer (HPC and Storage). Architecture, Infrastructure and Systems Advanced Research Computing, IT Services Room 132, Computer Centre G5, Elms Road Please note I don't work on Monday.
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
