Thanks for the insights Mohammad and Roman. Interesting read. My interest in RDMA is purely from testing perspective.
Still I would be interested if somebody who has RDMA enabled and running, to share their ceph.conf. My RDMA related entries are taken from Mellanox blog here https://community.mellanox.com/s/article/bring-up-ceph-rdma---developer-s-guide <https://community.mellanox.com/s/article/bring-up-ceph-rdma---developer-s-guide>. They used Luminous and built it from source. I'm running binary distribution of Mimic here. ms_type = async+rdma ms_cluster = async+rdma ms_async_rdma_device_name = mlx5_0 ms_async_rdma_polling_us = 0 ms_async_rdma_local_gid=<node's_gid> Or, if somebody with knowledge of the code could tell me when is this "RDMAConnectedSocketImpl" error is printed might also be helpful. 2018-12-19 21:45:32.757 7f52b8548140 0 mon.rio@-1(probing).osd e25981 crush map has features 288514051259236352, adjusting msgr requires 2018-12-19 21:45:32.757 7f52b8548140 0 mon.rio@-1(probing).osd e25981 crush map has features 288514051259236352, adjusting msgr requires 2018-12-19 21:45:32.757 7f52b8548140 0 mon.rio@-1(probing).osd e25981 crush map has features 1009089991638532096, adjusting msgr requires 2018-12-19 21:45:32.757 7f52b8548140 0 mon.rio@-1(probing).osd e25981 crush map has features 288514051259236352, adjusting msgr requires 2018-12-19 21:45:33.138 7f52b8548140 0 mon.rio@-1(probing) e5 my rank is now 0 (was -1) 2018-12-19 21:45:33.141 7f529f3fe700 -1 RDMAConnectedSocketImpl activate failed to transition to RTR state: (113) No route to host 2018-12-19 21:45:33.142 7f529f3fe700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.2/rpm/el7/BUILD/ceph-13.2.2/src/msg/async/rdma/RDMAConnectedSocketImpl.cc: In function 'void RDMAConnectedSocketImpl::handle_connection()' thread 7f529f3fe700 time 2018-12-19 21:45:33.141972 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.2/rpm/el7/BUILD/ceph-13.2.2/src/msg/async/rdma/RDMAConnectedSocketImpl.cc: 224: FAILED assert(!r) -- Michael Green > On Dec 19, 2018, at 5:21 AM, Roman Penyaev <[email protected]> wrote: > > > Well, I am playing with ceph rdma implementation quite a while > and it has unsolved problems, thus I would say the status is > "not completely broken", but "you can run it on your own risk > and smile": > > 1. On disconnect of previously active (high write load) connection > there is a race that can lead to osd (or any receiver) crash: > > https://github.com/ceph/ceph/pull/25447 > <https://github.com/ceph/ceph/pull/25447> > > 2. Recent qlogic hardware (qedr drivers) does not support > IBV_EVENT_QP_LAST_WQE_REACHED, which is used in ceph rdma > implementation, pull request from 1. also targets this > incompatibility. > > 3. On high write load and many connections there is a chance, > that osd can run out of receive WRs and rdma connection (QP) > on sender side will get IBV_WC_RETRY_EXC_ERR, thus disconnected. > This is fundamental design problem, which has to be fixed on > protocol level (e.g. propagate backpressure to senders). > > 4. Unfortunately neither rdma or any other 0-latency network can > bring significant value, because the bottle neck is not a > network, please consider this for further reading regarding > transport performance in ceph: > > https://www.spinics.net/lists/ceph-devel/msg43555.html > <https://www.spinics.net/lists/ceph-devel/msg43555.html> > > Problems described above have quite a big impact on overall > transport performance. > > -- > Roman >>>
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
