Thanks for the insights Mohammad and Roman. Interesting read.

My interest in RDMA is purely from testing perspective. 

Still I would be interested if somebody who has RDMA enabled and running, to 
share their ceph.conf. 

My RDMA related entries are taken from Mellanox blog here 
https://community.mellanox.com/s/article/bring-up-ceph-rdma---developer-s-guide 
<https://community.mellanox.com/s/article/bring-up-ceph-rdma---developer-s-guide>.
 They used Luminous and built it from source. I'm running binary distribution 
of Mimic here.

ms_type = async+rdma
ms_cluster = async+rdma
ms_async_rdma_device_name = mlx5_0
ms_async_rdma_polling_us = 0
ms_async_rdma_local_gid=<node's_gid>

Or, if somebody with knowledge of the code could tell me when is this 
"RDMAConnectedSocketImpl" error is printed might also be helpful.

2018-12-19 21:45:32.757 7f52b8548140  0 mon.rio@-1(probing).osd e25981 crush 
map has features 288514051259236352, adjusting msgr requires
2018-12-19 21:45:32.757 7f52b8548140  0 mon.rio@-1(probing).osd e25981 crush 
map has features 288514051259236352, adjusting msgr requires
2018-12-19 21:45:32.757 7f52b8548140  0 mon.rio@-1(probing).osd e25981 crush 
map has features 1009089991638532096, adjusting msgr requires
2018-12-19 21:45:32.757 7f52b8548140  0 mon.rio@-1(probing).osd e25981 crush 
map has features 288514051259236352, adjusting msgr requires
2018-12-19 21:45:33.138 7f52b8548140  0 mon.rio@-1(probing) e5  my rank is now 
0 (was -1)
2018-12-19 21:45:33.141 7f529f3fe700 -1  RDMAConnectedSocketImpl activate 
failed to transition to RTR state: (113) No route to host
2018-12-19 21:45:33.142 7f529f3fe700 -1 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.2/rpm/el7/BUILD/ceph-13.2.2/src/msg/async/rdma/RDMAConnectedSocketImpl.cc:
 In function 'void RDMAConnectedSocketImpl::handle_connection()' thread 
7f529f3fe700 time 2018-12-19 21:45:33.141972
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.2/rpm/el7/BUILD/ceph-13.2.2/src/msg/async/rdma/RDMAConnectedSocketImpl.cc:
 224: FAILED assert(!r)
--
Michael Green



> On Dec 19, 2018, at 5:21 AM, Roman Penyaev <[email protected]> wrote:
> 
> 
> Well, I am playing with ceph rdma implementation quite a while
> and it has unsolved problems, thus I would say the status is
> "not completely broken", but "you can run it on your own risk
> and smile":
> 
> 1. On disconnect of previously active (high write load) connection
>   there is a race that can lead to osd (or any receiver) crash:
> 
>   https://github.com/ceph/ceph/pull/25447 
> <https://github.com/ceph/ceph/pull/25447>
> 
> 2. Recent qlogic hardware (qedr drivers) does not support
>   IBV_EVENT_QP_LAST_WQE_REACHED, which is used in ceph rdma
>   implementation, pull request from 1. also targets this
>   incompatibility.
> 
> 3. On high write load and many connections there is a chance,
>   that osd can run out of receive WRs and rdma connection (QP)
>   on sender side will get IBV_WC_RETRY_EXC_ERR, thus disconnected.
>   This is fundamental design problem, which has to be fixed on
>   protocol level (e.g. propagate backpressure to senders).
> 
> 4. Unfortunately neither rdma or any other 0-latency network can
>   bring significant value, because the bottle neck is not a
>   network, please consider this for further reading regarding
>   transport performance in ceph:
> 
>   https://www.spinics.net/lists/ceph-devel/msg43555.html 
> <https://www.spinics.net/lists/ceph-devel/msg43555.html>
> 
>   Problems described above have quite a big impact on overall
>   transport performance.
> 
> --
> Roman
>>> 

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to