Re: [ceph-users] RDMA/RoCE enablement failed with (113) No route to host

2019-02-09 Thread Vitaliy Filippov
Hi Roman, We recently discussed your tests and a simple idea came to my mind - can you repeat your tests targeting latency instead of max throughput? I mean just use iodepth=1. What the latency is and on what hardware? Well, I am playing with ceph rdma implementation quite a while and it

Re: [ceph-users] RDMA/RoCE enablement failed with (113) No route to host

2018-12-21 Thread Michael Green
I was informed today that the CEPH environment I’ve been working on is no longer available. Unfortunately this happened before I could try any of your suggestions, Roman. Thank you for all the attention and advice. -- Michael Green > On Dec 20, 2018, at 08:21, Roman Penyaev wrote: > >>

Re: [ceph-users] RDMA/RoCE enablement failed with (113) No route to host

2018-12-20 Thread Marc Roos
Thanks for posting this Roman. -Original Message- From: Roman Penyaev [mailto:rpeny...@suse.de] Sent: 20 December 2018 14:21 To: Marc Roos Cc: green; mgebai; ceph-users Subject: Re: [ceph-users] RDMA/RoCE enablement failed with (113) No route to host On 2018-12-19 22:01, Marc Roos

Re: [ceph-users] RDMA/RoCE enablement failed with (113) No route to host

2018-12-19 Thread Michael Green
Thanks, Roman. My RDMA is working correctly, I'm pretty sure of that for two reasons. (1) E8 Storage agent running on all OSDs uses RDMA to communicate with our E8 Storage controller and it's working correctly at the moment. The volumes are available and IO can be done at full line rate and

Re: [ceph-users] RDMA/RoCE enablement failed with (113) No route to host

2018-12-19 Thread Marc Roos
I would be interested learning about the performance increase it has compared to 10Gbit. I got the ConnectX-3 Pro but I am not using the rdma because support is not default available. sockperf ping-pong -i 192.168.2.13 -p 5001 -m 16384 -t 10 --pps=max sockperf: Warmup stage (sending a few

Re: [ceph-users] RDMA/RoCE enablement failed with (113) No route to host

2018-12-19 Thread Michael Green
Thanks for the insights Mohammad and Roman. Interesting read. My interest in RDMA is purely from testing perspective. Still I would be interested if somebody who has RDMA enabled and running, to share their ceph.conf. My RDMA related entries are taken from Mellanox blog here

Re: [ceph-users] RDMA/RoCE enablement failed with (113) No route to host

2018-12-18 Thread Mohamad Gebai
Last I heard (read) was that the RDMA implementation is somewhat experimental. Search for "troubleshooting ceph rdma performance" on this mailing list for more info. (Adding Roman in CC who has been working on this recently.) Mohamad On 12/18/18 11:42 AM, Michael Green wrote: > I don't know.  >

Re: [ceph-users] RDMA/RoCE enablement failed with (113) No route to host

2018-12-18 Thread Michael Green
I don't know. Ceph documentation on Mimic doesn't appear to go into too much details on RDMA in general, but still it's mentioned in the Ceph docs here and there. Some examples: Change log - http://docs.ceph.com/docs/master/releases/mimic/

Re: [ceph-users] RDMA/RoCE enablement failed with (113) No route to host

2018-12-18 Thread Виталий Филиппов
Is RDMA officially supported? I'm asking because I recently tried to use DPDK and it seems it's broken... i.e the code is there, but does not compile until I fix cmake scripts, and after fixing the build OSDs just get segfaults and die after processing something like 40-50 incoming packets.

[ceph-users] RDMA/RoCE enablement failed with (113) No route to host

2018-12-12 Thread Michael Green
Hello collective wisdom, ceph version 13.2.2 (02899bfda814146b021136e9d8e80eba494e1126) mimic (stable) here. I have a working cluster here consisting of 3 monitor hosts, 64 OSD processes across 4 osd hosts, plus 2 MDSs, plus 2 MGRs. All of that is consumed by 10 client nodes. Every host in