Hi Haomai,

can you please guide me to a running cluster with RDMA ?

regards

Gerhard W. Recher

net4sec UG (haftungsbeschränkt)
Leitenweg 6
86929 Penzing

+49 171 4802507
Am 28.09.2017 um 04:21 schrieb Haomai Wang:
> previously we have a infiniband cluster, recently we deploy a roce
> cluster. they are both test purpose for users.
>
> On Wed, Sep 27, 2017 at 11:38 PM, Gerhard W. Recher
> <gerhard.rec...@net4sec.com> wrote:
>> Haomai,
>>
>> I looked at your presentation, so i guess you already have a running
>> cluster with RDMA & mellanox
>> (https://www.youtube.com/watch?v=Qb2SUWLdDCw)
>>
>> Is nobody out there having a running cluster with RDMA ?
>> any help is appreciated !
>>
>> Gerhard W. Recher
>>
>> net4sec UG (haftungsbeschränkt)
>> Leitenweg 6
>> 86929 Penzing
>>
>> +49 171 4802507
>> Am 27.09.2017 um 16:09 schrieb Haomai Wang:
>>> https://community.mellanox.com/docs/DOC-2415
>>>
>>> On Wed, Sep 27, 2017 at 10:01 PM, Gerhard W. Recher
>>> <gerhard.rec...@net4sec.com> wrote:
>>>> How to set local gid option ?
>>>>
>>>> I have no glue :)
>>>>
>>>> Gerhard W. Recher
>>>>
>>>> net4sec UG (haftungsbeschränkt)
>>>> Leitenweg 6
>>>> 86929 Penzing
>>>>
>>>> +49 171 4802507
>>>> Am 27.09.2017 um 15:59 schrieb Haomai Wang:
>>>>> do you set local gid option?
>>>>>
>>>>> On Wed, Sep 27, 2017 at 9:52 PM, Gerhard W. Recher
>>>>> <gerhard.rec...@net4sec.com> wrote:
>>>>>> Yep ROcE ....
>>>>>>
>>>>>> i followed up all recommendations in mellanox papers ...
>>>>>>
>>>>>> */etc/security/limits.conf*
>>>>>>
>>>>>> * soft memlock unlimited
>>>>>> * hard memlock unlimited
>>>>>> root soft memlock unlimited
>>>>>> root hard memlock unlimited
>>>>>>
>>>>>>
>>>>>> also set properties on daemons (chapter 11) in
>>>>>> https://community.mellanox.com/docs/DOC-2721
>>>>>>
>>>>>>
>>>>>> only gids parameter in ceph.conf is no way in proxmox, because
>>>>>> cephp.conf is for all storage node the same file
>>>>>> root@pve01:/etc/ceph# ls -latr
>>>>>> total 16
>>>>>> lrwxrwxrwx   1 root root   18 Jun 21 19:35 ceph.conf -> 
>>>>>> /etc/pve/ceph.conf
>>>>>>
>>>>>> and each node has uniqe  GIDS.....
>>>>>>
>>>>>>
>>>>>> ./showgids
>>>>>> DEV     PORT    INDEX   GID
>>>>>> IPv4            VER     DEV
>>>>>> ---     ----    -----   ---
>>>>>> ------------    ---     ---
>>>>>> mlx4_0  1       0
>>>>>> fe80:0000:0000:0000:268a:07ff:fee2:6070                 v1      ens1
>>>>>> mlx4_0  1       1
>>>>>> fe80:0000:0000:0000:268a:07ff:fee2:6070                 v2      ens1
>>>>>> mlx4_0  1       2       0000:0000:0000:0000:0000:ffff:c0a8:dd8d
>>>>>> 192.168.221.141         v1      vmbr0
>>>>>> mlx4_0  1       3       0000:0000:0000:0000:0000:ffff:c0a8:dd8d
>>>>>> 192.168.221.141         v2      vmbr0
>>>>>> mlx4_0  2       0
>>>>>> fe80:0000:0000:0000:268a:07ff:fee2:6071                 v1      ens1d1
>>>>>> mlx4_0  2       1
>>>>>> fe80:0000:0000:0000:268a:07ff:fee2:6071                 v2      ens1d1
>>>>>> mlx4_0  2       2       0000:0000:0000:0000:0000:ffff:c0a8:648d
>>>>>> 192.168.100.141         v1      ens1d1
>>>>>> mlx4_0  2       3       0000:0000:0000:0000:0000:ffff:c0a8:648d
>>>>>> 192.168.100.141         v2      ens1d1
>>>>>> n_gids_found=8
>>>>>>
>>>>>> next node ... showgids
>>>>>> ./showgids
>>>>>> DEV     PORT    INDEX   GID
>>>>>> IPv4            VER     DEV
>>>>>> ---     ----    -----   ---
>>>>>> ------------    ---     ---
>>>>>> mlx4_0  1       0
>>>>>> fe80:0000:0000:0000:268a:07ff:fef9:8730                 v1      ens1
>>>>>> mlx4_0  1       1
>>>>>> fe80:0000:0000:0000:268a:07ff:fef9:8730                 v2      ens1
>>>>>> mlx4_0  1       2       0000:0000:0000:0000:0000:ffff:c0a8:dd8e
>>>>>> 192.168.221.142         v1      vmbr0
>>>>>> mlx4_0  1       3       0000:0000:0000:0000:0000:ffff:c0a8:dd8e
>>>>>> 192.168.221.142         v2      vmbr0
>>>>>> mlx4_0  2       0
>>>>>> fe80:0000:0000:0000:268a:07ff:fef9:8731                 v1      ens1d1
>>>>>> mlx4_0  2       1
>>>>>> fe80:0000:0000:0000:268a:07ff:fef9:8731                 v2      ens1d1
>>>>>> mlx4_0  2       2       0000:0000:0000:0000:0000:ffff:c0a8:648e
>>>>>> 192.168.100.142         v1      ens1d1
>>>>>> mlx4_0  2       3       0000:0000:0000:0000:0000:ffff:c0a8:648e
>>>>>> 192.168.100.142         v2      ens1d1
>>>>>> n_gids_found=8
>>>>>>
>>>>>>
>>>>>>
>>>>>> ifconfig ens1d1
>>>>>> ens1d1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9000
>>>>>>         inet 192.168.100.141  netmask 255.255.255.0  broadcast
>>>>>> 192.168.100.255
>>>>>>         inet6 fe80::268a:7ff:fee2:6071  prefixlen 64  scopeid 0x20<link>
>>>>>>         ether 24:8a:07:e2:60:71  txqueuelen 1000  (Ethernet)
>>>>>>         RX packets 25450717  bytes 39981352146 (37.2 GiB)
>>>>>>         RX errors 0  dropped 77  overruns 77  frame 0
>>>>>>         TX packets 26554236  bytes 53419159091 (49.7 GiB)
>>>>>>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>>>>>>
>>>>>>
>>>>>>
>>>>>> Gerhard W. Recher
>>>>>>
>>>>>> net4sec UG (haftungsbeschränkt)
>>>>>> Leitenweg 6
>>>>>> 86929 Penzing
>>>>>>
>>>>>> +49 171 4802507
>>>>>> Am 27.09.2017 um 14:50 schrieb Haomai Wang:
>>>>>>> On Wed, Sep 27, 2017 at 8:33 PM, Gerhard W. Recher
>>>>>>> <gerhard.rec...@net4sec.com> wrote:
>>>>>>>> Hi Folks!
>>>>>>>>
>>>>>>>> I'm totally stuck
>>>>>>>>
>>>>>>>> rdma is running on my nics, rping udaddy etc will give positive 
>>>>>>>> results.
>>>>>>>>
>>>>>>>> cluster consist of:
>>>>>>>> proxmox-ve: 5.0-23 (running kernel: 4.10.17-3-pve)
>>>>>>>> pve-manager: 5.0-32 (running version: 5.0-32/2560e073)
>>>>>>>>
>>>>>>>> system(4 nodes): Supermicro 2028U-TN24R4T+
>>>>>>>>
>>>>>>>> 2 port Mellanox connect x3pro 56Gbit
>>>>>>>> 4 port intel 10GigE
>>>>>>>> memory: 768 GBytes
>>>>>>>> CPU DUAL  Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
>>>>>>>>
>>>>>>>> ceph: 28 osds
>>>>>>>> 24  Intel Nvme 2000GB Intel SSD DC P3520, 2,5", PCIe 3.0 x4,
>>>>>>>>  4  Intel Nvme 1,6TB Intel SSD DC P3700, 2,5", U.2 PCIe 3.0
>>>>>>>>
>>>>>>>>
>>>>>>>> ceph is running on bluestore,  engaging rdma within ceph (version
>>>>>>>> 12.2.0-pve1) will lead into this crash
>>>>>>>>
>>>>>>>>
>>>>>>>> ceph.conf:
>>>>>>>> [global]
>>>>>>>> ms_type=async+rdma
>>>>>>>> ms_cluster_type = async+rdma
>>>>>>>> ms_async_rdma_port_num=2
>>>>>>> I guess it should be 0. what's your result of "ibstat"
>>>>>>>
>>>>>>>> ms_async_rdma_device_name=mlx4_0
>>>>>>>> ...
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -- Reboot --
>>>>>>>> Sep 26 18:56:10 pve02 systemd[1]: Started Ceph cluster manager daemon.
>>>>>>>> Sep 26 18:56:10 pve02 systemd[1]: Reached target ceph target allowing 
>>>>>>>> to start/stop all ceph-mgr@.service instances at once.
>>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]: 2017-09-26 18:56:10.427474 
>>>>>>>> 7f0e2137e700 -1 Infiniband binding_port  port not found
>>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]: 
>>>>>>>> /home/builder/source/ceph-12.2.0/src/msg/async/rdma/Infiniband.cc: In 
>>>>>>>> function 'void Device::binding_port(CephContext*, int)' thread 
>>>>>>>> 7f0e2137e700 time 2017-09-26 18:56:10.427498
>>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]: 
>>>>>>>> /home/builder/source/ceph-12.2.0/src/msg/async/rdma/Infiniband.cc: 
>>>>>>>> 144: FAILED assert(active_port)
>>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  ceph version 12.2.0 
>>>>>>>> (36f6c5ea099d43087ff0276121fd34e71668ae0e) luminous (rc)
>>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  1: 
>>>>>>>> (ceph::__ceph_assert_fail(char const*, char const*, int, char 
>>>>>>>> const*)+0x102) [0x55e9dde4bd12]
>>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  2: 
>>>>>>>> (Device::binding_port(CephContext*, int)+0x573) [0x55e9de1b2c33]
>>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  3: (Infiniband::init()+0x15f) 
>>>>>>>> [0x55e9de1b8f1f]
>>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  4: 
>>>>>>>> (RDMAWorker::connect(entity_addr_t const&, SocketOptions const&, 
>>>>>>>> ConnectedSocket*)+0x4c) [0x55e9ddf2329c]
>>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  5: 
>>>>>>>> (AsyncConnection::_process_connection()+0x446) [0x55e9de1a6d86]
>>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  6: 
>>>>>>>> (AsyncConnection::process()+0x7f8) [0x55e9de1ac328]
>>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  7: 
>>>>>>>> (EventCenter::process_events(int, std::chrono::duration<unsigned long, 
>>>>>>>> std::ratio<1l, 1000000000l> >*)+0x1125) [0x55e9ddf198a5]
>>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  8: (()+0x4c9288) 
>>>>>>>> [0x55e9ddf1d288]
>>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  9: (()+0xb9e6f) [0x7f0e259d4e6f]
>>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  10: (()+0x7494) [0x7f0e260d1494]
>>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  11: (clone()+0x3f) 
>>>>>>>> [0x7f0e25149aff]
>>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  NOTE: a copy of the executable, 
>>>>>>>> or `objdump -rdS <executable>` is needed to interpret this.
>>>>>>>>
>>>>>>>>
>>>>>>>> any advice ?
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Gerhard W. Recher
>>>>>>>>
>>>>>>>> net4sec UG (haftungsbeschränkt)
>>>>>>>> Leitenweg 6
>>>>>>>> 86929 Penzing
>>>>>>>>
>>>>>>>> +49 171 4802507
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> ceph-users mailing list
>>>>>>>> ceph-users@lists.ceph.com
>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>>
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users@lists.ceph.com
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to