Re: [ceph-users] Rados performance inconsistencies, lower than expected performance

Menno Zonneveld Thu, 06 Sep 2018 06:53:28 -0700

ah yes, 3x replicated with minimal 2.


my ceph.conf is pretty bare, just in case it might be relevant

[global]
         auth client required = cephx
         auth cluster required = cephx
         auth service required = cephx

         cluster network = 172.25.42.0/24

         fsid = f4971cca-e73c-46bc-bb05-4af61d419f6e

         keyring = /etc/pve/priv/$cluster.$name.keyring

         mon allow pool delete = true
         mon osd allow primary affinity = true

         osd journal size = 5120
         osd pool default min size = 2
         osd pool default size = 3


-----Original message-----
> From:Marc Roos <[email protected]>
> Sent: Thursday 6th September 2018 15:43
> To: ceph-users <[email protected]>; Menno Zonneveld <[email protected]>
> Subject: RE: [ceph-users] Rados performance inconsistencies, lower than 
> expected performance
> 
>  
> 
> Test pool is 3x replicated?
> 
> 
> -----Original Message-----
> From: Menno Zonneveld [mailto:[email protected]] 
> Sent: donderdag 6 september 2018 15:29
> To: [email protected]
> Subject: [ceph-users] Rados performance inconsistencies, lower than 
> expected performance
> 
> I've setup a CEPH cluster to test things before going into production 
> but I've run into some performance issues that I cannot resolve or 
> explain.
> 
> Hardware in use in each storage machine (x3)
> - dual 10Gbit Solarflare Communications SFC9020 (Linux bond, mtu 9000)
> - dual 10Gbit EdgeSwitch 16-Port XG
> - LSI Logic / Symbios Logic SAS2308 PCI-Express Fusion-MPT SAS-2 HBA
> - 3x Intel S4500 480GB SSD as OSD's
> - 2x SSD raid-1 boot/OS disks
> - 2x Intel(R) Xeon(R) CPU E5-2630
> - 128GB memory
> 
> Software wise I'm running CEPH 12.2.7-pve1 setup from Proxmox VE 5.2 on 
> all nodes.
> 
> Running rados benchmark resulted in somewhat lower than expected 
> performance unless ceph enters the 'near-full' state. When the cluster 
> is mostly empty rados bench (180 write -b 4M -t 16) results in about 
> 330MB/s with 0.18ms latency but when hitting near-full state this goes 
> up to a more expected 550MB/s and 0.11ms latency.
> 
> iostat on the storage machines shows the disks are hardly utilized 
> unless the cluster hits near-full, CPU and network also aren't maxed 
> out. I’ve also tried with NIC bonding and just one switch, without 
> jumbo frames but nothing seem to matter in this case.
> 
> Is this expected behavior or what can I try to do to pinpoint the 
> bottleneck ?
> 
> The expected performance is per Proxmox's benchmark results they 
> released this year, they have 4 OSD's per server and hit almost 800MB/s 
> with 0.08ms latency using 10Gbit and 3 nodes, though they have more 
> OSD's and somewhat different hardware I understand I won't hit the 
> 800MB/s mark but the difference between empty and almost full cluster 
> makes no sense to me, I'd expect it to be the other way around.
> 
> Thanks,
> Menno
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Rados performance inconsistencies, lower than expected performance

Reply via email to