> Op 14 januari 2017 om 6:41 schreef Christian Balzer <ch...@gol.com>: > > > > Hello, > > On Fri, 13 Jan 2017 13:18:35 -0500 Mohammed Naser wrote: > > > These Intel SSDs are more than capable of handling the workload, in > > addition, this cluster is used as an RBD backend for an OpenStack cluster. > > > > I haven't tested the S3520s yet, them being the first 3D NAND offering > from Intel they are slightly slower than the predecessors in terms of BW > and IOPS, but have supposedly a slightly lower latency if the specs are to > believed. > > Given the history of Intel DC S SSDs I think it is safe to assume that they > use the same/similar controller setup as their predecessors, meaning a > large powercap backed cache which enables them to deal correctly and > quickly with SYNC and DIRECT writes. > > What would worry me slight more (even at their 960GB size) is the endurance > of 1 DWPD, which with journals inline comes down to 0.5 and with FS > overhead and write amplification (depends a lot on the type of operations) > you're looking a something along 0.3 DWPD to base your expectations on. > Mind, that still leaves you with about 9.6TB per day, which is a decent > enough number, but only equates to about 112MB/s. > > Finally, most people start with looking at bandwidth/throughput when > penultimately they discover it was IOPS they needed first and foremost.
Yes! Bandwidth isn't what people usually need, they need IOps. Low latency. I see a lot of clusters doing 10k ~ 20k IOps with somewhere around 1Gbit/s of traffic. Wido > > Christian > > > Sent from my iPhone > > > > > On Jan 13, 2017, at 1:08 PM, Somnath Roy <somnath....@sandisk.com> wrote: > > > > > > Also, there are lot of discussion about SSDs not suitable for Ceph write > > > workload (with filestore) in community as those are not good for > > > odirect/odsync kind of writes. Hope your SSDs are tolerant of that. > > > > > > -----Original Message----- > > > From: Somnath Roy > > > Sent: Friday, January 13, 2017 10:06 AM > > > To: 'Mohammed Naser'; Wido den Hollander > > > Cc: ceph-users@lists.ceph.com > > > Subject: RE: [ceph-users] All SSD cluster performance > > > > > > << Both OSDs are pinned to two cores on the system Is there any reason > > > you are pinning osds like that ? I would say for object workload there is > > > no need to pin osds. > > > The configuration you mentioned , Ceph with 4M object PUT it should be > > > saturating your network first. > > > > > > Have you run say 4M object GET to see what BW you are getting ? > > > > > > Thanks & Regards > > > Somnath > > > > > > -----Original Message----- > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > > > Mohammed Naser > > > Sent: Friday, January 13, 2017 9:51 AM > > > To: Wido den Hollander > > > Cc: ceph-users@lists.ceph.com > > > Subject: Re: [ceph-users] All SSD cluster performance > > > > > > > > >> On Jan 13, 2017, at 12:41 PM, Wido den Hollander <w...@42on.com> wrote: > > >> > > >> > > >>> Op 13 januari 2017 om 18:39 schreef Mohammed Naser > > >>> <mna...@vexxhost.com>: > > >>> > > >>> > > >>> > > >>>> On Jan 13, 2017, at 12:37 PM, Wido den Hollander <w...@42on.com> wrote: > > >>>> > > >>>> > > >>>>> Op 13 januari 2017 om 18:18 schreef Mohammed Naser > > >>>>> <mna...@vexxhost.com>: > > >>>>> > > >>>>> > > >>>>> Hi everyone, > > >>>>> > > >>>>> We have a deployment with 90 OSDs at the moment which is all SSD > > >>>>> that’s not hitting quite the performance that it should be in my > > >>>>> opinion, a `rados bench` run gives something along these numbers: > > >>>>> > > >>>>> Maintaining 16 concurrent writes of 4194304 bytes to objects of > > >>>>> size 4194304 for up to 10 seconds or 0 objects Object prefix: > > >>>>> benchmark_data_bench.vexxhost._30340 > > >>>>> sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg > > >>>>> lat(s) > > >>>>> 0 0 0 0 0 0 - > > >>>>> 0 > > >>>>> 1 16 158 142 568.513 568 0.0965336 > > >>>>> 0.0939971 > > >>>>> 2 16 287 271 542.191 516 0.0291494 > > >>>>> 0.107503 > > >>>>> 3 16 375 359 478.75 352 0.0892724 > > >>>>> 0.118463 > > >>>>> 4 16 477 461 461.042 408 0.0243493 > > >>>>> 0.126649 > > >>>>> 5 16 540 524 419.216 252 0.239123 > > >>>>> 0.132195 > > >>>>> 6 16 644 628 418.67 416 0.347606 > > >>>>> 0.146832 > > >>>>> 7 16 734 718 410.281 360 0.0534447 > > >>>>> 0.147413 > > >>>>> 8 16 811 795 397.487 308 0.0311927 > > >>>>> 0.15004 > > >>>>> 9 16 879 863 383.537 272 0.0894534 > > >>>>> 0.158513 > > >>>>> 10 16 980 964 385.578 404 0.0969865 > > >>>>> 0.162121 > > >>>>> 11 3 981 978 355.613 56 0.798949 > > >>>>> 0.171779 > > >>>>> Total time run: 11.063482 > > >>>>> Total writes made: 981 > > >>>>> Write size: 4194304 > > >>>>> Object size: 4194304 > > >>>>> Bandwidth (MB/sec): 354.68 > > >>>>> Stddev Bandwidth: 137.608 > > >>>>> Max bandwidth (MB/sec): 568 > > >>>>> Min bandwidth (MB/sec): 56 > > >>>>> Average IOPS: 88 > > >>>>> Stddev IOPS: 34 > > >>>>> Max IOPS: 142 > > >>>>> Min IOPS: 14 > > >>>>> Average Latency(s): 0.175273 > > >>>>> Stddev Latency(s): 0.294736 > > >>>>> Max latency(s): 1.97781 > > >>>>> Min latency(s): 0.0205769 > > >>>>> Cleaning up (deleting benchmark objects) Clean up completed and > > >>>>> total clean up time :3.895293 > > >>>>> > > >>>>> We’ve verified the network by running `iperf` across both replication > > >>>>> and public networks and it resulted in 9.8Gb/s (10G links for both). > > >>>>> The machine that’s running the benchmark doesn’t even saturate it’s > > >>>>> port. The SSDs are S3520 960GB drives which we’ve benchmarked and > > >>>>> they can handle the load using fio/etc. At this point, not really > > >>>>> sure where to look next.. anyone running all SSD clusters that might > > >>>>> be able to share their experience? > > >>>> > > >>>> I suggest that you search a bit on the ceph-users list since this > > >>>> topic has been discussed multiple times in the past and even recently. > > >>>> > > >>>> Ceph isn't your average storage system and you have to keep that in > > >>>> mind. Nothing is free in this world. Ceph provides excellent > > >>>> consistency and distribution of data, but that also means that you > > >>>> have things like network and CPU latency. > > >>>> > > >>>> However, I suggest you look up a few threads on this list which have > > >>>> valuable tips. > > >>>> > > >>>> Wido > > >>> > > >>> Thanks for the reply, I’ve actually done quite a lot of research and > > >>> went through many of the previous posts. While I agree a 100% with > > >>> your statement, I’ve found that other people with similar setups have > > >>> been able to reach numbers that I cannot, which leads me to believe > > >>> that there is actually an issue in here. They have been able to max > > >>> out at 1200 MB/s which is the maximum of their benchmarking host. We’d > > >>> like to reach that and I think that given the specifications of the > > >>> cluster, it can do it with no problems. > > >> > > >> A few tips: > > >> > > >> - Disable all logging in Ceph (debug_osd, debug_ms, debug_auth, etc, > > >> etc) > > > > > > All logging is configured to default settings, should those be turned > > > down? > > > > > >> - Disable power saving on the CPUs > > > > > > All disabled as well, everything running on `performance` mode. > > > > > >> > > >> Can you also share how the 90 OSDs are distributed in the cluster and > > >> what CPUs you have? > > > > > > There are 45 machines with 2 OSDs each. The servers they’re located on > > > on average have 24 core ~3GHz Intel CPUs. Both OSDs are pinned to two > > > cores on the system. > > > > > >> > > >> Wido > > >> > > >>> > > >>>>> > > >>>>> Thanks, > > >>>>> Mohammed > > >>>>> _______________________________________________ > > >>>>> ceph-users mailing list > > >>>>> ceph-users@lists.ceph.com > > >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > >>> > > > > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@lists.ceph.com > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > ________________________________ > > > > > > PLEASE NOTE: The information contained in this electronic mail message is > > > intended only for the use of the designated recipient(s) named above. If > > > the reader of this message is not the intended recipient, you are hereby > > > notified that you have received this message in error and that any > > > review, dissemination, distribution, or copying of this message is > > > strictly prohibited. If you have received this communication in error, > > > please notify the sender by telephone or e-mail (as shown above) > > > immediately and destroy any and all copies of this message in your > > > possession (whether hard copies or electronically stored copies). > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > -- > Christian Balzer Network/Systems Engineer > ch...@gol.com Global OnLine Japan/Rakuten Communications > http://www.gol.com/ > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com