Here are the results: dd of=ddbenchfile if=/dev/zero bs=8K count=1000000 oflag=dsync 8192000000 bytes (8.2 GB) copied, 266.873 s, 30.7 MB/s
On Tue, Sep 17, 2013 at 5:03 PM, Gregory Farnum <[email protected]> wrote: > Try it with oflag=dsync instead? I'm curious what kind of variation > these disks will provide. > > Anyway, you're not going to get the same kind of performance with > RADOS on 8k sync IO that you will with a local FS. It needs to > traverse the network and go through work queues in the daemon; your > primary limiter here is probably the per-request latency that you're > seeing (average ~30 ms, looking at the rados bench results). The good > news is that means you should be able to scale out to a lot of > clients, and if you don't force those 8k sync IOs (which RBD won't, > unless the application asks for them by itself using directIO or > frequent fsync or whatever) your performance will go way up. > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > > > On Tue, Sep 17, 2013 at 1:47 PM, Jason Villalta <[email protected]> > wrote: > > > > Here are the stats with direct io. > > > > dd of=ddbenchfile if=/dev/zero bs=8K count=1000000 oflag=direct > > 8192000000 bytes (8.2 GB) copied, 68.4789 s, 120 MB/s > > > > dd if=ddbenchfile of=/dev/null bs=8K > > 8192000000 bytes (8.2 GB) copied, 19.7318 s, 415 MB/s > > > > These numbers are still over all much faster than when using RADOS bench. > > The replica is set to 2. The Journals are on the same disk but separate > partitions. > > > > I kept the block size the same 8K. > > > > > > > > > > On Tue, Sep 17, 2013 at 11:37 AM, Campbell, Bill < > [email protected]> wrote: > >> > >> As Gregory mentioned, your 'dd' test looks to be reading from the cache > (you are writing 8GB in, and then reading that 8GB out, so the reads are > all cached reads) so the performance is going to seem good. You can add > the 'oflag=direct' to your dd test to try and get a more accurate reading > from that. > >> > >> RADOS performance from what I've seen is largely going to hinge on > replica size and journal location. Are your journals on separate disks or > on the same disk as the OSD? What is the replica size of your pool? > >> > >> ________________________________ > >> From: "Jason Villalta" <[email protected]> > >> To: "Bill Campbell" <[email protected]> > >> Cc: "Gregory Farnum" <[email protected]>, "ceph-users" < > [email protected]> > >> Sent: Tuesday, September 17, 2013 11:31:43 AM > >> > >> Subject: Re: [ceph-users] Ceph performance with 8K blocks. > >> > >> Thanks for you feed back it is helpful. > >> > >> I may have been wrong about the default windows block size. What would > be the best tests to compare native performance of the SSD disks at 4K > blocks vs Ceph performance with 4K blocks? It just seems their is a huge > difference in the results. > >> > >> > >> On Tue, Sep 17, 2013 at 10:56 AM, Campbell, Bill < > [email protected]> wrote: > >>> > >>> Windows default (NTFS) is a 4k block. Are you changing the allocation > unit to 8k as a default for your configuration? > >>> > >>> ________________________________ > >>> From: "Gregory Farnum" <[email protected]> > >>> To: "Jason Villalta" <[email protected]> > >>> Cc: [email protected] > >>> Sent: Tuesday, September 17, 2013 10:40:09 AM > >>> Subject: Re: [ceph-users] Ceph performance with 8K blocks. > >>> > >>> > >>> Your 8k-block dd test is not nearly the same as your 8k-block rados > bench or SQL tests. Both rados bench and SQL require the write to be > committed to disk before moving on to the next one; dd is simply writing > into the page cache. So you're not going to get 460 or even 273MB/s with > sync 8k writes regardless of your settings. > >>> > >>> However, I think you should be able to tune your OSDs into somewhat > better numbers -- that rados bench is giving you ~300IOPs on every OSD > (with a small pipeline!), and an SSD-based daemon should be going faster. > What kind of logging are you running with and what configs have you set? > >>> > >>> Hopefully you can get Mark or Sam or somebody who's done some > performance tuning to offer some tips as well. :) > >>> -Greg > >>> > >>> On Tuesday, September 17, 2013, Jason Villalta wrote: > >>>> > >>>> Hello all, > >>>> I am new to the list. > >>>> > >>>> I have a single machines setup for testing Ceph. It has a dual proc > 6 cores(12core total) for CPU and 128GB of RAM. I also have 3 Intel 520 > 240GB SSDs and an OSD setup on each disk with the OSD and Journal in > separate partitions formatted with ext4. > >>>> > >>>> My goal here is to prove just how fast Ceph can go and what kind of > performance to expect when using it as a back-end storage for virtual > machines mostly windows. I would also like to try to understand how it > will scale IO by removing one disk of the three and doing the benchmark > tests. But that is secondary. So far here are my results. I am aware > this is all sequential, I just want to know how fast it can go. > >>>> > >>>> DD IO test of SSD disks: I am testing 8K blocks since that is the > default block size of windows. > >>>> dd of=ddbenchfile if=/dev/zero bs=8K count=1000000 > >>>> 8192000000 bytes (8.2 GB) copied, 17.7953 s, 460 MB/s > >>>> > >>>> dd if=ddbenchfile of=/dev/null bs=8K > >>>> 8192000000 bytes (8.2 GB) copied, 2.94287 s, 2.8 GB/s > >>>> > >>>> RADOS bench test with 3 SSD disks and 4MB object size(Default): > >>>> rados --no-cleanup bench -p pbench 30 write > >>>> Total writes made: 2061 > >>>> Write size: 4194304 > >>>> Bandwidth (MB/sec): 273.004 > >>>> > >>>> Stddev Bandwidth: 67.5237 > >>>> Max bandwidth (MB/sec): 352 > >>>> Min bandwidth (MB/sec): 0 > >>>> Average Latency: 0.234199 > >>>> Stddev Latency: 0.130874 > >>>> Max latency: 0.867119 > >>>> Min latency: 0.039318 > >>>> ----- > >>>> rados bench -p pbench 30 seq > >>>> Total reads made: 2061 > >>>> Read size: 4194304 > >>>> Bandwidth (MB/sec): 956.466 > >>>> > >>>> Average Latency: 0.0666347 > >>>> Max latency: 0.208986 > >>>> Min latency: 0.011625 > >>>> > >>>> This all looks like I would expect from using three disks. The > problems appear to come with the 8K blocks/object size. > >>>> > >>>> RADOS bench test with 3 SSD disks and 8K object size(8K blocks): > >>>> rados --no-cleanup bench -b 8192 -p pbench 30 write > >>>> Total writes made: 13770 > >>>> Write size: 8192 > >>>> Bandwidth (MB/sec): 3.581 > >>>> > >>>> Stddev Bandwidth: 1.04405 > >>>> Max bandwidth (MB/sec): 6.19531 > >>>> Min bandwidth (MB/sec): 0 > >>>> Average Latency: 0.0348977 > >>>> Stddev Latency: 0.0349212 > >>>> Max latency: 0.326429 > >>>> Min latency: 0.0019 > >>>> ------ > >>>> rados bench -b 8192 -p pbench 30 seq > >>>> Total reads made: 13770 > >>>> Read size: 8192 > >>>> Bandwidth (MB/sec): 52.573 > >>>> > >>>> Average Latency: 0.00237483 > >>>> Max latency: 0.006783 > >>>> Min latency: 0.000521 > >>>> > >>>> So are these performance correct or is this something I missed with > the testing procedure? The RADOS bench number with 8K block size are the > same we see when testing performance in an VM with SQLIO. Does anyone know > of any configure changes that are needed to get the Ceph performance closer > to native performance with 8K blocks? > >>>> > >>>> Thanks in advance. > >>>> > >>>> > >>>> > >>>> -- > >>>> -- > >>>> Jason Villalta > >>>> Co-founder > >>>> 800.799.4407x1230 | www.RubixTechnology.com > >>> > >>> > >>> > >>> -- > >>> Software Engineer #42 @ http://inktank.com | http://ceph.com > >>> > >>> _______________________________________________ > >>> ceph-users mailing list > >>> [email protected] > >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>> > >>> > >>> NOTICE: Protect the information in this message in accordance with the > company's security policies. If you received this message in error, > immediately notify the sender and destroy all copies. > >>> > >> > >> > >> > >> -- > >> -- > >> Jason Villalta > >> Co-founder > >> 800.799.4407x1230 | www.RubixTechnology.com > >> > >> > >> NOTICE: Protect the information in this message in accordance with the > company's security policies. If you received this message in error, > immediately notify the sender and destroy all copies. > >> > > > > > > > > -- > > -- > > Jason Villalta > > Co-founder > > 800.799.4407x1230 | www.RubixTechnology.com > -- -- *Jason Villalta* Co-founder [image: Inline image 1] 800.799.4407x1230 | www.RubixTechnology.com<http://www.rubixtechnology.com/>
<<EmailLogo.png>>
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
