First on your comment of: "we found that during times where the cache pool flushed to the storage pool client IO took a severe hit"
We found the same thing. http://blog.wadeit.io/ceph-cache-tier-performance-random-writes/ -- I don't claim this is a great write up, and not what a lot of folks are interested in but it is what I was after. Great on your fio test. However take a look at the response time. Naturally it will increase after 4-5 concurrent writes. Which is of course what you were saying and is correct. However, I think we can generally accept a slightly higher response time and therefore iodepth>1 is a more real world test. Just my thoughts. You did the right thing, and tested well. Some might not like it , but I like Sebastien's journal size calculation and it has served me well: http://slides.com/sebastienhan/ceph-performance-and-benchmarking#/24 Cheers Wade On Thu, Feb 4, 2016 at 7:24 AM Sascha Vogt <[email protected]> wrote: > Hi, > > Am 04.02.2016 um 12:59 schrieb Wade Holler: > > You referenced parallel writes for journal and data. Which is default > > for btrfs but but XFS. Now you are mentioning multiple parallel writes > > to the drive , which of course yes will occur. > Ah, that is good to know. So if I want to create more "parallelism" I > should use btrfs then. Thanks a lot, that's a very critical bit of > information :) > > > Also Our Dell 400 Gb NVMe drives do not top out around 5-7 sequential > > writes as you mentioned. That would be 5-7 random writes from a drives > > perspective and the NVMe drives can do many times that. > Hm, I used the following fio bench from [1]: > fio --filename=/dev/sda --direct=1 --sync=1 --rw=write --bs=4k > --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting > --name=journal-test > > Our disks showed the following bandwidths: (#<no> is the numjobs > paramenter): > > #1: write: io=1992.2MB, bw=33997KB/s, iops=8499 > #2: write: io=5621.6MB, bw=95940KB/s, iops=23984 > #3: write: io=8062.8MB, bw=137602KB/s, iops=34400 > #4: write: io=9114.1MB, bw=155545KB/s, iops=38886 > #5: write: io=8860.7MB, bw=151169KB/s, iops=37792 > > Also for more jobs (tried up to 8) bandwidth stayed at around 150MB/s > and around 37k iops. So I figured that around 5 should be the sweet spot > in terms for journals on a single disk. > > > I would park it at 5-6 partitions per NVMe , journal on the same disk. > > Frequently I want more concurrent operations , rather than all out > > throughput. > For journal on the same partition, should I limit the size of the > journal size? If yes, what should be the limit? Rather large or rather > small? > > Greetings > -Sascha- > > [1] > http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
