Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Jason Dillaman
Thanks for pointing that out, since it is incorrect for (semi-)modern QEMUs. All configuration starts and the Ceph defaults, are overwritten by your ceph.conf, and then are further overwritten by any QEMU-specific override. I would recommend retesting with "cache=writeback" to see if that helps.

Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Pavan Rallabhandi
The VM am testing against is created after the librbd upgrade. Always had this confusion around this bit in the docs here http://docs.ceph.com/docs/jewel/rbd/qemu-rbd/#qemu-cache-options that: “QEMU’s cache settings override Ceph’s default settings (i.e., settings that are not explicitly set

Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Jason Dillaman
On Fri, Oct 21, 2016 at 1:15 PM, Pavan Rallabhandi wrote: > The QEMU cache is none for all of the rbd drives Hmm -- if you have QEMU cache disabled, I would expect it to disable the librbd cache. I have to ask, but did you (re)start/live-migrate these VMs you are

Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Pavan Rallabhandi
Thanks for verifying at your end Jason. It’s pretty weird that the difference is >~10X, with "rbd_cache_writethrough_until_flush = true" I see ~400 IOPS vs with "rbd_cache_writethrough_until_flush = false" I see them to be ~6000 IOPS. The QEMU cache is none for all of the rbd drives. On that

Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Jason Dillaman
I just tested from the v10.2.3 git tag on my local machine and averaged 2912.54 4K writes / second with "rbd_cache_writethrough_until_flush = false" and averaged 3035.09 4K writes / second with "rbd_cache_writethrough_until_flush = true" (queue depth of 1 in both cases). I used new images between

Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Pavan Rallabhandi
And to add, the host running Cinder services is having Hammer 0.94.9 but the rest of them like Nova are on Jewel 10.2.3 FWIW, the rbd info for one such image looks like this: rbd image 'volume-f6ec45e2-b644-4b58-b6b5-b3a418c3c5b2': size 2048 MB in 512 objects order 22 (4096 kB

Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Pavan Rallabhandi
From my VMs that have cinder provisioned volumes, I tried dd / fio (like below) to find the IOPS to be less, even a sync before the runs didn’t help. Same runs by setting the option to false yield better results. Both the clients and the cluster are running 10.2.3, perhaps the only difference

Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Jason Dillaman
It's in the build and has tests to verify that it is properly being triggered [1]. $ git tag --contains 5498377205523052476ed81aebb2c2e6973f67ef v10.2.3 What are your tests that say otherwise? [1] https://github.com/ceph/ceph/pull/10797/commits/5498377205523052476ed81aebb2c2e6973f67ef On Fri,

[ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Pavan Rallabhandi
I see the fix for write back cache not getting turned on after flush has made into Jewel 10.2.3 ( http://tracker.ceph.com/issues/17080 ) but our testing says otherwise. The cache is still behaving as if its writethrough, though the setting is set to true. Wanted to check if it’s still broken