Re: [ceph-users] IO scheduler & osd_disk_thread_ioprio_class

Jan Schermer Tue, 23 Jun 2015 04:34:45 -0700

Thank you for your reply, answers below.


> On 23 Jun 2015, at 13:15, Christian Balzer <ch...@gol.com> wrote:
> 
> 
> Hello,
> 
> On Tue, 23 Jun 2015 12:53:45 +0200 Jan Schermer wrote:
> 
>> I use CFQ but I have just discovered it completely _kills_ writes when
>> also reading (doing backfill for example)
>> 
> I've seen similar things, but for the record and so people can correctly
> reproduce things, please be specific.
> 
> For starters, what version of Ceph?
> 

0.67.12 dumpling (newest git)
I know it’s ancient :-)

> CFQ with what kernel, with what filesystem, on what type of OSD (HDD, HDD
> with on disk journal, HDD with SSD journal)?
> 

My test was done on the block device, not filesystem, on a SSD.
I tested several scenarios but the most simple one is to run

fio --filename=/dev/sda --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 
--iodepth=1 --runtime=60 --time_based --group_reporting --name=test 
--ioengine=aio
and
fio --filename=/dev/sda --direct=1 --sync=1 --rw=randread --bs=32k --numjobs=1 
--iodepth=8 --runtime=60 --time_based --group_reporting --name=test 
--ioengine=aio

You will see the first fio IOPS drop to ~10.

This will of course depend on the drive and this is also saturating the SATA 2 
capacity I have on my test machine (which might be the real cause).

I am still testing various combinations, different drives have different 
thresholds (some fall to the bottom only with 128k block size which is larger 
than my average IO on the drives - not accounting for backfills).

There’s a point though where it just hits the bottom and no amount of 
cfq-tuning magic can help.


>> If I run a fio job for synchronous writes and at the same time run a fio
>> job for random reads, writes drop to 10 IOPS (oops!). Setting io
>> priority with ionice works nicely maintaining ~250 IOPS for writes while
>> throttling reads.
>> 
> Setting the priority to what (level and type) on which process?
> The fio ones, the OSD ones?

ionice -c3 fio-for-read-test

this sets the class to idle
setting the priority to 7 but leaving in on best-effort helps, but not much (10 
x 30 IOPs)

> 
> Scrub and friends can really wreck havoc on one of my cluster which is 99%
> writes, same goes for the few times it has to do reads (VMs booting).

Scrubbing is fine on my cluster, backfilling kills it with new drives - that’s 
what I’m investigating right now and I encountered this. So before I go 
scratching my head I thought I’d ask here - I’m probably not the first one to 
have these kind of problems :-)

Thanks

Jan

> 
> Christian
> 
>> I looked at osd_disk_thread_ioprio_class - for some reason documentation
>> says “idle” “rt” “be” for possible values, but it only accepts numbers
>> (3 should be idle) in my case - and doesn’t seem to do anything in
>> regards to slow requests. Do I need to restart the OSD for it to take
>> effect? It actually looks like it made things even worse for me…
>> 
>> Changing the scheduler to deadline improves the bottom line a a lot for
>> my benchmark, but large amount of reads can still drop that to 30 IOPS -
>> contrary to CFQ which maintains steady 250 IOPS for writes even under
>> read load.
>> 
>> What would be the recommendation here? Did someone test this extensively
>> before?
>> 
>> thanks
>> 
>> Jan
>> 
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> -- 
> Christian Balzer        Network/Systems Engineer                
> ch...@gol.com         Global OnLine Japan/Fusion Communications
> http://www.gol.com/

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] IO scheduler & osd_disk_thread_ioprio_class

Reply via email to