I've just looked through github for the Linux kernel and it looks like that 
read ahead fix was introduced in 4.4, so I'm not sure if it's worth trying a 
slightly newer kernel?

Sent from Nine

From: Mike Miller <[email protected]>
Sent: 21 Apr 2016 2:20 pm
To: [email protected]
Subject: Re: [ceph-users] Slow read on RBD mount, Hammer 0.94.5

Hi Udo, 

thanks, just to make sure, further increased the readahead: 

$ sudo blockdev --getra /dev/rbd0 
1048576 

$ cat /sys/block/rbd0/queue/read_ahead_kb 
524288 

No difference here. First one is sectors (512 bytes), second one KB. 

The second read (after drop cache) is somewhat faster (10%-20%) but not 
much. 

I also found this info 
http://tracker.ceph.com/issues/9192 

Maybe Ilya can help us, he knows probably best how this can be improved. 

Thanks and cheers, 

Mike 


On 4/21/16 4:32 PM, Udo Lembke wrote: 
> Hi Mike, 
> 
> Am 21.04.2016 um 09:07 schrieb Mike Miller: 
>> Hi Nick and Udo, 
>> 
>> thanks, very helpful, I tweaked some of the config parameters along 
>> the line Udo suggests, but still only some 80 MB/s or so. 
> this mean you have reached factor 3 (this are round about the value I 
> see with single thread on RBD too). Better than nothing. 
> 
>> 
>> Kernel 4.3.4 running on the client machine and comfortable readahead 
>> configured 
>> 
>> $ sudo blockdev --getra /dev/rbd0 
>> 262144 
>> 
>> Still not more than about 80-90 MB/s. 
> they are two possibilities for read-ahead. 
> Take a look here (and change with echo) 
> cat /sys/block/rbd0/queue/read_ahead_kb 
> 
> Perhaps there are slightly differences? 
> 
>> 
>> For writing the parallelization is amazing and I see very impressive 
>> speeds, but why is reading performance so much behind? Why is it not 
>> parallelized the same way writing is? Is this something coming up in 
>> the jewel release? Or is it planned further down the road? 
> If you read an big file and clear your cache ("echo 3 > 
> /proc/sys/vm/drop_caches") on the client, is the second read very fast? 
> I assume yes. 
> In this case the readed data is in the cache on the osd-nodes... so 
> tuning must be there (and I'm very interesting in improvements). 
> 
> Udo 
_______________________________________________ 
ceph-users mailing list 
[email protected] 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to