Re: [ceph-users] Preconditioning an RBD image

Nick Fisk Thu, 23 Mar 2017 14:19:34 -0700

Hi Peter,


Interesting graph. Out of interest, when you use bcache, do you then just
leave the journal collocated on the combined bcache device and rely on the
writeback to provide journal performance, or do you still create a separate
partition on whatever SSD/NVME you use, effectively giving triple write
overhead?

 

Nick

 

From: ceph-users [mailto:[email protected]] On Behalf Of
Peter Maloney
Sent: 22 March 2017 10:06
To: Alex Gorbachev <[email protected]>; ceph-users
<[email protected]>
Subject: Re: [ceph-users] Preconditioning an RBD image

 

Does iostat (eg.  iostat -xmy 1 /dev/sd[a-z]) show high util% or await
during these problems?

Ceph filestore requires lots of metadata writing (directory splitting for
example), xattrs, leveldb, etc. which are small sync writes that HDDs are
bad at (100-300 iops), and SSDs are good at (cheapo would be 6k iops, and
not so crazy DC/NVMe would be 20-200k iops and more). So in theory, these
things are mitigated by using an SSD, like bcache on your osd device. You
could also try something like that, at least to test.

I have tested with bcache in writeback mode and found hugely obvious
differences seen by iostat, for example here's my before and after (heavier
load due to converting week 49-50 or so, and the highest spikes being the
scrub infinite loop bug in 10.2.3): 

http://www.brockmann-consult.de/ganglia/graph.php?cs=10%2F25%2F2016+10%3A27
<http://www.brockmann-consult.de/ganglia/graph.php?cs=10%2F25%2F2016+10%3A27
&ce=03%2F09%2F2017+17%3A26&z=xlarge&hreg>
&ce=03%2F09%2F2017+17%3A26&z=xlarge&hreg[]=ceph.*&mreg[]=sd[c-z]_await&glege
nd=show&aggregate=1&x=100

But when you share a cache device, you get a single point of failure (and
bcache, like all software, can be assumed to have bugs too). And I recommend
vanilla kernel 4.9 or later which has many bcache fixes, or Ubuntu's 4.4
kernel which has the specific fixes I checked for.

On 03/21/17 23:22, Alex Gorbachev wrote:

I wanted to share the recent experience, in which a few RBD volumes,
formatted as XFS and exported via Ubuntu NFS-kernel-server performed poorly,
even generated an "out of space" warnings on a nearly empty filesystem.  I
tried a variety of hacks and fixes to no effect, until things started
magically working just after some dd write testing. 

 

The only explanation I can come up with is that preconditioning, or
thickening, the images with this benchmarking is what caused the
improvement.

 

Ceph is Hammer 0.94.7 running on Ubuntu 14.04, kernel 4.10 on OSD nodes and
4.4 on NFS nodes.

 

Regards,

Alex

Storcium

-- 

-- 

Alex Gorbachev

Storcium






_______________________________________________
ceph-users mailing list
[email protected] <mailto:[email protected]> 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 

 

-- 
 
--------------------------------------------
Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300
Fax: +49 4152 889 333
E-mail: [email protected]
<mailto:[email protected]> 
Internet: http://www.brockmann-consult.de
--------------------------------------------

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Preconditioning an RBD image

Reply via email to