On Tue, Jan 20, 2015 at 10:07 AM, James <[email protected]> wrote:
> Bill Kenworthy <billk <at> iinet.net.au> writes:
>
>> You can turn off COW and go single on btrfs to speed it up but bugs in
>> ceph and btrfs lose data real fast!
>
> Interesting idea, since I'll have raid1 underneath each node. I'll need to
> dig into this idea a bit more.
>

So, btrfs and ceph solve an overlapping set of problems in an
overlapping set of ways.  In general adding data security often comes
at the cost of performance, and obviously adding it at multiple layers
can come at the cost of additional performance.  I think the right
solution is going to depend on the circumstances.

if ceph provided that protection against bitrot I'd probably avoid a
COW filesystem entirely.  It isn't going to add any additional value,
and they do have a performance cost.  If I had mirroring at the ceph
level I'd probably just run them on ext4 on lvm with no
mdadm/btrfs/whatever below that.  Availability is already ensured by
ceph - if you lose a drive then other nodes will pick up the load.  If
I didn't have robust mirroring at the ceph level then having mirroring
of some kind at the individual node level would improve availability.

On the other hand, ceph currently has some gaps, so having it on top
of zfs/btrfs could provide protection against bitrot.  However, right
now there is no way to turn off COW while leaving checksumming
enabled.  It would be nice if you could leave the checksumming on.
Then if there was bitrot btrfs would just return an error when you
tried to read the file, and then ceph would handle it like any other
disk error and use a mirrored copy on another node.  The problem with
ceph+ext4 is that if there is bitrot neither layer will detect it.

Does btrfs+ceph really have a performance hit that is larger than
btrfs without ceph?  I fully expect it to be slower than ext4+ceph.
Btrfs in general performs fairly poorly right now - that is expected
to improve in the future, but I doubt that it will ever outperform
ext4 other than for specific operations that benefit from it (like
reflink copies).  It will always be faster to just overwrite one block
in the middle of a file than to write the block out to unallocated
space and update all the metadata.

-- 
Rich

Reply via email to