I'll test this by manually inducing corrupted data to the ZFS filesystem and 
report back how ZFS+ceph interact during a detected file failure/corruption, 
how it recovers and any manual steps required, and report back with the 
results. 

As for compression, using lz4 the CPU impact is around 5-20% depending on load, 
type of I/O and I/O size, with little-to-no I/O performance impact, and in fact 
in some cases the I/O performance actually increases. I'm currently looking at 
a compression ratio on the ZFS datasets of around 30-35% for a data consisting 
of rbd backed OpenStack KVM VMs. I have not tried any sort of dedupe as it is 
memory intensive and I only had 24GB of ram on each node. I'll grab some FIO 
benchmarks and report back.

Cheers,



-----Original Message-----
From: Christian Balzer [mailto:[email protected]] 
Sent: October-30-14 4:12 AM
To: ceph-users
Cc: Michal Kozanecki
Subject: Re: [ceph-users] use ZFS for OSDs

On Wed, 29 Oct 2014 15:32:57 +0000 Michal Kozanecki wrote:

[snip]
> With Ceph handling the
> redundancy at the OSD level I saw no need for using ZFS mirroring or 
> zraid, instead if ZFS detects corruption instead of self-healing it 
> sends a read failure of the pg file to ceph, and then ceph's scrub 
> mechanisms should then repair/replace the pg file using a good replica 
> elsewhere on the cluster. ZFS + ceph are a beautiful bitrot fighting 
> match!
> 
Could you elaborate on that? 
AFAIK Ceph currently has no way to determine which of the replicas is "good", 
one such failed PG object will require you to do a manual repair after the 
scrub and hope that two surviving replicas (assuming a size of
3) are identical. If not, start tossing a coin.
Ideally Ceph would have a way to know what happened (as in, it's a checksum and 
not a real I/O error) and do a rebuild of that object itself.

On an other note, have you done any tests using the ZFS compression?
I'm wondering what the performance impact and efficiency are.

Christian
-- 
Christian Balzer        Network/Systems Engineer                
[email protected]           Global OnLine Japan/Fusion Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to