On Tue, 23 Dec 2014 22:24:30 -0500, Rich Freeman wrote:

> On Tue, Dec 23, 2014 at 4:08 PM, Holger Hoffstätte
> <holger.hoffstae...@googlemail.com> wrote:
>> On Tue, 23 Dec 2014 21:54:00 +0100, Stefan G. Weichinger wrote:
>>
>>> In the other direction: what protects against these errors you
>>> mention?
>>
>> ceph scrub :)
>>
>>
> Are you sure about that?  I was under the impression that it just
> checked that everything was retrievable.  I'm not sure if it compares
> all the copies of everything to make sure that they match, and if they
> don't match I don't think that it has any way to know which one is
> right.  I believe an algorithm just picks one as the official version,
> and it may or may not be identical to the one that was originally
> stored.

There's light and deep scrub; the former does what you described,
while deep does checksumming. In case of mismatch it should create
a quorum. Whether that actually happens and/or works is another
matter. ;)

Unfortunately a full point-in-time deep scrub and the resulting creation 
of checksums is more or less economically unviable with growing amounts 
of data; this really should be incremental. All distributed databases 
suffer from the same problem, and the better ones eventually adopted the 
incremental approach.

http://ceph.com/docs/master/rados/configuration/osd-config-ref/#scrubbing

I know how btrfs scrub works, but it too (and in fact every storge system)
suffers from the problem of having to decide which copy is "good"; they 
all have different points in their timeline where they need to make a 
decision at which a checksum is considered valid. When we're talking 
about preventing bitrot, just having another copy is usually enough.

On top of that btrfs will at least tell you which file is suspected, 
thanks to its wonderful backreferences.

-h


Reply via email to