HI Loic, 
(C)RS stands for the Cauchy Reed-Solomon codes which are based on pure parity 
operations, while the standard Reed-Solomon codes need more multiplications and 
are slower.

Considering the checksumming ... for comparison the CRC32 code from libz run's 
on a 8-core Xeon at ~730 MB/s for small block sizes while SSE4.2 CRC32C 
checksum run's at ~2GByte/s.

Cheers Andreas.
________________________________________
From: Loic Dachary [[email protected]]
Sent: 05 July 2013 23:23
To: Andreas Joachim Peters
Cc: [email protected]
Subject: Re: CEPH Erasure Encoding + OSD Scalability

Hi Andreas,

On 04/07/2013 23:01, Andreas Joachim Peters wrote:> Hi Loic,
> thanks for the responses!
>
> Maybe this is useful for your erasure code discussion:
>
> as an example in our RS implementation we chunk a data block of e.g. 4M into 
> 4 data chunks of 1M. Then we create a 2 parity chunks.
>
> Data & parity chunks are split into 4k blocks and these 4k blocks get a 
> CRC32C block checksum each (SSE4.2 CPU extension => MIT library or BTRFS). 
> This creates 0.1% volume overhead (4 bytes per 4096 bytes) - nothing compared 
> to the parity overhead ...
>
> You can now easily detect data corruption using the local checksums and avoid 
> to read any parity information and (C)RS decoding if there is no corruption 
> detected. Moreover CRC32C computation is distributed over several (in this 
> case 4) machines while (C)RS decoding would run on a single machine where you 
> assemble a block ... and CRC32C is faster than (C)RS decoding (with SSE4.2) 
> ...

What does (C)RS mean ? (C)Reed-Solomon ?

> In our case we write this checksum information separate from the original 
> data ... while in a block-based storage like CEPH it would be probably 
> inlined in the data chunk.
> If an OSD detects to run on BRTFS or ZFS one could disable automatically the 
> CRC32C code.

Nice. I did not know that was built-in :-)
https://github.com/dachary/ceph/blob/wip-4929/doc/dev/osd_internals/erasure-code.rst#scrubbing

> (wouldn't CRC32C be also useful for normal CEPH block replication? )

I don't know the details of scrubbing but it seems CRC is already used by deep 
scrubbing

https://github.com/ceph/ceph/blob/master/src/osd/PG.cc#L2731

Cheers

> As far as I know with the RS CODEC we use you can either miss stripes (data 
> =0) in the decoding process but you cannot inject corrupted stripes into the 
> decoding process, so the block checksumming is important.
>
> Cheers Andreas.

--
Loïc Dachary, Artisan Logiciel Libre
All that is necessary for the triumph of evil is that good people do nothing.

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to