Hi Andreas, Since it looks like we're going to use jerasure-1.2, we will be able to try (C)RS using
https://github.com/tsuraan/Jerasure/blob/master/src/cauchy.c https://github.com/tsuraan/Jerasure/blob/master/src/cauchy.h Do you know of a better / faster implementation ? Is there a tradeoff between (C)RS and RS ? Cheers On 06/07/2013 15:43, Andreas-Joachim Peters wrote: > HI Loic, > (C)RS stands for the Cauchy Reed-Solomon codes which are based on pure parity > operations, while the standard Reed-Solomon codes need more multiplications > and are slower. > > Considering the checksumming ... for comparison the CRC32 code from libz > run's on a 8-core Xeon at ~730 MB/s for small block sizes while SSE4.2 CRC32C > checksum run's at ~2GByte/s. > > Cheers Andreas. > > > > > On Fri, Jul 5, 2013 at 11:23 PM, Loic Dachary <[email protected] > <mailto:[email protected]>> wrote: > > Hi Andreas, > > On 04/07/2013 23:01, Andreas Joachim Peters wrote:> Hi Loic, > > thanks for the responses! > > > > Maybe this is useful for your erasure code discussion: > > > > as an example in our RS implementation we chunk a data block of e.g. 4M > into 4 data chunks of 1M. Then we create a 2 parity chunks. > > > > Data & parity chunks are split into 4k blocks and these 4k blocks get a > CRC32C block checksum each (SSE4.2 CPU extension => MIT library or BTRFS). > This creates 0.1% volume overhead (4 bytes per 4096 bytes) - nothing compared > to the parity overhead ... > > > > You can now easily detect data corruption using the local checksums and > avoid to read any parity information and (C)RS decoding if there is no > corruption detected. Moreover CRC32C computation is distributed over several > (in this case 4) machines while (C)RS decoding would run on a single machine > where you assemble a block ... and CRC32C is faster than (C)RS decoding (with > SSE4.2) ... > > What does (C)RS mean ? (C)Reed-Solomon ? > > > In our case we write this checksum information separate from the > original data ... while in a block-based storage like CEPH it would be > probably inlined in the data chunk. > > If an OSD detects to run on BRTFS or ZFS one could disable > automatically the CRC32C code. > > Nice. I did not know that was built-in :-) > > https://github.com/dachary/ceph/blob/wip-4929/doc/dev/osd_internals/erasure-code.rst#scrubbing > > > (wouldn't CRC32C be also useful for normal CEPH block replication? ) > > I don't know the details of scrubbing but it seems CRC is already used by > deep scrubbing > > https://github.com/ceph/ceph/blob/master/src/osd/PG.cc#L2731 > > Cheers > > > As far as I know with the RS CODEC we use you can either miss stripes > (data =0) in the decoding process but you cannot inject corrupted stripes > into the decoding process, so the block checksumming is important. > > > > Cheers Andreas. > > -- > Loïc Dachary, Artisan Logiciel Libre > All that is necessary for the triumph of evil is that good people do > nothing. > > -- Loïc Dachary, Artisan Logiciel Libre All that is necessary for the triumph of evil is that good people do nothing.
signature.asc
Description: OpenPGP digital signature
