Hi Mark,

Nice :-) I'm curious about how it's used. Is it computed every time an object 
is written to disk ? Or is it part of the WRITE messages that are sent to the 
replicas ? 

Cheers

On 06/07/2013 17:28, Mark Nelson wrote:
> Hi Guys,
> 
> For what it's worth, we just added SSE 4.2 CRC32c for architectures that 
> support it:
> 
> https://github.com/ceph/ceph/commit/7c59288d9168ddef3b3dc570464ae9a1f180d18c#src/common/crc32c-intel.c
> 
> Mark
> 
> On 07/06/2013 08:45 AM, Andreas Joachim Peters wrote:
>> HI Loic,
>> (C)RS stands for the Cauchy Reed-Solomon codes which are based on pure 
>> parity operations, while the standard Reed-Solomon codes need more 
>> multiplications and are slower.
>>
>> Considering the checksumming ... for comparison the CRC32 code from libz 
>> run's on a 8-core Xeon at ~730 MB/s for small block sizes while SSE4.2 
>> CRC32C checksum run's at ~2GByte/s.
>>
>> Cheers Andreas.
>> ________________________________________
>> From: Loic Dachary [[email protected]]
>> Sent: 05 July 2013 23:23
>> To: Andreas Joachim Peters
>> Cc: [email protected]
>> Subject: Re: CEPH Erasure Encoding + OSD Scalability
>>
>> Hi Andreas,
>>
>> On 04/07/2013 23:01, Andreas Joachim Peters wrote:> Hi Loic,
>>> thanks for the responses!
>>>
>>> Maybe this is useful for your erasure code discussion:
>>>
>>> as an example in our RS implementation we chunk a data block of e.g. 4M 
>>> into 4 data chunks of 1M. Then we create a 2 parity chunks.
>>>
>>> Data & parity chunks are split into 4k blocks and these 4k blocks get a 
>>> CRC32C block checksum each (SSE4.2 CPU extension => MIT library or BTRFS). 
>>> This creates 0.1% volume overhead (4 bytes per 4096 bytes) - nothing 
>>> compared to the parity overhead ...
>>>
>>> You can now easily detect data corruption using the local checksums and 
>>> avoid to read any parity information and (C)RS decoding if there is no 
>>> corruption detected. Moreover CRC32C computation is distributed over 
>>> several (in this case 4) machines while (C)RS decoding would run on a 
>>> single machine where you assemble a block ... and CRC32C is faster than 
>>> (C)RS decoding (with SSE4.2) ...
>>
>> What does (C)RS mean ? (C)Reed-Solomon ?
>>
>>> In our case we write this checksum information separate from the original 
>>> data ... while in a block-based storage like CEPH it would be probably 
>>> inlined in the data chunk.
>>> If an OSD detects to run on BRTFS or ZFS one could disable automatically 
>>> the CRC32C code.
>>
>> Nice. I did not know that was built-in :-)
>> https://github.com/dachary/ceph/blob/wip-4929/doc/dev/osd_internals/erasure-code.rst#scrubbing
>>
>>> (wouldn't CRC32C be also useful for normal CEPH block replication? )
>>
>> I don't know the details of scrubbing but it seems CRC is already used by 
>> deep scrubbing
>>
>> https://github.com/ceph/ceph/blob/master/src/osd/PG.cc#L2731
>>
>> Cheers
>>
>>> As far as I know with the RS CODEC we use you can either miss stripes (data 
>>> =0) in the decoding process but you cannot inject corrupted stripes into 
>>> the decoding process, so the block checksumming is important.
>>>
>>> Cheers Andreas.
>>
>> -- 
>> Loïc Dachary, Artisan Logiciel Libre
>> All that is necessary for the triumph of evil is that good people do nothing.
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to [email protected]
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 

-- 
Loïc Dachary, Artisan Logiciel Libre
All that is necessary for the triumph of evil is that good people do nothing.

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to