On Sun, Aug 18, 2013 at 2:27 PM, Branko Čibej <br...@wandisco.com> wrote:
> On 18.08.2013 14:25, Branko Čibej wrote: > > On 18.08.2013 14:15, stef...@apache.org wrote: > >> Author: stefan2 > >> Date: Sun Aug 18 12:15:01 2013 > >> New Revision: 1515088 > >> > >> URL: http://svn.apache.org/r1515088 > >> Log: > >> On the log-addressing branch: For low-overhead checksumming, > >> add standard FNV-1a and a faster modified version of that to > >> the list of checksum kinds supported with svn_checksum_t. > >> > >> We will use this new checksum to secure parts of FSFS (and > >> later FSX) that are not directly covered by MD5/SHA1 content > >> checksums. That will help to localize corruptions much quicker > >> and more accurately but it will not eliminate the need to run > >> a full content verification. > > If you're using this for detecting corruption, rather than key > > distribution, why not instead use a 64-bit or even 32-bit CRC? It should > > be much faster than any kind of multiply-with-prime hash. > CRC happens to be slower than even standard FNV-1 (6 clk / byte vs. 4 clk / byte) on recent machines (<10y). Since I want the low-level verification to run on (linear read) disk speed of >1GB/s, we need a checksum that is 2 clk/B or better. The interleaved fnv1a_32x4 variant gets down to ~1 clk/B. CRC would require a similar optimization and still be at least 50% slower than the current code. > Or you could even use the Adler-32 implementation that we already use in > the xdelta code. > Adler-32 is relatively weak and we already use it implicitly for our zlib encoded data. Using a different and stronger checksum should add more verification strength than applying an existing one twice. -- Stefan^2.