On Wed, Apr 17, 2013 at 11:08 PM, Andres Freund <and...@2ndquadrant.com> wrote: > On 2013-04-17 18:16:36 -0700, Daniel Farina wrote: >> The original paper is often shorthanded "Castagnoli 93", but it exists >> in the IEEE's sphere of influence and is hard to find a copy of. >> Luckily, a pretty interesting survey paper discussing some of the >> issues was written by Koopman in 2002 and is available: >> http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.11.8323 As a >> pedagolgical note, it's pretty interesting and accessible piece of >> writing (for me, as someone who knows little of error >> detection/correction) and explains some of the engineering reasons >> that provoke such exercises. > > http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=231911&userType=inst > > There's also a koopman paper from 2004 thats interesting.
Having read the 2002 paper more, it seems that the current CRC32 doesn't have a whole lot going for it: CRC32C pretty much cleans its clock across the board (I don't understand detected Hamming Distance that seem greater than the information content of the message, e.g. HD 14 with 8 bit messages as seen in CRC32C: that's where CRC32 can "win"). CRC32C looks, all in all, the most flexible, because detection of Hamming Distance 4 spans from 5244-131072 bits (the upper range of which is a full 16KiB!) and there is superior Hamming Distance detection on shorter messages up until the point where it seems like the Hamming Distance able to be detected is larger than the message size itself (e.g. HM 13 on an 8 bit message). I'm not sure if this is an error in my understanding, or what. Also, larger runs (16KB) are better served by CRC32C: even the probably-best contender I can see (0xD419CC15) drops to Hamming Distance 2-detection right after 65505 bits. CRC32C has the biggest range at HD4, although Koopman 0xBA0DC66 comes close, gaining superior Hamming distance detection for 178-16360 bits (the upper end of this rnage is short of 2KiB by 3 bytes). All in all, there is no reason I can see to keep CRC32 at all, vs CRC32C on the basis of error detection alone, so putting aside all the business about instruction set architecture, I think a software CRC32C in a vacuum can be seen as a robustness improvement. There may be polynomials that are not CRC32 or CRC32C that one might view as having slightly better tradeoffs as seen in Table 1 of Koopman 2002, but it's kind of a stretch: being able to handle 8KB and 16KB as seen in CRC32C at HD4 as seen in CRC32C is awfully compelling to me. Koopman 0xBA0DC66B can admirably reach HD6 on a much larger range, up to 16360 bytes, which is every so shy of 2KiB. Castagnoli 0xD419CC15 can, short of 8KB by 31 bits can detect HD 5. Corrections welcome on my interpretations of Tbl 1. -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers