Re: [Toybox] crc64?

2018-02-05 Thread enh
i think the world has mostly moved on to SHA already anyway, at this
level of the stack. things like CRC are just for hardware these days.
(Android recently removed a software CRC from adb.)

On Sun, Feb 4, 2018 at 7:35 PM, Rob Landley  wrote:
> On 02/04/2018 07:54 PM, Rob Landley wrote:
>> But the largest distribution in the wild seems to be the "jones" one
>> used by redis:
>>
>> https://raw.githubusercontent.com/antirez/redis/88c1d9550d198fd7df426b19ea67e9c51c92a811/src/crc64.c
>>
>> Those are also the only three listed here:
>>
>> https://users.ece.cmu.edu/~koopman/crc/crc64.html
>
> The "Jones" variant was introduced by this paper:
>
> http://www0.cs.ucl.ac.uk/staff/D.Jones/crcnote.pdf
>
> Meanwhile, the xz variant is from Appendix B of a 1992 publication the
> European Computer Manufacturers Association put out about 48 track
> magnetic tape cartriges, which seems to have been completely ignored
> until xz picked it up.
>
> http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-182.pdf
>
> Looking for more analysis on why xz chose an appendix of a 25 year old
> standard for magnetic tape storage, I found this:
>
> http://www.nongnu.org/lzip/xz_inadequate.html
>
> Which doesn't _directly_ address the issue but really doesn't give me
> confidence in xz's design decisions.
>
> (What _everybody_ seems to agree on is the ISO version is actively stupid.)
>
> Anyway, happy to have more info from somebody with actual domain
> expertise...
>
> Rob
> ___
> Toybox mailing list
> Toybox@lists.landley.net
> http://lists.landley.net/listinfo.cgi/toybox-landley.net
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] crc64?

2018-02-04 Thread Rob Landley
On 02/04/2018 07:54 PM, Rob Landley wrote:
> But the largest distribution in the wild seems to be the "jones" one
> used by redis:
> 
> https://raw.githubusercontent.com/antirez/redis/88c1d9550d198fd7df426b19ea67e9c51c92a811/src/crc64.c
> 
> Those are also the only three listed here:
> 
> https://users.ece.cmu.edu/~koopman/crc/crc64.html

The "Jones" variant was introduced by this paper:

http://www0.cs.ucl.ac.uk/staff/D.Jones/crcnote.pdf

Meanwhile, the xz variant is from Appendix B of a 1992 publication the
European Computer Manufacturers Association put out about 48 track
magnetic tape cartriges, which seems to have been completely ignored
until xz picked it up.

http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-182.pdf

Looking for more analysis on why xz chose an appendix of a 25 year old
standard for magnetic tape storage, I found this:

http://www.nongnu.org/lzip/xz_inadequate.html

Which doesn't _directly_ address the issue but really doesn't give me
confidence in xz's design decisions.

(What _everybody_ seems to agree on is the ISO version is actively stupid.)

Anyway, happy to have more info from somebody with actual domain
expertise...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


[Toybox] crc64?

2018-02-04 Thread Rob Landley
One of the things I've been pondering is adding crc64. Except... there's
no agreement what crc64 should _be_? Wikipedia[citation needed] seems to
recommend the ECMA polynomial (0x42F0E1EBA9EA3693 used by xz?):

https://en.wikipedia.org/wiki/Cyclic_redundancy_check#Polynomial_representations_of_cyclic_redundancy_checks

Because the ISO one is "not strong for hashing". (And is only 3 bits.)

But the largest distribution in the wild seems to be the "jones" one
used by redis:

https://raw.githubusercontent.com/antirez/redis/88c1d9550d198fd7df426b19ea67e9c51c92a811/src/crc64.c

Those are also the only three listed here:

https://users.ece.cmu.edu/~koopman/crc/crc64.html

Meanwhile, I recently added the "crc32" command line from ubuntu, which
is basically cksum -HLNP. I.E. output in hex, don't include length,
little endian, with neither pre-inversion or post-inversion.

And yes, that means the default output of "cksum" and the default output
of "crc32" are _way_ different, despite being the same basic algorithm.
(Note, I added the cksum command line options in toybox, the ubuntu one
doesn't have 'em. I can teach cksum a -6 option to do 64 bit too, but
one thing I _didn't_ make configurable is the polynomial seed, because
the world at large seems to agree on that one these days...)

Is there any sort of consensus emerging for crc64? Should I just have
crc64 do a 64 bit version of what crc32 does? (Is anything actually
using this, or is crc32 basically enough to find real-world data
corruption in transmissions, given that
https://barrgroup.com/Embedded-Systems/How-To/CRC-Math-Theory describes
how it basically catches all the transmission errors hardware's likely
to do. I.E. crc32 is guaranteed to catch any one, two, or three bit
error, any odd number of bits in error, and any error burst as wide as
the checksum itself. For the rest it basically has a 1 in 4 billion
chance of _not_ noticing it.

Anyone have an opinion here?

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net