Re: Problems with sum in textutils

Nick Lawes Sat, 27 Oct 2001 14:44:05 -0700

On Saturday 27 Oct 2001 17:39, Jim Meyering wrote:
> Thanks for the report, but that's not a bug in the newer version.
>
> In 2.0f, I applied this patch:
>
>   2000-06-22  Bruno Haible  <[EMAIL PROTECTED]>
>
>           * src/sum.c (sysv_sum_file): Avoid overflowing 32-bit
> accumulator on files larger than 256 MB.
>
> The only problem is that the above comment is inaccurate.
> (I've just fixed it.)
> In reality, the problem with overflow using the old version
> could happen with files as `small' as 16843010 bytes.
> That's floor ((2^32) / 255 + 1).
>
> To demonstrate, remember that the first number in the output of
> `sum -s' is the sum of all bytes modulo 0xffff (aka 65535).
>
> So, consider a file that is a sequence of one less than
> that magic number of 0xff bytes.  We can compute the first
> number in sum -s output using bc:
>
>   $ echo '(16843009 * 255) % 65535' |bc
>   0
>
> Do the same, but with one more byte:
>
>   $ echo '(16843010 * 255) % 65535' |bc
>   255
>
> Looks fine, right?
> But what happens when we simulate 32-bit two's complement
> arithmetic, which makes us reduce the product modulo 2^32:
>
>   $ echo '((16843010 * 255) % (2^32)) % 65535' |bc
>   254
>
> You see we have a different number.
> And that is the bug in the old version of GNU sum.
> Depending on the width of a long, it would output different results.
> The new version uses the code you include below to reduce the
> sum modulo 0xffff, so the problem with overflow cannot arise.
>
> Demonstrate that sum works as described above:
>
>   $ perl -e 'while (1) {print chr(255) x 300}' |head --bytes=16843010
> |sum -s 255 32897
>
> "nick lawes" <[EMAIL PROTECTED]> wrote:
> > I've been looking into a problem that has surfaced on our systems,
> > and it turns out that the problem in in the gnu 'sum' utility as
> > shipped with RedHat 7.1.
> >
> > I realise that they have annoyingly shipped an alpha version of
> > textutils, but as the problem will become official when this
> > version gets released, I felt I should point it out.
> >
> > The problem is the addition of the line:
> >
> >       /* Reduce checksum mod 0xffff, to avoid overflow.  */
> >       checksum = (checksum & 0xffff) + (checksum >> 16);
> >
> > Adding (checksum >> 16) makes the number returned for large files
> > (e.g. 38MB) incompatible with earlier gnu sums and with system V
> > sum that it claims to be compatible with.
> >
> > I can get around the problem for now by using an older version of
> > sum, but this problem will no doubt bite many people when it's
> > released...


Jim, thanks for the reply.

I would have thought that the overflow would be avoided using

checksum = (checksum & 0xffff);

without adding (checksum >> 16).

I've done a sum with several other systems using sysV sum, and they ALL 
agree on the value, it's the new GNU sum that is different. So I 
suspect that the others must be doing the above.

I'm not sure on your definition of a "bug", but this new behavaiour has 
the unfortunate side effect of making the new sum unusable, and only 
comparable to other versions of itself. I suppose it could be argued 
that all the others are wrong, but at least they all agree.

I guess that I will either have to stick to the older version of GNU 
sum, or just knock up my own.

Regards

/nick

-- 
Nick Lawes | SilverPlatter Information | mailto:[EMAIL PROTECTED]

_______________________________________________
Bug-textutils mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/bug-textutils

Re: Problems with sum in textutils

Reply via email to