On Saturday 27 Oct 2001 17:39, Jim Meyering wrote: > Thanks for the report, but that's not a bug in the newer version. > > In 2.0f, I applied this patch: > > 2000-06-22 Bruno Haible <[EMAIL PROTECTED]> > > * src/sum.c (sysv_sum_file): Avoid overflowing 32-bit > accumulator on files larger than 256 MB. > > The only problem is that the above comment is inaccurate. > (I've just fixed it.) > In reality, the problem with overflow using the old version > could happen with files as `small' as 16843010 bytes. > That's floor ((2^32) / 255 + 1). > > To demonstrate, remember that the first number in the output of > `sum -s' is the sum of all bytes modulo 0xffff (aka 65535). > > So, consider a file that is a sequence of one less than > that magic number of 0xff bytes. We can compute the first > number in sum -s output using bc: > > $ echo '(16843009 * 255) % 65535' |bc > 0 > > Do the same, but with one more byte: > > $ echo '(16843010 * 255) % 65535' |bc > 255 > > Looks fine, right? > But what happens when we simulate 32-bit two's complement > arithmetic, which makes us reduce the product modulo 2^32: > > $ echo '((16843010 * 255) % (2^32)) % 65535' |bc > 254 > > You see we have a different number. > And that is the bug in the old version of GNU sum. > Depending on the width of a long, it would output different results. > The new version uses the code you include below to reduce the > sum modulo 0xffff, so the problem with overflow cannot arise. > > Demonstrate that sum works as described above: > > $ perl -e 'while (1) {print chr(255) x 300}' |head --bytes=16843010 > |sum -s 255 32897 > > "nick lawes" <[EMAIL PROTECTED]> wrote: > > I've been looking into a problem that has surfaced on our systems, > > and it turns out that the problem in in the gnu 'sum' utility as > > shipped with RedHat 7.1. > > > > I realise that they have annoyingly shipped an alpha version of > > textutils, but as the problem will become official when this > > version gets released, I felt I should point it out. > > > > The problem is the addition of the line: > > > > /* Reduce checksum mod 0xffff, to avoid overflow. */ > > checksum = (checksum & 0xffff) + (checksum >> 16); > > > > Adding (checksum >> 16) makes the number returned for large files > > (e.g. 38MB) incompatible with earlier gnu sums and with system V > > sum that it claims to be compatible with. > > > > I can get around the problem for now by using an older version of > > sum, but this problem will no doubt bite many people when it's > > released...
Jim, thanks for the reply. I would have thought that the overflow would be avoided using checksum = (checksum & 0xffff); without adding (checksum >> 16). I've done a sum with several other systems using sysV sum, and they ALL agree on the value, it's the new GNU sum that is different. So I suspect that the others must be doing the above. I'm not sure on your definition of a "bug", but this new behavaiour has the unfortunate side effect of making the new sum unusable, and only comparable to other versions of itself. I suppose it could be argued that all the others are wrong, but at least they all agree. I guess that I will either have to stick to the older version of GNU sum, or just knock up my own. Regards /nick -- Nick Lawes | SilverPlatter Information | mailto:[EMAIL PROTECTED] _______________________________________________ Bug-textutils mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-textutils