Your message dated Thu, 7 Jun 2012 13:59:27 +0200
with message-id <[email protected]>
and subject line Re: Bug#676377: bzip2: bunzip2 sometimes generate corrupt data 
via a pipe
has caused the Debian Bug report #676377,
regarding bzip2: bunzip2 sometimes generates corrupt data via a pipe
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)


-- 
676377: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=676377
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Package: bzip2
Version: 1.0.5-6+squeeze1
Severity: grave
Justification: causes non-serious data loss

I have a Perl script that calls bunzip2 on a .bz2 file and does
some md5 checking on the output. But while the .bz2 file hasn't
changed, the md5 is sometimes wrong (this is unreproducible in
general, the failure occurs only from time to time). In case of
failure, the uncompressed data are stored in some file, so that
I could compare the correct data and the corrupt data:

$ ll r5899-*
-rw-r--r-- 1 vlefevre users 43883166 2012-06-06 16:24:46 r5899-right
-rw-r--r-- 1 vlefevre users 43883166 2012-06-06 16:08:19 r5899-wrong
$ cmp -l r5899-right r5899-wrong
37484200  20 220

Only one byte differs...

If need be, I can provide all the files (but that's 150 MB).

I'm not sure that the bug is in bzip2, it could also be in Perl
or in the kernel, or a hardware problem, but I haven't noticed
other failures than this md5 check.

I'll try to do other tests, so that running the same test in a
loop on several machines (with the same installation), possibly
doing the same test with xz.

-- System Information:
Debian Release: 6.0.5
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.32-5-amd64-server (SMP w/16 CPU cores)
Locale: LANG=POSIX, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages bzip2 depends on:
ii  libbz2-1.0              1.0.5-6+squeeze1 high-quality block-sorting file co
ii  libc6                   2.11.3-3         Embedded GNU C Library: Shared lib

bzip2 recommends no packages.

Versions of packages bzip2 suggests:
pn  bzip2-doc                     <none>     (no description available)

-- no debconf information



--- End Message ---
--- Begin Message ---
On 2012-06-07 11:54:53 +0100, Steven Chamberlain wrote:
> On 07/06/12 11:20, Vincent Lefevre wrote:
> > Actually these are only MD5 errors (from my own checks), no
> > "bunzip2: Data integrity error when decompressing" errors.
> 
> Couldn't you pipe the uncompressed data directly into your integrity
> checking program?  Without using bzip2/bunzip2 or any other compressors
> at all?

Yes, I've modified my script to support that too (the file is just
opened and checked). And I get the same kind of failures. So, I'm
closing the bug since it is not related to bzip2.

It could be a hardware error or some software error on the machine
(e.g. if some file has been corrupted during the installation, or
a bug in perl that corrupts the memory in some specific case).

> > And "bunzip2 -t mpfr-0-8207.bz2" in a loop never fails.
> 
> If the .bz2 file is too small (<<80GiB), this test may not be reliable
> (it would likely be read from disk only once, if at all, then sit in
> memory cache for all test iterations).

The failure occurs on the 156 MiB uncompressed file, which is much
smaller than 80 GiB:

-r-------- 1 vlefevre users 162926084 2012-05-09 11:23:09 mpfr-0-8207

And this is in /tmp. Since similar failures also occur for files
under NFS, I don't think a file system is in cause.

-- 
Vincent Lefèvre <[email protected]> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


--- End Message ---

Reply via email to