Hello,

This bug is much more generic.

>From theoretical point of view, for every plain (uncompressed) file
there exist *infinite* number of bz2 compressed files that correctly
decompress to the plain file.

In practice there exists number of different compressors that can
create different compressed files. Those include lbzip2 and pbzip2,
which may become even more popular as number of CPU cores increases
rapidly.

Even the newest version of unmodified upstream (or Debian) bzip2 can
produce different compressed files with the same block size. Basically
it's because bzip2 internally uses shellsort and quicksort, which
aren't stable sorting algorithms. Block-sorting can therefore produce
different results under different circumstances. If anyone cares I can
provide a proof-of-concept and/or explain why that happens.

The same thinking applies to gzip-compressed files.

IMO this bug should be merged with #563651, renamed to something like
"does not support tarballs compressed with alternative compressors"
and tagged wontfix (unless there is a sane solution, which I can't
think about now).

Mikołaj

PS. I know the internals of bzip2 *really* well. I am open for
discussion about any possible solutions.



--
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]

Reply via email to