> Linux. After all there comes the file contents. BZip2 compresses the
> whole archive. At the beginning I thought to reach a much better
> compression rate than before but gzip is not so bad as I thought.

Bzip2 is indeed good, but being based on the Burrows-Wheeler transform it 
probably performs worse than current 'state of the art' PPMD compressors. It 
would 
be really interesting to see the effect of your patch with a PPM compressor. A 
while 
ago I did a test on several different compressors on the source archive:

Packer  total size      compression              improvement from gz  loss from 
best  time (s)
tar     25764.5Mb       0                            -2.42284                 
0.786226                0
gzip    7527.25Mb      0.707845                 0                            
0.268287           41010.3
zzip    5987.13Mb      0.767621                 0.204605                 
0.0800635       441934
szip    6501.28Mb      0.747666                 0.1363                    
0.152816           37523.5
bzip2   6479.37Mb     0.748516                 0.139211                 
0.149951           43998.6
PPMd2   8631.81Mb  0.664973                 -0.146742                0.36192    
         36657.1
PPMd3   7199.65Mb  0.72056                   0.0435215               0.234993   
        39515.3
PPMd4   6549.61Mb  0.74579                   0.129879                 0.159067  
         41225.7
PPMd5   6223.04Mb  0.758465                 0.173265                 0.114937   
        43005
PPMd6   6027.49Mb  0.766055                 0.199245                 0.0862218  
       44711.3
PPMd7   5892.44Mb  0.771297                 0.217185                 0.0652795  
       46276.5
PPMd8   5796.15Mb  0.775034                 0.229978                 0.0497504  
       47621.4
PPMd9   5731.01Mb  0.777562                 0.238631                 0.0389513  
       48880.5
PPMd10  5688.98Mb 0.779193                 0.244215                 0.0318506   
      50777.1
PPMd11  5661.65Mb 0.780254                 0.247846                 0.0271773   
      50990.6
PPMd12  5638.57Mb 0.78115                   0.250912                 0.0231944  
       51868.9
PPMd13  5625.2Mb   0.781669                 0.252688                 0.0208738  
       52697.2
PPMd14  5613.63Mb 0.782118                 0.254225                 0.0188553   
       53441
PPMd15  5609.34Mb 0.782285                 0.254795                 0.0181046   
       54141.2
PPMd16  5605.05Mb 0.782451                 0.255366                 0.0173525   
       54776.9

bzip2                    = bzip2 -9
PPMdx                 = ppmd -o x -m 220
compression          = relative compression
improvement from gz = relative improvement from gzip compression
loss from best         = relative size difference compared to using different 
compressors for each 
                               package and using the individually best 
compressor for each package.
time(s)                   = total compression time
 
PPM compressors have been to slow to use until recently[0]. PPMd is a 
demonstration 
program based on the article[0], I even got as far as creating a demo 
package[1]. However, a more usable 
program that contains PPMd code is 7zip[2], but being written originally for 
windows I'm not sure 
about the current state of the port, if there even is any now (comments Radim?).

0. http://DataCompression.info/Miscellaneous/PPMII_DCC02.pdf
1. oxtan.campus.luth.se/debian/ppmd
2. http://www.7-zip.org/
-- 
Magnus Ekdahl 0739-287181 [EMAIL PROTECTED] [EMAIL PROTECTED]
public key available at http://oxtan.campus.luth.se/magnus.public
ftp://ftp.se.debian.org/debian-non-US/pool/non-US/main/d/debian-keyring/
Key fingerprint = 18DE CB62 8A86 374E 824E  09ED 1987 4B18 1213 79F6


Reply via email to