On Mon, 1 Oct 2012 22:16:53 -0700 Tim Kientzle <t...@kientzle.com> wrote:
> There are a few different parallel command-line compressors and decompressors > in ports; experiment a lot (with large files being read from and/or written > to disk) and see what the real effect is. In particular, some decompression > algorithms are actually faster than memcpy() when run on a single processor. > Parallelizing such algorithms is not likely to help much in the real world. > > The two popular algorithms I would expect to benefit most are bzip2 > compression and lzma compression (targeting xz or lzip format). For > decompression, bzip2 is block-oriented so fits SMP pretty naturally. Other > popular algorithms are stream-oriented and less amenable to parallelization. > > Take a careful look at pbzip2, which is a parallelized bzip2/bunzip2 > implementation that's already under a BSD license. You should be able to get > a lot of ideas about how to implement a parallel compression algorithm. > Better yet, you might be able to reuse a lot of the existing pbzip2 code. > > Mark Adler's pigz is also worth studying. It's also license-friendly, and is > built on top of regular zlib, which is a nice technique when it's feasible. Just a small note: There's a parallel implementation of xz called "pixz". It's build atop of liblzma and libarchiv and stands under a BSD style license. See: https://github.com/vasi/pixz Maybe it's possible to reuse most of the code. -- Homepage: www.yamagi.org XMPP: yam...@yamagi.org GnuPG/GPG: 0xEFBCCBCB
pgp4AZtefgufA.pgp
Description: PGP signature