On 2013-10-21 Richard W.M. Jones wrote:
> Here is a parallel implementation of xzcat:
> 
> http://git.annexia.org/?p=pxzcat.git;a=tree
> 
> Some test results:
> 
>   4 cores:  xzcat: 23.8 s  pxzcat: 8.1 s   speed up: 2.9
>   8 cores:  xzcat: 26.8 s  pxzcat: 10.5 s  speed up: 2.55
> 
> I just wrote this as a quick hack in a couple of hours, so while it
> may be of interest it's not a long term solution.  (It would be better
> to get the xzcat -T flag working).

Sounds nice!

Threaded decoding should be included in liblzma, but it will need to
wait past 5.2.0. In liblzma it will work for streamed decompression,
but it also means using quite a bit of memory.

> (2) I have not tested it with multi-stream files, but it should work
> with them.

I tested two-stream files without and with stream padding and neither
did work with pxzcat. Commands to create the files:

    echo foobar | xz --block-size=3 > test1.xz
    echo bazqux | xz --block-size=4 >> test1.xz

    echo foobar | xz --block-size=3 > test2.xz
    dd if=/dev/zero bs=100 count=1 >> test2.xz
    echo bazqux | xz --block-size=4 >> test2.xz

I didn't investigate why it doesn't work, sorry.

> Notes on performance:
> 
> - Scalability is not too bad on my laptop (4 core machine above) but
> much worse on a theoretically higher performing machine with SSDs (8
> core machine above).  I don't really understand why that is.

A few wild guesses:

  - Eight cores or threads (hyperthreading)?

  - If all cores share the same L3 cache and memory controller, maybe
    memory access becomes a bottle neck.

  - Maybe scattered I/O has something to do with it. Testing with the
    write calls commented out might give some hints.

> - For reasons I don't understand, both regular xzcat and pxzcat cause
> the output file to be flushed to disk after the program exits.  This
> causes any program which consumes the output of the file to slow down.

I have no idea. I see you committed something that seems to be related
to this after your email. With a quick reading I don't understand it
well, it seems to be working around some issue with ftruncate() with
ext4.

xz doesn't use ftruncate() though so if xz has a problem, it cannot be
ftruncate(). If sparseness is the problem, test --no-sparse, although
with very sparse files it creates a different performance problem, of
course.

-- 
Lasse Collin  |  IRC: Larhzu @ IRCnet & Freenode

Reply via email to