Re: [distcc] Re: Using distcc for other tasks (distributed "filtering")

Christian Leber Sun, 15 Feb 2004 13:45:48 -0800

On Mon, Feb 16, 2004 at 07:33:11AM +1100, Ben Elliston wrote:
> gzip has a nice property that:
>         cat A B | gzip > foo.gz
> 
> is functionally equivalent to:
>         (gzip < A && gzip < B) > bar.gz
> 
> The best way to parallelise your compression work would be to divide
> your workload into N pieces, where N is the number of machines you
> have.  Use split(1) to break the input into N pieces and use each host
> to gzip one chunk.  At the end, "cat" the result together again.


Exactly, i have about 30000 pieces, my problem is that i don't know how
to get distcc to pipe it through the gzip(*) on the remote boxes.

Christian Leber

(*) in fact it's stuff from 7z that takes about 15x the time, for gzip
this would not be worth, but is decompressable with normal gzip

-- 
  "Omnis enim res, quae dando non deficit, dum habetur et non datur,
   nondum habetur, quomodo habenda est."       (Aurelius Augustinus)
  Translation: <http://gnuhh.org/work/fsf-europe/augustinus.html>
__ 
distcc mailing list            http://distcc.samba.org/
To unsubscribe or change options: 
http://lists.samba.org/mailman/listinfo/distcc

Re: [distcc] Re: Using distcc for other tasks (distributed "filtering")

Reply via email to