On Mon, Dec 14, 2009 at 5:25 AM, Florent Daigniere
<nextgens at freenetproject.org> wrote:
> Attempting to compress the file with the same compression algorithm is likely
> to be fruitless, yes... I had a patch somewhere which was trying to
> use file extensions to make educated guesses... but it never got merged
> because of conflicts (saces was working on metadatas) and lack of interrest
> on my side.
>
> Anyway, how do you determine if a file is already compressed or not without
> actually compressing it? Did you do the maths? In most cases, even though the
> data is already compressed it does make sense to recompress it with another
> algorithm (walltime-wise) before sending it over the (slow) wire.

I think by looking at the filetype you can make an educated guess.
Also, if the file is larger than 1MB, there is a good chance that its
already been compressed.

I don't think you'll gain anything by re-compressing an already
compressed file unless the original compression mechanism was really
dumb.

> Iirc the node uses GZIP,BZIP2 and LZMA and inserts the smallest resulting
> file. At some point I even wanted to implement other algorithms like LZO and
> PAQ8P. After all, all we are talking about here is wasting some niced CPU
> cycles to earn both insert and download time!

Yeah, but with a 1GB file, this compression takes a *long* time, and
in the case of the vast majority of 1GB files, will be completely
fruitless because they'll already be compressed.

Ian.

-- 
Ian Clarke
CEO, Uprizer Labs
Email: ian at uprizer.com
Ph: +1 512 422 3588

Reply via email to