Re: [freenet-dev] To zip or not to zip

[EMAIL PROTECTED] Fri, 01 Aug 2003 05:46:01 -0700

>> if it's the compressed hash, then
>> + the freenet protocol would know where to look for the data in the
>> freenet, grab it, detect it's precompression, extract the original data and
>> return the data just like it went into the insertion tool
>> - 3rd party tools
>> would have to emulate the node's compression step, killing portability and
>> future compatibility, if they want to calculate the CHK hash by themselves
>
>Is there any particular reason why the tool would do that on it's own? The 
>node already has to do it regardless (am I correct in thinking that?), so 
>doing it twice is wasteful.


adding a hypothetical FCP parameter like "tryToCompress=false" would force the node to 
not try to compress the stuff and thus tools could compress the data by themselves 
using faster and/or native 
routines/hardware.
IIRC FUQID uses it's *own*, faster and non-localhost-network-looped-back FEC splitfile 
routines, without using FRED's FCP FEC functions. lets hope they are compatible 
(unless the implementations diverge at 
some time in the future)
so the here discussed insertion compression might be done by some weird insertion tool 
by itself, bypassing the node's implementaion, if the tool author sees some gain with 
this behaviour.

also, 3rd party *nodes* will have to reimplement at least the decompression side to 
allow the user to use freenet.
this might not be a big problem if we try to use widely spread algorithms like 
standard zip, which has a library at hand in nearly every computer language. try to 
use an algorithm that is available for the mayot 
programming languages like c, cpp, java, delphi, python, .... i myself have not looked 
for support of bzip2/... in these languages, but i'll bet the chances are better that 
standard zip could be found within a lib.

another topic is the hash of the compressed data.
even if we can agree on the used compression algorithm, the *implementation* of this 
alg might differ between languages and even language versions.
this might lead to the effect, that compressing the *same data* leads to *different 
archives*, which will of course produce different hash keys for the compressed data 
(try to zip a file with winzip, then with winrar. 
they're different. or imagine using a different compression level...!)
so the insertion of a piece of data produces a hash of the compressed data which is 
effectively not predictable and not consistent.
this will lead to the problem, that you have to reinsert your data from the same node 
as you always do, because if the hash of the inserted compressed data is different 
between the nodes (or tools), you get no key 
collision; resulting in wasted store space and wasted time. maybe the insertion tool 
will remember an already inserted file (like FIW does) and does not reinsert the data 
but uses the previous hash. this will lead to 
different compression algorithms used for one single freesite: some files are inserted 
using algorithm A, re-insertion with a node or tool that produces different hashes 
will partially upload the new files with algorithm 
B. the resulting mapfile for the site will then contain some files compressed with A 
and some with B. so if you're unlucky and use something (node/tool) that is 
incompatible with B, you only get a crippled site or data 
(splitfile blocks can theoretically be inserted compressed, if they compress 
well)...... if any data at all....


just my 2 eurocent, maybe i'm worrying too much...




_______________________________________________
devl mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org:8080/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] To zip or not to zip

Reply via email to