On Tue, Feb 04, 2003 at 05:18:04PM +0000, Gordan Bobic wrote:
> Hi!
> 
> Matthew, I hope you're reading this because it is primarily aimed at you. 
> :-)
> 
> Has compression of files in Freenet been considered?
> 
> I am not talking about compression of the traffic between the nodes (which 
> would be also beneficial for security, because any entropy in the content 
> would be removed before encryption).
Fortunate, because all the data is encrypted before we get a chance to
compress it at the traffic level (the files themselves that the nodes
cache are encrypted with keys known only by a client who knows the URI
of the file).
> 
> I'm talking about the support for transparent file (de)compression. What 
> do I mean?
> 
> Well, all main-stream browsers support gzip and compress encoded content. 
> They pass the "Accept-Encoding: gzip, compress, deflate" parameter. Then, 
> if the server knows how to compress the data in one of the acceptable 
> formats, it will send the content back compressed, and set the 
> "Content-Encoding: gzip" header (or compress/deflate, as appropriate).
> 
> This means that the files can be transparently decompressed by the browser 
> without any particular overhead, and a huge potential saving in file 
> storage, bandwidth, and possibly reliability (if smaller files get lost 
> less easily).
Not huge. It would only apply to text formats. Images are almost
universally already compressed.
> 
> What I am proposing would require three possible changes to Fred, one of 
> which would be optional.
> 
> 1) At insertion, instead of just setting the MIME type, allow for a box 
> stating the "encoding" method. That way, a plain text file can be 
> compressed by the user using, say, gzip. They then upload it and set the 
> encoding=gzip option. This could be stored in the file headers somehow, 
> along with the MIME type.
> 
> Assuming we get this far, we now have a compressed file in the 
> distributed file cache.
Yeah, there is something or other in the Dublin Core-based Info metadata 
we could use.
> 
> 2) When the browser requests the compressed text file mentioned above, it 
> will issue the HTTP GET request to fproxy, and as always, send it's 
> Acept-Encoding headers. Now, here are two options:
> 
> 2.1) We care about supporting browsers thad don't support gzip compressed 
> pages. Therefore, there is a requirement for a gzip decompressor in 
> fproxy, so that it can uncompress the document for the browser that 
> doesn't support the standard compression method. Fproxy decompresses the 
> gzipped document, omits the Content-Encoding header, and passes back the 
> plain text file.
This is probably not difficult. Java provides methods to manipulate
ZIP/JAR files, which we already use in the Distribution Servlet. I don't
know whether it supports dealing with raw gzip, but it is the same codec
as for ZIP files.
> 
> 2.2) We DON'T care about browsers that don't support the gzip encoding. 
> This is probably safe, because all commonly used browsers support this. In 
> this case, the fproxy modifications would be much smaller. All it would 
> have to do is look up the compression encoding on the file in the headers 
> once it has downloaded it to the local node, and if set to "gzip", it 
> would only have to pass back the standard headers, add the 
> "Content-Encoding: gzip" header, and pass back the compressed file. It 
> would only have to look up the file encoding, and set a header 
> accordingly.
> 
> The benefits seem pretty obvious, unless I am missing something. It 
> removes the entropy from the files before they are ever inserted and 
> encrypted, it reduces the storage requirements, and it reduces the 
> bandwidth requirements.
> 
> Am I re-discovering the wheel here? Or does this sound reasonable? Would 
> this be a potential feature for the next version? I am not sure how 
> flexible the headers are on the files in Freenet, so I am not sure what 
> this will do for backward compatibility of the protocol (other than 
> making the file come out as garbage on older nodes).
It's possible and will be implemented eventually. Possibly only in the
form of ZIP-manifest files.
> 
> Obviously, changes to the insertion tools would be required, too...
> 
> Regards.
> 
> Gordan

-- 
Matthew Toseland
[EMAIL PROTECTED][EMAIL PROTECTED]
Full time freenet hacker.
http://freenetproject.org/
Freenet Distribution Node (temporary) at http://amphibian.dyndns.org:8889/A8dz8aYheps/
ICTHUS.

Attachment: msg01111/pgp00000.pgp
Description: PGP signature

Reply via email to