On Tue, Feb 04, 2003 at 05:18:04PM +0000, Gordan Bobic wrote: > Hi! > > Matthew, I hope you're reading this because it is primarily aimed at you. > :-) > > Has compression of files in Freenet been considered? > > I am not talking about compression of the traffic between the nodes (which > would be also beneficial for security, because any entropy in the content > would be removed before encryption). Fortunate, because all the data is encrypted before we get a chance to compress it at the traffic level (the files themselves that the nodes cache are encrypted with keys known only by a client who knows the URI of the file). > > I'm talking about the support for transparent file (de)compression. What > do I mean? > > Well, all main-stream browsers support gzip and compress encoded content. > They pass the "Accept-Encoding: gzip, compress, deflate" parameter. Then, > if the server knows how to compress the data in one of the acceptable > formats, it will send the content back compressed, and set the > "Content-Encoding: gzip" header (or compress/deflate, as appropriate). > > This means that the files can be transparently decompressed by the browser > without any particular overhead, and a huge potential saving in file > storage, bandwidth, and possibly reliability (if smaller files get lost > less easily). Not huge. It would only apply to text formats. Images are almost universally already compressed. > > What I am proposing would require three possible changes to Fred, one of > which would be optional. > > 1) At insertion, instead of just setting the MIME type, allow for a box > stating the "encoding" method. That way, a plain text file can be > compressed by the user using, say, gzip. They then upload it and set the > encoding=gzip option. This could be stored in the file headers somehow, > along with the MIME type. > > Assuming we get this far, we now have a compressed file in the > distributed file cache. Yeah, there is something or other in the Dublin Core-based Info metadata we could use. > > 2) When the browser requests the compressed text file mentioned above, it > will issue the HTTP GET request to fproxy, and as always, send it's > Acept-Encoding headers. Now, here are two options: > > 2.1) We care about supporting browsers thad don't support gzip compressed > pages. Therefore, there is a requirement for a gzip decompressor in > fproxy, so that it can uncompress the document for the browser that > doesn't support the standard compression method. Fproxy decompresses the > gzipped document, omits the Content-Encoding header, and passes back the > plain text file. This is probably not difficult. Java provides methods to manipulate ZIP/JAR files, which we already use in the Distribution Servlet. I don't know whether it supports dealing with raw gzip, but it is the same codec as for ZIP files. > > 2.2) We DON'T care about browsers that don't support the gzip encoding. > This is probably safe, because all commonly used browsers support this. In > this case, the fproxy modifications would be much smaller. All it would > have to do is look up the compression encoding on the file in the headers > once it has downloaded it to the local node, and if set to "gzip", it > would only have to pass back the standard headers, add the > "Content-Encoding: gzip" header, and pass back the compressed file. It > would only have to look up the file encoding, and set a header > accordingly. > > The benefits seem pretty obvious, unless I am missing something. It > removes the entropy from the files before they are ever inserted and > encrypted, it reduces the storage requirements, and it reduces the > bandwidth requirements. > > Am I re-discovering the wheel here? Or does this sound reasonable? Would > this be a potential feature for the next version? I am not sure how > flexible the headers are on the files in Freenet, so I am not sure what > this will do for backward compatibility of the protocol (other than > making the file come out as garbage on older nodes). It's possible and will be implemented eventually. Possibly only in the form of ZIP-manifest files. > > Obviously, changes to the insertion tools would be required, too... > > Regards. > > Gordan
-- Matthew Toseland [EMAIL PROTECTED][EMAIL PROTECTED] Full time freenet hacker. http://freenetproject.org/ Freenet Distribution Node (temporary) at http://amphibian.dyndns.org:8889/A8dz8aYheps/ ICTHUS.
msg01111/pgp00000.pgp
Description: PGP signature
