Nathan Bullock: > I have a plucker document with about 1600 html pages, > they range from 1k - 37k. The vast majority are > between 2k - 10k.
> Various compression techniques: > 1. Total Raw Bytes 8Mb. > 2. Total gzipped 3.1Mb (Each file individually > compressed). > 3. Total tar gzipped 2.3Mb The reason for the extra reduction is that there is some redundancy between files. For instance, they probably have similar headers and footers. You could get a similar reduction by using a custom dictionary. Then you would only need to parse this dictionary (once) plus the desired record, instead of everything-up-to-the-record. The zlib spec does allow for a custom dictionary, but (last I checked) this didn't seem to be implemented in the standard open source zlib.[1] It is "application- specific". We would also have to decide whether to use a (or several?) plucker-custom dictionary or a per/pdb dictionary with a special magic record number, or both. [1] http://www.gzip.org/zlib/ suggests that it is there by 1.1.3 (which we use), but was improved since then. -jJ _______________________________________________ plucker-dev mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-dev
