On Dec 10, 6:26 pm, Tatu Saloranta <[email protected]> wrote: > On Thu, Dec 10, 2009 at 1:34 PM, Sam Van Oort <[email protected]> wrote: > > > Hi, > > I can add this to a list of extensions to theLZFcode. It's not a > > bad idea to have a version which is fully binary compatible, which can > > be used as a compatibility option in the future. > > I think it'd be nice to be compatible, especially if a separate > library was carved out. I'm talking with Mr. Mueller about this, but it could take some time to work out. We'll see where it goes?
> Also: using chunk identifiers like command-line tools has one nice > benefit; that is, you can just sequence blocks without restrictions as > there is no initial header (which can be downside in some cases too). Could you explain a little more? I haven't looked at the C LZF code, to avoid possible legal problems. I can do that now that my Java LZF extensions appear complete. > What kind of improvements are there? Better hashing? I assume changes > to format wouldn't be needed? No changes to format... just a system that stores more hashes and checks for the best of several candidate back-references. Also a variant that hashes *all* bytes rather than just the literals and last couple from each back-reference, but that version is still too slow to be useful. As it stands now, it looks like these won't make it into the H2 codebase, because Mr. Mueller wants to keep the code pared down, but here's a teaser in case anyone is interested in pre-release versions. Benchmarks (Intel Core 2 Duo T5270 @ 1.4GHz, single core only): File -- Parser.java (all speeds in MB/s) Compressor: CompressRate: Compression Ratio: Expand Rate (old expander): Expand Rate (new expander): FASTEST 97.5 0.2704 488.3 556.1 FAST 70.1 0.2526 -- 599.5 NORMAL 48.5 0.2413 -- 669.3 BETTER 25.2 0.2324 -- 693.8 "Fastest" corresponds to the current (optimized) compressor. "Fast" is equivalent in speed (more or less) to the older un-optimized version. Note that the difference b/w ratios of 0.27 and 0.232 is about 15% decrease in file size, and the expansion rate grows with compression. Yes, the compression rate drops significantly, but for frequent read but infrequent write, it's worth it. Expect speeds a little over double on a more modern system (say 2.6 GHz Core 2 Duo), so the slowest option still compresses at 50 MB/s or so. > Yes, many other projects have expressed interest in using a pluggable codec. Something in the same style as the existing Inflater/Deflater and GZip/ Zip Input/Output streams? I think this I can do that easily. Doesn't surprise me to see interest, since there's so little support for LZF in the Java libraries. Cheers, Sam Van Oort -- You received this message because you are subscribed to the Google Groups "H2 Database" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/h2-database?hl=en.
