* Thus wrote Jan Fabry ([EMAIL PROTECTED]): > ... > I have created some sort of compression. It assigns a number to each > word, which says how much of the first letters are in common with the > previous word. It's only one hexadecimal character, so it is in the > range 0-15. I don't know the official name, but I saw it in ispell > dictionary files. I call it 'prefix compression', but this name already > stands for another type of compression. > > An example to make this clear: > > _ > abs > acos > > ... > becomes > > 0_ > 0abs > 1cos > ... > If someone knows a script that does this kind of compression, or would > like to write it, you're welcome.
Done :) http://zirzow.dyndns.org/prefix_worddic/ > > You can find an example with the function names from '_' till > 'cyrus_unbind' (with manual compression!) on > > [ http://lumumba.luc.ac.be/cheezy/misc/php/prefix_compression.html ] > > This gives a compression from 3588 bytes to 2261 bytes (that would > reduce the full lists from 42kB to 25-30kB, so the complete file would > be about 45kB (or 40kB if we remove all whitespace)). The 'uncompression > code' is really short, and it doesn't give a noticable speed difference. I hope you know how to search the comopressed data, It baffles me right now. Curt -- "I used to think I was indecisive, but now I'm not so sure."