* Thus wrote Jan Fabry ([EMAIL PROTECTED]):
> ...
> I have created some sort of compression. It assigns a number to each 
> word, which says how much of the first letters are in common with the 
> previous word. It's only one hexadecimal character, so it is in the 
> range 0-15. I don't know the official name, but I saw it in ispell 
> dictionary files. I call it 'prefix compression', but this name already 
> stands for another type of compression.
> 
> An example to make this clear:
> 
> _
> abs
> acos
>
> ... 
> becomes
> 
> 0_
> 0abs
> 1cos
> ... 
> If someone knows a script that does this kind of compression, or would 
> like to write it, you're welcome.

Done :)

http://zirzow.dyndns.org/prefix_worddic/


> 
> You can find an example with the function names from '_' till 
> 'cyrus_unbind' (with manual compression!) on
> 
> [ http://lumumba.luc.ac.be/cheezy/misc/php/prefix_compression.html ]
> 
> This gives a compression from 3588 bytes to 2261 bytes (that would 
> reduce the full lists from 42kB to 25-30kB, so the complete file would 
> be about 45kB (or 40kB if we remove all whitespace)). The 'uncompression 
> code' is really short, and it doesn't give a noticable speed difference.
 
I hope you know how to search the comopressed data, It baffles me
right now.

Curt
-- 
"I used to think I was indecisive, but now I'm not so sure."

Reply via email to