Yu-Xi Lim: Thank you for your comments, and sorry for my last cryptic answer.
>I think Bearophile isn't refering to compression of the dictionary, but the >predictive algorithms used by modern data compressors. However, I think he's >over-complicating the issue. It is *not* a data compression problem, imho.< I agree that my solution is probably too much complex for most purposes (using a compressor simpler than PAQ is probably better for most of such purposes), but I think it is a data compression problem, because compressing data essentially means predicting the next bit, and this program has to predict what's the most probable letter that the user wanted to add. See Dasher too at the end of this post. >While predictive input is desired, the PAQ algorithm utilizes multiple >"contexts" (the novel contribution of the paper mentioned below). This is >intended for general purpose data compressors which work on a variety of data, >such as uncompressed graphics and audio, text, or other binary data. There is >however, only one context in this case.< PAQ8 manages many contexts. Some of them are fit for digital audio/images, or Jpeg, etc. Such contexts can be removed (disabled) from the PAQ source code, it's not difficult. But PAQ8 contains many (more than one) contexts just for textual data, and you can keep such contexts, because they improve text compression, so they improve the prediction. (Removing those contexts improves speed and even more it reduces memory used). For this program I think you can keep the following ones: Order n, Sparse, Text, Formatted text, Fixed record length, Context gap, Indirect. If you are short on memory you can probably remove some of them. If you use the keyboard to input specific kinds of data, you may add a context for them too. >A more advanced system (beyond regular T9 and comparable to Motorola's iTap) >may consider the context of the word. So typing followed 2255#466 would make >"call home" the most likely word.< A good compressor (a PPM can be okay too) can do this too, its contexts can be many chars long (but you need lot of memory, probably too much for a telephone of today). >https://www.cs.fit.edu/Projects/tech_reports/cs-2005-16.pdf Note that this document doesn't explain the new versions, that contain a new good idea. Code for the last version: http://cs.fit.edu/~mmahoney/compression/paq8h.zip You can be interested in a similar project, that uses a PPM: http://www.inference.phy.cam.ac.uk/djw30/dasher/ Using an instrumented PAQ this Dasher can be improved a little (speed of the algorithm isn't important, because you are compressing few bits/minute. The dictionaries created by the PAQ can be even frozen, in some cases, so they can be read from disk/flash at the start of the program. Bye, strong bear hugs, bearophile -- http://mail.python.org/mailman/listinfo/python-list