>From the GNU aspell author Kevin Atkinson <kevina at gnu.org>: Concerning Aspell and UTF-8:
Starting with version 0.60, Aspell fully supports spell checking documents in UTF-8 or any other encoding that Aspell supports. The fact that Aspell is still 8-bit internally can be made completely transparent to the end user. This means that Aspell can now support any language that has no more than 220 distinct characters, including different capitalizations and accents, _even if_ there is not an existing 8-bit encoding that supports the language. All one has to do is creating a new character data file which is a fairly simple task. The internal encoding never has to be seen by the end-user, including the word list author, since not even the word list has to be in the same encoding that Aspell uses. GNU Aspell 0.50 supported Unicode to some extent; however, word list still had to be in an 8-bit character set. Furthermore, spell checking documents in an encoding that is different from the internal encoding was pragmatic. Full UTF-8 support was added with 0.51-20040219, the next snapshot, 0.51-20040227 fixed a few bugs, while the latest 0.60-20040317 uses a new, simpler, format for the character data files. Aspell snapshots can be downloaded from ftp://alpha.gnu.org/gnu/aspell/. Markus -- Markus Kuhn, Computer Laboratory, University of Cambridge http://www.cl.cam.ac.uk/~mgk25/ || CB3 0FD, Great Britain -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
