Pauliuc George wrote: > a few huge documents. And I was quite impressed by the > results of aspell. If aspell knew the misspelled word than > that word is in most cases the first option. Never seen > such results with other spell checkers. Probably is because
I agree that aspell gives the best suggestions. But ispell/MySpell is better on some other issues. I wish Aspell's "suggestion engine" as a component would be integrated into ispell/MySpell. > Anyway, here's one issue: in Romanian we use the dash (-) > about the same way English uses the apostrophe ('). Swedish has a similar issue with colon (:) and dash (-), and in some places with digits. For example, English "3rd" = Swedish "3:e", while "3:c" would be bad spelling. Also 743:e is good, but 7A3:e is bad, so it would be nice to have regular expressions as part of the dictionary, all patterns matching "[0-9]*3:e" are good words. > The final issue (even more twisted as the ones above ;-): > because of badly implemented Romanian char support many > documents are made without the diacritics. So, instead of î > we have i and so on. This is a very particular case for a > spell checker (I don't know any other language with such an > issue) - to add the diacritics. For Swedish, leaving out the diacritics was common in the 1980s, but has gone away. People don't accept it, and always make fun of those who leave out diacritics, so it is OK for a spelling program to report these cases as errors. For German, it is different. There is a long tradition of rewriting German ä as ae, ö as oe, and ü as ue. People still use this on many German mailing lists, even though their keyboards and mailing software should support the diacritics. It is not reversible. Writing "Poesie" as "Pösie" would be wrong. A "generous" dictionary would have to contain Götter, Goetter, and Poesie, but not Pösie. This could be generated automatically by software from a "strict" dictionary that only contains "Götter" and "Poesie", like this: ( cat strict ; sed 's/ö/oe/g;s/ä/ae/g;s/ü/ue/g' strict ) | sort -u > generous -- Lars Aronsson ([EMAIL PROTECTED]) Aronsson Datateknik - http://aronsson.se/ _______________________________________________ Aspell-devel mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/aspell-devel