Olof Sjobergh wrote: > On Wed, Jan 28, 2009 at 2:05 PM, Helge Hafting <helge.haft...@hist.no> wrote: >> The obvious fix is to store the dictionary in such a format that >> conversions won't be necessary. Not sure why utf16 is being used, >> utf8 is more compact and works so well for everything else in linux. > > Yes, the obvious fix is to change the dictionary format. However, it's > not as simple as you might think. > > The dictionary today is stored in utf8, not utf16. But the dictionary > lookup tries to match words not exactly the same as the input word, > for example e should also match é, è and ë. To do this, every
I see. This is done to avoid needing a few extra keys for accents and umlauts? Won't that create problems for languages where two words differ only in accents? In Norwegian, there are many such pairs. Examples: for/fôr, tå/ta, dør/dor,... Helge Hafting _______________________________________________ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community