On Thu, 11 Jul 2002 [EMAIL PROTECTED] wrote: > ... It would be much better, though, if aspell's > algorithms were oriented toward the kinds of mistakes OCR engines make > rather than the kinds made by human typists.
Aspell algorithms are not really tuned for the type of mistakes made by typists. Rather they are tuned for the type of mistakes humans (especially me) tend to make when trying to spell a word. The typo analysis in Aspell biases the result slightly, but it generally doesn't make a huge difference. > I can see how you might do this > by working with the translation tables for the phonetic code, the keyboard > files, etc. You will probably get the best results by turnings the soundslike analysis off all together. Modifying the keyboard file will also help. However, the best results will probably be from modifying the weights in TypoEditDistanceWeights found in util/typo_editdist.hh. To do so will requiring modifying the code a bit. The code that fills in the weight can be found in SuggestParms::fill_distance_lookup in lib/suggest.cc. Most of the code should be self explanatory. You really need to understand how edit distance works in order to know what to modify. The comments in the util/*editdist* files should give you enough information for this understanding. --- http://kevin.atkinson.dhs.org ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Gadgets, caffeine, t-shirts, fun stuff. http://thinkgeek.com/sf _______________________________________________ aspell-user mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/aspell-user