I've just started playing with aspell. I'd like to use it to improve OCR text automatically. I know the output won't be perfect, I just want it to be better than the input. I've experimented a bit and found that by inserting aspell's first suggestion for words it doesn't find in its dictionary, I get a reasonably good result. It would be much better, though, if aspell's algorithms were oriented toward the kinds of mistakes OCR engines make rather than the kinds made by human typists. I can see how you might do this by working with the translation tables for the phonetic code, the keyboard files, etc. Before I go any further, I thought I'd ask whether anyone else has already gone down this route. Does anyone have files to share, or advice on how to proceed, or warnings to go back now?
Peter Binkley Digital Initiatives Technology Librarian email: [EMAIL PROTECTED] phone: (780) 492-3743 fax: (780) 492-9243 post: Cameron Library 4-30 University of Alberta Edmonton Alberta Canada T6G 2J8 ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek PC Mods, Computing goodies, cases & more http://thinkgeek.com/sf _______________________________________________ aspell-user mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/aspell-user