Hi Michael, Thanks for the tip.
However, changing the defines to 3 or 5, and changing the pruner from 229 to 200 had no effect on the result. Deeper into the code I go then... :) J On Apr 10, 5:45 pm, Michael Reimer <[email protected]> wrote: > I butted my head against this sort of thing at first too, and the > answer was that Tesseract doesn't seem to trust the dictionary very > much by default. There's a FAQ entry on how to change > that:http://code.google.com/p/tesseract-ocr/wiki/FAQ(How to increase the > trust in/strength of the dictionary?). > > On Apr 10, 10:33 am, Jon <[email protected]> wrote: > > > Err... my bad, the parameter should indeed be 1 character. > > > On Apr 10, 5:24 pm, Jon <[email protected]> wrote: > > > > I may be wrong, but I think /dict/dawg.cpp line 144 doesn't seem to > > > consider UTF-8 (parameter 3 is a single byte), and thus fails on my > > > Hebrew word. > > > I'm still looking into it, it's the first time I'm looking at the > > > code. > > > > More to come. > > > > On Apr 6, 11:57 am, paulfeakins <[email protected]> wrote: > > > > > Hi Jon, I also get the feeling my dictionary files are being ignored, > > > > but I don't know what's causing it as yet... --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

