Hi Michael,

Thanks for the tip.

However, changing the defines to 3 or 5, and changing the pruner from
229 to 200 had no effect on the result.

Deeper into the code I go then... :)

J

On Apr 10, 5:45 pm, Michael Reimer <[email protected]> wrote:
> I butted my head against this sort of thing at first too, and the
> answer was that Tesseract doesn't seem to trust the dictionary very
> much by default.  There's a FAQ entry on how to change 
> that:http://code.google.com/p/tesseract-ocr/wiki/FAQ(How to increase the
> trust in/strength of the dictionary?).
>
> On Apr 10, 10:33 am, Jon <[email protected]> wrote:
>
> > Err... my bad, the parameter should indeed be 1 character.
>
> > On Apr 10, 5:24 pm, Jon <[email protected]> wrote:
>
> > > I may be wrong, but I think /dict/dawg.cpp line 144 doesn't seem to
> > > consider UTF-8 (parameter 3 is a single byte), and thus fails on my
> > > Hebrew word.
> > > I'm still looking into it, it's the first time I'm looking at the
> > > code.
>
> > > More to come.
>
> > > On Apr 6, 11:57 am, paulfeakins <[email protected]> wrote:
>
> > > > Hi Jon, I also get the feeling my dictionary files are being ignored,
> > > > but I don't know what's causing it as yet...
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to