On 2 October 2010 07:20, tt <[email protected]> wrote:
> Regarding the situations where one has no control over the original
> scan (i.e., where no amount of resizing'd help), one could also train
> for those letters' combinations which 'inevitably' come out joined
> after the box mapping phase.

Tesseract 3.01 will have a new mode where you can train using pages
boxed by word instead of by character. The caveat is that, to use it,
you must first have existing language data trained by character.

-- 
<Leftmost> jimregan, that's because deep inside you, you are evil.
<Leftmost> Also not-so-deep inside you.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to