Jimmy, Thanks for the good news that we can train using pages boxed by word instead of character in tesseract 3.01 with help of traineddata of tesseract 3.0.This may help for Kannada project. With Best Regards, -sriranga(78yrsold)
On Sat, Oct 2, 2010 at 5:28 PM, Jimmy O'Regan <[email protected]> wrote: > On 2 October 2010 07:20, tt <[email protected]> wrote: > > Regarding the situations where one has no control over the original > > scan (i.e., where no amount of resizing'd help), one could also train > > for those letters' combinations which 'inevitably' come out joined > > after the box mapping phase. > > Tesseract 3.01 will have a new mode where you can train using pages > boxed by word instead of by character. The caveat is that, to use it, > you must first have existing language data trained by character. > > -- > <Leftmost> jimregan, that's because deep inside you, you are evil. > <Leftmost> Also not-so-deep inside you. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<tesseract-ocr%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

