What about those multi-language recognitions (recognising text in Lang1 with inclusions of text in Lang2 and in Lang3 etc.)? Right now, for this to work one has to train for the "pseudo-languages", with obvious drawbacks in such technique.
If such functionality is being worked on, would you also please consider adding the weights to the languages in such multi-language sets? Would be helpful in cases of similar-looking glyphs in different languages. On Oct 2, 6:29 am, "Jimmy O'Regan" <[email protected]> wrote: ... -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

