I did search this group but found only old posts regarding multiple languages (regarding 2.0), but, looking forward to the new features in 3.01...
I am assuming it's still impossible, even in 3.01, to recognize a mixture of languages (distinct alphabets), per scan. If my assumption is correct, then, the next best thing would/could be to combine multiple traineddata files into one superset... But is that even feasible?? Any other solutions for multilingual (multi-alphabetic) documents? (ABBYY does it -- why can't we?? :-)) TIA -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

