I should add that there will be Tesseract support again; it's just not part of the default build right now.
In fact, an easy way of training a new language is by using Tesseract to bootstrap the new line recognizer; any language that Tesseract supports can be trained that way, and the OCRopus recognizer can give you better performance than the Tesseract recognizer even if it was initially trained based on Tesseract output. Tom On Jun 8, 6:29 pm, Gabriel <[email protected]> wrote: > I have downloaded and compiled the latest sources > fromhttp://mercurial.iupr.org > with scons. > My OS is Ubuntu 9.04. > > I used OCRopus to recognize French text with a sample page in PNG > format. > ocropus page page01.png > > Before I did > export tesslanguage=fra > to be sure that Tesseract had the right language. > > The result was nice but accentuated characters are not recognized. > > But when I try directly using Tesseract (with the same page in TIFF > format) its seems OK with accentuated characters. > > Did I need special dictionnaries for OCROpus ? > > Gabriel --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/ocropus?hl=en -~----------~----~----~----~------~----~------~--~---
