Karen, My OCR performance have been improved thanks to your advices and also by specifying correctly the dictionary. One of the problems was that when I changed the dictionary, matterhorn didn't loaded it until I deleted the location where matterhorn stores the data and it was restarted. I think, it must be a better way to reload the dictionaries but for me it works and it's not a problem doing this way in my case.
Regards, Miguel 2012/4/9 Miguel Del Agua <[email protected]>: > Thanks for your suggestions, I'll try as soon as possible. > > 2012/4/5 Tobias Wunden <[email protected]>: >> Hi Matjaz, >> >> Is this something you could potentially take a look at together with Karen? >> It seems that her suggestions have quite an impact on OCR performance. I'e >> just opened a ticket, MH-8701 >> >> Thanks, >> Tobias >> >> On 03.04.2012, at 19:51, Karen Dolan <[email protected]> wrote: >> >>> Miguel, >>> >>> Matterhorn trunk (from a couple weeks ago) was configured to pull down >>> Leptonica 1.66 and Tesseract 3.00. >>> I went and retrieved Leptonica 1.67 and Tesseract 3.01 directly, along with >>> the latest Tesseract English dictionary (Reference: >>> http://code.google.com/p/tesseract-ocr/wiki/ReadMe). >>> >>> The text extraction is now much better than it was a few months ago. >>> >>> Good luck! >>> Karen >>> >>> >>> >>> On 4/3/2012 11:29 AM, Miguel Del Agua wrote: >>>> Hi, >>>> >>>> I just installed version 1.3 and seems to work correctly, but the OCR >>>> performance is quite poor. I've tried to install a new dictionary as >>>> it's said in the wiki but the performance still bad. So I would like >>>> to know if it's possible to improve text recognition either by >>>> changing some parameters of OCRopus or improving in some way the >>>> dictionary. >>>> >>>> Thanks in advance. >>>> _______________________________________________ >>>> Matterhorn mailing list >>>> [email protected] >>>> http://lists.opencastproject.org/mailman/listinfo/matterhorn >>>> >>>> >>>> To unsubscribe please email >>>> [email protected] >>>> _______________________________________________ >>> >>> _______________________________________________ >>> Matterhorn mailing list >>> [email protected] >>> http://lists.opencastproject.org/mailman/listinfo/matterhorn >>> >>> >>> To unsubscribe please email >>> [email protected] >>> _______________________________________________ >> _______________________________________________ >> Matterhorn mailing list >> [email protected] >> http://lists.opencastproject.org/mailman/listinfo/matterhorn >> >> >> To unsubscribe please email >> [email protected] >> _______________________________________________ _______________________________________________ Matterhorn mailing list [email protected] http://lists.opencastproject.org/mailman/listinfo/matterhorn To unsubscribe please email [email protected] _______________________________________________
