Re: [Opencast Matterhorn] How to improve OCR performance

Tobias Wunden Thu, 05 Apr 2012 03:08:16 -0700

Hi Matjaz,

Is this something you could potentially take a look at together with Karen? It 
seems that her suggestions have quite an impact on OCR performance. I'e just 
opened a ticket, MH-8701


Thanks,
Tobias

On 03.04.2012, at 19:51, Karen Dolan <[email protected]> wrote:

> Miguel,
> 
> Matterhorn trunk (from a couple weeks ago) was configured to pull down 
> Leptonica 1.66 and Tesseract 3.00.
> I went and retrieved Leptonica 1.67 and Tesseract 3.01 directly, along with 
> the latest Tesseract English dictionary (Reference: 
> http://code.google.com/p/tesseract-ocr/wiki/ReadMe).
> 
> The text extraction is now much better than it was a few months ago.
> 
> Good luck!
> Karen
> 
> 
> 
> On 4/3/2012 11:29 AM, Miguel Del Agua wrote:
>> Hi,
>> 
>> I just installed version 1.3 and seems to work correctly, but the OCR
>> performance is quite poor. I've tried to install a new dictionary as
>> it's said in the wiki but the performance still bad. So I would like
>> to know if it's possible to improve text recognition either by
>> changing some parameters of OCRopus or improving in some way the
>> dictionary.
>> 
>> Thanks in advance.
>> _______________________________________________
>> Matterhorn mailing list
>> [email protected]
>> http://lists.opencastproject.org/mailman/listinfo/matterhorn
>> 
>> 
>> To unsubscribe please email
>> [email protected]
>> _______________________________________________
> 
> _______________________________________________
> Matterhorn mailing list
> [email protected]
> http://lists.opencastproject.org/mailman/listinfo/matterhorn
> 
> 
> To unsubscribe please email
> [email protected]
> _______________________________________________
_______________________________________________
Matterhorn mailing list
[email protected]
http://lists.opencastproject.org/mailman/listinfo/matterhorn


To unsubscribe please email
[email protected]
_______________________________________________

Re: [Opencast Matterhorn] How to improve OCR performance

Reply via email to