Karen,

My OCR performance have been improved thanks to your advices and also
by specifying correctly the dictionary. One of the problems was that
when I changed the dictionary, matterhorn didn't loaded it until I
deleted the location where matterhorn stores the data and it was
restarted. I think, it must be a better way to reload the dictionaries
but for me it works and it's not a problem doing this way in my case.

Regards,
Miguel


2012/4/9 Miguel Del Agua <[email protected]>:
> Thanks for your suggestions, I'll try as soon as possible.
>
> 2012/4/5 Tobias Wunden <[email protected]>:
>> Hi Matjaz,
>>
>> Is this something you could potentially take a look at together with Karen? 
>> It seems that her suggestions have quite an impact on OCR performance. I'e 
>> just opened a ticket, MH-8701
>>
>> Thanks,
>> Tobias
>>
>> On 03.04.2012, at 19:51, Karen Dolan <[email protected]> wrote:
>>
>>> Miguel,
>>>
>>> Matterhorn trunk (from a couple weeks ago) was configured to pull down 
>>> Leptonica 1.66 and Tesseract 3.00.
>>> I went and retrieved Leptonica 1.67 and Tesseract 3.01 directly, along with 
>>> the latest Tesseract English dictionary (Reference: 
>>> http://code.google.com/p/tesseract-ocr/wiki/ReadMe).
>>>
>>> The text extraction is now much better than it was a few months ago.
>>>
>>> Good luck!
>>> Karen
>>>
>>>
>>>
>>> On 4/3/2012 11:29 AM, Miguel Del Agua wrote:
>>>> Hi,
>>>>
>>>> I just installed version 1.3 and seems to work correctly, but the OCR
>>>> performance is quite poor. I've tried to install a new dictionary as
>>>> it's said in the wiki but the performance still bad. So I would like
>>>> to know if it's possible to improve text recognition either by
>>>> changing some parameters of OCRopus or improving in some way the
>>>> dictionary.
>>>>
>>>> Thanks in advance.
>>>> _______________________________________________
>>>> Matterhorn mailing list
>>>> [email protected]
>>>> http://lists.opencastproject.org/mailman/listinfo/matterhorn
>>>>
>>>>
>>>> To unsubscribe please email
>>>> [email protected]
>>>> _______________________________________________
>>>
>>> _______________________________________________
>>> Matterhorn mailing list
>>> [email protected]
>>> http://lists.opencastproject.org/mailman/listinfo/matterhorn
>>>
>>>
>>> To unsubscribe please email
>>> [email protected]
>>> _______________________________________________
>> _______________________________________________
>> Matterhorn mailing list
>> [email protected]
>> http://lists.opencastproject.org/mailman/listinfo/matterhorn
>>
>>
>> To unsubscribe please email
>> [email protected]
>> _______________________________________________
_______________________________________________
Matterhorn mailing list
[email protected]
http://lists.opencastproject.org/mailman/listinfo/matterhorn


To unsubscribe please email
[email protected]
_______________________________________________

Reply via email to