Re: [Matterhorn-users] How to improve OCR performance

Tobias Wunden Thu, 12 Apr 2012 12:35:10 -0700

Hi Ruben,

would you mind sharing some details around the bugs you found and the 
improvements you are about to suggest? Maybe attach a patch to an open ticket?


Thanks,
Tobias

On 12.04.2012, at 12:40, Rub駭 P駻ez <[email protected]> wrote:

> Dear all,
> 
> We are currently struggling with the text extraction, too, and we are seeing 
> that Matterhorn is a little anglo-centric and does not like words with 
> characters outside the [a-zA-Z_0-9] range. We are making some developments 
> (partially thanks to Karen's advice --thanks!) but some of these involve 
> changing some Java code and some design decisions which can be regarded as 
> bugs. We want to test this thoroughly and perhaps we'll submit them for the 
> 1.4 version, since this wouldn't be a new feature, but correcting something 
> that is already in. 
> 
> Best regards
> 
> 2012/4/12 Miguel Del Agua <[email protected]>
> Thank you very much, but in my case captures seems to be OK. Anyway
> the problem was due to some third party tools versions, and also due
> to a incorrect dictionary loading. More info:
> http://opencast.3480289.n2.nabble.com/How-to-improve-OCR-performance-tp7433198p7458735.html
> 
> Regards,
> 
> Miguel
> 
> 
> 2012/4/5 費納德費納德 <[email protected]>:
> > Hello Miguel,
> >
> > Take a look at the captures the workflow get form the video. In my case I
> > get a grey pattern captures in 90% of the cases, so the OCR was not able to
> > recognize almost anything. I solve it installing again ffmpeg and all the
> > dependent packages. Now the OCR works almost perfect. But I have some issue
> > with the ffmepg version, because recordings longer than 5 min I get errors
> > during the video and audio mux. (With version 1.2 I didn't get these errors,
> > only with 1.3. Maybe I install something in a different way).
> >
> > So I am not sure if you have this problem with the OCR but it is possible.
> >
> >
> > Regards,
> >
> > Fernando Hernández Esguevillas.
> >
> > PD.- Si tienes alguna duda sobre como instalar la versión más reciente de
> > ffmpeg me lo comentas y te paso algún link. Aunque es fácil encontrar la
> > información en google. Un saludo.
> >
> > El 4 de abril de 2012 00:15, Miguel Del Agua <[email protected]>
> > escribió:
> >>
> >> Hi,
> >>
> >> I just installed version 1.3 and seems to work correctly, but the OCR
> >> performance is quite poor. I've tried to install a new dictionary as
> >> it's said in the wiki but the performance still bad. So I would like
> >> to know if it's possible to improve text recognition either by
> >> changing some parameters of OCRopus or improving in some way the
> >> dictionary.
> >>
> >> Thanks in advance.
> >> _______________________________________________
> >> Matterhorn-users mailing list
> >> [email protected]
> >> http://lists.opencastproject.org/mailman/listinfo/matterhorn-users
> >
> >
> >
> > _______________________________________________
> > Matterhorn-users mailing list
> > [email protected]
> > http://lists.opencastproject.org/mailman/listinfo/matterhorn-users
> >
> _______________________________________________
> Matterhorn-users mailing list
> [email protected]
> http://lists.opencastproject.org/mailman/listinfo/matterhorn-users
> 
> _______________________________________________
> Matterhorn-users mailing list
> [email protected]
> http://lists.opencastproject.org/mailman/listinfo/matterhorn-users

_______________________________________________
Matterhorn-users mailing list
[email protected]
http://lists.opencastproject.org/mailman/listinfo/matterhorn-users

Re: [Matterhorn-users] How to improve OCR performance

Reply via email to