Re: Improving Tika OCR

Thamme Gowda Wed, 19 Apr 2017 10:13:26 -0700

Hi Kranthi,

Thanks for updating us.
I believe in the long run both of these two models may co-exist (tesseract
for flat-bench scanner images with perfect lighting conditions, VGG models
for natural images taken by cellphone/digital cameras with weird
orientations and lighting conditions).


I agree with you, we can make VGG OCR as an optional REST API and allow
users to agree their license if they want to use it. Thanks Luis for the
feedback :-)

Keep up the good work and keep this email thread updated with your findings.

Thanks,
TG

*--*
*Thamme Gowda*
TG | @thammegowda <https://twitter.com/thammegowda>
~Sent via somebody's Webmail server!

On Wed, Apr 19, 2017 at 6:12 AM, Kranthi Kiran G V <
[email protected]> wrote:

> Hello community,
> I have successfully tested Tesseract 4.0 on various images of different
> sizes, orientation and lightening
> conditions. I would, in the next few days, publish the results on a blog
> for you to have a look at.
>
> Although I'm able to reliably measure the clock time, accuracy, etc, I am
> not able to come up with a method
> to reliably measure the memory consumed. Any pointers on this from the
> developer community would be
> appreciated.
>
> VGG group has two models released
> <http://www.robots.ox.ac.uk/~vgg/research/text/#sec-models>. I'm not able
> to test any as of now due to no back compatibility with
> the MatConvNet used. I use a recent version of MATLAB. As of now, I am
> trying to get around it by updating
> parts of the code. I'm also contacting the mainters of the repository to
> help me address the issues.
> I'm hopeful to run them.
>
> Addressing Luis' concern, we won't be building VGG's models into Tika'
> source. We would only be helping
> the user deploy a REST API to which Tika's OCR subsystem passes the images
> and retrieve the information
> in the form of a string.
>
> Thank you,
> Kranthi Kiran GV,
> CS 3/4 Undergrad,
> NIT Warangal
>
> On Tue, Apr 18, 2017 at 8:43 AM, Kranthi Kiran G V <
> [email protected]> wrote:
>
> > Hello Luis,
> > Yes, tesseract 4.0 is not yet a stable release. VGG group's model has a
> > 3-clause BSD license.
> >
> > I see it as a long term effort which would help the Tika's community
> > experience near state of art OCR.
> >
> > This is an investigation into it to see if we can try out this direction.
> > Thanks for expressing your views.
> >
> > Thank you,
> > Kranthi Kiran GV
> >
> > On Apr 18, 2017 2:44 AM, "Luís Filipe Nassif" <[email protected]>
> wrote:
> >
> > Hi Kranthi,
> >
> > That is an interesting comparison! But I think Tesseract 4.0 is still
> > alpha? And do you know the VGG software license?
> >
> > Best,
> > Luis
> >
> > Em 17 de abr de 2017 8:46 AM, "Kranthi Kiran G V" <
> > [email protected]> escreveu:
> >
> > Hello Tim Allison,
> >
> > I am currently working on improving Tika's OCR capabilities.
> > After suggestion from Thamme Gowda (@thammegowda
> > <https://issues.apache.org/jira/secure/ViewProfile.jspa?name=thammegowda
> > >),
> > I started to work on comparison of Tesseract 4.0's neural network
> > <https://github.com/tesseract-ocr/tesseract/wiki/
> NeuralNetsInTesseract4.00
> > >
> > subsystem and Visual Geometry Group's (VGG) models
> > <http://www.robots.ox.ac.uk/~vgg/research/text/>.
> >
> > It would be great if you provide the dataset to test the OCR as you
> > mentioned in one of the issues.
> >
> > I would be comparing their running time for evaluation, accuracy, memory
> > consumed and invariance to lighting, orientation, etc. And then I would
> be
> > integrating the appropriate models into Tika's OCR.
> >
> > Thank you,
> > Kranthi Kiran GV,
> > CS 3/4 Undergrad,
> > NIT Warangal
> >
> >
> >
>

Re: Improving Tika OCR

Reply via email to