Re: Improving Tika OCR

Thamme Gowda Fri, 21 Apr 2017 09:43:58 -0700

Thanks, Kranthi.

Keep us informed about how it goes.


Cheers,
TG

On Thu, Apr 20, 2017 at 1:01 PM, Kranthi Kiran G V <
[email protected]> wrote:

> Hello Thamme,
>
> Agreed. Looking at the paper[1], it seems to me that tesseract and VGG
> models can co-exist
> in Tika to serve all kinds of input images.
>
> I am able to run one of the models Deep Features for Text Spotting[2] by
> disabling the GPU.
> It however doesn't generate any text, but generates only features. The
> initial assumption that
> MATLAB version is creating an issue is thus proven wrong.
> The problem lies with the MatConvNet that is bundled with the models. It
> is a very old version
> which doesn't even resemble the current structure. I'm having problems to
> build it on my system
> for the other model, Synthetic Data and Artificial Neural Networks for
> Natural Scene Text Recognition[1].
> Note that both of them are supplied with custom versions of MatConvNet.
>
> Nevertheless, we can build the system to use a latest version of
> MatConvNet by building it layer
> by layer looking at the MAT file[3]. I want to hear your views on whether
> or not I should attempt it.
>
>
> Thank you,
> Kranthi Kiran GV,
> CS 3/4 Undergrad,
> NIT Warangal
>
>
>
> [1] http://www.robots.ox.ac.uk/~vgg/publications/2014/
> Jaderberg14c/jaderberg14c.pdf
> [2] http://www.robots.ox.ac.uk/~vgg/publications/2014/
> Jaderberg14/jaderberg14.pdf.pdf
> [3] https://github.com/vlfeat/matconvnet/issues/239
>
> On Wed, Apr 19, 2017 at 10:42 PM, Thamme Gowda <[email protected]>
> wrote:
>
>> Hi Kranthi,
>>
>> Thanks for updating us.
>> I believe in the long run both of these two models may co-exist
>> (tesseract for flat-bench scanner images with perfect lighting conditions,
>> VGG models for natural images taken by cellphone/digital cameras with weird
>> orientations and lighting conditions).
>>
>> I agree with you, we can make VGG OCR as an optional REST API and allow
>> users to agree their license if they want to use it. Thanks Luis for the
>> feedback :-)
>>
>> Keep up the good work and keep this email thread updated with your
>> findings.
>>
>> Thanks,
>> TG
>>
>> *--*
>> *Thamme Gowda*
>> TG | @thammegowda <https://twitter.com/thammegowda>
>> ~Sent via somebody's Webmail server!
>>
>> On Wed, Apr 19, 2017 at 6:12 AM, Kranthi Kiran G V <
>> [email protected]> wrote:
>>
>>> Hello community,
>>> I have successfully tested Tesseract 4.0 on various images of different
>>> sizes, orientation and lightening
>>> conditions. I would, in the next few days, publish the results on a blog
>>> for you to have a look at.
>>>
>>> Although I'm able to reliably measure the clock time, accuracy, etc, I am
>>> not able to come up with a method
>>> to reliably measure the memory consumed. Any pointers on this from the
>>> developer community would be
>>> appreciated.
>>>
>>> VGG group has two models released
>>> <http://www.robots.ox.ac.uk/~vgg/research/text/#sec-models>. I'm not
>>> able
>>>
>>> to test any as of now due to no back compatibility with
>>> the MatConvNet used. I use a recent version of MATLAB. As of now, I am
>>> trying to get around it by updating
>>> parts of the code. I'm also contacting the mainters of the repository to
>>> help me address the issues.
>>> I'm hopeful to run them.
>>>
>>> Addressing Luis' concern, we won't be building VGG's models into Tika'
>>> source. We would only be helping
>>> the user deploy a REST API to which Tika's OCR subsystem passes the
>>> images
>>> and retrieve the information
>>> in the form of a string.
>>>
>>> Thank you,
>>> Kranthi Kiran GV,
>>> CS 3/4 Undergrad,
>>> NIT Warangal
>>>
>>> On Tue, Apr 18, 2017 at 8:43 AM, Kranthi Kiran G V <
>>> [email protected]> wrote:
>>>
>>> > Hello Luis,
>>> > Yes, tesseract 4.0 is not yet a stable release. VGG group's model has a
>>> > 3-clause BSD license.
>>> >
>>> > I see it as a long term effort which would help the Tika's community
>>> > experience near state of art OCR.
>>> >
>>> > This is an investigation into it to see if we can try out this
>>> direction.
>>> > Thanks for expressing your views.
>>> >
>>> > Thank you,
>>> > Kranthi Kiran GV
>>> >
>>> > On Apr 18, 2017 2:44 AM, "Luís Filipe Nassif" <[email protected]>
>>> wrote:
>>> >
>>> > Hi Kranthi,
>>> >
>>> > That is an interesting comparison! But I think Tesseract 4.0 is still
>>> > alpha? And do you know the VGG software license?
>>> >
>>> > Best,
>>> > Luis
>>> >
>>> > Em 17 de abr de 2017 8:46 AM, "Kranthi Kiran G V" <
>>> > [email protected]> escreveu:
>>> >
>>> > Hello Tim Allison,
>>> >
>>> > I am currently working on improving Tika's OCR capabilities.
>>> > After suggestion from Thamme Gowda (@thammegowda
>>> > <https://issues.apache.org/jira/secure/ViewProfile.jspa?name
>>> =thammegowda
>>> > >),
>>> > I started to work on comparison of Tesseract 4.0's neural network
>>> > <https://github.com/tesseract-ocr/tesseract/wiki/NeuralNetsI
>>> nTesseract4.00
>>> > >
>>> > subsystem and Visual Geometry Group's (VGG) models
>>> > <http://www.robots.ox.ac.uk/~vgg/research/text/>.
>>> >
>>> > It would be great if you provide the dataset to test the OCR as you
>>> > mentioned in one of the issues.
>>> >
>>> > I would be comparing their running time for evaluation, accuracy,
>>> memory
>>> > consumed and invariance to lighting, orientation, etc. And then I
>>> would be
>>> > integrating the appropriate models into Tika's OCR.
>>> >
>>> > Thank you,
>>> > Kranthi Kiran GV,
>>> > CS 3/4 Undergrad,
>>> > NIT Warangal
>>> >
>>> >
>>> >
>>>
>>
>>
>

Re: Improving Tika OCR

Reply via email to