Thanks, Kranthi. Keep us informed about how it goes.
Cheers, TG On Thu, Apr 20, 2017 at 1:01 PM, Kranthi Kiran G V < [email protected]> wrote: > Hello Thamme, > > Agreed. Looking at the paper[1], it seems to me that tesseract and VGG > models can co-exist > in Tika to serve all kinds of input images. > > I am able to run one of the models Deep Features for Text Spotting[2] by > disabling the GPU. > It however doesn't generate any text, but generates only features. The > initial assumption that > MATLAB version is creating an issue is thus proven wrong. > The problem lies with the MatConvNet that is bundled with the models. It > is a very old version > which doesn't even resemble the current structure. I'm having problems to > build it on my system > for the other model, Synthetic Data and Artificial Neural Networks for > Natural Scene Text Recognition[1]. > Note that both of them are supplied with custom versions of MatConvNet. > > Nevertheless, we can build the system to use a latest version of > MatConvNet by building it layer > by layer looking at the MAT file[3]. I want to hear your views on whether > or not I should attempt it. > > > Thank you, > Kranthi Kiran GV, > CS 3/4 Undergrad, > NIT Warangal > > > > [1] http://www.robots.ox.ac.uk/~vgg/publications/2014/ > Jaderberg14c/jaderberg14c.pdf > [2] http://www.robots.ox.ac.uk/~vgg/publications/2014/ > Jaderberg14/jaderberg14.pdf.pdf > [3] https://github.com/vlfeat/matconvnet/issues/239 > > On Wed, Apr 19, 2017 at 10:42 PM, Thamme Gowda <[email protected]> > wrote: > >> Hi Kranthi, >> >> Thanks for updating us. >> I believe in the long run both of these two models may co-exist >> (tesseract for flat-bench scanner images with perfect lighting conditions, >> VGG models for natural images taken by cellphone/digital cameras with weird >> orientations and lighting conditions). >> >> I agree with you, we can make VGG OCR as an optional REST API and allow >> users to agree their license if they want to use it. Thanks Luis for the >> feedback :-) >> >> Keep up the good work and keep this email thread updated with your >> findings. >> >> Thanks, >> TG >> >> *--* >> *Thamme Gowda* >> TG | @thammegowda <https://twitter.com/thammegowda> >> ~Sent via somebody's Webmail server! >> >> On Wed, Apr 19, 2017 at 6:12 AM, Kranthi Kiran G V < >> [email protected]> wrote: >> >>> Hello community, >>> I have successfully tested Tesseract 4.0 on various images of different >>> sizes, orientation and lightening >>> conditions. I would, in the next few days, publish the results on a blog >>> for you to have a look at. >>> >>> Although I'm able to reliably measure the clock time, accuracy, etc, I am >>> not able to come up with a method >>> to reliably measure the memory consumed. Any pointers on this from the >>> developer community would be >>> appreciated. >>> >>> VGG group has two models released >>> <http://www.robots.ox.ac.uk/~vgg/research/text/#sec-models>. I'm not >>> able >>> >>> to test any as of now due to no back compatibility with >>> the MatConvNet used. I use a recent version of MATLAB. As of now, I am >>> trying to get around it by updating >>> parts of the code. I'm also contacting the mainters of the repository to >>> help me address the issues. >>> I'm hopeful to run them. >>> >>> Addressing Luis' concern, we won't be building VGG's models into Tika' >>> source. We would only be helping >>> the user deploy a REST API to which Tika's OCR subsystem passes the >>> images >>> and retrieve the information >>> in the form of a string. >>> >>> Thank you, >>> Kranthi Kiran GV, >>> CS 3/4 Undergrad, >>> NIT Warangal >>> >>> On Tue, Apr 18, 2017 at 8:43 AM, Kranthi Kiran G V < >>> [email protected]> wrote: >>> >>> > Hello Luis, >>> > Yes, tesseract 4.0 is not yet a stable release. VGG group's model has a >>> > 3-clause BSD license. >>> > >>> > I see it as a long term effort which would help the Tika's community >>> > experience near state of art OCR. >>> > >>> > This is an investigation into it to see if we can try out this >>> direction. >>> > Thanks for expressing your views. >>> > >>> > Thank you, >>> > Kranthi Kiran GV >>> > >>> > On Apr 18, 2017 2:44 AM, "Luís Filipe Nassif" <[email protected]> >>> wrote: >>> > >>> > Hi Kranthi, >>> > >>> > That is an interesting comparison! But I think Tesseract 4.0 is still >>> > alpha? And do you know the VGG software license? >>> > >>> > Best, >>> > Luis >>> > >>> > Em 17 de abr de 2017 8:46 AM, "Kranthi Kiran G V" < >>> > [email protected]> escreveu: >>> > >>> > Hello Tim Allison, >>> > >>> > I am currently working on improving Tika's OCR capabilities. >>> > After suggestion from Thamme Gowda (@thammegowda >>> > <https://issues.apache.org/jira/secure/ViewProfile.jspa?name >>> =thammegowda >>> > >), >>> > I started to work on comparison of Tesseract 4.0's neural network >>> > <https://github.com/tesseract-ocr/tesseract/wiki/NeuralNetsI >>> nTesseract4.00 >>> > > >>> > subsystem and Visual Geometry Group's (VGG) models >>> > <http://www.robots.ox.ac.uk/~vgg/research/text/>. >>> > >>> > It would be great if you provide the dataset to test the OCR as you >>> > mentioned in one of the issues. >>> > >>> > I would be comparing their running time for evaluation, accuracy, >>> memory >>> > consumed and invariance to lighting, orientation, etc. And then I >>> would be >>> > integrating the appropriate models into Tika's OCR. >>> > >>> > Thank you, >>> > Kranthi Kiran GV, >>> > CS 3/4 Undergrad, >>> > NIT Warangal >>> > >>> > >>> > >>> >> >> >
