Hi Sibi ! Did you manage to understand the working process of Tesseract ? I have installed the debugging tools but i cannot understand the step by step process that Tesseract implements for each input. I want to understand the working and train tesseract on new fonts efficiently.
On Sunday, October 12, 2014 at 12:58:29 AM UTC-4, sibi kanagaraj wrote: > > Hi Zdenko , > > You nailed it . > > Here is what I did initially . > > Having followed this , > https://code.google.com/p/tesseract-ocr/wiki/ViewerDebugging > I , > 1.Downloaded the 2 jar files > 2.Created a new folder in Tesseract-ocr which was under /usr/share > 3.Hence now I have /usr/share/tesseract-ocr/java > 4.Then I cd into java > 5.Ran the command make ScrollView.jar > // > sibi@Sibi:/usr/share/ > tesseract-ocr/java$ ls > piccolo2d-core-3.0.jar piccolo2d-extras-3.0.jar > sibi@Sibi:/usr/share/tesseract-ocr/java$ make ScrollView.jar > make: *** No rule to make target `ScrollView.jar'. Stop. > // > > Later on I cloned the source > https://code.google.com/p/tesseract-ocr/source/checkout > > Probably that must help in it as I see a java folder inside Tesseract-ocr > . > > I have another doubt , probably that would go out of scope of this > question , so will create another thread and link it if necessary . > On Thursday, August 28, 2014 3:13:46 AM UTC+5:30, zdenop wrote: >> >> Yes, you will need to download tesseract source code, and configure it: >> ./autogen.sh && ./configure >> then "make ScrollView.jar" should work for you. >> >> >> Zdenko >> >> >> On Wed, Aug 27, 2014 at 4:28 PM, sibi kanagaraj <[email protected]> >> wrote: >> >>> Hello Zednko , >>> >>> Sorry for my late(very late) response . Initially I was working with >>> Fedora 19 now I have switched to Ubuntu . >>> >>> After the switch , >>> >>> 1.I installed Tesseract using >>> >>> sudo apt-get install tesseract-ocr >>> >>> 2.Then I downloaded two jar files(piccolo2d-core-3.0.jar and >>> piccolo2d-extras-3.0.jar) .Using nautalius moved them to >>> tesseract/java. >>> >>> 3.After that I cd into Java >>> >>> 4.Then I gave the make ScrollView.jar >>> >>> 5. It gave me error as >>> >>> make: *** No rule to make target `ScrollView.jar'. Stop. >>> >>> *Extra information : * >>> In the tesseract.spec file , I see that the >>> >>> Name: tesseract >>> Version: 3.00 >>> Release: 1%{?dist} >>> Summary: Raw Open source OCR Engine >>> >>> If its needed to go to tesseract-ocr-3.02.02.tar.gz >>> <https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-ocr-3.02.02.tar.gz&can=2&q=>and >>> >>> download it and build it along with leptonica also I am ready to do it . >>> >>> >>> >>> On Sunday, August 3, 2014 1:53:41 AM UTC+5:30, zdenop wrote: >>> >>>> You need to provide more information.... What version of tesseract do >>>> you use? How did you configured tesseract? etc... >>>> >>>> Zdenko >>>> >>>> >>>> On Thu, Jul 31, 2014 at 8:33 AM, sibi kanagaraj <[email protected]> >>>> wrote: >>>> >>>>> Hi , >>>>> >>>>> This is with respect the the debugging process . >>>>> >>>>> >>>>> I have followed the steps given here . >>>>> >>>>> /// >>>>> " >>>>> Building and installing*On Linux:* >>>>> >>>>> - Copy piccolo2d-core-3.0.jar and piccolo2d-extras-3.0.jar to >>>>> tesseract/java. >>>>> - cd java >>>>> - make ScrollView.jar >>>>> - Set the SCROLLVIEW_PATH environment variable to point to your >>>>> java directory containing all 3 jar files." >>>>> >>>>> /// >>>>> >>>>> Here the problem which I facing is that >>>>> >>>>> //////////////////////////////////////////////////////////////////// >>>>> [root@localhost java]# make ScrollView.jar >>>>> make: *** No rule to make target `ScrollView.jar'. Stop >>>>> /////////////////////////////////////////////////////////////////// >>>>> What am I supposed to do at this instance to get it cleared ? >>>>> >>>>> - Sibi >>>>> >>>>> On Friday, July 25, 2014 8:17:53 PM UTC+5:30, sibi kanagaraj wrote: >>>>>> >>>>>> Hi Zdenko , >>>>>> >>>>>> Thank you for the reply . I would check them and post back the >>>>>> results and queries . I am testing the engine for Tamil . If I am able >>>>>> to >>>>>> see the module by module work then it would be of great help of me to >>>>>> remove ambiguities and work on it . >>>>>> >>>>>> Once again thank you for the wonderful links . >>>>>> >>>>>> -Sibi >>>>>> >>>>>> On Friday, July 25, 2014 12:45:46 PM UTC+5:30, zdenop wrote: >>>>>>> >>>>>>> Have a look at ViewerDebugging wiki[1] Dmitri Silaev blog[2] and >>>>>>> maybe Slides from Tutorial on Tesseract presented at DAS2014[3]. >>>>>>> >>>>>>> [1] https://code.google.com/p/tesseract-ocr/wiki/ViewerDebugging >>>>>>> [2] http://rdaemons.blogspot.sk/2012/06/tesseract-ocr-interactiv >>>>>>> e-debugging.html >>>>>>> [3] https://drive.google.com/file/d/0B7l10Bj_LprhbUlIUFlCdGt >>>>>>> DYkE/edit?usp=sharing >>>>>>> >>>>>>> Zdenko >>>>>>> >>>>>>> >>>>>>> On Fri, Jul 25, 2014 at 7:08 AM, sibi kanagaraj <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Dear all , >>>>>>>> >>>>>>>> I would like to see hoe tesseract works . Say , how line >>>>>>>> segmentation happens , how word recognition and classification happens >>>>>>>> .etc >>>>>>>> . how am I supposed to "see" it . The command "tessaract" with input >>>>>>>> and >>>>>>>> output files give me output . But , I need to see step by step >>>>>>>> execution . >>>>>>>> In short , a debugging process with various watch points . Please let >>>>>>>> me >>>>>>>> know . I am very eager to learn the engine than the stand alone output >>>>>>>> . >>>>>>>> >>>>>>>> -Sibi >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "tesseract-ocr" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> To post to this group, send email to [email protected]. >>>>>>>> Visit this group at http://groups.google.com/group/tesseract-ocr. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/0c2fc40f-d85 >>>>>>>> c-4f3c-9dba-1769b7680084%40googlegroups.com >>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/0c2fc40f-d85c-4f3c-9dba-1769b7680084%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>> >>>>>>> >>>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To post to this group, send email to [email protected]. >>>>> Visit this group at http://groups.google.com/group/tesseract-ocr. >>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>> msgid/tesseract-ocr/154ae7e3-f400-415c-a773-e86e6a851047% >>>>> 40googlegroups.com >>>>> <https://groups.google.com/d/msgid/tesseract-ocr/154ae7e3-f400-415c-a773-e86e6a851047%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at http://groups.google.com/group/tesseract-ocr. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/dcb15752-c3a5-4e9e-95e3-5961bdd2fbb1%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/tesseract-ocr/dcb15752-c3a5-4e9e-95e3-5961bdd2fbb1%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/a22b3887-075c-44ec-b227-87f0d3b28c1a%40googlegroups.com.

