On May 22, 3:53 pm, sri1683 <[email protected]> wrote: > hi taha, > > thanks for the suggestion.. > i have used 6 tif images for training.. > thats what drove me to think that the traineddata file should be > bigger.. > > On May 22, 3:35 pm, Taha Alasli <[email protected]> wrote: > > > > > > > > > I think that size of the traineddata file Depend on tiff\boxs you used. > > > On 22 May 2012 12:27, sri1683 <[email protected]> wrote: > > > > thanks a lot.. > > > that was very helpful.. > > > i could create the traineddata file.. > > > > i am training the tesseract3.00 and not 2.00 as mentioned in the link > > > u gave me.. > > > > however i am getting a blank text file post training. > > > and one more interesting part is that the traineddata file of the > > > kannada language is smaller than english, > > > considering the number of characters in english is very small when > > > compared to kannada, so i assume that the traineddata file should be > > > larger. > > > please help me in understanding this. > > > > i also faced a problem with the font properties as i am unable to find > > > the actual font details for this language.. > > > > On May 21, 5:46 am, Dakshika Jayathilaka <[email protected]> wrote: > > > > try this,... > > >http://vannait.blogspot.com/2009/06/how-to-train-tesseract-ocr.html > > > > > On Friday, May 18, 2012 8:14:24 PM UTC+5:30, sri1683 wrote: > > > > > > Hi, > > > > > > This is my first time to post a question. Don't exactly know how > > > > > to do it but will still try my luck. Please do correct me if am not > > > > > clear in what I say. > > > > > I have seen the Tesseract-OCR working for the english language > > > > > and was very fascinated with it. Now i wanted to train it to read > > > > > Kannada text. But I do not know how to do it. Has anyone tried it > > > > > earlier? If so please help me as i don't know how to go about training > > > > > the OCR. > > > > > > Thanks in advance. > > > > -- > > > You received this message because you are subscribed to the Google > > > Groups "tesseract-ocr" group. > > > To post to this group, send email to [email protected] > > > To unsubscribe from this group, send email to > > > [email protected] > > > For more options, visit this group at > > >http://groups.google.com/group/tesseract-ocr?hl=en
Hi, Some of us working for kannada language file have worked one traineddata file which is fairly good, of course needs post- processing. can send to those interested by mail if sought. Thanks, MNS Rao -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

