Hello Mr. Rao, Can u please send the trainneddata file to me as I am stuck.
I am also stuck with a problem where i have the tesseract reading the text but the output sent out from the engine in the text file is some special characters.. I am unable to tell the tesseract engine to write the output in the notepad in kannada. As far as i can see that the output is trying to write on basis of the shape but the writing is to be done in kannada. That is the point i am stuck in. Please help.. On Sun, May 27, 2012 at 9:51 PM, mns_rao <[email protected]> wrote: > > > On May 22, 3:53 pm, sri1683 <[email protected]> wrote: > > hi taha, > > > > thanks for the suggestion.. > > i have used 6 tif images for training.. > > thats what drove me to think that the traineddata file should be > > bigger.. > > > > On May 22, 3:35 pm, Taha Alasli <[email protected]> wrote: > > > > > > > > > > > > > > > > > I think that size of the traineddata file Depend on tiff\boxs you used. > > > > > On 22 May 2012 12:27, sri1683 <[email protected]> wrote: > > > > > > thanks a lot.. > > > > that was very helpful.. > > > > i could create the traineddata file.. > > > > > > i am training the tesseract3.00 and not 2.00 as mentioned in the link > > > > u gave me.. > > > > > > however i am getting a blank text file post training. > > > > and one more interesting part is that the traineddata file of the > > > > kannada language is smaller than english, > > > > considering the number of characters in english is very small when > > > > compared to kannada, so i assume that the traineddata file should be > > > > larger. > > > > please help me in understanding this. > > > > > > i also faced a problem with the font properties as i am unable to > find > > > > the actual font details for this language.. > > > > > > On May 21, 5:46 am, Dakshika Jayathilaka <[email protected]> > wrote: > > > > > try this,... > > > >http://vannait.blogspot.com/2009/06/how-to-train-tesseract-ocr.html > > > > > > > On Friday, May 18, 2012 8:14:24 PM UTC+5:30, sri1683 wrote: > > > > > > > > Hi, > > > > > > > > This is my first time to post a question. Don't exactly > know how > > > > > > to do it but will still try my luck. Please do correct me if am > not > > > > > > clear in what I say. > > > > > > I have seen the Tesseract-OCR working for the english > language > > > > > > and was very fascinated with it. Now i wanted to train it to read > > > > > > Kannada text. But I do not know how to do it. Has anyone tried it > > > > > > earlier? If so please help me as i don't know how to go about > training > > > > > > the OCR. > > > > > > > > Thanks in advance. > > > > > > -- > > > > You received this message because you are subscribed to the Google > > > > Groups "tesseract-ocr" group. > > > > To post to this group, send email to [email protected] > > > > To unsubscribe from this group, send email to > > > > [email protected] > > > > For more options, visit this group at > > > >http://groups.google.com/group/tesseract-ocr?hl=en > > Hi, > Some of us working for kannada language file have worked one > traineddata file which is fairly good, of course needs post- > processing. can send to those interested by mail if sought. > Thanks, > MNS Rao > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- regards, Sri -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

