On Sunday, January 20, 2013 12:14:09 PM UTC-6, [email protected] wrote: > > Hi, > > I am working on this too. Although accuracy is high but I am facing two > problems. > > 1) Tesseract is not able to intercept two letters which mingles with each > other like capital "T" and "h". Capital "F" and "i" etc. >
According to Training Wiki, "if the pair is common, put both characters at the start of the line, leaving the bounding box to represent them both". > 2) Tesseract skipping some lines in between. > > I have attached box file, training image file, traineddata file, Image and > its output. > > It would be very helpful if I can get any suggestions here. > > Thanks in advance. > > On Friday, January 18, 2013 9:32:35 AM UTC+5:30, Quan Nguyen wrote: >> >> Boxes look overlapping. You may want to space them out a bit more. >> >> On Thursday, January 17, 2013 10:33:13 AM UTC-6, Tauqeer baig wrote: >>> >>> I am trying to traindata for fonts Dartangnon-ITC and rageitalics but >>> not getting accurate output. It`s not even 50% accurate. I have attached >>> image files(tif), traindata file and box files for this. >>> >>> >>> What steps will reproduce the problem? >>> 1. Train data for rage Italics >>> 2. Run tesseract 3.02 to make output >>> >>> What is the expected output? What do you see instead? >>> Output Data generated is not accurate. >>> >>> What version of the product are you using? On what operating system? >>> tesseract3.02. I am using it on windowsXP >>> >>> Please provide any additional information below. >>> I tried to train tesseract several times but unfortunately I never got >>> accurate output. >>> I am training data for rageItalics fonts and Dartangnon fonts. I am >>> attaching my traineddata file along with the file >>> >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

