Hi! Thanks for the reply. I found HOCR in the command line option which shows the coordinates for the words, but which config file in tessdata/configs for Tesseract version 3.02 should I modify to get character confidence output? Thanks!
Daniel Kraft於 2015年11月10日星期二 UTC+8下午11時05分34秒寫道: > > Hi! > > On 2015-11-10 14:04, Chang Alden wrote: > > I think resizing the bitmap might not be the best way to solve the > > spacing problem. Is there other methods I can try out? > > I'm by no means a tesseract or OCR expert, but I've been experiencing > spacing issues myself (in my case, with columns of numbers). > > For me, it worked very well to rotate failing images a few degrees. My > data contains checksums, so that I can determine automatically if a > recognition was correct or not; applying various rotations until it > succeeds works very well to improve my recognition rate significantly > (including for spaces). Not sure if that's an option for you, though. > > Yours, > Daniel > > -- > http://www.domob.eu/ > OpenPGP: 1142 850E 6DFF 65BA 63D6 88A8 B249 2AC4 A733 0737 > Namecoin: id/domob -> https://nameid.org/?name=domob > -- > Done: Arc-Bar-Cav-Hea-Kni-Ran-Rog-Sam-Tou-Val-Wiz > To go: Mon-Pri > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2139d1bf-8213-4b59-a698-505554387766%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

