Re: [tesseract-ocr] Trained font - always one letter wrong

2018-05-01 Thread dave . hardy
Training doesn't work. If i use the characters "ä, ö, ü" (which i need) in my training text, text2image says: WARNING: illegal UTF8 encountered and then creates an incorrect box/tif pair. This seems not to depend on my font, because with Arial it does the same thing. Can you help me to avoid

Re: [tesseract-ocr] Do I need to call Init before every rectangle?

2018-05-01 Thread Ben Rogall
Thank you shree. That did solve the issue. I wonder why this is the default behavior. On Tuesday, May 1, 2018 at 3:30:06 AM UTC-5, shree wrote: > > See >

Re: [tesseract-ocr] Do I need to call Init before every rectangle?

2018-05-01 Thread ShreeDevi Kumar
See https://github.com/tesseract-ocr/tesseract/wiki/FAQ#there-are-inconsistent-results-from-tesseract-when-the-same-tessbaseapi-object-is-used-for-decoding-multiple-images On Tue 1 May, 2018, 12:53 PM Ben Rogall, wrote: > > I am using the baseapi to OCR a large number of

[tesseract-ocr] Do I need to call Init before every rectangle?

2018-05-01 Thread Ben Rogall
I am using the baseapi to OCR a large number of small text images, most of which just have a few digits. If I call End() and Init() after every image, the results are basically perfect. If I just delete the char string and go on to the next image the results are much worse, with extra