[tesseract-ocr] Word coordinate for single lines.

2018-06-15 Thread ahka . anyreader
Dear All, In the project that I am currently working in, I have a pure text line cropped from an document image. As a next step, I need to recognize the text using and at the same time, I need to get the words coordinates. To get that coordinates I am passing the hocr parameters to the

Re: [tesseract-ocr] Tesseract 4 training related issue

2018-06-15 Thread pranaya mhatre
Yes I am using images and box files. I did both box files with spaces and without spaces. But when i trained tesseract using box files with space it is generating space in some images not in all test images and it also sometimes print digits between spaces -- You received this message because

Re: [tesseract-ocr] Tesseract 4 training related issue

2018-06-15 Thread ShreeDevi Kumar
Are you using images and box files? Does your box file have boxes for spaces between words? ShreeDevi भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Fri, Jun 15, 2018 at 12:42 PM pranaya mhatre wrote: > Hi, > > I trained

[tesseract-ocr] Tesseract 4 training related issue

2018-06-15 Thread pranaya mhatre
Hi, I trained tesseract 4 many times on images by fine tuning english model, but after training tesseract wont give space between two words. How should i resolve spacing problem ? And how should i train tesseract for detecting text boxes appropriately for italic fonts ? Thank you -- You