Did you solve this problem? I also have similar layout documents with handwritten digits.
On Tuesday, June 6, 2017 at 2:13:47 PM UTC+5:30, Mrinmoy Nath wrote: > > Hi, > > I am trying to extract each word from a .png image (converted from pdf > documents). > Using Python 2.7 and tesseract-3.05 APIs. > But for few of the documents instead of drawing the bounding box around a > word Tesseract is drawing the same for a larger area and missing some of > the words. > I am using 1111.png as input. Also find the output in 1111_op.png. > Could you please help me out to understand what could be the reason. > > Regards, > Mrinmoy > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/815c46a3-3dda-4adb-9023-2437d6797069%40googlegroups.com.

