Hi Tom Any update regarding underline text problem?
Regards Guna On Monday, March 7, 2016 at 6:08:03 AM UTC+5:30, Gunasekaran Velu wrote: > > HI > > I just sent own creation f image in paint and sent you. > > Now i have attached the real document(Cropping from full image due to > confidential data) underline text. > > In this case when i do the OCR the underline text completely skipped by > tesseract. > > Kindly update the same. > > > Regards > Guna > > On Saturday, March 5, 2016 at 10:42:18 PM UTC+5:30, Tom Morris wrote: >> >> On Saturday, March 5, 2016 at 5:11:55 AM UTC-5, Gunasekaran Velu wrote: >>> >>> >>> >tesseract.exe Underline.png Underline -l eng -psm 1 >>> >>> Result: This is underline word @ >>> >>> Does it possible to do OCR recognition for underlined text/word on the >>> image? or some image processing need to apply on the image? >>> >>> Attached sample image. >>> >> >> Tesseract knows how to recognize underlined text, as you can see from >> that fact that it got "underline" correct in your example. For some reason >> it's getting confused by the underlined word "test", perhaps because it's >> at the end of the line? >> >> It could potentially represent a bug, but I'd try to recreate it with a >> less artificial example. Of course, pre-processing would improve the >> situation and removing underlines should be that hard to do. >> >> Tom >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/760f7a01-0a41-4564-a6e7-3d917ba28073%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

