[tesseract-ocr] Re: unable to recognize numbers within box using tesseract in C#

blekas Thu, 09 Feb 2017 04:01:53 -0800

Hello and thank you for the useful suggestion.

Would you happen to know the reason why numbers printed within boxes cannot 
be parsed and are ignored?


I am working on scenarios that numbers withing closed boxes are very very 
common and removing the horizontal lines have various side effects on other 
pieces of text on my images.

Is there a reason for this and maybe another way to make tesseract detect 
the numbers printed within boxes (maybe with passing a parameter or 
something)?

Thank you in advance for your answer.




On Saturday, August 27, 2016 at 7:34:10 PM UTC+3, Quan Nguyen wrote:
>
> Deskew, grayscale, remove lines, binarize produced the image:
>
>
> <https://lh3.googleusercontent.com/-k4IAE2W2W7M/V8HAYJhIP5I/AAAAAAAAAqg/C85uxC7JDOMikMfAX_whlGB8UBU2Y1BiACLcB/s1600/Capture4.PNG>
>
> and OCRed text:
>
> l4|0|0l2|1l1>°l0|7l
>
> So if you could remove the vertical lines, it would improve further.
>
> On Saturday, August 27, 2016 at 10:29:52 AM UTC-5, shripad shirsat wrote:
>>
>>
>> I am facing to issue to recognize the numbers from pdf which are printed 
>> within the boxes. I have used tesseract in C# for my project. Kindly some 
>> one help me out with any clue or hint or a snippet to how to go about to 
>> find the solution for the same. Please find the attached pdf
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/6d790a9f-b385-4f25-b133-27998bdb7f3f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Re: unable to recognize numbers within box using tesseract in C#

Reply via email to