For sure, best of luck!

art

From: [email protected] [mailto:[email protected]] On 
Behalf Of Leo Bergolth
Sent: Friday, November 18, 2016 12:31 PM
To: tesseract-ocr <[email protected]>
Subject: Re: [tesseract-ocr] Re: Unable to detect single digit cells in an 
invoice

Am Donnerstag, 17. November 2016 17:39:28 UTC+1 schrieb Art Rhyno:
 Your example has such good contrast that you might consider using the colors 
to identify single characters. I have attached a quick sample of what I mean. I 
used opencv and defer greatly to the blog post I reference at the top of the 
script, but the idea would be to try to catch single characters using opencv’s 
“inrange” function. I would use tesseract on the image first and weed out blobs 
for further processing based on the coordinates of what tesseract has already 
detected. I would then use single character mode on what’s left. Feel free to 
ping me if you are interested in this approach.

Thanks for your suggestion! Looks like a neat way to circumvent the problem.
However, I'd prefer to find the reason why tesseract rejects those blobs first.
(See my other post.)
Maybe this can be fixed in tesseract, once I know some background... :-)

Cheers,
--leo
--
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
[email protected]<mailto:[email protected]>.
To post to this group, send email to 
[email protected]<mailto:[email protected]>.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/d16bb097-f4a7-4deb-a5bd-fa1545e25c33%40googlegroups.com<https://groups.google.com/d/msgid/tesseract-ocr/d16bb097-f4a7-4deb-a5bd-fa1545e25c33%40googlegroups.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/DM3PR11MB1036EF360A57F7A4BD9E6ABFDCB00%40DM3PR11MB1036.namprd11.prod.outlook.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to