[tesseract-ocr] Strategy for Sparse Text

CK Mon, 16 Apr 2018 00:33:51 -0700

Hello,

Using tesseract I am trying to output hexadecimal numbers (10 characters 
long) located on video screenshots.  My results have very low positives.


The screenshots (1280x720 pixels) may or may not have text other than the 
hexadecimal number.  Really, it doesn't matter if that text is output or 
not.  The hexadecimal number can be located anywhere in the image.

Targeted text is always:

Hexadecimal characters (0-9, uppercase A-F)
10 characters long
Same font (open sans bold)
Same size (x height 11 pixels - but always uppercase)

This is what I've tried:

tesseract list.txt out -c tessedit_char_whitelist=0123456789ABCDEF


I have also tried disabling the dictionaries.

Is there anyway training could help me locate that text more reliably?  
Basically force tesseract to only look for one size and one font?  

Thanks





-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/94f56f51-626f-4949-bc7e-ca84e2986522%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Strategy for Sparse Text

Reply via email to