Hi
I am a new tesseract user, and I'm really struggling to get it to produce 
any kind of sensible results, especially with numerical text. I have some 
text that looks like this:
[image: example_input.jpg]
I've read the documentation, and looked through the parameter list, and I 
added the following to the command line:
--psm 6
-c preserve_interword_spaces=1
-c textord_dotmatrix_gap=6
-c classify_bln_numeric_mode=1
-c rej_alphas_in_number_perm=1

But I just get garbage out:

Oo -250 6 3a
190 & So
190 6 -100
1 $1290 6 ~140
1 $130 6 ~150

I've tried all sorts of additional image processing to try and improve the 
look of the text, but none of it works. In fact, this is the best output of 
seen. It's usually worse. I'm really hoping someone who has worked with 
dot-matrix input can offer some magic incantation to make tesseract come to 
its senses. Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/4f1c4b60-457c-46d3-8e28-541ba85e0cf3n%40googlegroups.com.

Reply via email to