[tesseract-ocr] Tesseract low accurate single character recognition

Mosn Sat, 02 Jan 2016 09:54:06 -0800

Hello, I've been going through the topic posts on group for almost 3 days 
but absolutely cannot find any fix for my issue. I need to recognize single 
set of characters with tesseract but the accuracy is absolutely horrible, 
even after image processing and setting proper configurations of tesseract.


For start this is my original image:  http://i.imgur.com/MwTswFA.jpg

I pre process,invert colors, resize and sharpen the image to get this : 
  http://i.imgur.com/Pl6OVE3.png

I pass this to tesseract and most of the time I get back "W", This also 
happens with the A and other characters. confidence rate for M =  (*    
"(64.64%) 'W'", **"(57.08%) 'M'"*)

I also tried to fine tune tesseract setting and did following : 

1. limited the characters with  char_whitelist 
2. disabled dictionaries 
3. set page segmentation to 10 to process single char
4. also modified the language_model_penalty_non_dict_word and other 
settings related to it. 


But non of this helped with the issue. I still cannot recognize a simple M. 
I might be able to do font training for I dont think font training can help 
with a upside "w" issue. 

I really appreciate any help on this. 
I am using tesseractOCR iOS. 

cheers, 
Mo 


-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/549c35c3-215d-417a-b715-adcafe7c4c7d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Tesseract low accurate single character recognition

Reply via email to