Hi,

I am trying to use tesseract in conjunction with UI test automation 
software Sikuli. In this scenario, we have use cases like - click on a 
button whose text is 'abc'. Hence the accuracy of short words is the main 
requirement.

I tried to extract the text from the image fast_col_picker.png which is 
attached. The text obtained was "Fan column Pitker" instead of "Fast Column 
Picker".

Tried different psm values, which didn't help.

Enabled 'tessedit_write_images true' in the config and the resultant 
tessinput.tif (attached) shows an image which is close to the original one.

Tried the following config with eng.user-words containing only 3 words 
'Fast', 'Column' and 'Picker', but it made the output less accurate (is it 
correct to expect the output to contain only words from the user-words ?). 

load_system_dawg     F
load_freq_dawg       F
user_words_suffix    user-words


Since I have to read this text from an application window to act on it, 
post processing the images is not an option. 

Appreciate any pointers in this regard.

Thanks.

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

<<attachment: fast_col_picker.PNG>>

<<attachment: tessinput.tif>>

Reply via email to