Hi, I am trying to use tesseract in conjunction with UI test automation software Sikuli. In this scenario, we have use cases like - click on a button whose text is 'abc'. Hence the accuracy of short words is the main requirement.
I tried to extract the text from the image fast_col_picker.png which is attached. The text obtained was "Fan column Pitker" instead of "Fast Column Picker". Tried different psm values, which didn't help. Enabled 'tessedit_write_images true' in the config and the resultant tessinput.tif (attached) shows an image which is close to the original one. Tried the following config with eng.user-words containing only 3 words 'Fast', 'Column' and 'Picker', but it made the output less accurate (is it correct to expect the output to contain only words from the user-words ?). load_system_dawg F load_freq_dawg F user_words_suffix user-words Since I have to read this text from an application window to act on it, post processing the images is not an option. Appreciate any pointers in this regard. Thanks. -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
<<attachment: fast_col_picker.PNG>>
<<attachment: tessinput.tif>>

