[tesseract-ocr] image_to_string and image_to_data results are not the same

Alan Kong Sun, 11 Feb 2018 04:26:20 -0800

Hi everyone, 

I am a new user to tesseract-ocr and had been using it on python with 
pytesseract wrapper.


On the pytesseract, I am able to call to function 1) image_to_string which 
translate character it recognize to text string in a python list and 2) 
image_to_data which translate character to string, + verbose information 
where it includes all the bounding boxes coordinates and confidence of the 
prediction.

I had used these 2 function and would expect them to actually return the 
same result but they differ a lot. I was thinking maybe image_to_data uses 
-psm 0 by hard default and this parameters cannot be change. Where as in 
image_to_string, I could set -psm 6 which return fairly reasonable results.

Cheers,

Alan

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/444ef2dc-c9d6-4fed-9316-b1b39553f24c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] image_to_string and image_to_data results are not the same

Reply via email to