I have an image here: http://dl.dropbox.com/u/1531272/pg1-CROP-OCR.jpg
This image when run through the tesseract renders out three words...

05
04571
6

I have adjusted tosp_table_xht_sp_ratio to no avail... I cannot
understand why 6 is not included in the 04571 word. In looking at the
characters that are returned the height of 1 is 69px and the space to
the next character 6 is 12px. Even using the default value for
tosp_table_xht_sp_ratio of .33 should yield a space of 69*.33 = 23px
for spacing - which would make this 6 come into the same grouping.

Can anyone offer a view into this that helps me understand why the 6
is not read as part of the 045716 word?

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to