Anyone? In case the length of my posts scared off a few readers, here's a more condensed version:
having box coordinates for every recognized character in the final result, would allow one to extend the recognition process by either re- doing the recognition (with alternative image pre-processing), or doing a number of creative things such as automated image extraction, or maybe even some other clever layout recognition. You could use box info to determine line spacing (and/or font size), indentations, and many other things. (Though, i realize, the latest version has indentation recognition) Where do I even begin to look in the code, to see if I can get a printout of the coordinates? thanks again -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

