Yes, look up the definition of TesseractExtractResults: it returns the set of boxes for all characters it recognized, with blank characters (ascii 32) between words or lines (you have to map to a space or to a newline based on the X & Y coordinates of the box before and after the delimiter). A "word" would be the set of individual boxes between two delimiters and if you wish to draw a box around the entire word, you can just make one up from the min/max X & Y values from the set of individual character boxes. Disclaimer: there may be an API I don't know that returns the word boxes ready-made.
A word of caution: the Tesseract space detection is so-so and is wrong IMHO about 5-10% of the time. Patrick On Jan 6, 12:37 pm, jdevelop <[email protected]> wrote: > Hello, all! > > Can somebody please advice - is it possible to get the coordinates and > bounding boxes of words, recognized by tesseract? If so - can somebody > please point me to where I should learn more about it? > > Ideally, the output (or API callback) should contain the word itself, > the [X,Y] of upper-left corner and [X,Y] of bottom-right one. > > Thank you all in advance! > > -- > Eugene
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

