Yes, look up the definition of TesseractExtractResults: it returns the
set of boxes for all characters it recognized, with blank characters
(ascii 32) between words or lines (you have to map to a space or to a
newline based on the X & Y coordinates of the box before and after the
delimiter). A "word" would be the set of individual boxes between two
delimiters and if you wish to draw a box around the entire word, you
can just make one up from the min/max X & Y values from the set of
individual character boxes. Disclaimer: there may be an API I don't
know that returns the word boxes ready-made.

A word of caution: the Tesseract space detection is so-so and is wrong
IMHO about 5-10% of the time.

Patrick

On Jan 6, 12:37 pm, jdevelop <[email protected]> wrote:
> Hello, all!
>
> Can somebody please advice - is it possible to get the coordinates and
> bounding boxes of words, recognized by tesseract? If so - can somebody
> please point me to where I should learn more about it?
>
> Ideally, the output (or API callback) should contain the word itself,
> the [X,Y] of upper-left corner and [X,Y] of bottom-right one.
>
> Thank you all in advance!
>
> --
> Eugene
-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.


Reply via email to