Hi all,
a while ago, I wrote myself a hack to tesseract to insert
a blank line before every new paragraph.  I did that by
checking the x-position of the first word in every line
with respect to the left side of the current block in
baseapi.cpp:TessBaseAPI::GetUTF8Text().  This code worked
well enough when I wrote it for svn release 319, but
I thought I would update to a newer source release (525),
and find that it no longer works.  The code gets the left
side of the current block via:

page_res_it.block()->block->bounding_box().left()

That used to be set to the x-position of the current block
of text, but now I find that the bounding_box just encompasses
the entire image.  So, my question is: is the bounding box
of the current block no longer automatically updated?  Do
I have to enable something in the configs to get the bounding
box computed properly again?

I never tried my hack with any of the source releases between
319 and 525, so I don't know when the behaviour changed.

Cheers,
Rob Komar

--
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to