These are the 4 calls I am making to get the list of scanned
characters + their location on the image:
MyTessBaseAPI::CopyImageToTesseract(grayScaleImageData,
1,grayScaleWidth,0,0,grayScaleWidth,grayScaleHeight);
block_list = MyTessBaseAPI::FindLinesCreateBlockList();
page_res = MyTessBaseAPI::Recognize(block_list, nil);
text = MyTessBaseAPI::TesseractToBoxText(page_res, 0,
grayScaleHeight);
This works fine, and returns a text string containing each character
followed by 4 coordinates representing the bounding rectangle and each
set separated by '\n', e.g "H 2 20 15 20\ni 28 20 41 20\n" (for
"Hi"), except that the returned characters are just that, without any
indication of spaces or line breaks in the scanned text. Clearly there
is a Tesseract API to also get that, can anyone help?
By the way: the reason I use TesseractToBox above is because for some
reason this call fails:
MyTessBaseAPI::TesseractExtractResult(&text, &lengths, &costs, &x0,
&y0, &x1, &y1, page_res);
This returns the correct text, with spaces and line breaks but not the
bounding rectangles for some reason. If someone knows why, that would
be very helpful as well ...
Thanks!
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---