Hello,

Recently, I am working on detecting the font size of text with Tesseract 
under Android platform. I found this post http://pastebin.com/0dV84hBa and 
modified the GetHOCRText(int page_number) function in "baseapi.cpp" as 
follows:

    const char *font_name;
    bool bold, italic, underlined, monospace, serif, smallcaps;
    int pointsize, font_id;
const char* word = res_it->GetUTF8Text(RIL_WORD);
if (word != 0) {

    font_name = res_it->WordFontAttributes(&bold, &italic, &underlined, 
&monospace, &serif, &smallcaps,&pointsize, &font_id);

hocr_str += " !!!word: ";
hocr_str += word;
hocr_str += " !!!font_name: ";
hocr_str += font_name;
hocr_str += " !!!bold: ";
hocr_str += bold;
hocr_str += " !!!pointsize: ";
hocr_str += pointsize;
}
delete[] word;

However, only the text and font type (char) can be displayed correctly, 
while "pointsize" (int) and "bold" (bool) are both unreadable messy code 
(like a square).

Anyone has encountered this before?
Thanks

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to