Thanks for the quick response, but I already know about those APIs - let me try to explain with an example.
Let's say that ResultIterator says that it found the word "hello" in the image at position (100, 100), and TessResultIteratorWordFontAttributes says it's in font "Arial" with a height of 16. In my Windows application, I can construct a 16-high Arial font and draw the word "hello" at (100, 100) and I am doing a good job of showing the user the OCR output. But now let's say that ResultIterator continues and says that it found the word "goodbye" in the image at position (100, 300), and TessResultIteratorWordFontAttributes says it's in font "DejaVu Sans" with a height of 16. If I tell Windows to construct a font named "DejaVu Sans", Window won't have any idea what that is, and it will pick some random font from its list. When I then have my Windows application draw the word "goodbye" at (100, 300), it's highly likely that the character widths in the font that Windows is using are very different from the character widths in the actual DejaVu Sans font, so the word "goodbye" will take up the wrong amount of space and I'll either end up with lots of white space or (more often) the words all run over each other. Does that make more sense? Thanks, Chris On Friday, September 20, 2013 5:39:07 PM UTC-6, Quan Nguyen wrote: > > You'll need to access Tessearct API for such information, specifically, > ResultIterator and ResultIteratorWordFontAttributes. Check out the API > Example <http://code.google.com/p/tesseract-ocr/wiki/APIExample> page. > > Quan > > > On Friday, September 20, 2013 3:42:14 PM UTC-5, [email protected] wrote: >> >> I would like to show the user the OCR output in my Windows application in >> a graphical form (the OCR'd characters, in the specified font, in the right >> location), in order to do that I need to pick a font to draw the OCR output >> text in, and it seems like I have two choices - >> 1) Map the Tesseract font to something Windows can understand >> 2) Use the actual Tesseract font >> >> For #1, Tesseract uses a lot of fonts that I've got on my Windows box >> (Times New Roman, Arial, etc.) but then it also comes up with some I don't >> have (Century Schoolbook). Is there a way to enumerate all the names of >> the fonts that Tesseract might return? I can then decide whether it's >> easier to find Windows equivalent for all the fonts, or to download fonts >> (if they are free and have nice licensing). >> >> For #2, it's not enough to just display the selected portion of the >> source image, that doesn't tell the user anything. I would need a way to >> ask Tesseract, "what is the glyph for an uppercase G in an Arial font of >> height 34". Does that exist? >> >> Thanks, >> Chris >> >> -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

