Thanks for the reply Albert. I think I'll stop looking ... for a silver bullet and create a strategy which covers my set of glyphs. (maybe the pdf solution will work).
I thought Unicode did specify what a character looks like (on a basic level), and then fonts were responsible for their interpretation (which can be completely off). For example, "WingDings" is vastly different from what Unicode shows in their PDF renderings. I assumed that the character drawn in those unicode files were a basic rendition of what the character should look like. Do you have any experience creating fonts? I might create one... it doesn't have to be pretty... just needs to help the user accomplish their task of comparing text extract from the UI vs text extracted from the model. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

