Well Tesseract 2.0 has support for unicode, but many times it can be hard to understand the results of the OCR because the characters are not printable in many fonts.
Typically in text editors (including Notepad++, UltraEdit, MS Word, Notepad, etc.), an unrecognized character will be displayed as a simple box. This is not readable. So, to verify your results, especially while training, you need to check how accurate the results came out. So, if you are using unprintable characters and don't have a font which recognizes them correctly, then this webapp will help you know which character the OCR recognized.... unless you know off the top of your head what hex value matches what characters you want. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

