Well Tesseract 2.0 has support for unicode, but many times it can be
hard to understand the results of the OCR because the characters are
not printable in many fonts.

Typically in text editors (including Notepad++, UltraEdit, MS Word,
Notepad, etc.), an unrecognized character will be displayed as a
simple box. This is not readable.
So, to verify your results, especially while training, you need to
check how accurate the results came out.

So, if you are using unprintable characters and don't have a font
which recognizes them correctly, then this webapp will help you know
which character the OCR recognized.... unless you know off the top of
your head what hex value matches what characters you want.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to