In what way it will help for tesseractocr? and if so step by step procedure followed may please be indicated.
On Fri, May 1, 2009 at 7:55 PM, Rob H. <[email protected]> wrote: > > I've been training OCR to recognize many characters spread throughout > unicode definition. > I found this handy webapp to be invaluable in understanding what are > some of the "unprintable" unicode characters. > > I can copy/paste the character into the top left text area and hit > convert. > I am mainly interested in the "UTF-16 code units" text area on the > lower right side of the page, since these are the codes I'm using with > Tesseract. > http://rishida.net/scripts/uniview/conversion.php > > If I don't recognize the UTF-16 (which is less frequent now that I've > stared at them so much), then I can click the "View in Uniview" which > is above the top left text area. This will pop-up another web page > which 99% of the time gives me a printable view of the unicode > character. > > Hope it helps! > > > PS: Does anyone know of a single font which is capable of drawing ALL > unicode characters defined by unicode.org? Currently, I'm using MS > Arial Unicode which does a halfway decent job, but it isn't complete. > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

