ⒺⒻⓁⓁⓊⓅⓈⓈⓉⓊ U+24BA U+24BB U+24C1 U+24C1 U+24CA U+24C5 U+24C8 U+24C8 U+24C9 U+24CA U+000A
E F L L U P S S T U http://rishida.net/scripts/uniview/conversion.php 000A [control] 0020 SPACE 0045 E LATIN CAPITAL LETTER E 0020 SPACE 0046 F LATIN CAPITAL LETTER F 0020 SPACE 004C L LATIN CAPITAL LETTER L 0020 SPACE 004C L LATIN CAPITAL LETTER L 0020 SPACE 0055 U LATIN CAPITAL LETTER U 0020 SPACE 0050 P LATIN CAPITAL LETTER P 0020 SPACE 0053 S LATIN CAPITAL LETTER S 0020 SPACE 0053 S LATIN CAPITAL LETTER S 0020 SPACE 0054 T LATIN CAPITAL LETTER T 0020 SPACE 0055 U LATIN CAPITAL LETTER U 000A [control] 000A [control] On Mon, May 4, 2009 at 6:46 PM, Rob H. <[email protected]> wrote: > > Copy and paste the following text into the basic notepad application. > It will show up as "little boxes". > There's a good chance that your web browser doesn't have a unicode > enabled font, so most of the following characters will display as > garbage. > > The following characters are: circled E, circled F, circled L, circled > L, circled U, circled P, circled S, circled S, circled T, circled U > > ⒺⒻⓁⓁⓊⓅⓈⓈⓉⓊ > > Or you can copy/paste those into the web app and view them: > http://rishida.net/scripts/uniview/uniview.php?codepoints=24BA 24BB > 24C1 24C1 24CA 24C5 24C8 24C8 24C9 24CA > > > On May 3, 5:35 am, 74yrs old <[email protected]> wrote: > > Thanks. very good idea. will you please upload sample of "little box"? > > > > > > > > On Sun, May 3, 2009 at 9:21 AM, Rob H. <[email protected]> wrote: > > > > > I'm training Tess to recognize letters/numbers/symbols/etc. used for > > > geometrical tolerancing and annotations (ASME Standard Y14.5) > > > Alot of the characters used in the ASME standard are coming from all > > > over the unicode tables (although the characters/words are from the > > > English language). > > > > > This is part of a data validation project and I'm using OCR as part of > > > the process. > > > Since OCR is not 100% accurate, some of the validation will need to be > > > done by hand (hopefully as little as possible). > > > If the person checking the annotation sees a "little box" (ie > > > unprintable character) then it will slow down their job. > > > For the moment, I check unprintable characters using the webapp which > > > I posted above. > > > Once this goes into production, there will be a font (purchasd or home- > > > brewed) which can correctly draw all the letters/numbers/symbols/etc. > > > > > On May 2, 7:04 am, 74yrs old <[email protected]> wrote: > > > > Hi Rob, > > > > I know about conversion.php which I am using for long time for > Kannada > > > > project. > > > > Will you kindly explain by step by step of your experiment with > sample > > > if > > > > any. I > > > > wanted to have hands on experience. BTW which lang. you were > training? > > > > Regards, > > > > sriranga(76yrs old) > > > > > > On Sat, May 2, 2009 at 6:37 AM, Rob H. <[email protected]> wrote: > > > > > > > Also, I got this e-mail from a someone named Albert > > > > > ========= > > > > > Hi Rob, > > > > > > > Reply to your "ps".... > > > > > > > That doesn't make any sense to me. You are asking for a set of > glyphs > > > > > that can represent every Unicode character in existence. Not > > > > > only would such a file be *HUGE* in size, but I can't see it as > > > > > serving any purpose to anyone (other than you, I guess)... > > > > > > > So you should stop looking for it. > > > > > > > - > > > > > Albert > > > > > ========= > > > > > > > Arial Unicode covers ~50K of the ~140K characters defined at > > > > > unicode.org. This font file is 22mb. > > > > > Wouldn't a complete unicode font be around 70mb? > > > > > > > If you need a general text viewer which can legibly show documents > > > > > that contain any number of the valid ~140K characters, > > > > > then a complete font would be useful. > > > > > > > Great advice Albert...*roll eyes*... "stop looking"... how about > > > > > something a little more constructive? > > > > > maybe you know a strategy of mixing fonts to enable an application > to > > > > > view all the possible unicode characters?- Hide quoted text - > > > > - Show quoted text - > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

