Re: Great tool for working with unicode

74yrs old Mon, 04 May 2009 10:24:11 -0700

ⒺⒻⓁⓁⓊⓅⓈⓈⓉⓊ
U+24BA U+24BB U+24C1 U+24C1 U+24CA U+24C5 U+24C8 U+24C8 U+24C9 U+24CA U+000A


E F L L U P S S T U   http://rishida.net/scripts/uniview/conversion.php
000A     [control]  0020     SPACE
  0045  E  LATIN CAPITAL LETTER E
  0020     SPACE
  0046  F  LATIN CAPITAL LETTER F
  0020     SPACE
  004C  L  LATIN CAPITAL LETTER L
  0020     SPACE
  004C  L  LATIN CAPITAL LETTER L
  0020     SPACE
  0055  U  LATIN CAPITAL LETTER U
  0020     SPACE
  0050  P  LATIN CAPITAL LETTER P
  0020     SPACE
  0053  S  LATIN CAPITAL LETTER S
  0020     SPACE
  0053  S  LATIN CAPITAL LETTER S
  0020     SPACE
  0054  T  LATIN CAPITAL LETTER T
  0020     SPACE
  0055  U  LATIN CAPITAL LETTER U
  000A     [control]
  000A     [control]



On Mon, May 4, 2009 at 6:46 PM, Rob H. <[email protected]> wrote:

>
> Copy and paste the following text into the basic notepad application.
> It will show up as "little boxes".
> There's a good chance that your web browser doesn't have a unicode
> enabled font, so most of the following characters will display as
> garbage.
>
> The following characters are: circled E, circled F, circled L, circled
> L, circled U, circled P, circled S, circled S, circled T, circled U
>
> ⒺⒻⓁⓁⓊⓅⓈⓈⓉⓊ
>
> Or you can copy/paste those into the web app and view them:
> http://rishida.net/scripts/uniview/uniview.php?codepoints=24BA 24BB
> 24C1 24C1 24CA 24C5 24C8 24C8 24C9 24CA
>
>
> On May 3, 5:35 am, 74yrs old <[email protected]> wrote:
> > Thanks. very good idea. will you please upload sample of "little box"?
> >
> >
> >
> > On Sun, May 3, 2009 at 9:21 AM, Rob H. <[email protected]> wrote:
> >
> > > I'm training Tess to recognize letters/numbers/symbols/etc. used for
> > > geometrical tolerancing and annotations (ASME Standard Y14.5)
> > > Alot of the characters used in the ASME standard are coming from all
> > > over the unicode tables (although the characters/words are from the
> > > English language).
> >
> > > This is part of a data validation project and I'm using OCR as part of
> > > the process.
> > > Since OCR is not 100% accurate, some of the validation will need to be
> > > done by hand (hopefully as little as possible).
> > > If the person checking the annotation sees a "little box" (ie
> > > unprintable character) then it will slow down their job.
> > > For the moment, I check unprintable characters using the webapp which
> > > I posted above.
> > > Once this goes into production, there will be a font (purchasd or home-
> > > brewed) which can correctly draw all the letters/numbers/symbols/etc.
> >
> > > On May 2, 7:04 am, 74yrs old <[email protected]> wrote:
> > > > Hi Rob,
> > > > I know about conversion.php which I am using for long time for
> Kannada
> > > > project.
> > > > Will you kindly explain by step by step  of your experiment with
> sample
> > > if
> > > > any. I
> > > > wanted to have hands on experience.  BTW which lang. you were
> training?
> > > > Regards,
> > > > sriranga(76yrs old)
> >
> > > > On Sat, May 2, 2009 at 6:37 AM, Rob H. <[email protected]> wrote:
> >
> > > > > Also, I got this e-mail from a someone named Albert
> > > > > =========
> > > > > Hi Rob,
> >
> > > > > Reply to your "ps"....
> >
> > > > > That doesn't make any sense to me.  You are asking for a set of
> glyphs
> > > > > that can represent every Unicode character in existence.  Not
> > > > > only would such a file be *HUGE* in size, but I can't see it as
> > > > > serving any purpose to anyone (other than you, I guess)...
> >
> > > > > So you should stop looking for it.
> >
> > > > > -
> > > > > Albert
> > > > > =========
> >
> > > > > Arial Unicode covers ~50K of the ~140K characters defined at
> > > > > unicode.org. This font file is 22mb.
> > > > > Wouldn't a complete unicode font be around 70mb?
> >
> > > > > If you need a general text viewer which can legibly show documents
> > > > > that contain any number of the valid ~140K characters,
> > > > > then a complete font would be useful.
> >
> > > > > Great advice Albert...*roll eyes*... "stop looking"... how about
> > > > > something a little more constructive?
> > > > > maybe you know a strategy of mixing fonts to enable an application
> to
> > > > > view all the possible unicode characters?- Hide quoted text -
> >
> > - Show quoted text -
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Great tool for working with unicode

Reply via email to