One of the Dublin papers talks about how this is done in ICU:
http://www.unicode.org/iuc/iuc21/a347.html

Mark
—————

Γνῶθι σαυτόν — Θαλῆς
[For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr]

http://www.macchiato.com

----- Original Message -----
From: "Geoffrey Waigh" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Sunday, April 21, 2002 03:28
Subject: Re: unidata is big


> > I would just like to know if someone could give me a tip on how to
> > structure all the unicode-information in memory?
> >
> > All the UNIDATA does contain quite a bit of information and I
can't see
> > any obvious method of which is memory-efficient and gives fast
access.
>
> a) you see if there is a Unicode friendly library you can use that
already
> does this for you.
>
> b) you write a program to parse the file and extract what your
application
> needs. With clever data encoding you can pack most of the fields of
> UNIDATA into a very tight space.  Long ago in the Unicode conference
> proceedings somebody illustrated how they used trie structures to
> efficiently
> build the lookup tables - the boring parts of the encoding space
have
> shorter branches than the areas where every codepoint is different
from
> it's neighbour.
>
> Geoffrey
>
>
>


Reply via email to