> I would just like to know if someone could give me a tip on how to > structure all the unicode-information in memory? > > All the UNIDATA does contain quite a bit of information and I can't see > any obvious method of which is memory-efficient and gives fast access.
a) you see if there is a Unicode friendly library you can use that already does this for you. b) you write a program to parse the file and extract what your application needs. With clever data encoding you can pack most of the fields of UNIDATA into a very tight space. Long ago in the Unicode conference proceedings somebody illustrated how they used trie structures to efficiently build the lookup tables - the boring parts of the encoding space have shorter branches than the areas where every codepoint is different from it's neighbour. Geoffrey

