Re: [CODE4LIB] Handling non-Unicode characters (was: Unicode persistence)

2010-05-03 Thread Jakob Voss
Hi Stuart, These have been included because they are in widespread use in a current written culture. The problems I personally have are down to characters used by a single publisher in a handful of books more than a hundred years ago. Such characters are explicitly excluded from Unicode. In

Re: [CODE4LIB] Handling non-Unicode characters (was: Unicode persistence)

2010-05-03 Thread Jonathan Rochkind
Hmm, you could theoretically assign chars in the private unicode area to the chars you need -- but then have your application replace those chars by small images on rendering/display. This seems as clean a solution as you are likely to find. Your TEI solution still requires chars-as-images

Re: [CODE4LIB] Handling non-Unicode characters (was: Unicode persistence)

2010-05-02 Thread stuart yeates
Jakob Voss wrote: Eric Hellman wrote: May I just add here that of all the things we've talked about in these threads, perhaps the only thing that will still be in use a hundred years from now will be Unicode. إن شاء الله Stuart Yeates wrote: Sadly, yes, I agree with you on this. Do