--- Mike Nordell <[EMAIL PROTECTED]> wrote: > Martin Sevior wrote: > > > 2. 16 bit unsigned char => 32 bit unsigned char to > allow 100% unicode > > compliance. > > I take it you mean UTF-32 compliance here, meaning > AbiWord is to only use > UTF-32 (unicode.org) or UCS-4 (ISO/IEC > 10646-1:2000)?
I've already discussed this with dom a couple of times. What we need to do is support the full 32-bit Unicode character set but we shouldn't use UTF-32 to do it since we'll waste vast amounts of memory space since characters above 16-bit are very very rare. We need to instead switch to UTF-8 internally for everything. This is the right answer for several reasons which have all been covered in depth on several mailing lists about Unicode issues which should be findable easily using Google. > May I ask exactly _how_ we're supposed to display > e.g. UCS-4 on Win32? :-) That's a different issue. Win32 can display 32-bit Unicode characters using its UCS-2 with surrogates. Our string classes will handle the conversion like any other encoding conversion. Some versions of Windows may require a registry change or dll update for 32-bit Unicode to display. Without it strings won't get stomped but will display the occasional "unknown character" glyph. > You don't happen to know of a freely available > 32-bit TrueType font (less > than 12 GB in size)? :-) This is also another issue. Nothing says one document has to use a single font for multiple languages. Andrew Dunbar. ===== http://linguaphile.sourceforge.net http://www.abisource.com __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com
