Re: going past the bmp

Owen Taylor Thu, 28 Nov 2002 12:58:40 -0800

[EMAIL PROTECTED] writes:

> > from the beginning with this in mind.  I think most Linux programs don't
> > suffer from what used to be a common misunderstanding(hope it's not any
> > more) that Unicode is a 16bit character set unless they rely on old
> ....
> 
>   What I wrote above is true in principle, but in reality, it appears
> to be a different story. I've just made a file with three non-BMP
> characters(U+10331, U+10332, U+10333. Gothic characters) with Yudit
> (Yudit works well for this purpose. You can use 'Uxxxxyyyy' to enter
> non-BMP characters with 'Unicode' keymap selected.) and read in that
> file from gedit (with its font set to CODE2001). gedit doesn't show
> anything although judging from the movement of the cursor, it certainly
> knows there are characters where nothing is displayed.


The path to adding full beyond-the-BMP support to Pango is 
pretty straightforward. (I'm a little suprised that it doesn't
sort of work now for TrueType fonts, but I haven't tested
it at all.)

 - Extend the property tables in GLib to cover all of Unicode-3.2
   (requires some table format revisions, the only hard part)
 - Upgrade Pango to the latest Fribidi versions which I believe
   cover non-BMP portions of Unicode.
 - Extend the hex-square drawing code for unknown characters
   to cover 6-digit hex-rectangles.
 - Test (there may be FreeType problems or...)

I believe it's maybe a week of work total. It unfortunately, is still
reasonably far from the top of my queue; not a lot of practical demand
for it. It would be a neat demo of the comprehensive use of UTF-8
though.

Regards,
                                        Owen

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: going past the bmp

Reply via email to