Hi!

 JFYI: They (M$) did it. There is support for surrogates at GUI  level
(TextOut, ExtTextOut) in Windows 2000.

--
-=AV=-

-----Original Message-----
From: Jean-Marc Desperrier <[EMAIL PROTECTED]>
Newsgroups: netscape.public.mozilla.i18n
To: [EMAIL PROTECTED] <[EMAIL PROTECTED]>
Date: 29 May 2001 20:25
Subject: Support for surrogates (plane 1) under Windows 2000 and Mozilla


I have very recently read some information about the support for
surrogates and plane 1 characters under Windows 2000.

I'm not sure if the developper of Mozilla are aware about the level of
support currently available so I write this message. This may lead to a
RFE bug.

There is no support for surrogates and plane in Windows 95/98, and
Windows NT 4, but there is in Windows 2000. The status of Windows Me is
not described in the documents I have seen.

The following link provides some Microsoft documentation about this :
http://msdn.microsoft.com/library/psdk/winbase/unicode_192r.htm

James Kass has created a sample TTF font (Code2001) that includes Old
Persian Cuneiform, Deseret, Tengwar, Cirth, Old Italic, and Gothic code
points in the plan 1, with the Format 12 and the required cmap subtable,
necessary for this to be identified by Windows 2000.

Here is the page :
http://home.att.net/~jameskass/code2001.htm

You must first modify an entry in the registry to enable surrogates
support in Windows 2000, and restart.

After doing that, by installing the Code2001 font, I was able to see the
text in plane 1 in the file plane1.txt James Kass provides under both
Wordpad and Notepad, when the file format is UTF-16 and when it's UTF-8
(with an initial BOM identifier).

I have not been able to display the texte correctly with IE 5.0 or IE
5.5, despite doing the change that is described in the Microsoft
documentation in the registry, to tell IE that it should use the
Code2001 font to display this text, _but_ the display shows that the
surrogates are recognised correctly at the last line of Etruscan text,
that has only 6 Etruscan characters, display 6 box under IE and not 12,
as would be the case if surrogates are not recognised.

Therefore I'm not sure if there's really something not working, or if I
haven't been able to correctly do the changes required.

When opening plane1.txt in Mozilla (release 2001052504), twelve question
marks are displayed on the last line of the plane1.txt, instead of 6 if
surrogates were identified.

I tried to convert plane1.txt to utf-8, change it to html, and add a
font tag for Code2001, but it changes nothing.
Last line still display 12 questions marks (the text being displayed is
only 6 UTF-8 characters and duly identified as such in UTF-8, there
should be no risk of non identification of surrogates pairs like in
UTF-16/UCS-2.).

But after converting the file to HTML 4 entities, I got a different
result.
Mozilla displays some very strange, ugly looking, small graphics instead
of questions marks.

According to the Microsoft documentation above, the standard text
display function (TextOut, ExtTextOut) will recognize surrogates and
display them correctly.

Therefore the result is not very successful with Mozilla.

It's a pity as, as I can force Mozilla to use Code2001 when displaying
unicode text, I could avoid the problem with IE, that is not able to
detect Code2001 is the font to use to display theses characters.

I join with this message the HTML files I created for the tests with
Mozilla, as James Kass only make available an UTF-16 text file for
testing (I believe binaries are allowed on this newsgroup ?).

There's the original text file, and convertions of his text file to
UTF-8, UTF-8 HTML, and HTML with HTML4 entities.

plane1.zip

Reply via email to