On 14/08/2003 11:44, Jim Allan wrote:

Peter Kirk posted:

The documentation is great, but I have had some problems copying text
from it (with Acrobat Reader 5), in particular with text in small
capitals e.g. Unicode character names. For example, I get the following
from p.44:

The sequence of Unicode characters U+0061 “a” 
   + U+0308 “!”  + U+0075 “u”  

 unambiguously encodes “äu” not “aü”.


This came out perfectly on my Windows 98 system as browsed by me in the Unicode list archives through Mozilla 1.3 and also after I pasted it into the Mozilla Compose window as quoted text.

The characters, small capital or others, are displayed with no problems.


Jim Allan





What seems to be happening, in Windows 2000, is that the text on the clipboard is made up of PUA character codes U+F7XX, where the XX seems to be the corresponding ASCII code. For example, small caps "LATIN" comes out as F76C F761 F774 F769 F76E. At some point Windows 98 simply strips off the F7's giving you the correct text. But Windows 2000, which is Unicode based, keeps the full PUA code points, which in my Mozilla 1.4 are rendered as strange combinations of base characters with combining marks, e.g. "LATIN" comes out as  which appears on my screen (in Mozilla mail and browsing the archives with Mozilla) as N diaeresis M macron o vertical-line-below n macron o acute dot-below. When I browse the archives in IE6 or paste the text into Word, I get square boxes.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/





Reply via email to