From: Werner LEMBERG <[EMAIL PROTECTED]>
> > My test, the copy&paste <0xFFFF is OK but fail when >0xFFFF.
> >
> > And have extra `a, a3, b, c, d, e or ff'. But I have no idea
> > of those code.
>
> Fixed. Thanks for the report. Please test again.
I also use the nobmp2.tex file to test the CJK ExtB characters.
When I copy/paste the second line of the first paragraph from the
start of the line to the semi-colon, I get all the characters very
nicely:
CJK Unied Ideographs Extension B 的罕用字。例如:
(I suppose the wrong "fi" in what should be "Unified" is a bug in
evince not recognizing this ligature.)
Copying and pasting the characters behind the semi-colon up until the
end of the second line results in this in gnome-terminal in one line:
\ud840\udc21\ud840\udc22\ud840\udc23\ud840\udc24\ud840\udc25\ud840
\udc3b\ud840\udc3c\ud840\udc3d
And this in Emacs22:
����������������
����������������
���������������
When I copy the *complete* second line, from the very first character
to the last one, I get it correctly in Emacs22:
CJK Unied Ideographs Extension B 的罕用字。例如:������������������������������������������������
When I paste the third line completely, I get this in Emacs22 in one
line:
����������������
���������������
���������������
����������������
����������������
���������������
����������������
���������������
���������������
����������������
����������������
���������������
������
And the fourth line in Emacs22:
������. . . . . . 等等。
When I paste the complete first paragraph, I get it correctly in
Emacs22 (except for the A in \LaTeX and the "fi" ligature ;-):
A
這是關於 L TEX CJK 擴增 Unicode 碼位至 U+10FFFF 的測試,這些包括了
CJK Unied Ideographs Extension B 的罕用字。例如:������������������������������������������������
������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������
������. . . . . . 等等。
And this in gnome-terminal in one line:
A
\u9019\u662f\u95dc\u65bc L TEX CJK \u64f4\u589e Unicode
\u78bc\u4f4d\u81f3 U+10FFFF
\u7684\u6e2c\u8a66\uff0c\u9019\u4e9b\u5305\u62ec\u4e86CJK Unied
Ideographs Extension B\u7684\u7f55\u7528\u5b57\u3002\u4f8b\u5982
\uff1a\ud840\udc21\ud840\udc22\ud840\udc23\ud840\udc24\ud840\udc25
\ud840\udc3b\ud840\udc3c\ud840\udc3d\ud840\udc3e\ud840\udc41\ud840
\udc4d\ud840\udc5b\ud840\udc5c\ud840\udc58\ud840\udc5e\ud840\udc60
\ud840\udc81\ud840\udc83\ud840\udc77\ud840\udc85\ud840\udc95\ud840
\udc90\ud840\udc8f\ud840\udcc3\ud840\udcc2\ud840\udcd9\ud840\udcc8
\ud840\udcc7\ud840\udcec\ud840\udce6\ud840\udced\ud840\udcfb\ud840
\udcfa\ud840\udd23\ud840\udd37\ud840\udd60\ud840\udd86\ud840\uddc1
\ud840\uddc0\ud840\uddc2\ud840\ude3f\ud840\ude3e. . . . . . \u7b49\u7b49\u3002
Since there are (at least) two instances where this works, I suppose
that all the other bugs are due to either Evince, Gnome-Terminal or
the Gnome font mechanism.
Also, when I select CJK ExtB characters with the mouse, it's as if
each characters consists of two half-width glyphs, which I don't
experience with the characters in Unicode BMP.
An evince bug as well, I suppose.
I guess I'll try to find out which of the three is to blame, and
report a nasty bug to them. ;)
Thanks for the fix.
Danai SAE-HAN
韓達耐
--
題目:《泊船瓜洲》
作者:王安石(1021-1086)
京口瓜洲一水間,鐘山只隔數重山。
春風又綠江南岸,明月何時照我還。
_______________________________________________
Cjk maillist - [email protected]
http://lists.ffii.org/mailman/listinfo/cjk