On 10/24/2014 9:01 AM, Michel Suignard wrote:
I know for a fact (because I did it and just verified), that the font used for
those codes use the real UCS code. The conversion happens in the PDF embedding
magic. I could look into it, but I have no easy to debug the Adobe Distiller
path here. Apparently when you get out of the beaten path for new characters,
the preservation of code points in copy and paste operation is not bullet proof.
And this is presumably true in general, and the code substitutions would
then be "random", meaning that they do not establish an alternate
encoding for exchange purposes. That is different from releasing
ASCII-hacked or PUA fonts directly, because they do establish alternate
encodings and documents in them can be exchanged if viewed with the same
fonts.
A./
Michel
-----Original Message-----
From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Jukka K. Korpela
Sent: Friday, October 24, 2014 4:51 AM
To: unicode@unicode.org
Subject: Re: Code charts and code points
2014-10-24 11:17, "Martin J. Dürst" wrote:
The code charts are published as PDFs. In general, text in PDFs can be
copypasted elsewhere. Is there something in place that makes sure that
"wrong" Unicode encodings for glyphs published in code charts don't
leak elsewhere?
It seems that there isn’t. Whether this is serious is a different issue.
I tested with the arbitrarily chosen Ornamental Dingbats block, with the chart
http://www.unicode.org/charts/PDF/Unicode-7.0/U70-1F780.pdf
Opening it in Adobe Reader XI on Win 7, I was able to select the characters
with the mouse and copy and paste them to a text editor, BabelPad. It shows
most of them as just boxes, identified with the correct Unicode numbers; this
is the expected behavior when the editor has no suitable font in its disposal.
But instead of U+1F67C VERY HEAVY SOLIDUS and U+1F67D VERY HEAVY REVERSE
SOLIDUS, it shows “/” and “/”, identified as U+002F SOLIDUS and U+005C REVERSE
SOLIDUS.
So apparently the font designer had placed the glyphs as assigned to SOLIDUS
and REVERSE SOLIDUS, which is understandable. But this means that when the
characters in the code charts are copied and pasted, or otherwise accessed at
the character level, they are wrong characters.
I think it is imaginable that someone wants to copy a block of characters from
the code charts, as a handy way of getting them for inspection, e.g. for
testing how some particular software renders them using some particular
font(s). I would expect some confusion then if you had partly got all wrong
characters (code points).
Yucca
_______________________________________________
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode
_______________________________________________
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode
_______________________________________________
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode