2010/9/20 Li Yanrui (李延瑞) <[email protected]>: > 2010/9/10 Taco Hoekwater <[email protected]>: >> On 09/10/2010 02:32 PM, Li Yanrui (李延瑞) wrote: >>> >>> 2010/9/10 Li Yanrui (李延瑞)<[email protected]>: >>>> >>>> Hi all, >>>> >>>> For the pdf file which is generated frome the following example, two >>>> Chinese characters can not be copied rightly when I use simsun.ttc >>>> font and use *Adobe Reader* to view it. The copy text is displayed as >>>> the wrong unicode text such as "". >> >> It looks like the reader is ignoring the ToUnicode entry in the >> bad case. No idea why, though. >> > > My friend sent a mail to Ken Lunde who works for adobe systems > incorporated. He replied: > > [quote] > I forwarded your email to the Acrobat team for investigation. > When I copied the body text from the first file, and all of the text > from the second file, the code points are PUA, specifically Unicode > Plane 16. The heading text of the first file appears to be encoded > correctly. I am guessing that this is a PDF producer issue. > I will keep you posted about what the Acrobat team discovers. > [/quote] > > In the above, the first file is the tex file of the pdf file which is > the attachment for my previous mail; the second one is that > attachment. >
Recently Ken Lunde replied again: > Gu Hua forwarded your email to the Acrobat, which investigated this issue. > > The evidence points toward a malformed or poorly-made ToUnicode table. > It would be appropriate for Adobe Reader to ignore such a ToUnicode > table. > > It would be useful to know how the ToUnicode table is made by this PDF > producer. I know that when making a static ToUnicode mapping resource, > which uses CMap resource syntax, it is very easy to make an invalid > one, specifically because CID ranges, when expressed as two-byte > values, cannot cross first-byte boundaries. Also, UTF-32 (and UTF-8) > cannot be used directly, and such values must be converted to UTF-16. > One or both of these issues, if not handles correctly, can result in > an invalid ToUnicode table. > In fact the text with fonts such as simsun.ttc can not be copied rightly but the text with fonts such as AdobeSongStd-Light.otf can be copied rightly. It seems like that this problem may be related to the font loading script of mkiv. -- Best regards, Li Yanrui (李延瑞) ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : [email protected] / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________
