The cause are "gaps" in the PDFDocEncoding specification that have been missed in the implementation. I'll create an issue later.

Tilman

Am 10.07.2017 um 19:22 schrieb Andrea Vacondio:
Hi, we came across this case where we are basically cloning outline items
where the original outline title is a UTF16BE encoded text string
containing the value 00A0 (non break space). We later use the string to
assign the title in a new outline item and the A0 is recognised as a € sign.
Here is a simple test:

         COSString victim = COSString
                 .parseHex("FEFF004300680061007000740065007200A0");
         PDOutlineItem node = new PDOutlineItem();
         node.setTitle(victim.getString());

If you look at the node dictionary you'll see that the title value is
Chapter€



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to