Hi,

I am having a problem when attempting to output a string containing Unicode 
characters.  If the Unicode sequence corresponds to single byte character 
(e.g., a Registered Trademark symbol, U+00AE), the character is output 
correctly.  However, if the character is a 2-byte value (e.g., Trademark 
character(TM), U+2122), the string is generated as UTF-16BE as expected, but 
the output file is drawn with the FE and FF BOM characters and the 21, 22 
characters as single byte characters.

Is there something that I need to initialize to properly handle the UTF-16 
characters (the most likely solution)?  Is it a bug in PDFBox?  Is it a quirk 
in Reader X (least likely since I have seen the TM character being displayed 
correctly in other documents)?

Any help and pointers on how to deal with this problem will be greatly 
appreciated.

I am using PDFBox 1.8.2 and Adobe Reader X (Version 10.1.8) and here is a 
simple program to demonstrate the problem:

package example;
import java.io.*;
import org.apache.pdfbox.exceptions.*;
import org.apache.pdfbox.pdmodel.*;
import org.apache.pdfbox.pdmodel.edit.*;
import org.apache.pdfbox.pdmodel.font.*;
public class PDFUnicodeExample
{
    public static void main(String[] args)
    {
        PDDocument document = null;
        try
        {
            document = new PDDocument();
            PDPage page = new PDPage();
            document.addPage(page);
            PDPageContentStream cs = new PDPageContentStream(document, page);
            PDFont font = PDType1Font.HELVETICA;

            cs.beginText();
            cs.setFont(font, 16.0f);
            cs.moveTextPositionByAmount(100, 700);
            cs.drawString("Reg TM \u00AE ");
            cs.endText();

            cs.beginText();
            cs.setFont(font, 16.0f);
            cs.moveTextPositionByAmount(100, 680);
            cs.drawString("TM \u2122 ");
            cs.endText();
            cs.close();

            document.save("Unicode Example.pdf");
        }
        catch (IOException e)
        {
            e.printStackTrace();
        }
        catch (COSVisitorException e)
        {
            e.printStackTrace();
        }
    }
}

Best regards,
--Glenn Karcher



This communication, including any attachments, may contain information that is 
proprietary, privileged, confidential or legally exempt from disclosure. If you 
are not a named addressee, you are hereby notified that you are not authorized 
to read, print, retain a copy of or disseminate any portion of this 
communication without the consent of the sender and that doing so may be 
unlawful. If you have received this communication in error, please immediately 
notify the sender via return e-mail and delete it from your system.

Reply via email to