The PDF reference doesn't define any encoding, just bytes. In a german 
locale Acrobat will encode as cp1252 and in a japanese one maybe SJIS, iText 
uses 8859-1 as it could use anything. getISOBytes() is used to quick convert 
chars that are known to be in range, it's not a generic converter, for that 
there are other methods in Java and in PdfEncodings.

Paulo

----- Original Message ----- 
From: "Michael Bell" <[email protected]>
To: <[email protected]>
Sent: Saturday, August 29, 2009 7:46 AM
Subject: [iText-questions] PDF Encryption


1. The javadocs don't define the character set of the encoding for the 
byte[] signature to PDFEncryptor.encrypt. Of course a byte array without a 
character encoding makes no sense. What is the acceptable character 
encodings (see #2) ? This should be corrected.
2. Cracking open the code, I see the String method calls getISOBytes()......

public static void encrypt(PdfReader reader, OutputStream os, int type, 
String userPassword, String ownerPassword, int permissions) throws 
DocumentException, IOException {
        PdfStamper stamper = new PdfStamper(reader, os);
        stamper.setEncryption(type, userPassword, ownerPassword, 
permissions);
        stamper.close();
 }

which leads to ...
(PDFStamper class)
public void setEncryption(boolean strength, String userPassword, String 
ownerPassword, int permissions) throws DocumentException {
        setEncryption(DocWriter.getISOBytes(userPassword), 
DocWriter.getISOBytes(ownerPassword), permissions, strength);
    }
...


Here is the source of getISOBytes()


    /** Converts a <CODE>String</CODE> into a <CODE>Byte</CODE> array
     * according to the ISO-8859-1 codepage.
     * @param text the text to be converted
     * @return the conversion result
     */
        public static final byte[] getISOBytes(String text)
        {
            if (text == null)
                return null;
            int len = text.length();
            byte b[] = new byte[len];
            for (int k = 0; k < len; ++k)
                b[k] = (byte)text.charAt(k);
            return b;
        }


a. The javadoc comment is wrong. This does not guarantee an ISO-8859-1 
conversion, it merely tries to convert to a SINGLE byte representation of 
the characters. That MIGHT be ISO-8859-1. It might not be - there are tons 
of single byte character sets.
b. If you feed it unicode, it will drop every other byte (as a side effect 
it will of course corrupt surrogate pairs horribly)

public static void main(String[] args) {
    String a = "\u2345";
    System.out.println(getISOBytes(a).length); // returns 1
    System.out.println(convertToHex(getISOBytes(a)); // returns x45
}




Am I missing something? I'm concerned not just in the context of a password, 
but because the iText source code has getISOBytes all over the place 
(perhaps correctly so in some or most cases - I've not checked, just did a 
quick search)


------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.1t3xt.com/docs/book.php
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/

Reply via email to