Christian Appl created PDFBOX-5249:
--------------------------------------

             Summary: AES128 PK decryption failure for documents with exposed 
metadata.
                 Key: PDFBOX-5249
                 URL: https://issues.apache.org/jira/browse/PDFBOX-5249
             Project: PDFBox
          Issue Type: Improvement
          Components: Parsing, Writing
    Affects Versions: 2.0.24
            Reporter: Christian Appl
         Attachments: image-2021-07-29-15-28-15-341.png, 
image-2021-07-29-15-30-35-928.png, image-2021-07-29-15-31-14-097.png

*TL;DR:*
 I attempted to decrypt a AES128-PK secured document with exposed metadata and 
encountered an Exception, even though the decryption should have worked.
 I tried to fix the cause and tried to implement an improved handling for 
"EncryptMetadata" during AES en-/decryption on the way.

*Initial Issue:*
 When creating a PDF document (for example using Adobe Document Creator) and 
selecting AES128 Public Key ("Certificate Security") encryption, one can select 
to expose the metadata by excluding it from encryption.
 (Via setting the encryption dictionary "EncryptMetadata" to false and so 
forth.)

If it is attempted to decrypt such a document using PDFBox, an error is 
encountered - indicating the failure to use the given Material (Private 
Key/Certificate) to identify a recipient of that document. This exception is 
encountered even if the material is verifably usable to decrypt the document in 
other tools.

*Debugging:*
 The AES128 (V4) encryption leads to changes to the actual encryption key, when 
metadata is exposed. As is documented in PDF 7 reference manual "7.6.4.3 
Public-Key Encryption Algorithm" such a encryption key shall contain 4bytes 
with content 0xFF, that shall indicate the exposed metadata. (Which is then 
included in the SHA-1 message digest and truncated to a given key length.)
As far as I could see in class PublicKeySecurityHandler#prepareForDecryption() 
those bytes were not included in the message digest. 
Therefore I had to assume, that the given decryption material was failing for 
that reason - as the document's strings and streams would most likely be 
encrypted with a key, that was including those bytes.
Hence I altered the method as follows:
!image-2021-07-29-15-30-35-928.png!

Also: The PDFBox implementation claims (rightly so and according to 
specification), that the document's encryption dictionary shall hold a field 
"EncryptMetadata", that shall easily identify whether metadata is exposed in a 
given document.
But the document I produced, instead contained said Field as part of the 
"DefaultCryptFilter":
!image-2021-07-29-15-28-15-341.png!
I was unable to find this behaviour documented in the reference manual, but 
decided that I would prefer if it was supported anyway. Resulting in further 
changes:
!image-2021-07-29-15-31-14-097.png!
**I succeeded - The document could now be decrypted using the given Material.

*Follow-up Questions:*
While examining the classes "StandardSecurityHandler" and the 
"PublicKeySecurityHandler" I found comments mentioning or solving some issues 
concerning the "EncryptMetadata" flag and it's consequences for other 
structures.
The question was, whether I could implement that feature for V4 and V5 
encryption - and so I did, resulting in the patch you will find in the 
attachments.
*Caveat:*
I'm aware that some of the changes hereby made could be problematic for some 
reason, that I can not see yet. (especially those made to "COSBase"(!) and 
"PDCryptFilterDictionary") Those changes are naive and are made under the 
premise to reach the given goal most directly and as fast as possible.
I would assume, that this patch is rather a mere suggestion, than something 
that can or should be merged directly and without inspection.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to