Christian Appl created PDFBOX-5249:
--------------------------------------
Summary: AES128 PK decryption failure for documents with exposed
metadata.
Key: PDFBOX-5249
URL: https://issues.apache.org/jira/browse/PDFBOX-5249
Project: PDFBox
Issue Type: Improvement
Components: Parsing, Writing
Affects Versions: 2.0.24
Reporter: Christian Appl
Attachments: image-2021-07-29-15-28-15-341.png,
image-2021-07-29-15-30-35-928.png, image-2021-07-29-15-31-14-097.png
*TL;DR:*
I attempted to decrypt a AES128-PK secured document with exposed metadata and
encountered an Exception, even though the decryption should have worked.
I tried to fix the cause and tried to implement an improved handling for
"EncryptMetadata" during AES en-/decryption on the way.
*Initial Issue:*
When creating a PDF document (for example using Adobe Document Creator) and
selecting AES128 Public Key ("Certificate Security") encryption, one can select
to expose the metadata by excluding it from encryption.
(Via setting the encryption dictionary "EncryptMetadata" to false and so
forth.)
If it is attempted to decrypt such a document using PDFBox, an error is
encountered - indicating the failure to use the given Material (Private
Key/Certificate) to identify a recipient of that document. This exception is
encountered even if the material is verifably usable to decrypt the document in
other tools.
*Debugging:*
The AES128 (V4) encryption leads to changes to the actual encryption key, when
metadata is exposed. As is documented in PDF 7 reference manual "7.6.4.3
Public-Key Encryption Algorithm" such a encryption key shall contain 4bytes
with content 0xFF, that shall indicate the exposed metadata. (Which is then
included in the SHA-1 message digest and truncated to a given key length.)
As far as I could see in class PublicKeySecurityHandler#prepareForDecryption()
those bytes were not included in the message digest.
Therefore I had to assume, that the given decryption material was failing for
that reason - as the document's strings and streams would most likely be
encrypted with a key, that was including those bytes.
Hence I altered the method as follows:
!image-2021-07-29-15-30-35-928.png!
Also: The PDFBox implementation claims (rightly so and according to
specification), that the document's encryption dictionary shall hold a field
"EncryptMetadata", that shall easily identify whether metadata is exposed in a
given document.
But the document I produced, instead contained said Field as part of the
"DefaultCryptFilter":
!image-2021-07-29-15-28-15-341.png!
I was unable to find this behaviour documented in the reference manual, but
decided that I would prefer if it was supported anyway. Resulting in further
changes:
!image-2021-07-29-15-31-14-097.png!
**I succeeded - The document could now be decrypted using the given Material.
*Follow-up Questions:*
While examining the classes "StandardSecurityHandler" and the
"PublicKeySecurityHandler" I found comments mentioning or solving some issues
concerning the "EncryptMetadata" flag and it's consequences for other
structures.
The question was, whether I could implement that feature for V4 and V5
encryption - and so I did, resulting in the patch you will find in the
attachments.
*Caveat:*
I'm aware that some of the changes hereby made could be problematic for some
reason, that I can not see yet. (especially those made to "COSBase"(!) and
"PDCryptFilterDictionary") Those changes are naive and are made under the
premise to reach the given goal most directly and as fast as possible.
I would assume, that this patch is rather a mere suggestion, than something
that can or should be merged directly and without inspection.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]