[ 
https://issues.apache.org/jira/browse/TIKA-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17802344#comment-17802344
 ] 

Ross Johnson commented on TIKA-4175:
------------------------------------

Hi Tim, I shared a file via email.

In case it is helpful, I also did a bit of investigating of how / when Acrobat 
will attempt to immediately show the encrypted payload PDF instead of the 
wrapper document. I was disappointed to learn about the */Root* -> 
*/Collection* -> */D* dictionary property (described in Table 155 of PDF spec), 
which may contain the name of a file in the *EmbeddedFiles* name tree which the 
viewer is supposed to show as the initial document instead of the actual 
initial document. Removing or changing the name of this *D* property with my 
sample file causes Acrobat to just show the single "Please use Acrobat" page of 
the wrapper document.

> Additional IRM-protected PDFs should throw EncryptedDocumentException
> ---------------------------------------------------------------------
>
>                 Key: TIKA-4175
>                 URL: https://issues.apache.org/jira/browse/TIKA-4175
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Ross Johnson
>            Priority: Major
>         Attachments: image-2023-12-20-17-06-29-791.png, 
> image-2023-12-20-17-12-09-946.png
>
>
> I've come across some PDFs that use an Adobe IRM scheme, similar to 
> TIKA-4082, where a wrapper PDF contains an IRM-protected embedded PDF. These 
> wrapper PDFs do not currently throw because the structure is a bit different 
> than what is currently being looked for in PDFParser#checkEncryptedPayload().
> As best I can tell, this form of IRM was implemented by Adobe, but is 
> licensed to 3rd parties who then can market it as their own form of PDF 
> protection. The documents I've seen are from an IRM product from Interlinks, 
> but there are likely very similarly protected PDFs from other products.
> Opening the wrapper PDF in Adobe Reader / Acrobat prompts for a server 
> authentication (shown below). Opening in other viewers shows the wrapper 
> splash page, which indicates that the viewer is not secure and to use Adobe 
> Reader. !image-2023-12-20-17-06-29-791.png!
> The wrapper PDFs I've seen use PDF version 1.4 and have a somewhat generic 
> /EmbeddedFiles dictionary:  !image-2023-12-20-17-12-09-946.png!
> The encrypted PDF payloads I've seen have a somewhat interesting /Encrypt 
> dictionary with a Filter value of "Adobe.APS".
> {code:java}
> <<
>   /EDCData (...base64 string...)
>   /CF <<
>     /DefaultCryptFilter<</CFM/AESV3/Length 256>>
>   >>
>   /PDRLLic (...base64 string...)
>   /R 65537
>   /StmF /DefaultCryptFilter
>   /Filter /Adobe.APS
>   /EncryptMetadata true
>   /V 5
>   /StrF /DefaultCryptFilter
>   /PDRLPol (...base64 string...)
>   /SubFilter /adobe.pdrl.v0
> >>
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to