[ 
https://issues.apache.org/jira/browse/PDFBOX-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324002#comment-17324002
 ] 

Tim Allison edited comment on PDFBOX-5166 at 4/16/21, 6:07 PM:
---------------------------------------------------------------

Extraction only, yes...for our purposes on Tika, we wouldn't have any need to 
add or modify.  I'm ok with Tilman's example code for now, but I worry that 
we'll likely come across some required special handling that it would be better 
to have in PDFBox.  

This isn't high priority, and I don't see a need to backport to 2.x.

Separate topic...I'm wondering now if there are other annotation types that 
might conceal embedded files?


was (Author: [email protected]):
Extraction only, yes...for our purposes on Tika, we wouldn't have any need to 
add or modify.  I'm ok with Tilman's example code for now, but I worry that 
we'll likely come across some required special handling that'd it would be 
better to have in PDFBox.  

This isn't high priority, and I don't see a need to backport to 2.x.

Separate topic...I'm wondering now if there are other annotation types that 
might conceal embedded files?

> Implement RichMedia annotation
> ------------------------------
>
>                 Key: PDFBOX-5166
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5166
>             Project: PDFBox
>          Issue Type: New Feature
>            Reporter: Tim Allison
>            Priority: Minor
>         Attachments: testFlashInPDF.pdf
>
>
> See TIKA-3359.  The attached file as an embedded Flash/swf file.  Tika is not 
> currently extracting the embedded file.
> In the debugger, I can see the Annotation as a PDAnnotationUnknown.  In the 
> COSDictionary, I can see the subtype is "RichMedia".  If someone has the 
> time, it'd be great to implement this so that we can extract more attachments 
> in Tika...  Obv, others may find use too. :D
> Many thanks to Tyler Thorsted for the test file and many thanks to 
> @terminalboredom and @beet_keeper.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to