[
https://issues.apache.org/jira/browse/PDFBOX-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324002#comment-17324002
]
Tim Allison edited comment on PDFBOX-5166 at 4/16/21, 6:07 PM:
---------------------------------------------------------------
Extraction only, yes...for our purposes on Tika, we wouldn't have any need to
add or modify. I'm ok with Tilman's example code for now, but I worry that
we'll likely come across some required special handling that it would be better
to have in PDFBox.
This isn't high priority, and I don't see a need to backport to 2.x.
Separate topic...I'm wondering now if there are other annotation types that
might conceal embedded files?
was (Author: [email protected]):
Extraction only, yes...for our purposes on Tika, we wouldn't have any need to
add or modify. I'm ok with Tilman's example code for now, but I worry that
we'll likely come across some required special handling that'd it would be
better to have in PDFBox.
This isn't high priority, and I don't see a need to backport to 2.x.
Separate topic...I'm wondering now if there are other annotation types that
might conceal embedded files?
> Implement RichMedia annotation
> ------------------------------
>
> Key: PDFBOX-5166
> URL: https://issues.apache.org/jira/browse/PDFBOX-5166
> Project: PDFBox
> Issue Type: New Feature
> Reporter: Tim Allison
> Priority: Minor
> Attachments: testFlashInPDF.pdf
>
>
> See TIKA-3359. The attached file as an embedded Flash/swf file. Tika is not
> currently extracting the embedded file.
> In the debugger, I can see the Annotation as a PDAnnotationUnknown. In the
> COSDictionary, I can see the subtype is "RichMedia". If someone has the
> time, it'd be great to implement this so that we can extract more attachments
> in Tika... Obv, others may find use too. :D
> Many thanks to Tyler Thorsted for the test file and many thanks to
> @terminalboredom and @beet_keeper.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]