[
https://issues.apache.org/jira/browse/TIKA-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984984#comment-13984984
]
Jeremy Anderson commented on TIKA-1268:
---------------------------------------
This fix will break when PDFBox 2.0.0 is released and upgraded to. I may add a
new TIKA issue at some-point to reference a 2.0.0 upgrade, with a patch if I
implement one rather than commenting out this code. (I'm currently building
tika, pdfbox, and poi using daily snapshots.
See: PDFBOX-1893.
Essentially the org.apache.pdfbox.pdmodel.graphics.xobject package was removed
and logic from its classes were refactored across various other classes. This
TIKA fix heavily utilized classes from this package.
> Extract images from PDF documents
> ---------------------------------
>
> Key: TIKA-1268
> URL: https://issues.apache.org/jira/browse/TIKA-1268
> Project: Tika
> Issue Type: New Feature
> Components: parser
> Reporter: Jukka Zitting
> Assignee: Jukka Zitting
> Fix For: 1.6
>
>
> It would be nice if images within PDF documents could be extracted much like
> embedded attachments are now being handled.
--
This message was sent by Atlassian JIRA
(v6.2#6252)