[
https://issues.apache.org/jira/browse/TIKA-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1297.
-------------------------------
Resolution: Fixed
Fix Version/s: 1.6
> Images not being extracted from PDFs
> ------------------------------------
>
> Key: TIKA-1297
> URL: https://issues.apache.org/jira/browse/TIKA-1297
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.5
> Reporter: James Baker
> Fix For: 1.6
>
>
> Images embedded within PDF documents are not being extracted by Tika. I have
> tested this via the command line (where the -z option fails to extract any
> images), and by inspecting the XHTML version of the PDF produced by Tika
> (where the image tags are not included in the output).
> The images are extractable by PDFBox, so Tika should be able to extract them
> and include them in the XHTML output.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)