[ 
https://issues.apache.org/jira/browse/TIKA-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14145369#comment-14145369
 ] 

Tim Allison commented on TIKA-1396:
-----------------------------------

Thank you for attaching a test file!  I'll take a look tomorrow.  Have you had 
any luck with any other PDFs or is there something about this file that is 
failing but you're otherwise seeing success with extracting images from PDFs?

Also, have you, by chance, tried PDFBox's ExtractImages?  If not, I'll give 
that a go tomorrow...

Thank you, again.

> Embedded images in PDF documents
> --------------------------------
>
>                 Key: TIKA-1396
>                 URL: https://issues.apache.org/jira/browse/TIKA-1396
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.5
>         Environment: *OS:* 
> Ubuntu 14.04.1 LTS
> *KERNEL:*
> 3.13.0-33-generic 
> gcc version 4.8.2
> *JAVA:*
> java version "1.8.0_11"
> Java(TM) SE Runtime Environment (build 1.8.0_11-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.11-b03, mixed mode)
>            Reporter: Damiano
>            Priority: Critical
>             Fix For: 1.6
>
>         Attachments: tika_images.pdf
>
>
> Hello!
> I just found a problem with PDF documents that have embedded images.
> Doing:
> java -jar tika-app-1.5.jar --extract tika.pdf
> Tika can not find the image.
> Is this a PDF related problem? Because if i do the same operation with a DOC 
> document Tika finds the image correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to