[jira] [Commented] (TIKA-1268) Extract images from PDF documents

Jeremy Anderson (JIRA) Tue, 29 Apr 2014 17:00:52 -0700

    [ 
https://issues.apache.org/jira/browse/TIKA-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984984#comment-13984984
 ]


Jeremy Anderson commented on TIKA-1268:
---------------------------------------

This fix will break when PDFBox 2.0.0 is released and upgraded to.  I may add a 
new TIKA issue at some-point to reference a 2.0.0 upgrade, with a patch if I 
implement one rather than commenting out this code.  (I'm currently building 
tika, pdfbox, and poi using daily snapshots.

See: PDFBOX-1893.

Essentially the org.apache.pdfbox.pdmodel.graphics.xobject package was removed 
and logic from its classes were refactored across various other classes.  This 
TIKA fix heavily utilized classes from this package.

> Extract images from PDF documents
> ---------------------------------
>
>                 Key: TIKA-1268
>                 URL: https://issues.apache.org/jira/browse/TIKA-1268
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>             Fix For: 1.6
>
>
> It would be nice if images within PDF documents could be extracted much like 
> embedded attachments are now being handled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (TIKA-1268) Extract images from PDF documents

Reply via email to