[ 
https://issues.apache.org/jira/browse/TIKA-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350521#comment-17350521
 ] 

Tim Allison commented on TIKA-3416:
-----------------------------------

This has some overlaps with TIKA-3348 but is distinct.

> Extract logical images from PDFs
> --------------------------------
>
>                 Key: TIKA-3416
>                 URL: https://issues.apache.org/jira/browse/TIKA-3416
>             Project: Tika
>          Issue Type: New Feature
>            Reporter: Tim Allison
>            Priority: Major
>
> PDFs, bless their hearts, can store a logical image as hundreds or thousands 
> of subimages that when rendered, look like one image.  
> We currently have the option to let the user render the page and run OCR on 
> that rendered image, or the user can extract inline images.  There has to be 
> a happier medium, and the user should get back the rendering in, e.g., the 
> /unpack endpoint (see TIKA-3348).
> It would be handy for some use cases to do the geometry to find bounding 
> boxes for image components and then render those bounding boxes so that a 
> human gets a "logical image" <hand_waving>most of the time</hand_waving>.
> There would have to be some heuristics for when to give up and just render 
> the whole page, but I think we could do something that performed well enough. 
>  More importantly, I'm sure this is a solved problem...any recs for efficient 
> algorithms for this?
> What do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to