[jira] [Updated] (TIKA-3844) Improve extraction of PDF subset info

Tim Allison (Jira) Thu, 01 Sep 2022 03:24:30 -0700


     [ 
https://issues.apache.org/jira/browse/TIKA-3844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Tim Allison updated TIKA-3844:
------------------------------
    Description: 
We're extracting PDFA part and conformance. We should add extraction for VT, 
UA, and X.

We should also finally get rid of the bad hack from 1.x that appended the pdfa 
conformance to the file type.

I'd like to thank Peter Wyatt for everything that was right about this 
improvement.  The other stuff is all mine. :D

  was:
We're extracting PDFA part and conformance. We should add extraction for VT, 
UA, and X.

We should also finally get rid of the bad hack from 1.x that appended the pdfa 
conformance to the file type.


> Improve extraction of PDF subset info
> -------------------------------------
>
>                 Key: TIKA-3844
>                 URL: https://issues.apache.org/jira/browse/TIKA-3844
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Minor
>             Fix For: 2.4.2
>
>
> We're extracting PDFA part and conformance. We should add extraction for VT, 
> UA, and X.
> We should also finally get rid of the bad hack from 1.x that appended the 
> pdfa conformance to the file type.
> I'd like to thank Peter Wyatt for everything that was right about this 
> improvement.  The other stuff is all mine. :D



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (TIKA-3844) Improve extraction of PDF subset info

Reply via email to