Tim Allison created TIKA-4010:
---------------------------------

             Summary: Add boolean metadata element for isLinearized for PDFs
                 Key: TIKA-4010
                 URL: https://issues.apache.org/jira/browse/TIKA-4010
             Project: Tika
          Issue Type: New Feature
            Reporter: Tim Allison


Other tools such as pdfinfo extract information about whether or not the PDF is 
linearized.  We should do that as well.

In PDFBox 3.x, we can simply call {{.getLinearizedDictionary()}} on the 
COSDocument.  In 2.x, I tried to port that underlying code with no success -- 
the Linearized dictionary was not being parsed as a dictionary.

I don't think this has a high priority.  I'm happy enough waiting for 3.x.  
However, if there's a straightforward way to do this with 2.x, let's do that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to