[jira] [Commented] (PDFBOX-1792) Different metadata with NonSequentialPDFParser

JIRA Mon, 09 Mar 2015 11:36:13 -0700

    [ 
https://issues.apache.org/jira/browse/PDFBOX-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353351#comment-14353351
 ]


Andreas Lehmkühler commented on PDFBOX-1792:
--------------------------------------------

Hmm, that was my first thought, too. But the new schema class has to define the 
structure and the datatypes of the new supported schema. Maybe it's just me, 
but I didn't know how to do that without investigating as lot. So, any help is 
welcome. :-)

> Different metadata with NonSequentialPDFParser
> ----------------------------------------------
>
>                 Key: PDFBOX-1792
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1792
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.8.3
>            Reporter: Tim Allison
>            Assignee: Andreas Lehmkühler
>            Priority: Minor
>         Attachments: PDFBOX-1792.tar.gz, testPDF_acroForm2.pdf
>
>
> The traditional parser is able to extract metadata from a test document from 
> TIKA-738.  The NonSequentialPDFParser is not able to extract metadata from 
> that file.  Another file from the Tika test suite has metadata that can be 
> extracted by the NonSequentialPDFParser but not by classic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (PDFBOX-1792) Different metadata with NonSequentialPDFParser

Reply via email to