[jira] [Commented] (PDFBOX-1792) Different metadata with NonSequentialPDFParser

JIRA Sun, 08 Mar 2015 07:28:27 -0700

    [ 
https://issues.apache.org/jira/browse/PDFBOX-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14352060#comment-14352060
 ]


Andreas Lehmkühler commented on PDFBOX-1792:
--------------------------------------------

The testcase is in place again as Thomas reverted his changes some time ago.

The metadata of the attached pdf can't be extracted as it contains the 
unsupported namespace "http://ns.adobe.com/xfa/promoted-desc/";

[~msahyoun] I can't find any detailed information about that namespace. It 
seems to be related to Adobe Lifecylce. Can you shed some light on this?

> Different metadata with NonSequentialPDFParser
> ----------------------------------------------
>
>                 Key: PDFBOX-1792
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1792
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.8.3
>            Reporter: Tim Allison
>            Assignee: Andreas Lehmkühler
>            Priority: Minor
>         Attachments: PDFBOX-1792.tar.gz, testPDF_acroForm2.pdf
>
>
> The traditional parser is able to extract metadata from a test document from 
> TIKA-738.  The NonSequentialPDFParser is not able to extract metadata from 
> that file.  Another file from the Tika test suite has metadata that can be 
> extracted by the NonSequentialPDFParser but not by classic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (PDFBOX-1792) Different metadata with NonSequentialPDFParser

Reply via email to