[jira] [Commented] (TIKA-1203) Some metadata not extracted from PDF files when NonSequentialPDFParser is used

Tyler Palsulich (JIRA) Mon, 16 Mar 2015 12:37:08 -0700

    [ 
https://issues.apache.org/jira/browse/TIKA-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363760#comment-14363760
 ]


Tyler Palsulich commented on TIKA-1203:
---------------------------------------

The two types of PDF Parsers still get different Metadata. The comment above 
about dates was in regard to [a commented 
out|https://github.com/apache/tika/blob/trunk/tika-parsers/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java#L103-L105]
 section of PDFParserTest.java which says dates can't be reliably tested. But, 
now that TIKA-451 was resolved, I think we can uncomment those two lines.

> Some metadata not extracted from PDF files when NonSequentialPDFParser is used
> ------------------------------------------------------------------------------
>
>                 Key: TIKA-1203
>                 URL: https://issues.apache.org/jira/browse/TIKA-1203
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>            Reporter: Tim Allison
>            Priority: Minor
>
> While working on TIKA-1201, I noticed that metadata was not being extracted 
> from the testAnnotations.pdf file when the NonSequentialPDFParser was being 
> used.  I opened PDFBOX-1792.  This TIKA issue is a placeholder.  When 
> PDFBOX-1792 is fixed, we can stop skipping "testAnnotations.pdf" in 
> PDFParserTest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TIKA-1203) Some metadata not extracted from PDF files when NonSequentialPDFParser is used

Reply via email to