[ 
https://issues.apache.org/jira/browse/TIKA-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362488#comment-14362488
 ] 

Tyler Palsulich commented on TIKA-1161:
---------------------------------------

I'm seeing the following metadata (no date field) with Tika 1.8-SNAPSHOT:
{code}
Author: Luke Bo'sher
Content-Length: 576757
Content-Type: application/pdf
X-Parsed-By: org.apache.tika.parser.DefaultParser
X-Parsed-By: org.apache.tika.parser.pdf.PDFParser
access_permission:assemble_document: true
access_permission:can_modify: true
access_permission:can_print: true
access_permission:can_print_degraded: true
access_permission:extract_content: true
access_permission:extract_for_accessibility: true
access_permission:fill_in_form: true
access_permission:modify_annotations: true
creator: Luke Bo'sher
dc:creator: Luke Bo'sher
dc:format: application/pdf; version=1.3
dc:title: Microsoft Word - WorkChoices Submission.doc
meta:author: Luke Bo'sher
pdf:PDFVersion: 1.3
pdf:encrypted: false
producer: Mac OS X 10.4.7 Quartz PDFContext
resourceName: WF_16_Youth_Coalition.pdf
title: Microsoft Word - WorkChoices Submission.doc
xmp:CreatorTool: Word
xmpTPg:NPages: 20
{code}

> Dates incorrectly extracted from PDF
> ------------------------------------
>
>                 Key: TIKA-1161
>                 URL: https://issues.apache.org/jira/browse/TIKA-1161
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.4
>         Environment: Windows 7 64bit, JDK 1.7
>            Reporter: Nicolas Guillaumin
>            Priority: Minor
>              Labels: pdf
>         Attachments: WF_16_Youth_Coalition.pdf
>
>
> Tika incorrectly extracts the date on the attached PDF to 
> 5034-09-24T14:03:00Z, whereas the actual date on the PDF seems to be 
> 2007-03-01 10:58:57 according to FoxIt reader.
> Interestingly PDFBox 1.8.2 is extracting the correct date as well (When using 
> the PDFDebugger tool)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to