Isabelle Giguere created TIKA-2666:
--------------------------------------
Summary: Document last printed in the year 27321
Key: TIKA-2666
URL: https://issues.apache.org/jira/browse/TIKA-2666
Project: Tika
Issue Type: Bug
Affects Versions: 1.17
Reporter: Isabelle Giguere
Attachments: Genetic_Factors_and_the_Directionality_of.ppt,
PPT_lastPrinted_00.png, tika-app-1.17.metadata.txt
Tika extracts a strange last print date for the attached PowerPoint (97-2003)
In the attached screen shot PPT_lastPrinted_00.png, the date for last print was
set to 00:00
But when Tika extracts metadata from this document, the last print date is in
the year 27321 !
Last-Printed: 27321-01-23T08:20:12Z
meta:print-date: 27321-01-23T08:20:12Z
Attached metadata obtained using Tika 1.17
This weird date is causing issues further down in processing. We can probably
filter it out for now, but I do wonder how 00:00 turns into
27321-01-23T08:20:12Z
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)