[ 
https://issues.apache.org/jira/browse/PDFBOX-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler resolved PDFBOX-465.
---------------------------------------

       Resolution: Fixed
    Fix Version/s: 2.0.0
                   1.8.3
         Assignee: Andreas Lehmkühler

Solved after solving PDFBOX-1633

> invalid date formats 
> ---------------------
>
>                 Key: PDFBOX-465
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-465
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 0.8.0-incubator
>            Reporter: Sean Bridges
>            Assignee: Andreas Lehmkühler
>             Fix For: 1.8.3, 2.0.0
>
>         Attachments: SimpleDateParsingTest.java
>
>
> This is with the latest from svn, Revision: 773978
> From a sample of 13304 pdf documents generated in a very wide variety of 
> ways, I got 94 invalid date formats,
> It seems that all of these have the stack trace of,
> Caused by: java.io.IOException: Error converting date:Friday, July 11, 2008
>       at 
> org.apache.pdfbox.util.DateConverter.toCalendar(DateConverter.java:240)
>       at 
> org.apache.pdfbox.util.DateConverter.toCalendar(DateConverter.java:120)
>       at org.apache.pdfbox.cos.COSDictionary.getDate(COSDictionary.java:783)
>       at 
> org.apache.pdfbox.pdmodel.PDDocumentInformation.getCreationDate(PDDocumentInformation.java:218)
>       at 
> message_analyzer.extractor.PDFExtractor.getContent(PDFExtractor.java:50)
> Some examples of invalid dates are,
> 20070430193647+713'00'
> Tue Aug 21 10:35:22 2007
> Tuesday, November 04, 2008
> 200712172:2:3 
> Unknown
> 20090319 200122
> 9:47 5/12/2008
> i don't think there is any hope of parsing all these date formats.  If would 
> be nice if this was not a fatal error, and the parser could continue without 
> a creation date. 
> Is the policy of pdfbox to be as forgiving as possible when reading pdf 
> documents?  Maybe toCalendar should return a new Calendar() if parsing fails, 
> rather than throwing.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to