[ 
https://issues.apache.org/jira/browse/PDFBOX-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14217451#comment-14217451
 ] 

Ralf Hauser commented on PDFBOX-1674:
-------------------------------------

more precise error message with patch:

Index: 
preflight/src/main/java/org/apache/pdfbox/preflight/metadata/PDFAIdentificationValidation.java
===================================================================
--- 
preflight/src/main/java/org/apache/pdfbox/preflight/metadata/PDFAIdentificationValidation.java
      (revision 1637680)
+++ 
preflight/src/main/java/org/apache/pdfbox/preflight/metadata/PDFAIdentificationValidation.java
      (working copy)
@@ -58,7 +58,7 @@
         PDFAIdentificationSchema id = metadata.getPDFIdentificationSchema();
         if (id == null)
         {
-            ve.add(new ValidationError(ERROR_METADATA_PDFA_ID_MISSING));
+            ve.add(new ValidationError(ERROR_METADATA_PDFA_ID_MISSING, 
"PDFAIdentificationSchema is null"));
             return ve;
         }





7.11 : Error on MetaData, PDFAIdentificationSchema is null
        at 
org.apache.pdfbox.preflight.metadata.PDFAIdentificationValidation.validatePDFAIdentifer(PDFAIdentificationValidation.java:61)
        at 
org.apache.pdfbox.preflight.process.MetadataValidationProcess.validate(MetadataValidationProcess.java:87)
        at 
org.apache.pdfbox.preflight.utils.ContextHelper.callValidation(ContextHelper.java:73)
        at 
org.apache.pdfbox.preflight.utils.ContextHelper.validateElement(ContextHelper.java:88)
        at 
org.apache.pdfbox.preflight.PreflightDocument.validate(PreflightDocument.java:168)

> Preflight doesn't correctly parse PDF if obj identifier not followed by line 
> terminator
> ---------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-1674
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1674
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Preflight
>    Affects Versions: 2.0.0
>         Environment: Win 7
>            Reporter: Johan van der Knijff
>            Assignee: Eric Leleu
>            Priority: Minor
>             Fix For: 1.8.3, 2.0.0
>
>
> For some test files on the Adobe Acrobat Engineering website, Preflight 
> output looks like this:
> <preflight name="Disney-Flash.pdf">
>   <executionTimeMS>210</executionTimeMS>
>   <isValid type="">false</isValid>
>   <errors count="3">
>     <error count="1">
>       <code>1.0</code>
>       <details>Syntax error, Expected pattern 'obj but missed at character 
> 'o'</details>
>     </error>
>     <error count="1">
>       <code>1.2.1</code>
>       <details>Body Syntax error, Expected pattern 'obj but missed at 
> character 'o'</details>
>     </error>
>     <error count="1">
>       <code>1.2.1</code>
>       <details>Body Syntax error, Single space expected</details>
>     </error>
>   </errors>
> </preflight>
> Which suggests that Preflight doesn't correctly parse the objects. This is 
> confirmed by a look at some of the offending PDFs in a hex editor, which 
> reveals that the object identifiers in them are not terminated by any EOL 
> character(s). AFAIK this is allowed in both PDF and PDF/A-1. More details + 
> links to test files here ('Multimedia' table and below):
> http://www.openplanetsfoundation.org/blogs/2013-07-25-identification-pdf-preservation-risks-sequel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to