[
https://issues.apache.org/jira/browse/PDFBOX-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14217451#comment-14217451
]
Ralf Hauser commented on PDFBOX-1674:
-------------------------------------
more precise error message with patch:
Index:
preflight/src/main/java/org/apache/pdfbox/preflight/metadata/PDFAIdentificationValidation.java
===================================================================
---
preflight/src/main/java/org/apache/pdfbox/preflight/metadata/PDFAIdentificationValidation.java
(revision 1637680)
+++
preflight/src/main/java/org/apache/pdfbox/preflight/metadata/PDFAIdentificationValidation.java
(working copy)
@@ -58,7 +58,7 @@
PDFAIdentificationSchema id = metadata.getPDFIdentificationSchema();
if (id == null)
{
- ve.add(new ValidationError(ERROR_METADATA_PDFA_ID_MISSING));
+ ve.add(new ValidationError(ERROR_METADATA_PDFA_ID_MISSING,
"PDFAIdentificationSchema is null"));
return ve;
}
7.11 : Error on MetaData, PDFAIdentificationSchema is null
at
org.apache.pdfbox.preflight.metadata.PDFAIdentificationValidation.validatePDFAIdentifer(PDFAIdentificationValidation.java:61)
at
org.apache.pdfbox.preflight.process.MetadataValidationProcess.validate(MetadataValidationProcess.java:87)
at
org.apache.pdfbox.preflight.utils.ContextHelper.callValidation(ContextHelper.java:73)
at
org.apache.pdfbox.preflight.utils.ContextHelper.validateElement(ContextHelper.java:88)
at
org.apache.pdfbox.preflight.PreflightDocument.validate(PreflightDocument.java:168)
> Preflight doesn't correctly parse PDF if obj identifier not followed by line
> terminator
> ---------------------------------------------------------------------------------------
>
> Key: PDFBOX-1674
> URL: https://issues.apache.org/jira/browse/PDFBOX-1674
> Project: PDFBox
> Issue Type: Bug
> Components: Preflight
> Affects Versions: 2.0.0
> Environment: Win 7
> Reporter: Johan van der Knijff
> Assignee: Eric Leleu
> Priority: Minor
> Fix For: 1.8.3, 2.0.0
>
>
> For some test files on the Adobe Acrobat Engineering website, Preflight
> output looks like this:
> <preflight name="Disney-Flash.pdf">
> <executionTimeMS>210</executionTimeMS>
> <isValid type="">false</isValid>
> <errors count="3">
> <error count="1">
> <code>1.0</code>
> <details>Syntax error, Expected pattern 'obj but missed at character
> 'o'</details>
> </error>
> <error count="1">
> <code>1.2.1</code>
> <details>Body Syntax error, Expected pattern 'obj but missed at
> character 'o'</details>
> </error>
> <error count="1">
> <code>1.2.1</code>
> <details>Body Syntax error, Single space expected</details>
> </error>
> </errors>
> </preflight>
> Which suggests that Preflight doesn't correctly parse the objects. This is
> confirmed by a look at some of the offending PDFs in a hex editor, which
> reveals that the object identifiers in them are not terminated by any EOL
> character(s). AFAIK this is allowed in both PDF and PDF/A-1. More details +
> links to test files here ('Multimedia' table and below):
> http://www.openplanetsfoundation.org/blogs/2013-07-25-identification-pdf-preservation-risks-sequel
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)