[
https://issues.apache.org/jira/browse/PDFBOX-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14217487#comment-14217487
]
Tilman Hausherr commented on PDFBOX-1674:
-----------------------------------------
PDF-Tools has also many problems with that file:
Disney-Flash.pdf does not conform to PDF/A.
Validating file "Disney-Flash.pdf" for conformance level pdfa-1b
The separator after an 'obj' must be an EOL. (2)
The required XMP property 'pdfaid:part' is missing.
The required XMP property 'pdfaid:conformance' is missing.
The key Subtype has a value Screen which is prohibited.
The key F is required but missing.
A device-specific color space (DeviceGray) without an appropriate output intent
is used.
The value of the key F is 'Hidden' but must be 'Not Hidden'.
The appearance dictionary doesn't contain an entry.
The dictionary must not contain the key 'AA'.
The key S has a value JavaScript which is prohibited.
The dictionary must not contain the key 'A'.
A device-specific color space (DeviceRGB) without an appropriate output intent
is used.
The font Helvetica must be embedded.
The font Times-Roman must be embedded.
The font Verdana must be embedded.
The font Arial,Bold must be embedded.
The font Verdana,Bold must be embedded.
The font Arial must be embedded.
The document does not conform to the requested standard.
The file format (header, trailer, objects, xref, streams) is corrupted.
The document contains device-specific color spaces.
The document contains fonts without embedded font programs or encoding
information (CMAPs).
The document contains unknown annotation types.
The document contains hidden, invisible, non-viewable or non-printable
annotations.
The document contains annotations or form fields with ambigous or without
appropriate appearances.
The document contains actions types other than for navigation (Launch,
JavaScript, ResetForm, etc.).
The document's meta data is either missing or inconsistent or corrupt.
Done.
> Preflight doesn't correctly parse PDF if obj identifier not followed by line
> terminator
> ---------------------------------------------------------------------------------------
>
> Key: PDFBOX-1674
> URL: https://issues.apache.org/jira/browse/PDFBOX-1674
> Project: PDFBox
> Issue Type: Bug
> Components: Preflight
> Affects Versions: 2.0.0
> Environment: Win 7
> Reporter: Johan van der Knijff
> Assignee: Eric Leleu
> Priority: Minor
> Fix For: 1.8.3, 2.0.0
>
>
> For some test files on the Adobe Acrobat Engineering website, Preflight
> output looks like this:
> <preflight name="Disney-Flash.pdf">
> <executionTimeMS>210</executionTimeMS>
> <isValid type="">false</isValid>
> <errors count="3">
> <error count="1">
> <code>1.0</code>
> <details>Syntax error, Expected pattern 'obj but missed at character
> 'o'</details>
> </error>
> <error count="1">
> <code>1.2.1</code>
> <details>Body Syntax error, Expected pattern 'obj but missed at
> character 'o'</details>
> </error>
> <error count="1">
> <code>1.2.1</code>
> <details>Body Syntax error, Single space expected</details>
> </error>
> </errors>
> </preflight>
> Which suggests that Preflight doesn't correctly parse the objects. This is
> confirmed by a look at some of the offending PDFs in a hex editor, which
> reveals that the object identifiers in them are not terminated by any EOL
> character(s). AFAIK this is allowed in both PDF and PDF/A-1. More details +
> links to test files here ('Multimedia' table and below):
> http://www.openplanetsfoundation.org/blogs/2013-07-25-identification-pdf-preservation-risks-sequel
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)