[
https://issues.apache.org/jira/browse/TIKA-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexandre Madurell updated TIKA-1232:
-------------------------------------
Attachment: Sample 10.x.pdf
Sample 9.x.pdf
Sample 8.x.pdf
Sample 7.x.pdf
Sample 6.x.pdf
Sample 5.x.pdf
Sample 4.x.pdf
Here go:
Sample 4.x.pdf (PDF Version 1.3)
Sample 5.x.pdf (PDF Version 1.4)
Sample 6.x.pdf (PDF Version 1.5)
Sample 7.x.pdf (PDF Version 1.6)
Sample 8.x.pdf (PDF Version 1.7)
Sample 9.x.pdf (PDF Version 1.7 Adobe Extension Level 3)
Sample 10.x.pdf (PDF Version 1.7 Adobe Extension Level 8)
Sample 11.x.pdf coming up next
> Add PDF version to PDFParser output
> -----------------------------------
>
> Key: TIKA-1232
> URL: https://issues.apache.org/jira/browse/TIKA-1232
> Project: Tika
> Issue Type: Improvement
> Components: parser
> Affects Versions: 1.5
> Environment: JDK6
> Reporter: William Palmer
> Assignee: Tim Allison
> Priority: Minor
> Attachments: Sample 10.x.pdf, Sample 4.x.pdf, Sample 5.x.pdf, Sample
> 6.x.pdf, Sample 7.x.pdf, Sample 8.x.pdf, Sample 9.x.pdf, TIKA-1232v1.patch,
> TIKA-1232v2.patch, pdfversion.patch
>
>
> I'd like to identify the PDF version of files, this is not currently reported
> by the PDFParser although the information is available via PDFBox. I have
> attached a patch that adds the format version to the Metadata object.
> However, I am not familiar enough with the Tika source to know if an
> alternative metadata key should be used, or this new one added.
> Comments welcome.
--
This message was sent by Atlassian JIRA
(v6.2#6252)