Add Tika parsers for PDF and TTF -------------------------------- Key: PDFBOX-1132 URL: https://issues.apache.org/jira/browse/PDFBOX-1132 Project: PDFBox Issue Type: New Feature Components: FontBox, Parsing Reporter: Jukka Zitting
The PDF and TTF parsers in Apache Tika rely more on improvements in PDFBox than on those in Tika, so it would make more sense for that code to reside inside Apache PDFBox. Having the code inside PDFBox would allow for tighter integration with PDFBox internals and avoid need to wait for an official PDFBox release before new features can be used inside the PDF and TTF parsers. To do this, I'd migrate the code PDF and TTF parser classes and related test cases and files from Tika to the PDFBox and FontBox components. We'd add an optional dependency to tika-core to these components, so people who don't use or need Tika wouldn't be affected. I'll attach a patch with the proposed changes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira