[ https://issues.apache.org/jira/browse/SOLR-2550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steven Rowe resolved SOLR-2550. ------------------------------- Resolution: Fixed Fix Version/s: 3.1 Assignee: Steven Rowe Solr Cell upgraded to Tika 0.8, which included PDFbox 1.1.0, in the Solr 3.1 release. > Apache Solr needs an updated TIKA version in its extraction libraries > --------------------------------------------------------------------- > > Key: SOLR-2550 > URL: https://issues.apache.org/jira/browse/SOLR-2550 > Project: Solr > Issue Type: Bug > Components: contrib - Solr Cell (Tika extraction) > Affects Versions: 1.4.1 > Reporter: Surendranadh Puranam > Assignee: Steven Rowe > Priority: Critical > Labels: extraction, indexing, pdf, secure > Fix For: 3.1, 1.4.2 > > > There are issues with some PDF documents when it gets indexed (extracted?). > There is an issue being fixed by PDFBOX in the version PDFBox 1.1.0. But > Apache solr 1.4.1 doesn't have the latest version of these jars which is > causing these failures. We have tika-pareser0.4 in this solr 1.4.1 > distribution which has to be updated to 0.9 version. > Reference for the issue and the solution : > https://issues.apache.org/jira/browse/PDFBOX-617 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org