Solrj doesn't know if PDF was actually parsed by Tika -----------------------------------------------------
Key: SOLR-1847 URL: https://issues.apache.org/jira/browse/SOLR-1847 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Affects Versions: 1.5 Environment: TOMCAT 6.0.24, SOLR 1.5Dev, Solrj1.5Dev Tika Reporter: elsadek When posting pdf files using solrj the only response we get from Solr is only server response status, but never know whether pdf was actually parsed or not, checking the log I found that Tika wasn't able to succeed with some pdf files because of content nature (texts in images only) or are corrupted: 25 mars 2010 14:54:07 org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: EI 25 mars 2010 14:54:02 org.apache.pdfbox.filter.FlateFilter decode GRAVE: Stop reading corrupt stream The question is how can I catch these kinds of exceptions through Solrj ? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.