On 12/08/15 02:07, Justin wrote:
---tika-config.xml--- <?xml version="1.0" encoding="UTF-8"?> <properties> <parsers> <parser class="org.apache.tika.parser.mail.RFC822Parser"/> <parser class="org.apache.tika.parser.mbox.MboxParser"/> <parser class="org.apache.tika.parser.mbox.OutlookPSTParser"/> <parser class="org.apache.tika.parser.microsoft.JackcessParser"/> <parser class="org.apache.tika.parser.microsoft.OldExcelParser"/> <parser class="org.apache.tika.parser.microsoft.OfficeParser"/> <parser class="org.apache.tika.parser.microsoft.TNEFParser"/> <parser class="org.apache.tika.parser.microsoft.ooxml.OOXMLParser"/> <detectors>I do not get anything back from BodyContentHandler when parsing a PST file whereas I do when I use TikaConfig.getDefaultConfig() instead. Am I missing something?
Your config file looks invalid - you need to close the <parsers> tag with a </parsers> before you move onto the detectors
I'd also suggest you try some of the things listed in the Troubleshooting page, to ensure you really have the parsers you expected:
http://wiki.apache.org/tika/Troubleshooting%20Tika Nick
