When I add a document (say pdf file) to Sling via WebDAV, org.apache.tika.parser.AutoDetectParser.parse method is called initially. Below is the snippet of code from the parse method. ******************************** // Automatically detect the MIME type of the document MediaType type = detector.detect(tis, metadata); ******************************** In the above method, it is trying to get the content type by passing the 'metadata'. I verified that the value of 'metadata' is correctly set to 'Content-Type=application/pdf'. But detector.detect method returns "application/octet-stream"
I found that 'detector' is an instance of org.apache.tika.detect.CompositeDetector and the "detectors" is initialized to an *empty* list, so "application/octet-stream" is returned as the default value. Since the returned type is always "application/octet-stream", it is not calling any tika parsers, instead org.apache.tika.parser.EmptyParser is invoked. -- View this message in context: http://jackrabbit.510166.n4.nabble.com/Full-text-indexing-under-OSGi-environment-Sling-is-not-working-tp4658882p4658901.html Sent from the Jackrabbit - Users mailing list archive at Nabble.com.
