On Sun, 15 Jan 2012, Public Network Services wrote:
I am using Tika 0.9 to detect various types of files and formats, but not
getting the expected behavior.

I'd suggest you try a recent nighlty build, and see if that helps - we've done quite a bit of detection work since 0.9

- For various application files (e.g., images or MS-Office files) the
 detected type  is the generic "application/octet-stream", as opposed to the
 specific MIME type for the application.

For office file formats to be properly detected, you'll need to also have the tika parsers jar (+ dependencies) in your classpath, so that the extra detectors are present

The detection is made via a simple call to

new Tika().detect(inputStream);

It's worth double checking with the tika-app jar and the --detect flag, that'll let you verify if a detection problem is really a Tika one, or a problem with your setup (eg missing jars)

Nick

Reply via email to