Hi,

On Mon, Feb 18, 2013 at 4:46 PM, Matthew Taylor <[email protected]> wrote:
> Thanks for the response. Unfortunately, when I tried that, it returned an
> empty string. The same thing happened when I tried parser.parse() and used
> BodyContentHandler.toString().
>
> The input stream says that data is available, however, before it is passed
> into Tika. Any other ideas?

Perhaps the stream simply can't be parsed by Tika? Have you tried

    java -jar tika-app-1.3.jar --text < /path/to/file

on the document?

Alternatively, if you're running Tika in an OSGi environment like
Sling, do you have just tika-core deployed (AFAIUI that's the default
with Sling)? The core bundle doesn't contain any parser components, so
it won't be able to extract text from any documents. Deploying
tika-bundle along with core should fix that.

BR,

Jukka Zitting

Reply via email to