I have some very simple code to call Tika:

        Parser parser = new AutoDetectParser();
        ContentHandler contentHandler = new BodyContentHandler(writer);
        ParseContext parseContext = new ParseContext();
        Metadata metadata = new Metadata();
        parser.parse(input, contentHandler, metadata, parseContext);

It has been working fine on many inputs, but I get no text in the
content handler when I feed it a file in the Shift-JIS encoding.

The metadata comes back with a content type of application/octet-stream.

I thought I'd better write here before opening a JIRA, in case I'm
missing something trivial

Reply via email to