Folks,

I was trying to upgrade to Tika 1.0 and found I could break tiak-app with some MSG files :-( I have a Windows (Outlook) .msg file with an attached PDF which parses in Tika-app 0.7, 0.9, 0.10
but in Tika-app 1.0 I get a stack trace.

<error>
Apache Tika was unable to parse the document
at \\....XYZ.msg
The full exception stack trace is included below:

org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@57284c88 at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
    at org.apache.tika.gui.TikaGUI.handleStream(TikaGUI.java:320)
    at org.apache.tika.gui.TikaGUI.openFile(TikaGUI.java:279)
at org.apache.tika.gui.ParsingTransferHandler.importFiles(ParsingTransferHandler.java:94) at org.apache.tika.gui.ParsingTransferHandler.importData(ParsingTransferHandler.java:77)
[rest of stack trace removed]
</error>
I note that it says ...microsoft.OfficeParser, so I'm guessing it is in the message where it is falling over.
Is there anything I could do to configure the app?
Every version of the tika-app is started with the trivial command similar to C:\dev\tools\Tika\1.0\tika-app-1.0.jar -g
and I drag and drop onto it.
Interestingly enough running it from the command line, results in what looks like good output for all possible switches -m, -t, -x, -h

-Paul

Reply via email to