Tika network server does not print anything in response to, for example, Word
documents
---------------------------------------------------------------------------------------
Key: TIKA-709
URL: https://issues.apache.org/jira/browse/TIKA-709
Project: Tika
Issue Type: Bug
Components: cli
Affects Versions: 0.9
Environment: Debian Linux Sid
Reporter: Vitaliy Filippov
When trying to use Tika Server (java -jar tika-app-0.9.jar -t -p PORT) to parse
M$Word DOC/DOCX files, tika server reads the file and then doesn't do anything
more, it simply hangs, probably blocked on a socket read. This does not happend
with, for example, HTML documents. I don't know the mechanics of this bug, but
the following change definitely fixes the issue:
Change
type.process(socket.getInputStream(), output);
to
type.process(new CloseShieldInputStream(socket.getInputStream()), output);
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira