Hi

By default, CXF JAX-RS MessageBodyWriter which deals with InputStream closes it immediately a copy is complete, it can be disabled, but it would be indeed simpler to avoid using a try-with-resources. I can fix it...

FYI, re your test code, you can do response.getEntity(String.class)

Cheers, Sergey
On 02/06/17 11:27, Haris Osmanagic wrote:
Hi everyone!

I am using Tika Server, and I have faced a weird thing when extracting text and requiring a plain text response. Tests can be found here: https://github.com/hariso/tika/commit/2a0dc37a4427070360c7ebe147712d9c873a4e7b

*Version used*: 1.15
*File used*: Any I tried (MS Word, DOCX, PDF)
*Method used*: Multipart upload, using Accept: text/plain

*Expected result*: extracted text
*Actual result*: extract text PLUS an error saying

<ns1:XMLFault xmlns:ns1="http://cxf.apache.org/bindings/xformat";><ns1:faultstring xmlns:ns1="http://cxf.apache.org/bindings/xformat";>java.io.IOException: Stream Closed</ns1:faultstring></ns1:XMLFault>

Looking at the code, it seems like the method used for producing text is using try-with-resources <https://github.com/hariso/tika/blob/2a0dc37a4427070360c7ebe147712d9c873a4e7b/tika-server/src/main/java/org/apache/tika/server/resource/TikaResource.java#L408-L411>, and the used input stream has already been closed. The method used for producing XML doesn't do it <https://github.com/hariso/tika/blob/2a0dc37a4427070360c7ebe147712d9c873a4e7b/tika-server/src/main/java/org/apache/tika/server/resource/TikaResource.java#L476>.

In my use case, the parsed text is processed in an additional, where using XML/HTML is not really desired, hence I cannot use it as a workaround (at least not now).

Any help or comments are appreciated!

Haris


Reply via email to