@Tim

I'm not sure. I've been here
<https://issues.apache.org/jira/secure/Signup!default.jspa> and registration
completes, but no email. I also tried resetting my password, but no email
again.:S

On Fri, Jun 2, 2017 at 5:04 PM Allison, Timothy B. <[email protected]>
wrote:

> You already have!  J
>
>
>
> >I am not able to sign up for Apache's JIRA
>
>
>
> What went wrong?  That’s the best way to let us know that you’ve actually
> found a problem, which you did, unit test and all!
>
>
>
> *From:* Haris Osmanagic [mailto:[email protected]]
> *Sent:* Friday, June 2, 2017 10:56 AM
> *To:* [email protected]; [email protected]
>
>
> *Subject:* Re: "Stream closed" error when extracting text using Tika
> Server
>
>
>
> Thanks everyone for feedback!
>
>
>
> I am not able to sign up for Apache's JIRA, so I couldn't open the ticket
> myself, sorry for that. Am I able to help somehow this way?
>
>
>
> On Fri, Jun 2, 2017 at 3:18 PM Allison, Timothy B. <[email protected]>
> wrote:
>
> I opened TIKA-2384 for this.  Let’s move discussion there.
>
>
>
> *From:* Luís Filipe Nassif [mailto:[email protected]]
> *Sent:* Friday, June 2, 2017 9:00 AM
> *To:* [email protected]
> *Subject:* RE: "Stream closed" error when extracting text using Tika
> Server
>
>
>
> I think resources should be closed where they are opened, like
> parser.parse() API contract, no?
>
>
>
> Luis
>
>
>
> Em 2 de jun de 2017 8:27 AM, "Allison, Timothy B." <[email protected]>
> escreveu:
>
> Haris is correct.
>
> The static "parse()" closes the InputStream so we shouldn't wrap the call
> to parse in an autoclose
>
> try(InputStream is = xyz) {
>         TikaResource.parse(...)
> }
>
> Once I remove the autoclosing try, the test passes.
>
>
> -----Original Message-----
> From: Sergey Beryozkin [mailto:[email protected]]
> Sent: Friday, June 2, 2017 7:20 AM
> To: [email protected]
> Subject: Re: "Stream closed" error when extracting text using Tika Server
>
> Hi Tim, sorry, I'm not sure now what I was planning to fix :-), I've
> looked at the source again and it is not a case of InputStream returned
> directly from the method...
> try/catch will most likely work better, though may be it would hide some
> issue to do with some of the parsers closing the stream early somewhere...
>
> Thanks, Sergey
> On 02/06/17 12:13, Allison, Timothy B. wrote:
> > Thank you for sharing this with us.
> >
> > Oddly, I’m able to reproduce this with our 2pic.docx test file, but
> > not with our “test_recursive_embedded.docx”.
> >
> > Please open a ticket on our JIRA.
> >
> > *From:*Haris Osmanagic [mailto:[email protected]]
> > *Sent:* Friday, June 2, 2017 6:28 AM
> > *To:* [email protected]
> > *Subject:* "Stream closed" error when extracting text using Tika
> > Server
> >
> > Hi everyone!
> >
> > I am using Tika Server, and I have faced a weird thing when extracting
> > text and requiring a plain text response. Tests can be found here:
> > https://github.com/hariso/tika/commit/2a0dc37a4427070360c7ebe147712d9c
> > 873a4e7b
> >
> > *Version used*: 1.15
> >
> > *File used*: Any I tried (MS Word, DOCX, PDF)
> >
> > *Method used*: Multipart upload, using Accept: text/plain
> >
> > *Expected result*: extracted text
> >
> > *Actual result*: extract text PLUS an error saying
> >
> > <ns1:XMLFault
> > xmlns:ns1="http://cxf.apache.org/bindings/xformat";><ns1:faultstring
> > xmlns:ns1="http://cxf.apache.org/bindings/xformat";>java.io.IOException:
> > Stream Closed</ns1:faultstring></ns1:XMLFault>
> >
> > Looking at the code, it seems like the method used for producing text
> > is using try-with-resources
> > <https://github.com/hariso/tika/blob/2a0dc37a4427070360c7ebe147712d9c8
> > 73a4e7b/tika-server/src/main/java/org/apache/tika/server/resource/Tika
> > Resource.java#L408-L411>, and the used input stream has already been
> > closed. The method used for producing XML doesn't do it
> > <
> https://github.com/hariso/tika/blob/2a0dc37a4427070360c7ebe147712d9c873a4e7b/tika-server/src/main/java/org/apache/tika/server/resource/TikaResource.java#L476
> >.
> >
> > In my use case, the parsed text is processed in an additional, where
> > using XML/HTML is not really desired, hence I cannot use it as a
> > workaround (at least not now).
> >
> > Any help or comments are appreciated!
> >
> > Haris
> >
>
>
> --
> Sergey Beryozkin
>
> Talend Community Coders
> http://coders.talend.com/
>
>

Reply via email to