On Mon, 1 Mar 2021, Tim Allison wrote:
detectors should return the stream reset to the beginning.
I agree - needs to be ready for the parser to then process
Parsers, IIRC, should return the stream fully(?) read but not closed.
Not always - if the parser wanted a File then it may not have to
On Mon, 1 Mar 2021, Peter Kronenberg wrote:
But the issue is that different parsers return the stream in different
states. Sometimes the stream is all used up (although not closed). And
other times, the stream has been re-set to the beginning where it can be
re-used. Is this expected behavior
That’s not what I’m seeing. The AudioParser returns the stream at the
beginning. Maybe it’s because there was nothing to parse. It just returns
metadata. But the MP4Parser returns the stream fully consumed, even though,
again, it only returns meta-data.
Since right now, I’m dealing with aud
On Fri, 26 Feb 2021, Peter Kronenberg wrote:
For most audio files, using the AudioParser, the buffer is still at the
beginning. Even though there is no text extraction, I would think that
Tika still needs to read through the stream. The MP3Parser consumes the
stream, but the MP4Parser does not
detectors should return the stream reset to the beginning.
Parsers, IIRC, should return the stream fully(?) read but not closed.
On Mon, Mar 1, 2021 at 10:29 AM Tim Allison wrote:
> Reusing streams after parsing hasn't been something I've done before...
>
> This is not expected behavior. Parse
Reusing streams after parsing hasn't been something I've done before...
This is not expected behavior. Parsers should all behave the same.
On Mon, Mar 1, 2021 at 10:24 AM Peter Kronenberg
wrote:
> After more testing, it seems that it has nothing to do with
> TikaInputStream. I just passed in
After more testing, it seems that it has nothing to do with TikaInputStream. I
just passed in a BufferedInputStream to the parsers. I see that the first
thing the AutoDetactParser does is to convert it to a TikaInputStream. So
maybe TIS is being leveraged at a lower level, but there no reason