Gary,

I'm sorry for my delay.  I'm just back to the keyboard from some time away.

This is an example from the gz stream.  We had similar messages from some bzip2 
and xz.

Caused by: java.io.IOException: Garbage after a valid .gz stream
        at 
org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.init(GzipCompressorInputStream.java:240)
        at 
org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.read(GzipCompressorInputStream.java:391)
        at 
org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:205)
        at 
java.base/java.io.BufferedInputStream.fill(BufferedInputStream.java:252)
        at 
java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:292)
        at 
java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:351)
        at 
org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:205)

Thank you!

On 2023/07/29 14:49:23 Gary Gregory wrote:
> Hi Tim,
> 
> Do you have a stack trace? Maybe this is an option we can add...
> 
> Gary
> 
> On Wed, Jul 26, 2023, 3:22 PM Tim Allison <talli...@apache.org> wrote:
> 
> > We recently had a request to change our default behavior to turn on
> > processing multiple/concatenated compressor streams for gzip, bzip2, etc.
> > When we made this change and compared the updated results with our previous
> > results, we lost quite a few attachments because of the "garbage after a
> > valid x" exception and because of how we're buffering/digesting the stream.
> >
> > Is there any way to turn on extraction of concatenated compressor streams,
> > but have it silently stop reading instead of throwing a garbage exception?
> >
> > Thank you!
> >
> > Best,
> >
> >         Tim
> >
> >
> > [0] https://issues.apache.org/jira/browse/TIKA-4048
> >
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org

Reply via email to