Gary, I'm sorry for my delay. I'm just back to the keyboard from some time away.
This is an example from the gz stream. We had similar messages from some bzip2 and xz. Caused by: java.io.IOException: Garbage after a valid .gz stream at org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.init(GzipCompressorInputStream.java:240) at org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.read(GzipCompressorInputStream.java:391) at org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:205) at java.base/java.io.BufferedInputStream.fill(BufferedInputStream.java:252) at java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:292) at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:351) at org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:205) Thank you! On 2023/07/29 14:49:23 Gary Gregory wrote: > Hi Tim, > > Do you have a stack trace? Maybe this is an option we can add... > > Gary > > On Wed, Jul 26, 2023, 3:22 PM Tim Allison <talli...@apache.org> wrote: > > > We recently had a request to change our default behavior to turn on > > processing multiple/concatenated compressor streams for gzip, bzip2, etc. > > When we made this change and compared the updated results with our previous > > results, we lost quite a few attachments because of the "garbage after a > > valid x" exception and because of how we're buffering/digesting the stream. > > > > Is there any way to turn on extraction of concatenated compressor streams, > > but have it silently stop reading instead of throwing a garbage exception? > > > > Thank you! > > > > Best, > > > > Tim > > > > > > [0] https://issues.apache.org/jira/browse/TIKA-4048 > > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org