All, Ok, I looked into it. Those jar files are seriously messed up. Any self-respecting unzipper would be well within its rights to reject them as invalid. As it turns out, my patch to unzip is doing exactly what it’s supposed to. Something that processed those jar files has a bug.
In each of those .jar files there are several entries that are duplicated in a screwy way. First, those entries each have an exactly duplicated central directory entry, with the same file name, pointing to the same local header offset in the jar file. You can see this even using an unpatched unzip, asking it to unzip the jar file. It will extract a bunch of files, and then ask you if you’d like to replace a file it just unzipped when it encounters the duplicated central directory entry. The fact that two central directory entries are pointing to the same local header is what is, rightly, setting off the zip bomb detection in the patched unzip. Second, the other screwy thing is that there are a bunch of vestigial blocks of data in those jar files that are not referred to by any central directory entry. I presume that those are the data for those same files that were intended to be pointed to by the duplicated central directory entries, but were orphaned when the offsets to the local headers were not changed for those entries. Of course, even if the offsets had been changed so that all the data in the jar file were actually used, it would still be invalid due to having the same names appear more than once. The bug that resulted in these jar files might be even more serious if the duplicated entries are pointing to a previous version of those entries, and what was intended was to update those entries to what is now in the orphaned vestigial regions. Hopefully that’s not that the case. In summary, those jar files are grossly invalid zip files for three reasons: 1. The same name appears twice, 2. Two central directory entries point to the same local header (setting off zip bomb detection), and 3. There is an orphaned chunk of the file that is not referred to by any central directory header. As one example, all three of those things each occur 115 times in asm-5.0.3-sources.jar, out of 261 original central directory entries. Were it to be fixed, that jar file would have 146 unique entries, and would be quite a bit smaller. Which in addition to being invalid is also unfortunate and inefficient, since compression is kinda the point of the zip format. Mark > On Jul 12, 2019, at 8:23 PM, Adler, Mark <mad...@alumni.caltech.edu> wrote: > > Ben, > > Ah, no, I did not test the jar files. I just did, and indeed I am seeing the > reported zip bomb detections. > > Thanks. I’ll look into it. > > Mark > > >> On Jul 12, 2019, at 3:22 PM, Ben Caradoc-Davies <b...@transient.nz> wrote: >> >> On 13/07/2019 04:32, Adler, Mark wrote: >>> I downloaded the four false-positive zip files from the bugreport page, and >>> none of them showed a zip bomb error (or any other error). >> >> Mark, >> >> the zip bomb error is seen when unzipping the 17 jar files contained within >> the four zip files. Did you test these inner jar files? I used (in bash): >> >> $ for f in *.jar; do echo $f; unzip -tq $f; done >> >> The outer zip files are there because many email filters block all email >> with jar attachments, and Debian BTS is email-based. >> >> It would also be nice if unzip reported the filename when rejecting a >> suspected zip bomb, as it does when reporting "No errors detected". >> >> Kind regards, >> >> -- >> Ben Caradoc-Davies <b...@transient.nz> >> Director >> Transient Software Limited <https://transient.nz/> >> New Zealand >