On Fri, 22 Dec 2023 07:55:24 GMT, Eirik Bjørsnøs <eir...@openjdk.org> wrote:
>> ZipInputStream.readEnd currently assumes a Zip64 data descriptor if the >> number of compressed or uncompressed bytes read from the inflater is larger >> than the Zip64 magic value. >> >> While the ZIP format mandates that the data descriptor `SHOULD be stored in >> ZIP64 format (as 8 byte values) when a file's size exceeds 0xFFFFFFFF`, it >> also states that `ZIP64 format MAY be used regardless of the size of a >> file`. For such small entries, the above assumption does not hold. >> >> This PR augments ZipInputStream.readEnd to also assume 8-byte sizes if the >> ZipEntry includes a Zip64 extra information field. This brings >> ZipInputStream into alignment with the APPNOTE format spec: >> >> >> When extracting, if the zip64 extended information extra >> field is present for the file the compressed and >> uncompressed sizes will be 8 byte values. >> >> >> While small Zip64 files with 8-byte data descriptors are not commonly found >> in the wild, it is possible to create one using the Info-ZIP command line >> `-fd` flag: >> >> `echo hello | zip -fd > hello.zip` >> >> The PR also adds a test verifying that such a small Zip64 file can be parsed >> by ZipInputStream. > > Eirik Bjørsnøs has updated the pull request with a new target base due to a > merge or a rebase. The pull request now contains 33 commits: > > - Merge branch 'master' into data-descriptor > - Extract ZIP64_BLOCK_SIZE_OFFSET as a constant > - A Zip64 extra field used in a LOC header must include both the > uncompressed and compressed size fields, and does not include local header > offset or disk start number fields. Conequently, a valid LOC Zip64 block must > always be 16 bytes long. > - Document better the zip command and options used to generate the test > vector ZIP > - Fix spelling of "presence" > - Add a @bug reference in the test > - Use the term "block size" when referring to the size of a Zip64 extra > field data block > - Update comment reflect that a Zip64 extended field in a LOC header has > only two valid block sizes > - Convert test from testNG to JUnit > - Fix the check that the size of an extra field block size must not grow > past the total extra field length > - ... and 23 more: https://git.openjdk.org/jdk/compare/e2042421...ddff130f src/java.base/share/classes/java/util/zip/ZipInputStream.java line 692: > 690: private static boolean isZip64ExtBlockSizeValid(int blockSize) { > 691: // Uncompressed and compressed size fields are 8 bytes each > 692: return blockSize == 16; I'm not following this check. As far as I can see the `blockSize` being passed to this method is the size of the zip64 extra entry and as per the spec: 4.5.3 -Zip64 Extended Information Extra Field (0x0001): The following is the layout of the zip64 extended information "extra" block. .... Value Size Description ----- ---- ----------- (ZIP64) 0x0001 2 bytes Tag for this "extra" block type Size 2 bytes Size of this "extra" block Original Size 8 bytes Original uncompressed file size Compressed Size 8 bytes Size of compressed data Relative Header Offset 8 bytes Offset of local header record Disk Start Number 4 bytes Number of the disk on which this file starts So shouldn't it be 8 + 8 + 8 + 4 = 28 bytes and not 16 bytes? Did I misunderstand the code or the spec? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12524#discussion_r1444786479