On Mon, 23 Mar 2026 15:04:58 GMT, Eirik Bjørsnøs <[email protected]> wrote:
>> Please review this PR which improves validation of unmappable characters in >> names in the `ZipFileSystem` and `ZipFileOutputStream` APIs. >> >> Currently, `ZipFileSystem::getPath` and `ZipFileOutputStream:putNextEntry` >> both throw `IllegalArgumentException` when rejecting a path or entry name >> which cannot be encoded with the given charset. >> >> This PR fixes `ZipFileSystem::getPath` to instead throw >> `InvalidPathException` as specified. Similarly, >> `ZipOutputStream::putNextEntry` is updated to throw `ZipException` a >> specified. >> >> Related, `ZipOutputStream::putNextEntry` is updated to reject unmappable >> ZipEntry comments in a similar fashion. >> >> This change effectively means that `ZipOutputStream` now encodes names and >> comments twice, once in `putNextEntry` and second time in `writeCEN` when >> the stream is closed. An alternative would be to capture the encoded byte >> arrays in the `XEntry`, however this would increase retained heap memory for >> large number of entries. >> >> During work on this PR, I noticed that ZipFS ZipCoder.toString and >> ZipCoder.getBytes implementations differ from those in java.util.zip. This >> PR aligns ZipFS with java.util.zip. >> >> New tests are added in the ZipFS and ZipFileOutputStream area to verify that >> these APIs throw exceptions according to their specifications when faced >> with unmappable characters. > > Eirik Bjørsnøs has updated the pull request incrementally with one additional > commit since the last revision: > > Avoid encoding name into bytes twice in putNextEntry, then again in writeLOC > I agree and have pushed > [c8130ea](https://github.com/openjdk/jdk/commit/c8130ea751fa008c74bf3442fa97a8bd156c3983) > which capture the encoded byte array in putNextEntry and passes that along > to writeLOC such that we avoid introducing a double encoding here. > > Yes, comments will be encoded twice now, but I agree with you that's less of > a concern. Thanks for this, I think the implementation change is okay now. I assume you'll create a CSR to track the behavior change (ZipException now thrown for cases where IAE was previously thrown). I don't think the compatibility risk is a concern, and of course the IAE was not specified. ------------- PR Comment: https://git.openjdk.org/jdk/pull/30319#issuecomment-4117432667
