On Sun, 22 Mar 2026 19:21:59 GMT, Alan Bateman <[email protected]> wrote:
> Do I read it correct that the entry name and comment will be encoded twice? Prior to this change, the entry name would be encoded twice, once in writeLOC, once in writeCEN. The comment is only found in the CEN record, hence it would only be encoded in writeCEN. > I wonder if it should be pushed down to writeLOC. This was discussed with Lance in an earlier comment in this PR. My response there was that this validation needs to happen in `putNextEntry`, must happen before the XEntry is added to xentries list (Vector acually!). Otherwise, the finish / close of the ZipOutputStream will fail during writeCEN when an unmappable comment is encoded. Bad usability to fail during close. If we want to reduce the number of encodings I see the following options: 1: Capture the encoded byte array after validation in `putNextEntry`, then pass it as a parameter to writeLOC. This way, writeLOC does not have to reencode. Note that this trick does not work for comments, since they are not output in the CEN. 2: We could encode names and comments once during validation, then store byte arrays in the XEntry, then output that in writeLOC and writeCEN. The advangate is we only encode once, the disadvantage is we increase retained memory, probably noticable for large number of entries with longish names. Entry comments are probably rare in practise. I think we could do 1 without any worries. 2 I'm more sceptical about. What do you think? ------------- PR Comment: https://git.openjdk.org/jdk/pull/30319#issuecomment-4106852709
