On Sun, 23 Mar 2025 10:45:28 GMT, Eirik Bjørsnøs <eir...@openjdk.org> wrote:
>> Hello Jaikiran, >> >> Before this change, two ZipFile instances opened using different (non-UTF-8) >> charsets would have equal keys, and thus be backed by the same Source and >> ZipCoder instance, whichever ZipFile constructed first would "win". >> >> This seems like a separate bug, independent of the concurrency concerns >> described JDK-8347712. >> >> For the benefit of future maintainers, I think this independent bug should >> be described in a separate JBS issue. >> >> The bug could be solved in a separate PR, however I feel it's also ok to fix >> it in this PR, since moving the ZipCoder instance to ZipFile seems to >> incidentally solve the issue as well. >> >> WDYT? > >> This seems like a separate bug, independent of the concurrency concerns >> described JDK-8347712. > > Consider the following test, which when added to `ZipCoding.java` fails in > master but succeeds in this PR: > > > /** > * Verifies that opening a ZipFile with an incorrect charset does not > * prevent a later opening of the same file using the correct charset. > * > * This may happen if the two ZipFile instances have identical > * ZipFile.Source.Key, thus mapping to the same Source instance. > * > * @throws IOException if an unexpected I/O error occurs > */ > @Test > public void differentNonUTF8Charsets() throws IOException { > // Using ISO-8859-15 and ISO-8859-1 here, since they are similar > // encodings with different characters for some bytes > > // The byte 0xA4 is "Euro sign" in 8859-15, "Currency sign (generic)" in > 8859-1 > String euroSign = "\u20AC"; > String currencySign = "\u00A4"; > > // Create a ZIP, encoded in ISO-8859-15 > byte[] zip = createZIP("ISO-8859-15", euroSign, euroSign); > Path f = Path.of("zf-non-utf.zip"); > Files.write(f, zip); > > // Open a ZipFile instance using (incorrect) encoding ISO-8859-1 > // Then open a ZipFile instance using (correct) encoding ISO-8859-15 > try (ZipFile incorrect = new ZipFile(f.toFile(), > StandardCharsets.ISO_8859_1); > ZipFile correct = new ZipFile(f.toFile(), > Charset.forName("ISO-8859-15"))) { > > // Correct encoding should resolve euro sign > assertNotNull(correct.getEntry(euroSign)); > // Correct encoding should not resolve currency sign > assertNull(correct.getEntry(currencySign)); > > // Incorrect encoding should resolve currency sign > assertNotNull(incorrect.getEntry(currencySign)); > // Incorrect encoding should not resolve euro sign > assertNull(incorrect.getEntry(euroSign)); > } > } Hello Eirik, > Before this change, two ZipFile instances opened using different (non-UTF-8) > charsets would have equal keys, and thus be backed by the same Source and > ZipCoder instance, whichever ZipFile constructed first would "win". > This seems like a separate bug, independent of the concurrency concerns > described JDK-8347712. > For the benefit of future maintainers, I think this independent bug should be > described in a separate JBS issue. That's right - when fixing 8347712 I saw this issue with `Key` class and addressed it in this PR. You have a good point that having a JBS issue specifically for this will be useful. I have now created https://bugs.openjdk.org/browse/JDK-8355975 and linked that issue in this current PR. What helped was your test case example that uses `ISO-8859-15` and `ISO-8859-1` charsets to trigger this issue. I wasn't aware of that difference between `ISO-8859-15` and `ISO-8859-1` charsets. Thank you, I've now added that test in this PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23986#discussion_r2068528308