On Sun, 23 Mar 2025 10:45:28 GMT, Eirik Bjørsnøs <eir...@openjdk.org> wrote:

>> Hello Jaikiran,
>> 
>> Before this change, two ZipFile instances opened using different (non-UTF-8) 
>> charsets would have equal keys, and thus be backed by the same Source and 
>> ZipCoder instance, whichever ZipFile constructed first would "win".
>> 
>> This seems like a separate bug,  independent of the concurrency concerns 
>> described  JDK-8347712.
>> 
>> For the benefit of future maintainers, I think this independent bug should 
>> be described in a separate JBS issue.
>> 
>> The bug could be solved in a separate PR, however I feel it's also ok to fix 
>> it in this PR, since moving the ZipCoder instance to ZipFile seems to 
>> incidentally solve the issue as well.
>> 
>> WDYT?
>
>> This seems like a separate bug, independent of the concurrency concerns 
>> described JDK-8347712.
> 
> Consider the following test, which when added to `ZipCoding.java` fails in 
> master but succeeds in this PR:
> 
> 
> /**
>  * Verifies that opening a ZipFile with an incorrect charset does not
>  * prevent a later opening of the same file using the correct charset.
>  *
>  * This may happen if the two ZipFile instances have identical
>  * ZipFile.Source.Key, thus mapping to the same Source instance.
>  *
>  * @throws IOException if an unexpected I/O error occurs
>  */
> @Test
> public void differentNonUTF8Charsets() throws IOException {
>     // Using ISO-8859-15 and ISO-8859-1 here, since they are similar
>     // encodings with different characters for some bytes
> 
>     // The byte 0xA4 is "Euro sign" in 8859-15, "Currency sign (generic)" in 
> 8859-1
>     String euroSign = "\u20AC";
>     String currencySign = "\u00A4";
> 
>     // Create a ZIP, encoded in ISO-8859-15
>     byte[] zip = createZIP("ISO-8859-15", euroSign, euroSign);
>     Path f = Path.of("zf-non-utf.zip");
>     Files.write(f, zip);
> 
>     // Open a ZipFile instance using (incorrect) encoding ISO-8859-1
>     // Then open a ZipFile instance using (correct) encoding ISO-8859-15
>     try (ZipFile incorrect = new ZipFile(f.toFile(), 
> StandardCharsets.ISO_8859_1);
>          ZipFile correct = new ZipFile(f.toFile(), 
> Charset.forName("ISO-8859-15"))) {
> 
>         // Correct encoding should resolve euro sign
>         assertNotNull(correct.getEntry(euroSign));
>         // Correct encoding should not resolve currency sign
>         assertNull(correct.getEntry(currencySign));
> 
>         // Incorrect encoding should resolve currency sign
>         assertNotNull(incorrect.getEntry(currencySign));
>         // Incorrect encoding should not resolve euro sign
>         assertNull(incorrect.getEntry(euroSign));
>     }
> }

Hello Eirik,

> Before this change, two ZipFile instances opened using different (non-UTF-8) 
> charsets would have equal keys, and thus be backed by the same Source and 
> ZipCoder instance, whichever ZipFile constructed first would "win".
> This seems like a separate bug, independent of the concurrency concerns 
> described JDK-8347712.
> For the benefit of future maintainers, I think this independent bug should be 
> described in a separate JBS issue.

That's right - when fixing 8347712 I saw this issue with `Key` class and 
addressed it in this PR. You have a good point that having a JBS issue 
specifically for this will be useful. I have now created 
https://bugs.openjdk.org/browse/JDK-8355975 and linked that issue in this 
current PR. What helped was your test case example that uses `ISO-8859-15` and 
`ISO-8859-1` charsets to trigger this issue. I wasn't aware of that difference 
between `ISO-8859-15` and `ISO-8859-1` charsets. Thank you, I've now added that 
test in this PR.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23986#discussion_r2068528308

Reply via email to