Hi, new webrev based on discussions here and offline:
http://cr.openjdk.java.net/~redestad/8243254/open.01/ - creates a new testng test TestZipFileEncodings, which derives from TestZipFile rather than repurposing that existing test. There is definitely some overlap, but since TestZipFile is only run manually and this new test runs fine (and quickly) in our testing I think that's fine - only implements the optimization for UTF-8, removing the concept of ASCII-compatible ZipCoders. - a few other cleanups Testing: tier1+2 Thanks! /Claes On 2020-04-21 15:52, Claes Redestad wrote:
Hi, ZipFile.getEntry does optimizations to check for directory entries by adding a '/' to the encoded byte array. JDK-8242959 improved on this optimization, but a question was raised Jason Zaugg on whether the optimization is always valid[1]. Turns out it isn't, but only for non-ASCII compatible charsets. While JarFiles are always assumed to be UTF-8-encoded or compatible, ZipFiles allow arbitrary encoding. E.g., UTF-16 would encode '/' (2F) as either 2F 00 or 00 2F, which means the hash code would differ and a directory "foo/" potentially not be found when looking up "foo". Further complications arise when/if the directory name ends with a code point that might be encoded so that the final byte is 2F, e.g. \u012F. This patch only enables this low-level optimization when the charset encoding used is known to be ASCII compatible in the sense that 2F will be encoded as single-byte 2F. UTF-8 is compatible in this sense - and since this is the charset exclusively used by JarFile these changes have little to no effect on startup performance in the cases we've been looking at. Webrev: http://cr.openjdk.java.net/~redestad/8243254/open.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8243254 I've partially repurposed ZipFile/TestZipFile to test some such corner cases. Mainly expanded it to test ZipFiles created with various non- standard encodings, but also slimmed it down so that it can actually run quickly and reliably - as well as enabled it in regular testing. The updated test fails both before and after JDK-8242959, but passes with the proposed changes. Testing: tier1+2 Thanks! /Claes [1] https://twitter.com/retronym/status/1252134723431747584?s=20
