On Mon, 30 Jan 2023 10:32:38 GMT, Eirik Bjorsnos <[email protected]> wrote:
> After finding a hash match, getEntryPos needs to compare the lookup name up
> to the encoded entry name in the CEN. This comparison is done by decoding the
> entry name into a String. The names can then be compared using the String
> API. This decoding step adds a significat cost to this method.
>
> This PR suggest to update the string comparison such that in the common case
> where both the lookup name and the entry name are encoded in
> ASCII-compatible UTF-8, decoding can be avoided and the byte arrays can
> instead be compared direcly.
>
> ZipCoder is updated with a new method to compare a string with an encoded
> byte array range. The default implementation decodes to string (like the
> current code), while the UTF-8 implementation uses
> JavaLangAccess.getBytesNoRepl to get the bytes. Both methods thes uses
> Arrays.mismatch for comparison with or without matching trailing slashes.
>
> Additionally, this PR suggest to make the following updates to getEntryPos:
>
> - The try/catch for IAE is redundant and can be safely removed. (initCEN
> already checks this and will throws IAE for invalid UTF-8). This seems to
> give a 3-4% speedup on micros)
> - A new test InvalidBytesInEntryNameOrComment is a added to verify that
> initCEN does in fact reject invalid UTF-8 in CEN file names and comments. (I
> found no existing test coverage for this)
> - The recursion when looking for "name/" matches is replaced with iteration.
> We keep track of any "name/" match and return it at the end of the search. (I
> feel this is easier to follow and it also gives a ~30% speedup for addSlash
> lookups with no regression on regular lookups)
>
> (My though is that including these additional updates in this PR might reduce
> reviewer overhead given that it touches the exact same code. I might be wrong
> on this, please advise :)
>
> I'm seeing a ~17% saving on the micro ZipFileGetEntry.getEntryHit (modified
> to use xalan.jar):
>
> Baseline:
>
> Benchmark (size) Mode Cnt Score Error
> Units
> ZipFileGetEntry.getEntryHit 512 avgt 15 74.941 ± 1.004
> ns/op
> ZipFileGetEntry.getEntryHit 1024 avgt 15 84.943 ± 1.320
> ns/op
> ZipFileGetEntry.getEntryHitUncached 512 avgt 15 120.371 ± 2.386
> ns/op
> ZipFileGetEntry.getEntryHitUncached 1024 avgt 15 126.128 ± 1.075
> ns/op
> ZipFileGetEntry.getEntryMiss 512 avgt 15 23.818 ± 0.838
> ns/op
> ZipFileGetEntry.getEntryMiss 1024 avgt 15 29.762 ± 5.998
> ns/op
> ZipFileGetEntry.getEntryMissUncached 512 avgt 15 59.405 ± 0.545
> ns/op
> ZipFileGetEntry.getEntryMissUncached 1024 avgt 15 71.840 ± 2.455
> ns/op
> ZipFileGetEntry.getEntrySlash 512 avgt 15 135.621 ± 4.341
> ns/op
> ZipFileGetEntry.getEntrySlash 1024 avgt 15 134.190 ± 2.141
> ns/op
>
>
>
> PR:
>
>
> Benchmark (size) Mode Cnt Score Error
> Units
> ZipFileGetEntry.getEntryHit 512 avgt 15 62.267 ± 1.329
> ns/op
> ZipFileGetEntry.getEntryHit 1024 avgt 15 72.916 ± 2.428
> ns/op
> ZipFileGetEntry.getEntryHitUncached 512 avgt 15 101.630 ± 1.154
> ns/op
> ZipFileGetEntry.getEntryHitUncached 1024 avgt 15 113.161 ± 0.502
> ns/op
> ZipFileGetEntry.getEntryMiss 512 avgt 15 23.003 ± 1.191
> ns/op
> ZipFileGetEntry.getEntryMiss 1024 avgt 15 23.236 ± 1.114
> ns/op
> ZipFileGetEntry.getEntryMissUncached 512 avgt 15 56.781 ± 1.505
> ns/op
> ZipFileGetEntry.getEntryMissUncached 1024 avgt 15 67.767 ± 1.963
> ns/op
> ZipFileGetEntry.getEntrySlash 512 avgt 15 73.745 ± 2.717
> ns/op
> ZipFileGetEntry.getEntrySlash 1024 avgt 15 75.784 ± 1.051
> ns/op
>
>
> To assess the impact on startup/warmup, I made a main method class which
> measures the total time of calling ZipFile.getEntry for all entries in the
> 109 JAR file dependenies of spring-petclinic. The shows a nice improvement
> (time in micros):
>
>
> Percentile Baseline Patch
> 50 % 23155 21149
> 75 % 23598 21454
> 90 % 23989 21691
> 95 % 24238 21973
> 99 % 25270 22446
> STDEV 792 549
> Count 500 500
Filed JDK-8301873 for this, update PR title when you're ready.
src/java.base/share/classes/java/lang/System.java line 2668:
> 2666: @Override
> 2667: public int mismatchUTF8(String str, byte[] b, int
> fromIndex, int toIndex) {
> 2668: byte[] encoded = str.isLatin1() ? str.value() :
> str.getBytes(UTF_8.INSTANCE);
I think this is incorrect: latin-1 characters above codepoint 127 (non-ascii)
would be represented by 2 bytes in UTF-8. What you want here is probably
`str.isAscii() ? ...`. The ASCII check will have to look at the bytes, so will
incur a minor penalty.
Good news is that you should already be able to do this with what's already
exposed via `JLA.getBytesNoRepl(str, StandardCharsets.UTF_8)`, so no need for
more shared secrets.
src/java.base/share/classes/java/lang/System.java line 2671:
> 2669: if (false) {
> 2670: // Arrays.mismatch without the range checks (~5%
> faster micro getEntryHit)
> 2671: int aLength = encoded.length;
Part of the difference you're seeing is due to knowing that you'll be matching
the entire length of the first array (`encoded, 0, encoded.length`).
As an experiment I added `Arrays.mismatch(byte[], byte[], int, int)` to
mismatch the entire range of the first array argument vs the range of the
second and can spot an improvement in affected micros:
Benchmark (size) Mode Cnt Score Error
Units
ArraysMismatch.Char.differentSubrangeMatches 90 avgt 10 12.165 ± 0.074
ns/op # mismatch(a, aFrom, aTo, b, bFrom, bTo)
ArraysMismatch.Char.subrangeMatches 90 avgt 10 10.748 ± 0.006
ns/op # mismatch(a, b, bFrom, bTo)
This might be something we can solve in the JITs without having to add new
methods to `java.util.Arrays` to deal as efficiently as possible with the case
when we're matching against the entirety of one of the arrays.
-------------
PR: https://git.openjdk.org/jdk/pull/12290