On Fri, 15 Sep 2023 18:04:29 GMT, 温绍锦 <d...@openjdk.org> wrote:
> In the improvement of @cl4es PR #15591, the advantages of non-lookup-table > were discussed. > > But if the input is byte[], using lookup table can improve performance. > > For HexFormat#formatHex(Appendable, byte[]) and HexFormat#formatHex(byte[]), > If the length of byte[] is larger, the performance of table lookup will be > improved more obviously. The performance test results are as follows: ## 0. sciprt bash configure make images sh make/devkit/createJMHBundle.sh bash configure --with-jmh=build/jmh/jars make test TEST="micro:java.util.HexFormatBench.*" ## 1. [aliyun_ecs_c8i.xlarge](https://help.aliyun.com/document_detail/25378.html#c8i) * cpu : intel xeon sapphire rapids (x64) * os debian linux -Benchmark (size) Mode Cnt Score Error Units (baselinie) -HexFormatBench.appenderLower 512 avgt 15 2.768 ? 0.034 us/op -HexFormatBench.appenderLowerCached 512 avgt 15 2.796 ? 0.042 us/op -HexFormatBench.appenderUpper 512 avgt 15 2.800 ? 0.032 us/op -HexFormatBench.appenderUpperCached 512 avgt 15 2.781 ? 0.018 us/op -HexFormatBench.formatLower 512 avgt 15 0.544 ? 0.002 us/op -HexFormatBench.formatLowerCached 512 avgt 15 0.548 ? 0.004 us/op -HexFormatBench.formatUpper 512 avgt 15 0.546 ? 0.007 us/op -HexFormatBench.formatUpperCached 512 avgt 15 0.550 ? 0.005 us/op -HexFormatBench.toHexDigitsByte 512 avgt 15 3.364 ? 0.015 us/op -HexFormatBench.toHexDigitsInt 512 avgt 15 3.770 ? 0.017 us/op -HexFormatBench.toHexDigitsLong 512 avgt 15 4.990 ? 0.018 us/op -HexFormatBench.toHexDigitsShort 512 avgt 15 3.466 ? 0.017 us/op -HexFormatBench.toHexLower 512 avgt 15 0.415 ? 0.005 us/op -HexFormatBench.toHexLowerCached 512 avgt 15 0.422 ? 0.005 us/op -HexFormatBench.toHexUpper 512 avgt 15 0.413 ? 0.005 us/op -HexFormatBench.toHexUpperCached 512 avgt 15 0.423 ? 0.004 us/op +Benchmark (size) Mode Cnt Score Error Units (optimized) +HexFormatBench.appenderLower 512 avgt 15 0.163 ? 0.001 us/op (+1598.16) +HexFormatBench.appenderLowerCached 512 avgt 15 0.161 ? 0.001 us/op (+1636.65) +HexFormatBench.appenderUpper 512 avgt 15 0.251 ? 0.023 us/op (+1015.54) +HexFormatBench.appenderUpperCached 512 avgt 15 0.266 ? 0.001 us/op (+945.49) +HexFormatBench.formatLower 512 avgt 15 0.275 ? 0.001 us/op (+97.82) +HexFormatBench.formatLowerCached 512 avgt 15 0.277 ? 0.001 us/op (+97.84) +HexFormatBench.formatUpper 512 avgt 15 0.285 ? 0.001 us/op (+91.58) +HexFormatBench.formatUpperCached 512 avgt 15 0.285 ? 0.001 us/op (+92.99) +HexFormatBench.toHexDigitsByte 512 avgt 15 3.554 ? 0.028 us/op (-5.35) +HexFormatBench.toHexDigitsInt 512 avgt 15 3.910 ? 0.015 us/op (-3.59) +HexFormatBench.toHexDigitsLong 512 avgt 15 5.288 ? 0.018 us/op (-5.64) +HexFormatBench.toHexDigitsShort 512 avgt 15 3.637 ? 0.012 us/op (-4.71) +HexFormatBench.toHexLower 512 avgt 15 0.445 ? 0.001 us/op (-6.75) +HexFormatBench.toHexLowerCached 512 avgt 15 0.442 ? 0.001 us/op (-4.53) +HexFormatBench.toHexUpper 512 avgt 15 0.445 ? 0.001 us/op (-7.20) +HexFormatBench.toHexUpperCached 512 avgt 15 0.441 ? 0.001 us/op (-4.09) ## 2. [aliyun_ecs_c8y.xlarge](https://help.aliyun.com/document_detail/25378.html#c8y) * cpu : aliyun yitian 710 (aarch64) * os debian linux -Benchmark (size) Mode Cnt Score Error Units (baseline) -HexFormatBench.appenderLower 512 avgt 15 2.857 ? 0.791 us/op -HexFormatBench.appenderLowerCached 512 avgt 15 2.832 ? 0.758 us/op -HexFormatBench.appenderUpper 512 avgt 15 2.360 ? 0.010 us/op -HexFormatBench.appenderUpperCached 512 avgt 15 2.361 ? 0.013 us/op -HexFormatBench.formatLower 512 avgt 15 0.947 ? 0.406 us/op -HexFormatBench.formatLowerCached 512 avgt 15 0.616 ? 0.002 us/op -HexFormatBench.formatUpper 512 avgt 15 1.212 ? 0.411 us/op -HexFormatBench.formatUpperCached 512 avgt 15 0.616 ? 0.001 us/op -HexFormatBench.toHexDigitsByte 512 avgt 15 5.844 ? 0.264 us/op -HexFormatBench.toHexDigitsInt 512 avgt 15 7.392 ? 0.207 us/op -HexFormatBench.toHexDigitsLong 512 avgt 15 8.068 ? 0.303 us/op -HexFormatBench.toHexDigitsShort 512 avgt 15 6.214 ? 0.266 us/op -HexFormatBench.toHexLower 512 avgt 15 0.926 ? 0.003 us/op -HexFormatBench.toHexLowerCached 512 avgt 15 1.000 ? 0.005 us/op -HexFormatBench.toHexUpper 512 avgt 15 0.927 ? 0.002 us/op -HexFormatBench.toHexUpperCached 512 avgt 15 0.999 ? 0.020 us/op +Benchmark (size) Mode Cnt Score Error Units (optimized) +HexFormatBench.appenderLower 512 avgt 15 0.356 ? 0.001 us/op (+702.53) +HexFormatBench.appenderLowerCached 512 avgt 15 0.356 ? 0.001 us/op (+695.51) +HexFormatBench.appenderUpper 512 avgt 15 0.304 ? 0.001 us/op (+676.32) +HexFormatBench.appenderUpperCached 512 avgt 15 0.304 ? 0.001 us/op (+676.65) +HexFormatBench.formatLower 512 avgt 15 0.461 ? 0.001 us/op (+105.43) +HexFormatBench.formatLowerCached 512 avgt 15 0.485 ? 0.001 us/op (+27.02) +HexFormatBench.formatUpper 512 avgt 15 0.644 ? 0.003 us/op (+88.20) +HexFormatBench.formatUpperCached 512 avgt 15 0.595 ? 0.003 us/op (+3.53) +HexFormatBench.toHexDigitsByte 512 avgt 15 5.804 ? 0.237 us/op (+0.69) +HexFormatBench.toHexDigitsInt 512 avgt 15 7.209 ? 0.212 us/op (+2.54) +HexFormatBench.toHexDigitsLong 512 avgt 15 8.301 ? 0.422 us/op (-2.81) +HexFormatBench.toHexDigitsShort 512 avgt 15 5.908 ? 0.255 us/op (+5.18) +HexFormatBench.toHexLower 512 avgt 15 0.494 ? 0.001 us/op (+87.45) +HexFormatBench.toHexLowerCached 512 avgt 15 0.494 ? 0.001 us/op (+102.43) +HexFormatBench.toHexUpper 512 avgt 15 0.494 ? 0.001 us/op (+87.66) +HexFormatBench.toHexUpperCached 512 avgt 15 0.493 ? 0.001 us/op (+102.64) ## 3. Mac Book Pro M1 Pro -Benchmark (size) Mode Cnt Score Error Units (baseline) -HexFormatBench.appenderLower 512 avgt 15 2.867 ? 0.035 us/op -HexFormatBench.appenderLowerCached 512 avgt 15 1.656 ? 0.875 us/op -HexFormatBench.appenderUpper 512 avgt 15 2.813 ? 0.085 us/op -HexFormatBench.appenderUpperCached 512 avgt 15 1.575 ? 0.901 us/op -HexFormatBench.formatLower 512 avgt 15 0.385 ? 0.001 us/op -HexFormatBench.formatLowerCached 512 avgt 15 0.385 ? 0.002 us/op -HexFormatBench.formatUpper 512 avgt 15 0.385 ? 0.001 us/op -HexFormatBench.formatUpperCached 512 avgt 15 0.384 ? 0.001 us/op -HexFormatBench.toHexDigitsByte 512 avgt 15 1.688 ? 0.009 us/op -HexFormatBench.toHexDigitsInt 512 avgt 15 2.991 ? 0.015 us/op -HexFormatBench.toHexDigitsLong 512 avgt 15 3.719 ? 0.081 us/op -HexFormatBench.toHexDigitsShort 512 avgt 15 1.868 ? 0.010 us/op -HexFormatBench.toHexLower 512 avgt 15 0.321 ? 0.001 us/op -HexFormatBench.toHexLowerCached 512 avgt 15 0.322 ? 0.001 us/op -HexFormatBench.toHexUpper 512 avgt 15 0.324 ? 0.001 us/op -HexFormatBench.toHexUpperCached 512 avgt 15 0.325 ? 0.001 us/op +Benchmark (size) Mode Cnt Score Error Units (optimized) +HexFormatBench.appenderLower 512 avgt 15 0.212 ? 0.003 us/op (+1252.36) +HexFormatBench.appenderLowerCached 512 avgt 15 0.211 ? 0.001 us/op (+684.84) +HexFormatBench.appenderUpper 512 avgt 15 0.199 ? 0.002 us/op (+1313.57) +HexFormatBench.appenderUpperCached 512 avgt 15 0.198 ? 0.001 us/op (+695.46) +HexFormatBench.formatLower 512 avgt 15 0.221 ? 0.001 us/op (+74.21) +HexFormatBench.formatLowerCached 512 avgt 15 0.192 ? 0.001 us/op (+100.53) +HexFormatBench.formatUpper 512 avgt 15 0.317 ? 0.002 us/op (+21.46) +HexFormatBench.formatUpperCached 512 avgt 15 0.348 ? 0.003 us/op (+10.35) +HexFormatBench.toHexDigitsByte 512 avgt 15 1.715 ? 0.011 us/op (-1.58) +HexFormatBench.toHexDigitsInt 512 avgt 15 2.261 ? 0.012 us/op (+32.29) +HexFormatBench.toHexDigitsLong 512 avgt 15 3.776 ? 0.023 us/op (-1.51) +HexFormatBench.toHexDigitsShort 512 avgt 15 1.862 ? 0.011 us/op (+0.33) +HexFormatBench.toHexLower 512 avgt 15 0.289 ? 0.004 us/op (+11.08) +HexFormatBench.toHexLowerCached 512 avgt 15 0.294 ? 0.002 us/op (+9.53) +HexFormatBench.toHexUpper 512 avgt 15 0.288 ? 0.001 us/op (+12.50) +HexFormatBench.toHexUpperCached 512 avgt 15 0.295 ? 0.001 us/op (+10.17) Add internal methods to StringBuilder for performance optimization, I saw that the implementation of JEP 403 String Template does similar things. class AbstractStringBuilder { long mix(long lengthCoder) { } long prepend(long lengthCoder, byte[] buffer) {} // ... } However, the StringBuilder.appendHex method can have more usage scenarios and can be considered as a public method. Is it necessary to submit a new PR to add these methods? class AbstractStringBuilder { public void appendHex(byte[] bytes) {} public void appendHex(byte[] bytes, boolean ucase) {} public void appendHex(byte[] bytes, int fromIndex, int toIndex) {} public void appendHex(byte[] bytes, int fromIndex, int toIndex, boolean ucase) {} } Regarding the performance of using lookup table, I think it makes sense when the length of byte[] is greater than 8. I think that when the length of byte[] is actually used, there is a high probability that it will be greater than 8. Of course, I just said the number 8 casually, it could be 12, or 16. HexDecimal#DIGITS is a table with a size of 512 bytes. I think that in such a table, when it needs to be used continuously, it is worthwhile to perform table lookup operations. I deleted the newly added AbstractBuilder.appendHex method, Such changes are reduced and performance improvements are similar. The new performance test results are as follows: ## 1. [aliyun_ecs_c8i.xlarge](https://help.aliyun.com/document_detail/25378.html#c8i) * cpu : intel xeon sapphire rapids (x64) * os debian linux -Benchmark (size) Mode Cnt Score Error Units (baselinie) -HexFormatBench.appenderLower 512 avgt 15 2.768 ? 0.034 us/op -HexFormatBench.appenderLowerCached 512 avgt 15 2.796 ? 0.042 us/op -HexFormatBench.appenderUpper 512 avgt 15 2.800 ? 0.032 us/op -HexFormatBench.appenderUpperCached 512 avgt 15 2.781 ? 0.018 us/op -HexFormatBench.formatLower 512 avgt 15 0.544 ? 0.002 us/op -HexFormatBench.formatLowerCached 512 avgt 15 0.548 ? 0.004 us/op -HexFormatBench.formatUpper 512 avgt 15 0.546 ? 0.007 us/op -HexFormatBench.formatUpperCached 512 avgt 15 0.550 ? 0.005 us/op -HexFormatBench.toHexDigitsByte 512 avgt 15 3.364 ? 0.015 us/op -HexFormatBench.toHexDigitsInt 512 avgt 15 3.770 ? 0.017 us/op -HexFormatBench.toHexDigitsLong 512 avgt 15 4.990 ? 0.018 us/op -HexFormatBench.toHexDigitsShort 512 avgt 15 3.466 ? 0.017 us/op -HexFormatBench.toHexLower 512 avgt 15 0.415 ? 0.005 us/op -HexFormatBench.toHexLowerCached 512 avgt 15 0.422 ? 0.005 us/op -HexFormatBench.toHexUpper 512 avgt 15 0.413 ? 0.005 us/op -HexFormatBench.toHexUpperCached 512 avgt 15 0.423 ? 0.004 us/op +Benchmark (size) Mode Cnt Score Error Units (optimized) +HexFormatBench.appenderLower 512 avgt 15 0.211 ? 0.002 us/op (+1211.85) +HexFormatBench.appenderLowerCached 512 avgt 15 0.210 ? 0.004 us/op (+1231.43) +HexFormatBench.appenderUpper 512 avgt 15 0.289 ? 0.002 us/op (+868.86) +HexFormatBench.appenderUpperCached 512 avgt 15 0.296 ? 0.019 us/op (+839.53) +HexFormatBench.formatLower 512 avgt 15 0.265 ? 0.001 us/op (+105.29) +HexFormatBench.formatLowerCached 512 avgt 15 0.267 ? 0.002 us/op (+105.25) +HexFormatBench.formatUpper 512 avgt 15 0.274 ? 0.002 us/op (+99.28) +HexFormatBench.formatUpperCached 512 avgt 15 0.286 ? 0.019 us/op (+92.31) +HexFormatBench.toHexDigitsByte 512 avgt 15 3.351 ? 0.011 us/op (+0.39) +HexFormatBench.toHexDigitsInt 512 avgt 15 3.708 ? 0.011 us/op (+1.68) +HexFormatBench.toHexDigitsLong 512 avgt 15 5.051 ? 0.014 us/op (-1.21) +HexFormatBench.toHexDigitsShort 512 avgt 15 3.456 ? 0.012 us/op (+0.29) +HexFormatBench.toHexLower 512 avgt 15 0.445 ? 0.001 us/op (-6.75) +HexFormatBench.toHexLowerCached 512 avgt 15 0.441 ? 0.001 us/op (-4.31) +HexFormatBench.toHexUpper 512 avgt 15 0.444 ? 0.001 us/op (-6.99) +HexFormatBench.toHexUpperCached 512 avgt 15 0.441 ? 0.001 us/op (-4.09) ## 2. [aliyun_ecs_c8y.xlarge](https://help.aliyun.com/document_detail/25378.html#c8y) * cpu : aliyun yitian 710 (aarch64) * os debian linux -Benchmark (size) Mode Cnt Score Error Units (baseline) -HexFormatBench.appenderLower 512 avgt 15 2.857 ? 0.791 us/op -HexFormatBench.appenderLowerCached 512 avgt 15 2.832 ? 0.758 us/op -HexFormatBench.appenderUpper 512 avgt 15 2.360 ? 0.010 us/op -HexFormatBench.appenderUpperCached 512 avgt 15 2.361 ? 0.013 us/op -HexFormatBench.formatLower 512 avgt 15 0.947 ? 0.406 us/op -HexFormatBench.formatLowerCached 512 avgt 15 0.616 ? 0.002 us/op -HexFormatBench.formatUpper 512 avgt 15 1.212 ? 0.411 us/op -HexFormatBench.formatUpperCached 512 avgt 15 0.616 ? 0.001 us/op -HexFormatBench.toHexDigitsByte 512 avgt 15 5.844 ? 0.264 us/op -HexFormatBench.toHexDigitsInt 512 avgt 15 7.392 ? 0.207 us/op -HexFormatBench.toHexDigitsLong 512 avgt 15 8.068 ? 0.303 us/op -HexFormatBench.toHexDigitsShort 512 avgt 15 6.214 ? 0.266 us/op -HexFormatBench.toHexLower 512 avgt 15 0.926 ? 0.003 us/op -HexFormatBench.toHexLowerCached 512 avgt 15 1.000 ? 0.005 us/op -HexFormatBench.toHexUpper 512 avgt 15 0.927 ? 0.002 us/op -HexFormatBench.toHexUpperCached 512 avgt 15 0.999 ? 0.020 us/op +Benchmark (size) Mode Cnt Score Error Units (optimized) +HexFormatBench.appenderLower 512 avgt 15 0.343 ? 0.001 us/op (+732.95) +HexFormatBench.appenderLowerCached 512 avgt 15 0.345 ? 0.001 us/op (+720.87) +HexFormatBench.appenderUpper 512 avgt 15 0.352 ? 0.002 us/op (+570.46) +HexFormatBench.appenderUpperCached 512 avgt 15 0.349 ? 0.001 us/op (+576.51) +HexFormatBench.formatLower 512 avgt 15 0.464 ? 0.001 us/op (+104.10) +HexFormatBench.formatLowerCached 512 avgt 15 0.484 ? 0.002 us/op (+27.28) +HexFormatBench.formatUpper 512 avgt 15 0.650 ? 0.001 us/op (+86.47) +HexFormatBench.formatUpperCached 512 avgt 15 0.598 ? 0.001 us/op (+3.02) +HexFormatBench.toHexDigitsByte 512 avgt 15 5.591 ? 0.058 us/op (+4.53) +HexFormatBench.toHexDigitsInt 512 avgt 15 7.080 ? 0.114 us/op (+4.41) +HexFormatBench.toHexDigitsLong 512 avgt 15 7.754 ? 0.040 us/op (+4.05) +HexFormatBench.toHexDigitsShort 512 avgt 15 5.779 ? 0.076 us/op (+7.53) +HexFormatBench.toHexLower 512 avgt 15 0.494 ? 0.001 us/op (+87.45) +HexFormatBench.toHexLowerCached 512 avgt 15 0.493 ? 0.001 us/op (+102.84) +HexFormatBench.toHexUpper 512 avgt 15 0.494 ? 0.001 us/op (+87.66) +HexFormatBench.toHexUpperCached 512 avgt 15 0.493 ? 0.001 us/op (+102.64) ## 3. Mac Book Pro M1 Pro -Benchmark (size) Mode Cnt Score Error Units (baseline) -HexFormatBench.appenderLower 512 avgt 15 2.867 ? 0.035 us/op -HexFormatBench.appenderLowerCached 512 avgt 15 1.656 ? 0.875 us/op -HexFormatBench.appenderUpper 512 avgt 15 2.813 ? 0.085 us/op -HexFormatBench.appenderUpperCached 512 avgt 15 1.575 ? 0.901 us/op -HexFormatBench.formatLower 512 avgt 15 0.385 ? 0.001 us/op -HexFormatBench.formatLowerCached 512 avgt 15 0.385 ? 0.002 us/op -HexFormatBench.formatUpper 512 avgt 15 0.385 ? 0.001 us/op -HexFormatBench.formatUpperCached 512 avgt 15 0.384 ? 0.001 us/op -HexFormatBench.toHexDigitsByte 512 avgt 15 1.688 ? 0.009 us/op -HexFormatBench.toHexDigitsInt 512 avgt 15 2.991 ? 0.015 us/op -HexFormatBench.toHexDigitsLong 512 avgt 15 3.719 ? 0.081 us/op -HexFormatBench.toHexDigitsShort 512 avgt 15 1.868 ? 0.010 us/op -HexFormatBench.toHexLower 512 avgt 15 0.321 ? 0.001 us/op -HexFormatBench.toHexLowerCached 512 avgt 15 0.322 ? 0.001 us/op -HexFormatBench.toHexUpper 512 avgt 15 0.324 ? 0.001 us/op -HexFormatBench.toHexUpperCached 512 avgt 15 0.325 ? 0.001 us/op +Benchmark (size) Mode Cnt Score Error Units (optimized) +HexFormatBench.appenderLower 512 avgt 15 0.207 ? 0.001 us/op (+1285.03) +HexFormatBench.appenderLowerCached 512 avgt 15 0.206 ? 0.001 us/op (+703.89) +HexFormatBench.appenderUpper 512 avgt 15 0.225 ? 0.001 us/op (+1150.23) +HexFormatBench.appenderUpperCached 512 avgt 15 0.225 ? 0.001 us/op (+600.00) +HexFormatBench.formatLower 512 avgt 15 0.211 ? 0.003 us/op (+82.47) +HexFormatBench.formatLowerCached 512 avgt 15 0.186 ? 0.001 us/op (+106.99) +HexFormatBench.formatUpper 512 avgt 15 0.312 ? 0.001 us/op (+23.40) +HexFormatBench.formatUpperCached 512 avgt 15 0.344 ? 0.001 us/op (+11.63) +HexFormatBench.toHexDigitsByte 512 avgt 15 1.718 ? 0.054 us/op (-1.75) +HexFormatBench.toHexDigitsInt 512 avgt 15 2.255 ? 0.010 us/op (+32.64) +HexFormatBench.toHexDigitsLong 512 avgt 15 3.764 ? 0.005 us/op (-1.20) +HexFormatBench.toHexDigitsShort 512 avgt 15 1.858 ? 0.008 us/op (+0.54) +HexFormatBench.toHexLower 512 avgt 15 0.289 ? 0.004 us/op (+11.08) +HexFormatBench.toHexLowerCached 512 avgt 15 0.295 ? 0.001 us/op (+9.16) +HexFormatBench.toHexUpper 512 avgt 15 0.288 ? 0.001 us/op (+12.50) +HexFormatBench.toHexUpperCached 512 avgt 15 0.297 ? 0.005 us/op (+9.43) ------------- PR Comment: https://git.openjdk.org/jdk/pull/15768#issuecomment-1721723317 PR Comment: https://git.openjdk.org/jdk/pull/15768#issuecomment-1721944547 PR Comment: https://git.openjdk.org/jdk/pull/15768#issuecomment-1722180550