In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding.
Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea 
can be found at 
http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords.

Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build.
Tests in `test/jdk/java/util/Base64/` and 
`compiler/intrinsics/base64/TestBase64.java` runned specially for the 
correctness of the implementation.

There can be illegal characters at the start of the input if the data is MIME 
encoded.
It would be no benefits to use SIMD for this case, so the stub use no-simd 
instructions for MIME encoded data now.

A JMH micro, Base64Decode.java, is added for performance test.
With different input length (upper-bounded by parameter `maxNumBytes` in the 
JMH micro),
we witness ~2.5x improvements with long inputs and no regression with short 
inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on 
Kunpeng916.

The Base64Decode.java JMH micro-benchmark results:

# Kunpeng916, intrinsic
Base64Decode.testBase64Decode               4              1  avgt    5      
48.614 ±     0.609  ns/op
Base64Decode.testBase64Decode               4              3  avgt    5      
58.199 ±     1.650  ns/op
Base64Decode.testBase64Decode               4              7  avgt    5      
69.400 ±     0.931  ns/op
Base64Decode.testBase64Decode               4             32  avgt    5      
96.818 ±     1.687  ns/op
Base64Decode.testBase64Decode               4             64  avgt    5     
122.856 ±     9.217  ns/op
Base64Decode.testBase64Decode               4             80  avgt    5     
130.935 ±     1.667  ns/op
Base64Decode.testBase64Decode               4             96  avgt    5     
143.627 ±     1.751  ns/op
Base64Decode.testBase64Decode               4            112  avgt    5     
152.311 ±     1.178  ns/op
Base64Decode.testBase64Decode               4            512  avgt    5     
342.631 ±     0.584  ns/op
Base64Decode.testBase64Decode               4           1000  avgt    5     
573.635 ±     1.050  ns/op
Base64Decode.testBase64Decode               4          20000  avgt    5    
9534.136 ±    45.172  ns/op
Base64Decode.testBase64Decode               4          50000  avgt    5   
22718.726 ±   192.070  ns/op
Base64Decode.testBase64MIMEDecode           4              1  avgt   10      
63.558 ±    0.336  ns/op
Base64Decode.testBase64MIMEDecode           4              3  avgt   10      
82.504 ±    0.848  ns/op
Base64Decode.testBase64MIMEDecode           4              7  avgt   10     
120.591 ±    0.608  ns/op
Base64Decode.testBase64MIMEDecode           4             32  avgt   10     
324.314 ±    6.236  ns/op
Base64Decode.testBase64MIMEDecode           4             64  avgt   10     
532.678 ±    4.670  ns/op
Base64Decode.testBase64MIMEDecode           4             80  avgt   10     
678.126 ±    4.324  ns/op
Base64Decode.testBase64MIMEDecode           4             96  avgt   10     
771.603 ±    6.393  ns/op
Base64Decode.testBase64MIMEDecode           4            112  avgt   10     
889.608 ±   0.759  ns/op
Base64Decode.testBase64MIMEDecode           4            512  avgt   10    
3663.557 ±    3.422  ns/op
Base64Decode.testBase64MIMEDecode           4           1000  avgt   10    
7017.784 ±    9.128  ns/op
Base64Decode.testBase64MIMEDecode           4          20000  avgt   10  
128670.660 ± 7951.521  ns/op
Base64Decode.testBase64MIMEDecode           4          50000  avgt   10  
317113.667 ±  161.758  ns/op

# Kunpeng916, default
Base64Decode.testBase64Decode               4              1  avgt    5      
48.455 ±   0.571  ns/op
Base64Decode.testBase64Decode               4              3  avgt    5      
57.937 ±   0.505  ns/op
Base64Decode.testBase64Decode               4              7  avgt    5      
73.823 ±   1.452  ns/op
Base64Decode.testBase64Decode               4             32  avgt    5     
106.484 ±   1.243  ns/op
Base64Decode.testBase64Decode               4             64  avgt    5     
141.004 ±   1.188  ns/op
Base64Decode.testBase64Decode               4             80  avgt    5     
156.284 ±   0.572  ns/op
Base64Decode.testBase64Decode               4             96  avgt    5     
174.137 ±   0.177  ns/op
Base64Decode.testBase64Decode               4            112  avgt    5     
188.445 ±   0.572  ns/op
Base64Decode.testBase64Decode               4            512  avgt    5     
610.847 ±   1.559  ns/op
Base64Decode.testBase64Decode               4           1000  avgt    5    
1155.368 ±   0.813  ns/op
Base64Decode.testBase64Decode               4          20000  avgt    5   
19751.477 ±  24.669  ns/op
Base64Decode.testBase64Decode               4          50000  avgt    5   
50046.586 ± 523.155  ns/op
Base64Decode.testBase64MIMEDecode           4              1  avgt   10      
64.130 ±   0.238  ns/op
Base64Decode.testBase64MIMEDecode           4              3  avgt   10      
82.096 ±   0.205  ns/op
Base64Decode.testBase64MIMEDecode           4              7  avgt   10     
118.849 ±   0.610  ns/op
Base64Decode.testBase64MIMEDecode           4             32  avgt   10     
331.177 ±   4.732  ns/op
Base64Decode.testBase64MIMEDecode           4             64  avgt   10     
549.117 ±   0.177  ns/op
Base64Decode.testBase64MIMEDecode           4             80  avgt   10     
702.951 ±   4.572  ns/op
Base64Decode.testBase64MIMEDecode           4             96  avgt   10     
799.566 ±   0.301  ns/op
Base64Decode.testBase64MIMEDecode           4            112  avgt   10     
923.749 ±   0.389  ns/op
Base64Decode.testBase64MIMEDecode           4            512  avgt   10    
4000.725 ±   2.519  ns/op
Base64Decode.testBase64MIMEDecode           4           1000  avgt   10    
7674.994 ±   9.281  ns/op
Base64Decode.testBase64MIMEDecode           4          20000  avgt   10  
142059.001 ± 157.920  ns/op
Base64Decode.testBase64MIMEDecode           4          50000  avgt   10  
355698.369 ± 216.542  ns/op

-------------

Commit messages:
 - 8256245: AArch64: Implement Base64 decoding intrinsic

Changes: https://git.openjdk.java.net/jdk/pull/3228/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3228&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8256245
  Stats: 410 lines in 3 files changed: 410 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3228.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3228/head:pull/3228

PR: https://git.openjdk.java.net/jdk/pull/3228

Reply via email to