adriacabeza opened a new pull request, #2485:
URL: https://github.com/apache/fory/pull/2485
## What does this PR do?
Currently, primitive array are serialized by copy the data buffer directly
using `sun.misc.Unsafe`. After this PR, if the config values are specified, we
check with SIMD operations if the values inside are small enough and then we
compress its values. For example:
```
int[] → byte[] when all values ∈ [-128, 127]
Size goes from 4B/elem → 1B/elem (≈ 75% smaller).
int[] → short[] when all values ∈ [-32768, 32767] (but some fall outside
[-128,127])
4B/elem → 2B/elem (≈ 50% smaller).
long[] → int[] when all values ∈ [Integer.MIN_VALUE, Integer.MAX_VALUE]
8B/elem → 4B/elem (≈ 50% smaller).
```
Please take a look at the benchmark section to know more about the
performance impact.
## Related issues
<!--
Is there any related issue? Please attach here.
- #xxxx0
- #xxxx1
- #xxxx2
-->
## Does this PR introduce any user-facing change?
Yes, it adds `withIntArrayCompressed` and `withLongArrayCompressed` options
to the Fory builder which is a breaking change with previously compressed
arrays.
- [x] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?
## Benchmark
Check the file
`src/main/java/org/apache/fory/benchmark/ArrayCompressionSuite.java` to see the
benchmark code. To run it, go to the `/benchmark` directory and:
1. Compile the benchmark
```
javac -cp "../fory-core/target/classes:$(mvn dependency:build-classpath -q
-Dmdep.outputFile=/dev/stdout -Pjmh)" -d target/classes
src/main/java/org/apache/fory/benchmark/ArrayCompressionSuite.java
```
2. Run benchmark
The benchmarks performed in `ArrayCompressionSuite` show that:
- Compression has negligible cost: Deserialization and serialization
throughput are almost identical between compressed and uncompressed arrays.
- Massive SIMD advantage: Determining compression type with SIMD is orders
of magnitude faster than scalar.
- Memory savings with compression (e.g., int[] → byte[] gives 75% smaller
arrays) delivers significant space reductions with no meaningful performance
trade-off.
```
Benchmark (arraySize) Mode Cnt
Score Error Units
# deserialize
ArrayCompressionSuite.deserializeCompressedIntArray 100 thrpt 3
39337995.386 ± 5360760.131 ops/s
ArrayCompressionSuite.deserializeNormalIntArray 100 thrpt 3
38295765.689 ± 1245305.149 ops/s
ArrayCompressionSuite.deserializeCompressedIntArray 1000 thrpt 3
5551302.526 ± 137012.772 ops/s
ArrayCompressionSuite.deserializeNormalIntArray 1000 thrpt 3
5437271.349 ± 607905.314 ops/s
ArrayCompressionSuite.deserializeCompressedIntArray 10000 thrpt 3
646854.523 ± 162819.346 ops/s
ArrayCompressionSuite.deserializeNormalIntArray 10000 thrpt 3
622439.409 ± 156114.438 ops/s
ArrayCompressionSuite.deserializeCompressedIntArray 100000 thrpt 3
72575.767 ± 1516.885 ops/s
ArrayCompressionSuite.deserializeNormalIntArray 100000 thrpt 3
71451.258 ± 8428.635 ops/s
ArrayCompressionSuite.deserializeCompressedIntArray 1000000 thrpt 3
7333.304 ± 5231.423 ops/s
ArrayCompressionSuite.deserializeNormalIntArray 1000000 thrpt 3
6763.321 ± 6402.086 ops/s
ArrayCompressionSuite.deserializeCompressedIntArray 10000000 thrpt 3
617.388 ± 20.088 ops/s
ArrayCompressionSuite.deserializeNormalIntArray 10000000 thrpt 3
615.323 ± 4.986 ops/s
ArrayCompressionSuite.deserializeCompressedLongArray 100 thrpt 3
25497542.958 ± 409763.148 ops/s
ArrayCompressionSuite.deserializeNormalLongArray 100 thrpt 3
25509274.001 ± 709963.309 ops/s
ArrayCompressionSuite.deserializeCompressedLongArray 1000 thrpt 3
955670.779 ± 1113862.949 ops/s
ArrayCompressionSuite.deserializeNormalLongArray 1000 thrpt 3
923666.161 ± 1873719.259 ops/s
ArrayCompressionSuite.deserializeCompressedLongArray 10000 thrpt 3
358903.335 ± 73728.335 ops/s
ArrayCompressionSuite.deserializeNormalLongArray 10000 thrpt 3
356203.511 ± 60149.652 ops/s
ArrayCompressionSuite.deserializeCompressedLongArray 100000 thrpt 3
36963.012 ± 316.502 ops/s
ArrayCompressionSuite.deserializeNormalLongArray 100000 thrpt 3
41915.010 ± 26587.942 ops/s
ArrayCompressionSuite.deserializeCompressedLongArray 1000000 thrpt 3
3783.469 ± 1564.800 ops/s
ArrayCompressionSuite.deserializeNormalLongArray 1000000 thrpt 3
3834.763 ± 762.397 ops/s
ArrayCompressionSuite.deserializeCompressedLongArray 10000000 thrpt 3
323.378 ± 35.506 ops/s
ArrayCompressionSuite.deserializeNormalLongArray 10000000 thrpt 3
332.769 ± 20.246 ops/s
```
```
Benchmark (arraySize) Mode Cnt
Score Error Units
# serialize compression
ArrayCompressionSuite.serializeCompressedIntArray 100 thrpt 3
37687667.146 ± 649284.080 ops/s
ArrayCompressionSuite.serializeNormalIntArray 100 thrpt 3
30217468.757 ± 10126179.430 ops/s
ArrayCompressionSuite.serializeCompressedIntArray 1000 thrpt 3
11754790.584 ± 6487913.872 ops/s
ArrayCompressionSuite.serializeNormalIntArray 1000 thrpt 3
11455357.054 ± 1020061.625 ops/s
ArrayCompressionSuite.serializeCompressedIntArray 10000 thrpt 3
1214061.700 ± 438135.395 ops/s
ArrayCompressionSuite.serializeNormalIntArray 10000 thrpt 3
1149324.112 ± 64148.031 ops/s
ArrayCompressionSuite.serializeCompressedIntArray 100000 thrpt 3
130141.077 ± 51758.369 ops/s
ArrayCompressionSuite.serializeNormalIntArray 100000 thrpt 3
125215.129 ± 21495.372 ops/s
ArrayCompressionSuite.serializeCompressedIntArray 1000000 thrpt 3
8399.914 ± 1440.360 ops/s
ArrayCompressionSuite.serializeNormalIntArray 1000000 thrpt 3
10310.840 ± 18645.102 ops/s
ArrayCompressionSuite.serializeCompressedIntArray 10000000 thrpt 3
753.918 ± 97.084 ops/s
ArrayCompressionSuite.serializeNormalIntArray 10000000 thrpt 3
764.189 ± 79.020 ops/s
ArrayCompressionSuite.serializeCompressedLongArray 100 thrpt 3
39784594.200 ± 2699436.390 ops/s
ArrayCompressionSuite.serializeNormalLongArray 100 thrpt 3
40270803.004 ± 864536.346 ops/s
ArrayCompressionSuite.serializeCompressedLongArray 1000 thrpt 3
6477672.225 ± 686396.963 ops/s
ArrayCompressionSuite.serializeNormalLongArray 1000 thrpt 3
6545549.418 ± 1019871.296 ops/s
ArrayCompressionSuite.serializeCompressedLongArray 10000 thrpt 3
617183.974 ± 55659.620 ops/s
ArrayCompressionSuite.serializeNormalLongArray 10000 thrpt 3
635196.999 ± 149026.050 ops/s
ArrayCompressionSuite.serializeCompressedLongArray 100000 thrpt 3
53462.482 ± 35638.133 ops/s
ArrayCompressionSuite.serializeNormalLongArray 100000 thrpt 3
52004.432 ± 23252.382 ops/s
ArrayCompressionSuite.serializeCompressedLongArray 1000000 thrpt 3
6026.037 ± 16709.968 ops/s
ArrayCompressionSuite.serializeNormalLongArray 1000000 thrpt 3
5552.551 ± 466.382 ops/s
ArrayCompressionSuite.serializeCompressedLongArray 10000000 thrpt 3
429.205 ± 105.545 ops/s
ArrayCompressionSuite.serializeNormalLongArray 10000000 thrpt 3
429.779 ± 45.038 ops/s
```
```
Benchmark (arraySize)
Mode Cnt Score Error Units
# determine compression type
ArrayCompressionSuite.determineIntArrayCompressionTypeSIMD 100
thrpt 3 466402307.389 ± 2555334.745 ops/s
ArrayCompressionSuite.determineIntCompressionTypeScalar 100
thrpt 3 26427884.309 ± 962450.255 ops/s
ArrayCompressionSuite.determineIntArrayCompressionTypeSIMD 1000
thrpt 3 466744123.120 ± 12228234.474 ops/s
ArrayCompressionSuite.determineIntCompressionTypeScalar 1000
thrpt 3 2757905.535 ± 6773.933 ops/s
ArrayCompressionSuite.determineIntArrayCompressionTypeSIMD 10000
thrpt 3 467446373.509 ± 2354677.669 ops/s
ArrayCompressionSuite.determineIntCompressionTypeScalar 10000
thrpt 3 281539.120 ± 32260.205 ops/s
ArrayCompressionSuite.determineIntArrayCompressionTypeSIMD 100000
thrpt 3 467496144.240 ± 5141955.783 ops/s
ArrayCompressionSuite.determineIntCompressionTypeScalar 100000
thrpt 3 28332.792 ± 1172.081 ops/s
ArrayCompressionSuite.determineIntArrayCompressionTypeSIMD 1000000
thrpt 3 457251328.679 ± 67039627.791 ops/s
ArrayCompressionSuite.determineIntCompressionTypeScalar 1000000
thrpt 3 2814.572 ± 83.414 ops/s
ArrayCompressionSuite.determineIntArrayCompressionTypeSIMD 10000000
thrpt 3 465654341.115 ± 15778486.010 ops/s
ArrayCompressionSuite.determineIntCompressionTypeScalar 10000000
thrpt 3 280.622 ± 4.857 ops/s
ArrayCompressionSuite.determineLongArrayCompressionTypeSIMD 100
thrpt 3 473362303.009 ± 19310541.495 ops/s
ArrayCompressionSuite.determineLongCompressionTypeScalar 100
thrpt 3 28186064.314 ± 504616.686 ops/s
ArrayCompressionSuite.determineLongArrayCompressionTypeSIMD 1000
thrpt 3 467564131.215 ± 76322855.052 ops/s
ArrayCompressionSuite.determineLongCompressionTypeScalar 1000
thrpt 3 2918411.197 ± 44589.191 ops/s
ArrayCompressionSuite.determineLongArrayCompressionTypeSIMD 10000
thrpt 3 472648299.254 ± 21619910.950 ops/s
ArrayCompressionSuite.determineLongCompressionTypeScalar 10000
thrpt 3 300527.267 ± 956.579 ops/s
ArrayCompressionSuite.determineLongArrayCompressionTypeSIMD 100000
thrpt 3 473998295.136 ± 11659405.795 ops/s
ArrayCompressionSuite.determineLongCompressionTypeScalar 100000
thrpt 3 30140.257 ± 25.836 ops/s
ArrayCompressionSuite.determineLongArrayCompressionTypeSIMD 1000000
thrpt 3 474356048.064 ± 28979532.447 ops/s
ArrayCompressionSuite.determineLongCompressionTypeScalar 1000000
thrpt 3 2972.073 ± 66.983 ops/s
ArrayCompressionSuite.determineLongArrayCompressionTypeSIMD 10000000
thrpt 3 471544347.771 ± 99677479.793 ops/s
ArrayCompressionSuite.determineLongCompressionTypeScalar 10000000
thrpt 3 298.815 ± 0.287 ops/s
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]