bryanck opened a new pull request, #5168: URL: https://github.com/apache/iceberg/pull/5168
The vectorized reader benchmarks showed that the Iceberg Parquet vectorized reader falls behind the one in Spark when reading decimal types. When profiling the code, a bottleneck was discovered in a method in Arrow that pads the byte buffer when setting a value in the DecimalVector, specifically [this operation](https://github.com/apache/arrow/blob/fb6f200278ecdf65394fc293de8d35edfcda8bde/java/vector/src/main/java/org/apache/arrow/vector/DecimalVector.java#L241). Runs of [this benchmark](https://gist.github.com/anonymous/1c5ceca58222430d7dfb85bc5cc0e6a1) showed that calling `Unsafe.setMemory()` can be slower than Java array operations. Results of a run are [here](https://gist.github.com/bryanck/9e293b8b10d7f29810ec8fdbb0582ae8). This PR adds a workaround that pads the byte buffer before calling `setBigEndian()` to avoid `Unsafe.setMemory()` from being called. Here are the results of a run of the `VectorizedReadDictionaryEncodedFlatParquetDataBenchmark` benchmark without this change: ``` Benchmark Mode Cnt Score Error Units VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDatesIcebergVectorized5k ss 5 2.016 ± 0.069 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDatesSparkVectorized5k ss 5 2.083 ± 0.076 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDecimalsIcebergVectorized5k ss 5 14.451 ± 0.273 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDecimalsSparkVectorized5k ss 5 6.886 ± 0.163 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDoublesIcebergVectorized5k ss 5 2.058 ± 0.108 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDoublesSparkVectorized5k ss 5 1.731 ± 0.117 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readFloatsIcebergVectorized5k ss 5 1.905 ± 0.016 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readFloatsSparkVectorized5k ss 5 2.436 ± 0.178 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readIntegersIcebergVectorized5k ss 5 2.975 ± 0.053 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readIntegersSparkVectorized5k ss 5 2.461 ± 0.951 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readLongsIcebergVectorized5k ss 5 2.713 ± 0.075 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readLongsSparkVectorized5k ss 5 2.321 ± 0.953 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readStringsIcebergVectorized5k ss 5 3.154 ± 0.062 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readStringsSparkVectorized5k ss 5 4.567 ± 1.864 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readTimestampsIcebergVectorized5k ss 5 2.674 ± 0.085 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readTimestampsSparkVectorized5k ss 5 2.634 ± 0.089 s/op ``` Here are the results of a run with this change: ``` Benchmark Mode Cnt Score Error Units VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDatesIcebergVectorized5k ss 5 2.339 ± 1.092 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDatesSparkVectorized5k ss 5 2.204 ± 0.085 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDecimalsIcebergVectorized5k ss 5 8.501 ± 0.129 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDecimalsSparkVectorized5k ss 5 7.130 ± 0.111 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDoublesIcebergVectorized5k ss 5 2.677 ± 0.083 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDoublesSparkVectorized5k ss 5 2.251 ± 0.142 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readFloatsIcebergVectorized5k ss 5 2.616 ± 0.090 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readFloatsSparkVectorized5k ss 5 2.438 ± 0.074 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readIntegersIcebergVectorized5k ss 5 2.620 ± 0.171 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readIntegersSparkVectorized5k ss 5 2.242 ± 0.140 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readLongsIcebergVectorized5k ss 5 2.679 ± 0.084 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readLongsSparkVectorized5k ss 5 2.504 ± 0.173 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readStringsIcebergVectorized5k ss 5 3.804 ± 0.215 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readStringsSparkVectorized5k ss 5 4.864 ± 0.163 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readTimestampsIcebergVectorized5k ss 5 2.544 ± 0.086 s/op VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readTimestampsSparkVectorized5k ss 5 2.524 ± 0.193 s/op ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
