Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/23401 )
Change subject: IMPALA-14386: Add benchmarks for Byte Stream Split encoding ...................................................................... Patch Set 2: (4 comments) http://gerrit.cloudera.org:8080/#/c/23401/2/be/src/benchmarks/parquet-byte-stream-split-decoder-benchmark.cc File be/src/benchmarks/parquet-byte-stream-split-decoder-benchmark.cc: http://gerrit.cloudera.org:8080/#/c/23401/2/be/src/benchmarks/parquet-byte-stream-split-decoder-benchmark.cc@139 PS2, Line 139: byte_addition As far as I can see we only use it with byte vectors of size 4 and 8. I don't think it's worth the additional complexity to add this function instead of using uint32_t and uint64_t. Also, I wouldn't worry much about overflow, it's not a big problem if it occurs once in a dataset. http://gerrit.cloudera.org:8080/#/c/23401/2/be/src/benchmarks/parquet-byte-stream-split-decoder-benchmark.cc@205 PS2, Line 205: Benchmark Data Simulator Functions This could be a bit hard to understand... Could you reformulate it? http://gerrit.cloudera.org:8080/#/c/23401/2/be/src/benchmarks/parquet-byte-stream-split-decoder-benchmark.cc@218 PS2, Line 218: void testOutputCorrectness( Could add a comment why this is needed and operator== is not enough: is it because better reporting of the index where the vectors differ? http://gerrit.cloudera.org:8080/#/c/23401/2/be/src/benchmarks/parquet-byte-stream-split-decoder-benchmark.cc@239 PS2, Line 239: void BSSRun_DecodeBatch(int batch_size, void* d) { BSSRun_DecodeBatch and BSSComp_DecodeBatch are almost the same, the only difference is the decoder. They could be factorised into a common function template that takes the decoder as a parameter (the type of the type template). This is good because if the functions change, they will automatically be in sync. The same could be done with the other runtime - compile time function pairs. -- To view, visit http://gerrit.cloudera.org:8080/23401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I708af625348b0643aa3f37525b8a6e74f0c47057 Gerrit-Change-Number: 23401 Gerrit-PatchSet: 2 Gerrit-Owner: Gabriella Gyorgyevics <ggyorgyev...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Noemi Pap-Takacs <npaptak...@cloudera.com> Gerrit-Comment-Date: Wed, 10 Sep 2025 12:04:35 +0000 Gerrit-HasComments: Yes