Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23401 )

Change subject: IMPALA-14386: Add benchmarks for Byte Stream Split encoding
......................................................................


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/23401/2/be/src/benchmarks/parquet-byte-stream-split-decoder-benchmark.cc
File be/src/benchmarks/parquet-byte-stream-split-decoder-benchmark.cc:

http://gerrit.cloudera.org:8080/#/c/23401/2/be/src/benchmarks/parquet-byte-stream-split-decoder-benchmark.cc@139
PS2, Line 139: byte_addition
As far as I can see we only use it with byte vectors of size 4 and 8. I don't 
think it's worth the additional complexity to add this function instead of 
using uint32_t and uint64_t.

Also, I wouldn't worry much about overflow, it's not a big problem if it occurs 
once in a dataset.


http://gerrit.cloudera.org:8080/#/c/23401/2/be/src/benchmarks/parquet-byte-stream-split-decoder-benchmark.cc@205
PS2, Line 205: Benchmark Data Simulator Functions
This could be a bit hard to understand... Could you reformulate it?


http://gerrit.cloudera.org:8080/#/c/23401/2/be/src/benchmarks/parquet-byte-stream-split-decoder-benchmark.cc@218
PS2, Line 218: void testOutputCorrectness(
Could add a comment why this is needed and operator== is not enough: is it 
because better reporting of the index where the vectors differ?


http://gerrit.cloudera.org:8080/#/c/23401/2/be/src/benchmarks/parquet-byte-stream-split-decoder-benchmark.cc@239
PS2, Line 239: void BSSRun_DecodeBatch(int batch_size, void* d) {
BSSRun_DecodeBatch and BSSComp_DecodeBatch are almost the same, the only 
difference is the decoder. They could be factorised into a common function 
template that takes the decoder as a parameter (the type of the type template).

This is good because if the functions change, they will automatically be in 
sync.

The same could be done with the other runtime - compile time function pairs.



--
To view, visit http://gerrit.cloudera.org:8080/23401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I708af625348b0643aa3f37525b8a6e74f0c47057
Gerrit-Change-Number: 23401
Gerrit-PatchSet: 2
Gerrit-Owner: Gabriella Gyorgyevics <ggyorgyev...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <npaptak...@cloudera.com>
Gerrit-Comment-Date: Wed, 10 Sep 2025 12:04:35 +0000
Gerrit-HasComments: Yes

Reply via email to