mapleFU commented on code in PR #15124:
URL: https://github.com/apache/arrow/pull/15124#discussion_r1060511173
##########
cpp/src/parquet/encoding.cc:
##########
@@ -2479,7 +2481,18 @@ class DeltaBitPackDecoder : public DecoderImpl, virtual
public TypedDecoder<DTyp
if (ARROW_PREDICT_FALSE(values_current_mini_block_ == 0)) {
if (ARROW_PREDICT_FALSE(!block_initialized_)) {
buffer[i++] = last_value_;
- if (ARROW_PREDICT_FALSE(i == max_values)) break;
+ if (ARROW_PREDICT_FALSE(i == max_values)) {
+ // When block is uninitialized and i reaches max_values we have two
+ // different possibilities:
+ // 1. i == total_value_count_, which means that the page may have
only one
+ // value and we should not initialize any block.
+ // 2. i != total_value_count_ which means that user just read the
first value
+ // in the page, so we should initialize the incoming block.
+ if (i != static_cast<int>(total_value_count_)) {
+ InitBlock();
+ }
Review Comment:
By the way, should we add `ARROW_PREDICT_FALSE` here? @pitrou
##########
cpp/src/parquet/encoding_test.cc:
##########
@@ -1324,6 +1324,29 @@ class TestDeltaBitPackEncoding : public
TestEncodingBase<Type> {
CheckRoundtripSpaced(valid_bits, valid_bits_offset);
}
+ void ExecuteSteps(int nvalues, int repeats, int read_batch) {
Review Comment:
It's ok, but most test current not use it. So I'd like to keep it simple
here. If later we want test `batch size` for other encoding, we can move it to
`CheckRoundtrip`
##########
cpp/src/parquet/encoding.cc:
##########
@@ -2479,7 +2481,18 @@ class DeltaBitPackDecoder : public DecoderImpl, virtual
public TypedDecoder<DTyp
if (ARROW_PREDICT_FALSE(values_current_mini_block_ == 0)) {
if (ARROW_PREDICT_FALSE(!block_initialized_)) {
buffer[i++] = last_value_;
- if (ARROW_PREDICT_FALSE(i == max_values)) break;
+ if (ARROW_PREDICT_FALSE(i == max_values)) {
Review Comment:
Nice catch, would add it
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]