mapleFU commented on PR #15241: URL: https://github.com/apache/arrow/pull/15241#issuecomment-1377686245
Well, I found that rok's testing data already covers that case: And assume the page has 65 values and 32 values per-mini-block: ``` total_value_count_ == 1, call InitBlock current_num_values = 0, read block one, change to 32. current_num_values = 32, read block one, change to 64. current_num_values = 64, read block one, change to 96. ``` When changing to `InitBlock(is_init_page)`, I meet another trickey problem: ```c++ while (i < max_values) { ... decode values from miniblocks } total_values_remaining_ -= max_values; ``` So, `InitBlock()` should have a counter, or it should decrease `total_values_remaining_` per loop. I think there are some fixing: 1. current my implemention is ok here, because it maintains a `current miniblock counter` 2. The change like below: ```c++ while (i < max_values) { ... decode values from miniblocks, when i increase, decrease remaining ... or passing i into InitBlock } CHECK_EQ(i, max_values); ``` @pitrou what's your optionion here? Maintaining a `current value counter` or `passing i into InitBlock`, like `InitBlock(int prefix_counter)`, or decrease `remaining counter` every time we inc? (Personally I'd like to call `InitBlock(int prefix_counter)`, but I'd like to know what's your opinion) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org