pitrou commented on code in PR #39234:
URL: https://github.com/apache/arrow/pull/39234#discussion_r1434103894
##########
cpp/src/arrow/compute/light_array.cc:
##########
@@ -395,8 +395,12 @@ int ExecBatchBuilder::NumRowsToSkip(const
std::shared_ptr<ArrayData>& column,
--num_rows_left;
int row_id_removed = row_ids[num_rows_left];
const uint32_t* offsets =
- reinterpret_cast<const uint32_t*>(column->buffers[1]->data());
+ reinterpret_cast<const uint32_t*>(column->buffers[1]->data()) +
column->offset;
num_bytes_skipped += offsets[row_id_removed + 1] -
offsets[row_id_removed];
+ // Skip consecutive rows with the same id
Review Comment:
I don't understand what `row_ids` is or why this is needed.
Would you like to update the docstring for `NumRowsToSkip` to make the
semantics more understandable?
Also, why is `row_ids` ignored for fixed-width columns?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]