lidavidm commented on a change in pull request #11041:
URL: https://github.com/apache/arrow/pull/11041#discussion_r700412732
##########
File path: cpp/src/arrow/compute/exec/key_encode.cc
##########
@@ -427,11 +427,19 @@ void KeyEncoder::EncoderInteger::Decode(uint32_t
start_row, uint32_t num_rows,
row_base += offset_within_row;
uint8_t* col_base = col_prep.mutable_data(1);
switch (col_prep.metadata().fixed_length) {
- case 1:
+ case 1: {
for (uint32_t i = 0; i < num_rows; ++i) {
col_base[i] = row_base[i * row_size];
}
+ // For booleans, we pack 8 bytes at a time, and the buffer we're
+ // writing to here may not be fully initialized - so make sure a
+ // multiple of 8 bytes are initialized to avoid Valgrind errors. The
+ // temp buffer is sized to num_rows uint32_t values, so there's more
+ // than enough space here.
Review comment:
In that case, since the temp buffer is reused quite a bit, we might as
well just initialize the underlying buffer on allocation? It should be a fixed
setup cost since one large buffer is allocated on creation of the grouper
(TempVectorStack) and then slices of it (TempVectorHolder) are taken as scratch
space.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]