alamb commented on a change in pull request #8303:
URL: https://github.com/apache/arrow/pull/8303#discussion_r496930511
##########
File path: rust/arrow/src/compute/kernels/filter.rs
##########
@@ -353,15 +353,19 @@ impl FilterContext {
// foreach bit in batch:
if (filter_batch & self.filter_mask[j]) != 0 {
let data_index = (i * 64) + j;
- values.push(input_array.value(data_index));
+ if input_array.is_null(data_index) {
Review comment:
This is the same pattern as in the handling for primative array:
https://github.com/apache/arrow/pull/8303/files#diff-d7b0b7cde1850e8744ceda458c6dea81R294-L298
##########
File path: rust/arrow/src/compute/kernels/filter.rs
##########
@@ -373,7 +377,11 @@ impl FilterContext {
// foreach bit in batch:
if (filter_batch & self.filter_mask[j]) != 0 {
let data_index = (i * 64) + j;
- values.push(input_array.value(data_index));
+ if input_array.is_null(data_index) {
Review comment:
Likewise, this special case appears to miss the null check too
##########
File path: rust/arrow/src/compute/kernels/filter.rs
##########
@@ -353,15 +353,19 @@ impl FilterContext {
// foreach bit in batch:
if (filter_batch & self.filter_mask[j]) != 0 {
let data_index = (i * 64) + j;
- values.push(input_array.value(data_index));
+ if input_array.is_null(data_index) {
+ values.push(None)
+ } else {
+
values.push(Some(input_array.value(data_index)))
+ }
}
}
}
Ok(Arc::new(BinaryArray::from(values)))
}
DataType::Utf8 => {
let input_array =
array.as_any().downcast_ref::<StringArray>().unwrap();
- let mut values: Vec<&str> =
Vec::with_capacity(self.filtered_count);
+ let mut values: Vec<Option<&str>> =
Vec::with_capacity(self.filtered_count);
Review comment:
Note using an `Option` is likely to increase the temporary storage
requirements a bit.
It would likely be possible to avoid this allocation entirely if we used the
lower level `ArrayBuilder::with_bit_buffer`.
I chose to follow the style of the rest of this module, though I would love
opinions on trying to perf check this / optimize it (maybe a follow on JIRA
ticket is enough)?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]