yjshen commented on a change in pull request #2068:
URL: https://github.com/apache/arrow-datafusion/pull/2068#discussion_r835741197
##########
File path: datafusion-physical-expr/src/physical_expr.rs
##########
@@ -38,4 +43,74 @@ pub trait PhysicalExpr: Send + Sync + Display + Debug {
fn nullable(&self, input_schema: &Schema) -> Result<bool>;
/// Evaluate an expression against a RecordBatch
fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue>;
+ /// Evaluate an expression against a RecordBatch with validity array
+ fn evaluate_selection(
+ &self,
+ batch: &RecordBatch,
+ selection: &BooleanArray,
+ ) -> Result<ColumnarValue> {
+ let mut indices = vec![];
+ for (i, b) in selection.iter().enumerate() {
+ if let Some(true) = b {
+ indices.push(i as u64);
+ }
+ }
+ let indices = UInt64Array::from_iter_values(indices);
Review comment:
No, I think divide kernel works correctly to deal with only valid
indices.
Are you suggesting I should create new RecordBatch by masking the current
batch instead of the take-then-scatter way? Should I create bitmaps from
existing ones for each array with the help of `arrow::bit_util`, or do I miss
something handy?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]