zanmato1984 commented on code in PR #42106:
URL: https://github.com/apache/arrow/pull/42106#discussion_r1635806122


##########
cpp/src/arrow/compute/expression.cc:
##########
@@ -732,6 +776,10 @@ Result<Datum> ExecuteScalarExpression(const Expression& 
expr, const ExecBatch& i
         "ExecuteScalarExpression cannot Execute non-scalar expression ", 
expr.ToString());
   }
 
+  if (!expr.selection_vector_aware() && input.selection_vector) {

Review Comment:
   This actually implies a strategy (1) that is: if any subexpression 
(including the top-level expression itself) is not selection vector aware, we 
do the gather/scatter work only once for the top-level expression. This way all 
the subexpressions can work on a (gathered) continuous input.
   
   An alternative strategy (2) could be: every expression that is not selection 
vector aware (not counting its subexpressions), does the gather/scatter itself. 
This way every expression itself maintains the input/output shape same as the 
original input batch.
   
   Given that for a long time (possibly forever), there will be much less 
selection vector aware kernels than the other, it's probably worth it to use 1 
instead of 2, as the gather/scatter only happen once for the whole expression.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to