amol- commented on code in PR #40971:
URL: https://github.com/apache/arrow/pull/40971#discussion_r1561756130


##########
cpp/src/arrow/compute/kernels/vector_selection_filter_internal.cc:
##########
@@ -1039,10 +1039,17 @@ class FilterMetaFunction : public MetaFunction {
 
     if (args[0].kind() == Datum::RECORD_BATCH) {
       auto values_batch = args[0].record_batch();
-      ARROW_ASSIGN_OR_RAISE(
-          std::shared_ptr<RecordBatch> out_batch,
-          FilterRecordBatch(*args[0].record_batch(), args[1], options, ctx));
-      return Datum(out_batch);
+      if (args[1].kind() == Datum::ARRAY) {
+        ARROW_ASSIGN_OR_RAISE(std::shared_ptr<RecordBatch> out_batch,
+                              FilterRecordBatch(*values_batch, args[1], 
options, ctx));
+        return Datum(out_batch);
+      } else {

Review Comment:
   I'm not sure why we would want a filtering a RecordBatch to return a Table, 
outcome of filtering a RecordBatch will never be chunked even when the filter 
itself was chunked.
   
   Also, given that the filter is generally <= to the filtered array length, we 
can probably take for granted that it will always be possible to concatenate 
the filter when it's a chunked array. I think it can be benchmarked, but if I 
had to guess I'd say that concatenating the chunks of the filter has a good 
chance to be faster than FilterTable



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to