jorisvandenbossche commented on code in PR #40971:
URL: https://github.com/apache/arrow/pull/40971#discussion_r1563070055
##########
cpp/src/arrow/compute/kernels/vector_selection_filter_internal.cc:
##########
@@ -1039,10 +1039,17 @@ class FilterMetaFunction : public MetaFunction {
if (args[0].kind() == Datum::RECORD_BATCH) {
auto values_batch = args[0].record_batch();
- ARROW_ASSIGN_OR_RAISE(
- std::shared_ptr<RecordBatch> out_batch,
- FilterRecordBatch(*args[0].record_batch(), args[1], options, ctx));
- return Datum(out_batch);
+ if (args[1].kind() == Datum::ARRAY) {
+ ARROW_ASSIGN_OR_RAISE(std::shared_ptr<RecordBatch> out_batch,
+ FilterRecordBatch(*values_batch, args[1],
options, ctx));
+ return Datum(out_batch);
+ } else {
Review Comment:
> I'm not sure why we would want a filtering a RecordBatch to return a
Table, outcome of filtering a RecordBatch will never be chunked even when the
filter itself was chunked.
I was only saying that because that is what would happen if we use the same
approach as `FilterTable` does (which I thought you were suggesting, but
rereading you only say to check the filter type like FilterTable does)
But it's a good point that we could also just concatenate the filter. Type
stability/predictability wise, it's probably nice to have RecordBatch.filter()
always return a RecordBatch.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]