amol- commented on code in PR #40971:
URL: https://github.com/apache/arrow/pull/40971#discussion_r1561144362
##########
cpp/src/arrow/compute/kernels/vector_selection_filter_internal.cc:
##########
@@ -1039,10 +1039,17 @@ class FilterMetaFunction : public MetaFunction {
if (args[0].kind() == Datum::RECORD_BATCH) {
auto values_batch = args[0].record_batch();
- ARROW_ASSIGN_OR_RAISE(
- std::shared_ptr<RecordBatch> out_batch,
- FilterRecordBatch(*args[0].record_batch(), args[1], options, ctx));
- return Datum(out_batch);
+ if (args[1].kind() == Datum::ARRAY) {
+ ARROW_ASSIGN_OR_RAISE(std::shared_ptr<RecordBatch> out_batch,
+ FilterRecordBatch(*values_batch, args[1],
options, ctx));
+ return Datum(out_batch);
+ } else {
Review Comment:
Wouldn't it make sense to change `FilterRecordBatch` to handle ChunkedArrays
instead?
The problem seems to be that `FilterRecordBatch` takes for granted that the
filter is always an array (
https://github.com/apache/arrow/blob/6e1b62509b5e0ba69cad630519e3757688a1e0c1/cpp/src/arrow/compute/kernels/vector_selection_filter_internal.cc#L931
)
But it shouldn't be too hard to check the filter type and handle it like
what `FilterTable` does
https://github.com/apache/arrow/blob/6e1b62509b5e0ba69cad630519e3757688a1e0c1/cpp/src/arrow/compute/kernels/vector_selection_filter_internal.cc#L962-L971
in the end you could write the code to always work with multiple chunks and
consider a single array a case where there is just one chunk
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]