alamb commented on PR #6154:
URL: https://github.com/apache/arrow-rs/pull/6154#issuecomment-2256821863
My concern with this approach (and with any of the "gc while filtering"
approaches) is that it will work well for some use cases but be unavoidable
overhead with other use cases.
I think we need an API that allows callers to specify how they want the GC
to be done
What I have talked about with @XiangpengHao and a few others is making some
sort of stateful filter API; Among other things such an API would give us a
place to put options for filtering
So maybe it would look like
```rust
// create a new filterer
let filterer = ArrayFilterer::new()
// specify that GC should look for empty buffers
.with_gc_strategy(GcStrategy::FindEmptyBuffers);
// feed in arrays and their filters (BooleanArray)
filterer.push(array1, filter1)?;
filterer.push(array2, filter2)?;
// Produce the final output array
let filtered_array = filterer.build();
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]