[ 
https://issues.apache.org/jira/browse/ARROW-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16771388#comment-16771388
 ] 

Nicolas Trinquier commented on ARROW-4605:
------------------------------------------

[~andygrove] did you mean to move the functions as they are? They both do some 
kind of filtering on the data, what do you think of re-implementing them as a 
more generic function? Some along those lines:

 

 
{code:java}
fn filter(a: &Array, predicate: Fn(usize) -> bool) -> Result<ArrayRef> {
...

  for i in 0..b.len() {
    if predicate(i) {
      builder.append_value(a.value(i))?
    }
  }
...
}
{code}
 

Predicates would look like this:
{code:java}
let limit_predicate = |index| { index < limit_value }
let filter_predicate = |index| { filter_bools(index) }
{code}
I do not know a/ if this pattern is very rustacean, and b/ if the abstraction 
is worth it (i.e. in the case of limit we would still allocate a buffer for the 
full size and iterate through all the elements whereas we could save space and 
return early).

 

> [Rust] Move filter and limit code from DataFusion into compute module
> ---------------------------------------------------------------------
>
>                 Key: ARROW-4605
>                 URL: https://issues.apache.org/jira/browse/ARROW-4605
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Rust
>    Affects Versions: 0.12.0
>            Reporter: Andy Grove
>            Priority: Major
>             Fix For: 0.13.0
>
>
> FilterRelation and the new LimitRelation (in ARROW-4464) contain code for 
> filtering and limiting arrays that could now be pushed down into the compute 
> module.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to