Andrew Lamb created ARROW-11182:
-----------------------------------

             Summary: [Rust] [DataFusion] Improve performance if IN list 
function
                 Key: ARROW-11182
                 URL: https://issues.apache.org/jira/browse/ARROW-11182
             Project: Apache Arrow
          Issue Type: Improvement
            Reporter: Andrew Lamb


The initial implementation of IN and NOT IN followed the "functional first, and 
then fast"

There are several potential performance improvements for the IN and NOT IN 
implementation in Data fusion such as optimizing for large lists (use a hash 
table rather than repeated comparisons) and short circuiting results. 

There are a bunch of good ideas in the comments on this PR: 
https://github.com/apache/arrow/pull/9038/files



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to