Andrew Lamb created ARROW-11182:
-----------------------------------
Summary: [Rust] [DataFusion] Improve performance if IN list
function
Key: ARROW-11182
URL: https://issues.apache.org/jira/browse/ARROW-11182
Project: Apache Arrow
Issue Type: Improvement
Reporter: Andrew Lamb
The initial implementation of IN and NOT IN followed the "functional first, and
then fast"
There are several potential performance improvements for the IN and NOT IN
implementation in Data fusion such as optimizing for large lists (use a hash
table rather than repeated comparisons) and short circuiting results.
There are a bunch of good ideas in the comments on this PR:
https://github.com/apache/arrow/pull/9038/files
--
This message was sent by Atlassian Jira
(v8.3.4#803005)