[ https://issues.apache.org/jira/browse/ARROW-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wes McKinney reassigned ARROW-5760: ----------------------------------- Assignee: Wes McKinney (was: Ben Kietzman) > [C++] Optimize Take and Filter > ------------------------------ > > Key: ARROW-5760 > URL: https://issues.apache.org/jira/browse/ARROW-5760 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ > Reporter: Ben Kietzman > Assignee: Wes McKinney > Priority: Major > Fix For: 1.0.0 > > > There is some question of whether these kernels allocate optimally- for > example when Filtering or Taking strings it might be more efficient to pass > over the filter/indices twice, first to determine how much character storage > will be needed then again into allocated memory: > https://github.com/apache/arrow/pull/4531#discussion_r297160457 > Additionally, these kernels could probably make good use of scatter/gather > SIMD instructions. > Furthermore, Filter's bitmap is currently lazily expanded into the indices of > elements to be appended to the output array. It would probably be more > efficient to expand to indices in batches, then gather using an index batch. -- This message was sent by Atlassian Jira (v8.3.4#803005)