[
https://issues.apache.org/jira/browse/ARROW-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Antoine Pitrou resolved ARROW-5760.
-----------------------------------
Resolution: Fixed
Issue resolved by pull request 7382
[https://github.com/apache/arrow/pull/7382]
> [C++] Optimize Take implementation
> ----------------------------------
>
> Key: ARROW-5760
> URL: https://issues.apache.org/jira/browse/ARROW-5760
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++
> Reporter: Ben Kietzman
> Assignee: Wes McKinney
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.0.0
>
> Time Spent: 4h 20m
> Remaining Estimate: 0h
>
> There is some question of whether these kernels allocate optimally- for
> example when Filtering or Taking strings it might be more efficient to pass
> over the filter/indices twice, first to determine how much character storage
> will be needed then again into allocated memory:
> https://github.com/apache/arrow/pull/4531#discussion_r297160457
> Additionally, these kernels could probably make good use of scatter/gather
> SIMD instructions.
> Furthermore, Filter's bitmap is currently lazily expanded into the indices of
> elements to be appended to the output array. It would probably be more
> efficient to expand to indices in batches, then gather using an index batch.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)