[
https://issues.apache.org/jira/browse/ARROW-16177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17522228#comment-17522228
]
Antoine Pitrou commented on ARROW-16177:
----------------------------------------
ThreadIndexer could trivially use a {{std::map}} instead of a
{{std::unordered_map}}. Then, successful lookups do not need to lock the mutex,
since {{std::map}} doesn't invalidate existing references on insertion. There's
still the {{std::map}} lookup cost, but that should be reasonably cheap given
the map is bound to be small and the key type is trivial.
It may also help to add ThreadIndexer-based microbenchmarks (for both 1 and N
threads perhaps).
> [C++] Replace ThreadIndexer with a more performant thread-local implementation
> ------------------------------------------------------------------------------
>
> Key: ARROW-16177
> URL: https://issues.apache.org/jira/browse/ARROW-16177
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Reporter: Weston Pace
> Priority: Major
>
> Many of the ExecNode operations use thread local state (mostly temporary
> buffers) to avoid per-batch allocation. Currently we are using ThreadIndexer
> but to get the thread local state requires locking a mutex and a map lookup.
> For some of the critical sections we are developing in hash-join this has
> become a bottleneck. Ideally we can replace this with something that relies
> on thread local state instead of a map shared across all threads.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)