[ 
https://issues.apache.org/jira/browse/ARROW-16177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17522228#comment-17522228
 ] 

Antoine Pitrou commented on ARROW-16177:
----------------------------------------

ThreadIndexer could trivially use a {{std::map}} instead of a 
{{std::unordered_map}}. Then, successful lookups do not need to lock the mutex, 
since {{std::map}} doesn't invalidate existing references on insertion. There's 
still the {{std::map}} lookup cost, but that should be reasonably cheap given 
the map is bound to be small and the key type is trivial.

It may also help to add ThreadIndexer-based microbenchmarks (for both 1 and N 
threads perhaps).

> [C++] Replace ThreadIndexer with a more performant thread-local implementation
> ------------------------------------------------------------------------------
>
>                 Key: ARROW-16177
>                 URL: https://issues.apache.org/jira/browse/ARROW-16177
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: Weston Pace
>            Priority: Major
>
> Many of the ExecNode operations use thread local state (mostly temporary 
> buffers) to avoid per-batch allocation.  Currently we are using ThreadIndexer 
> but to get the thread local state requires locking a mutex and a map lookup.  
> For some of the critical sections we are developing in hash-join this has 
> become a bottleneck.  Ideally we can replace this with something that relies 
> on thread local state instead of a map shared across all threads. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to