[ 
https://issues.apache.org/jira/browse/ARROW-10117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352944#comment-17352944
 ] 

Weston Pace commented on ARROW-10117:
-------------------------------------

One interesting thing to note as my work on this is progressing is that the 
difference between how tasks are scheduled (ARROW-12903) is having a greater 
and greater impact on performance.  As an example, in my latest benchmarks, a 
tiny reference workload runs at ~2 million tasks per second.  With 8 threads 
and poor scheduling we end up with 620k tasks per second (this is similar to 
the baseline performance).  With 8 threads and ideal scheduling we end up with 
7-8 million tasks per second.

It's possible we can find some ways to better handle the worst-case scenario 
but it's not something I'm going to be tackling as part of this PR.

> [C++] Implement work-stealing scheduler / multiple queues in ThreadPool
> -----------------------------------------------------------------------
>
>                 Key: ARROW-10117
>                 URL: https://issues.apache.org/jira/browse/ARROW-10117
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Wes McKinney
>            Assignee: Weston Pace
>            Priority: Major
>
> This involves a change from a single task queue shared amongst all threads to 
> a per-thread task queue and the ability for idle threads to take tasks from 
> other threads' queues (work stealing). 
> As part of this, the task submission API would need to be evolved in some 
> fashion to allow for tasks related to a particular workload to end up in the 
> same task queue



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to