[ 
https://issues.apache.org/jira/browse/ARROW-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou updated ARROW-8667:
----------------------------------
    Fix Version/s: 8.0.0
                       (was: 7.0.0)

> [C++] Add multi-consumer Scheduler API to sit one layer above ThreadPool
> ------------------------------------------------------------------------
>
>                 Key: ARROW-8667
>                 URL: https://issues.apache.org/jira/browse/ARROW-8667
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Wes McKinney
>            Assignee: Weston Pace
>            Priority: Major
>             Fix For: 8.0.0
>
>
> I believe we should define an abstraction to allow for custom resource 
> allocation strategies (round robin, even time, etc.) to be devised for 
> situations where there are different thread pool consumers that are working 
> independently of each other.
> Consider the classic nested parallelism scenario:
> * Task A in thread 1 may issue N subtasks that run in parallel
> * Task B in thread 2 may issue K subtasks
> With our current ThreadPool abstraction, it is easy to conceive scenarios 
> where either Task A or Task B trample each other. 
> One approach to remedy this problem is to have an API like so:
> {code}
> // Inform the scheduler that you want to submit tasks that are "your tasks"
> Consumer* consumer = scheduler->NewConsumer();
> for (...) {
>   Future<T> fut = scheduler->Submit(consumer, DoWork, ...);
> }
> // Join for all tasks to finish.
> consumer->Finish();
> {code}
> The idea is that the scheduler would maintain separate task queues for each 
> consumer and e.g. track consumer-specific metrics of interest to determine 
> how tasks are allocated.
> The scheduler could have different logic to control tasks being assigned to 
> worker threads:
> * Round-robin
> * Even-time allocation (run fewer tasks for consumers with "slow" tasks and 
> more tasks from consumers with "fast" tasks -- though there are some nuances 
> here like avoiding starving a consumer if they've been doing a lot of "slow" 
> tasks and then a "fast" consumer shows up)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to