[ https://issues.apache.org/jira/browse/ARROW-9707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201762#comment-17201762 ]
Adam Lippai commented on ARROW-9707: ------------------------------------ Speaking about the thread pools, my last suggestion wasn't accepted (or declined): [https://docs.google.com/document/d/1_wc6diy3YrRgEIhVIGzrO5AK8yhwfjWlmKtGnvbsrrY/edit#heading=h.4f9v08zbs2z3] The use-case might be different, but recently Bevy created a rust task scheduler using context-specific threadpools: [https://bevyengine.org/news/bevy-0-2/] My understanding is that Bevy and the linked GDocs (DAG vs 2-3 staged pipeline vs generic threadpool) is similar approach and I find it the most generic (but acceptable) solution. > [Rust] [DataFusion] Re-implement threading model > ------------------------------------------------ > > Key: ARROW-9707 > URL: https://issues.apache.org/jira/browse/ARROW-9707 > Project: Apache Arrow > Issue Type: Sub-task > Components: Rust, Rust - DataFusion > Reporter: Andy Grove > Assignee: Andy Grove > Priority: Major > Fix For: 2.0.0 > > Attachments: image-2020-09-24-22-46-46-959.png > > > The current threading model is very simple and does not scale. We currently > use 1-2 dedicated threads per partition and they all run simultaneously, > which is a huge problem if you have more partitions than logical or physical > cores. > This task is to re-implement the threading model so that query execution uses > a fixed (configurable) number of threads. Work will be broken down into > stages and tasks and each in-process executor (running on a dedicated thread) > will process its queue of tasks. > This process will be driven by a scheduler. -- This message was sent by Atlassian Jira (v8.3.4#803005)