jon-chuang commented on issue #1221: URL: https://github.com/apache/arrow-datafusion/issues/1221#issuecomment-962762902
Here is the proposed design: - when an executor initially connects to the scheduler, it also tells the scheduler how many task slots it has. The amount of memory per task as per #587 could also be negotiated here. - As long as the executor is alive, it tries to send jobs to it. The scheduler tries to prioritize sending tasks to executors with most slots available. Just wondering if the cardinality estimates/execution cost model could be used for more intelligent scheduling. Also wondering if each task runs single threaded or if they can exploit more cores on the system, and if so, if they utilize a common threadpool or shard the number of cores in the system so that each of the executor's n slots has 1/n of the cores. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
