tustvold commented on issue #4295: URL: https://github.com/apache/arrow-datafusion/issues/4295#issuecomment-1327202615
> Anyway the concept of partition seems to sit pretty deep in codebase, I saw that It is passed through hierarchy of ExecutionPlan's execute(...). The scheduler I started work on preserved the concept of partitions, but did not rely on them for work distribution, or at least wouldn't have if I had actually finished it :sweat_smile: > Any changes in regards to existing pull model Yes, the hope was to gradually change to a push model for operators where it is possible > Will scheduler contain a DAG that would replace hierarchy based on children() from ExecutionPlan See https://github.com/apache/arrow-datafusion/blob/master/datafusion/core/src/scheduler/pipeline/mod.rs#L27 > I wonder how fairness of sharing resources would be approached, because from what I have heard HyperDB processes single query at the time, that achieves ideal fairness with morsels IMO fairness is better handled at a higher level, e.g. with separate query pools or even separate query processes. The scheduler should focus on throughput at the expense of fairness, if nothing else fairly multiplexing queries is a recipe to blow your memory budget. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
