alamb commented on issue #16490: URL: https://github.com/apache/datafusion/issues/16490#issuecomment-2993505936
> Making the partitioning strategies more load aware somehow might still help though to try to avoid congestion in partitions that for whatever reason are going a bit slower than the others. This might be interesting for RoundRobin partitioning, but it is not clear to me how it would work for hash partitioning > I will say that I have no understanding yet at this point how the hash based partitioning percolates through the pipeline. Is that something that's essentially a local decision for the repartition operator or does that have consequences further down the line in parent operators as well? I think it has consequences farther down (for example, the last phase of a two phase grouping relies on the fact that data is partitioned into non overlapping partitions for correctness) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org