Github user scwf commented on the issue: https://github.com/apache/spark/pull/16633 To clear, now we have these issues: 1. local limit compute all partitions, that means it launch many tasks but actually maybe very small tasks is enough. 2. global limit single partition issue, now the global limit will shuffle all the data to one partition, so if the limit num is very big, it cause performance bottleneck It is perfect if we combine the global limit and local limit into one stage, and avoid the shuffle, but for now i can not find a very good solution(no performance regression) to do this without change spark core/scheduler, your solution is trying to do that, but as i suggest, there are some cases the performance maybe worse. @wzhfy 's idea is just resolve the single partition issue, still shuffle, still local limit on all the partitions, but it not bring performance down in that cases compare with current code path. > Another issue is, how do you make sure you create a uniform distribution of the result of local limit. Each local limit can produce different number of rows. it use a special partitioner to do this, the partitioner like the `row_numer` in sql it give each row a uniform partitionid, so in the reduce task, each task handle num of rows very closely.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org