Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/7334#discussion_r34313535
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala ---
@@ -108,48 +105,24 @@ case class Union(children: Seq[SparkPlan]) extends
SparkPlan {
/**
* :: DeveloperApi ::
- * Take the first limit elements. Note that the implementation is
different depending on whether
- * this is a terminal operator or not. If it is terminal and is invoked
using executeCollect,
- * this operator uses something similar to Spark's take method on the
Spark driver. If it is not
- * terminal or is invoked using execute, we first take the limit on each
partition, and then
- * repartition all the data to a single partition to compute the global
limit.
+ * Take the first `limit` elements from each partition.
*/
@DeveloperApi
-case class Limit(limit: Int, child: SparkPlan)
+case class PartitionLocalLimit(limit: Int, child: SparkPlan)
extends UnaryNode {
- // TODO: Implement a partition local limit, and use a strategy to
generate the proper limit plan:
- // partition local limit -> exchange into one partition -> partition
local limit again
-
- /** We must copy rows when sort based shuffle is on */
- private def sortBasedShuffleOn =
SparkEnv.get.shuffleManager.isInstanceOf[SortShuffleManager]
-
override def output: Seq[Attribute] = child.output
- override def outputPartitioning: Partitioning = SinglePartition
override def executeCollect(): Array[Row] = child.executeTake(limit)
--- End diff --
Whoops, forgot to remove this. This should probably be split to a separate
operator in order to handle the cases where Limit is the final operator.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]