amaliujia commented on code in PR #35975:
URL: https://github.com/apache/spark/pull/35975#discussion_r852413603
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala:
##########
@@ -37,15 +37,27 @@ trait LimitExec extends UnaryExecNode {
}
/**
- * Take the first `limit` elements and collect them to a single partition.
+ * Take the first `limit` + `offset` elements and collect them to a single
partition and then to
+ * drop the first `offset` elements.
*
* This operator will be used when a logical `Limit` operation is the final
operator in an
* logical plan, which happens when the user is collecting results back to the
driver.
*/
-case class CollectLimitExec(limit: Int, child: SparkPlan) extends LimitExec {
+case class CollectLimitExec(limit: Int, offset: Int, child: SparkPlan) extends
LimitExec {
override def output: Seq[Attribute] = child.output
override def outputPartitioning: Partitioning = SinglePartition
- override def executeCollect(): Array[InternalRow] = child.executeTake(limit)
+ override def executeCollect(): Array[InternalRow] = {
+ // Because CollectLimitExec collect all the output of child to a single
partition, so we need
+ // collect the first `limit` + `offset` elements and then to drop the
first `offset` elements.
+ // For example: limit is 1 and offset is 2 and the child output two
partition.
+ // The first partition output [1, 2] and the Second partition output [3,
4, 5].
+ // Then [1, 2, 3] will be taken and output [3].
+ if (offset > 0) {
+ child.executeTake(limit + offset).drop(offset)
+ } else {
Review Comment:
so the assumption here is when offset is not > 0 then offset is not set?
Will use Option and None be better to indicate:
1. offset is set and legal. Some(value)
2. offset is set but not legal. Won't be here. It should be rejected in
analyzer
3. offset is not set (offset is not used with LIMIT). None.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]