dtenedor commented on code in PR #35975:
URL: https://github.com/apache/spark/pull/35975#discussion_r852434711


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala:
##########
@@ -85,22 +85,48 @@ abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
       case ReturnAnswer(rootPlan) => rootPlan match {
         case Limit(IntegerLiteral(limit), Sort(order, true, child))
             if limit < conf.topKSortFallbackThreshold =>
-          TakeOrderedAndProjectExec(limit, order, child.output, 
planLater(child)) :: Nil
+          TakeOrderedAndProjectExec(limit, 0, order, child.output, 
planLater(child)) :: Nil
         case Limit(IntegerLiteral(limit), Project(projectList, Sort(order, 
true, child)))
             if limit < conf.topKSortFallbackThreshold =>
-          TakeOrderedAndProjectExec(limit, order, projectList, 
planLater(child)) :: Nil
+          TakeOrderedAndProjectExec(limit, 0, order, projectList, 
planLater(child)) :: Nil
         case Limit(IntegerLiteral(limit), child) =>
-          CollectLimitExec(limit, planLater(child)) :: Nil
+          CollectLimitExec(limit, 0, planLater(child)) :: Nil
+        case GlobalLimitAndOffset(IntegerLiteral(limit), 
IntegerLiteral(offset),
+          Sort(order, true, child)) if limit + offset < 
conf.topKSortFallbackThreshold =>
+          TakeOrderedAndProjectExec(limit, offset, order, child.output, 
planLater(child)) :: Nil
+        case GlobalLimitAndOffset(IntegerLiteral(limit), 
IntegerLiteral(offset),
+          Project(projectList, Sort(order, true, child)))
+            if limit + offset < conf.topKSortFallbackThreshold =>
+          TakeOrderedAndProjectExec(limit, offset, order, projectList, 
planLater(child)) :: Nil
+        case GlobalLimitAndOffset(IntegerLiteral(limit), 
IntegerLiteral(offset), child) =>
+          CollectLimitExec(limit, offset, planLater(child)) :: Nil
         case Tail(IntegerLiteral(limit), child) =>
           CollectTailExec(limit, planLater(child)) :: Nil
         case other => planLater(other) :: Nil
       }
       case Limit(IntegerLiteral(limit), Sort(order, true, child))
           if limit < conf.topKSortFallbackThreshold =>
-        TakeOrderedAndProjectExec(limit, order, child.output, 
planLater(child)) :: Nil
+        TakeOrderedAndProjectExec(limit, 0, order, child.output, 
planLater(child)) :: Nil

Review Comment:
   It seems like there is some duplication between this and the cases under the 
`ReturnAnswer` block in L86-L105. Could we deduplicate them int once place? It 
could help us make sure the numbers are correct when adding the limit and 
offset together and comparing against the `conf.topKSortFallbackThreshold`. 



##########
sql/core/src/test/resources/sql-tests/inputs/postgreSQL/limit.sql:
##########
@@ -12,25 +12,24 @@ SELECT '' AS five, unique1, unique2, stringu1
 SELECT '' AS two, unique1, unique2, stringu1
                FROM onek WHERE unique1 > 60 AND unique1 < 63
                ORDER BY unique1 LIMIT 5;
--- [SPARK-28330] ANSI SQL: Top-level <result offset clause> in <query 
expression>
--- SELECT '' AS three, unique1, unique2, stringu1
---             FROM onek WHERE unique1 > 100
---             ORDER BY unique1 LIMIT 3 OFFSET 20;
--- SELECT '' AS zero, unique1, unique2, stringu1
---             FROM onek WHERE unique1 < 50
---             ORDER BY unique1 DESC LIMIT 8 OFFSET 99;
--- SELECT '' AS eleven, unique1, unique2, stringu1
---             FROM onek WHERE unique1 < 50
---             ORDER BY unique1 DESC LIMIT 20 OFFSET 39;
+SELECT '' AS three, unique1, unique2, stringu1
+               FROM onek WHERE unique1 > 100
+               ORDER BY unique1 LIMIT 3 OFFSET 20;

Review Comment:
   Could we also have a few test cases with ORDER BY and LIMIT + OFFSET, where 
the sum of the limit and offset are just below and just above the 
`conf.topKSortFallbackThreshold`? If we can include EXPLAIN output in the 
tests, we can show that the top-K sort gets planned as intended.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to