Re: [PR] [SPARK-48914][SQL][TESTS] Add OFFSET operator as an option in the subquery generator [spark]

via GitHub Thu, 18 Jul 2024 14:01:27 -0700


andylam-db commented on code in PR #47375:
URL: https://github.com/apache/spark/pull/47375#discussion_r1683497951



##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/querytest/GeneratedSubquerySuite.scala:
##########
@@ -126,33 +126,48 @@ class GeneratedSubquerySuite extends 
DockerJDBCIntegrationSuite with QueryGenera
       case _ => None
     }
 
-    // For the OrderBy, consider whether or not the result of the subquery is 
required to be sorted.
-    // This is to maintain test determinism. This is affected by whether the 
subquery has a limit
-    // clause.
-    val requiresLimitOne = isScalarSubquery && (operatorInSubquery match {
+    // For some situation needs exactly one row as output, we force the
+    // subquery to have a limit of 1 and no offset value (in case it outputs
+    // empty result set).
+    val requiresExactlyOneRowOutput = isScalarSubquery && (operatorInSubquery 
match {
       case a: Aggregate => a.groupingExpressions.nonEmpty
-      case l: Limit => l.limitValue > 1
       case _ => true
     })
 
-    val orderByClause = if (requiresLimitOne || 
operatorInSubquery.isInstanceOf[Limit]) {
+    val requireNoOffsetInCorrelatedSubquery = correlationConditions.nonEmpty

Review Comment:
   Move this variable down to when it's actually needed? Also, could you add a 
comment on why an offset cannot exist in a correlated subquery (for now)?



##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/querytest/GeneratedSubquerySuite.scala:
##########
@@ -339,7 +354,11 @@ class GeneratedSubquerySuite extends 
DockerJDBCIntegrationSuite with QueryGenera
       val aggregates = combinations.map {
         case (af, groupBy) => Aggregate(Seq(af), if (groupBy) 
Seq(groupByColumn) else Seq())
       }
-      val subqueryOperators = Seq(Limit(1), Limit(10)) ++ aggregates
+      val limitValues = Seq(0, 1, 10)
+      val offsetValues = Seq(0, 1, 10)
+      val limitAndOffsetOperators = limitValues.flatMap(limit => 
offsetValues.map(offset =>
+        LimitAndOffset(limit, offset))).filter(lo => !(lo.limitValue == 0 && 
lo.offsetValue == 0))

Review Comment:
   Indent this better to make it more readable.



##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/querytest/GeneratedSubquerySuite.scala:
##########
@@ -339,7 +354,11 @@ class GeneratedSubquerySuite extends 
DockerJDBCIntegrationSuite with QueryGenera
       val aggregates = combinations.map {
         case (af, groupBy) => Aggregate(Seq(af), if (groupBy) 
Seq(groupByColumn) else Seq())
       }
-      val subqueryOperators = Seq(Limit(1), Limit(10)) ++ aggregates
+      val limitValues = Seq(0, 1, 10)
+      val offsetValues = Seq(0, 1, 10)

Review Comment:
   Can we define this as a function, to be consistent with the rest of the file?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-48914][SQL][TESTS] Add OFFSET operator as an option in the subquery generator [spark]

Reply via email to