[GitHub] [spark] ulysses-you commented on a change in pull request #32724: [SPARK-35585][SQL] Support propagate empty relation through project/filter

2021-06-01 Thread GitBox


ulysses-you commented on a change in pull request #32724:
URL: https://github.com/apache/spark/pull/32724#discussion_r643626088



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
##
@@ -1373,6 +1386,28 @@ class AdaptiveQueryExecSuite
 }
   }
 
+  test("SPARK-35585: Support propagate empty relation through project/filter") 
{
+withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true",
+  SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") {
+  val (plan1, adaptivePlan1) = runAdaptiveAndVerifyResult(
+"SELECT key FROM testData WHERE key = 0 ORDER BY key, value")
+  assert(findTopLevelSort(plan1).size == 1)
+  assert(findTopLevelSort(adaptivePlan1).isEmpty)
+
+  val (plan2, adaptivePlan2) = runAdaptiveAndVerifyResult(
+"""
+  |SELECT t1.key, count(*) FROM testData t1

Review comment:
   ok done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #32724: [SPARK-35585][SQL] Support propagate empty relation through project/filter

2021-06-01 Thread GitBox


ulysses-you commented on a change in pull request #32724:
URL: https://github.com/apache/spark/pull/32724#discussion_r643607037



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
##
@@ -1373,6 +1386,28 @@ class AdaptiveQueryExecSuite
 }
   }
 
+  test("SPARK-35585: Support propagate empty relation through project/filter") 
{
+withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true",
+  SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") {
+  val (plan1, adaptivePlan1) = runAdaptiveAndVerifyResult(
+"SELECT key FROM testData WHERE key = 0 ORDER BY key, value")
+  assert(findTopLevelSort(plan1).size == 1)
+  assert(findTopLevelSort(adaptivePlan1).isEmpty)
+
+  val (plan2, adaptivePlan2) = runAdaptiveAndVerifyResult(
+"""
+  |SELECT t1.key, count(*) FROM testData t1

Review comment:
   If `limit` is not the top-level node, we will introduce extra shuffle 
node for the `SinglePartition`. It will make plan complex.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #32724: [SPARK-35585][SQL] Support propagate empty relation through project/filter

2021-06-01 Thread GitBox


ulysses-you commented on a change in pull request #32724:
URL: https://github.com/apache/spark/pull/32724#discussion_r643606101



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
##
@@ -1373,6 +1386,28 @@ class AdaptiveQueryExecSuite
 }
   }
 
+  test("SPARK-35585: Support propagate empty relation through project/filter") 
{
+withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true",
+  SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") {
+  val (plan1, adaptivePlan1) = runAdaptiveAndVerifyResult(
+"SELECT key FROM testData WHERE key = 0 ORDER BY key, value")
+  assert(findTopLevelSort(plan1).size == 1)
+  assert(findTopLevelSort(adaptivePlan1).isEmpty)

Review comment:
   Updated.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #32724: [SPARK-35585][SQL] Support propagate empty relation through project/filter

2021-06-01 Thread GitBox


ulysses-you commented on a change in pull request #32724:
URL: https://github.com/apache/spark/pull/32724#discussion_r643215852



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
##
@@ -1373,6 +1380,31 @@ class AdaptiveQueryExecSuite
 }
   }
 
+  test("SPARK-35585: Support propagate empty relation through project/filter") 
{
+withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true",
+  SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") {
+  Seq(
+"""
+ |SELECT t1.key, count(*) FROM testData t1
+ | JOIN (SELECT * FROM testData2 WHERE b = 0) t2 ON t1.key = t2.a

Review comment:
   yeah we can use sort instead. Updated.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #32724: [SPARK-35585][SQL] Support propagate empty relation through project/filter

2021-06-01 Thread GitBox


ulysses-you commented on a change in pull request #32724:
URL: https://github.com/apache/spark/pull/32724#discussion_r643138003



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEOptimizer.scala
##
@@ -63,3 +70,8 @@ class AQEOptimizer(conf: SQLConf) extends 
RuleExecutor[LogicalPlan] {
   LogicalPlanIntegrity.checkIfExprIdsAreGloballyUnique(plan))
   }
 }
+
+/**
+ * ConvertToLocalRelation at AQE optimizer side.
+ */
+object AQEConvertToLocalRelation extends ConvertToLocalRelationBase

Review comment:
   Seems the issue won't happen since we clone the plan step-by-step ( i.e. 
analyzer/optimizer/physical ) and the ineffective tag won't exist after clone.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #32724: [SPARK-35585][SQL] Support propagate empty relation through project/filter

2021-06-01 Thread GitBox


ulysses-you commented on a change in pull request #32724:
URL: https://github.com/apache/spark/pull/32724#discussion_r642845898



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEOptimizer.scala
##
@@ -63,3 +70,8 @@ class AQEOptimizer(conf: SQLConf) extends 
RuleExecutor[LogicalPlan] {
   LogicalPlanIntegrity.checkIfExprIdsAreGloballyUnique(plan))
   }
 }
+
+/**
+ * ConvertToLocalRelation at AQE optimizer side.
+ */
+object AQEConvertToLocalRelation extends ConvertToLocalRelationBase

Review comment:
   It should also do this. `UpdateAttributeNullability ` updated to 
`resolveOperatorsUpWithPruning` just now...  see the merged PR 
[#32686](https://github.com/apache/spark/pull/32686).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #32724: [SPARK-35585][SQL] Support propagate empty relation through project/filter

2021-06-01 Thread GitBox


ulysses-you commented on a change in pull request #32724:
URL: https://github.com/apache/spark/pull/32724#discussion_r642833622



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEOptimizer.scala
##
@@ -63,3 +70,8 @@ class AQEOptimizer(conf: SQLConf) extends 
RuleExecutor[LogicalPlan] {
   LogicalPlanIntegrity.checkIfExprIdsAreGloballyUnique(plan))
   }
 }
+
+/**
+ * ConvertToLocalRelation at AQE optimizer side.
+ */
+object AQEConvertToLocalRelation extends ConvertToLocalRelationBase

Review comment:
   avoid this rule marked as ineffective during normal optimizer




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #32724: [SPARK-35585][SQL] Support propagate empty relation through project/filter

2021-06-01 Thread GitBox


ulysses-you commented on a change in pull request #32724:
URL: https://github.com/apache/spark/pull/32724#discussion_r642832851



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
##
@@ -1373,6 +1380,31 @@ class AdaptiveQueryExecSuite
 }
   }
 
+  test("SPARK-35585: Support propagate empty relation through project/filter") 
{
+withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true",
+  SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") {
+  Seq(
+"""
+ |SELECT t1.key, count(*) FROM testData t1
+ | JOIN (SELECT * FROM testData2 WHERE b = 0) t2 ON t1.key = t2.a
+ | GROUP BY t1.key
+ |""".stripMargin,
+"""
+  |SELECT t1.key, count(*) FROM testData t1
+  | JOIN (SELECT * FROM testData2 WHERE b = 0) t2 ON t1.key = t2.a
+  | WHERE t1.key > rand()

Review comment:
   test filter case
   ```
   Aggregate
 Project
   Filter
 Join
   ShuffleStage
   ShuffleStage
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #32724: [SPARK-35585][SQL] Support propagate empty relation through project/filter

2021-05-31 Thread GitBox


ulysses-you commented on a change in pull request #32724:
URL: https://github.com/apache/spark/pull/32724#discussion_r642797088



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##
@@ -1711,7 +1711,7 @@ object DecimalAggregates extends Rule[LogicalPlan] {
  * Converts local operations (i.e. ones that don't require data exchange) on 
`LocalRelation` to
  * another `LocalRelation`.
  */
-object ConvertToLocalRelation extends Rule[LogicalPlan] {
+trait ConvertToLocalRelationBase extends Rule[LogicalPlan] {

Review comment:
   this is for isolation between normal and AQE optimizer




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org