[GitHub] [spark] cloud-fan commented on a change in pull request #33958: [SPARK-36718][SQL] Only collapse projects if we don't duplicate expensive expressions

GitBox Wed, 15 Sep 2021 08:50:44 -0700


cloud-fan commented on a change in pull request #33958:
URL: https://github.com/apache/spark/pull/33958#discussion_r709322487




##########
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/planning/ScanOperationSuite.scala
##########
@@ -73,17 +80,20 @@ class ScanOperationSuite extends SparkFunSuite {
 
   test("Filter which has the same non-deterministic expression with its child 
Project") {
     val filter1 = Filter(EqualTo(colR, Literal(1)), Project(Seq(colA, aliasR), 
relation))
-    assert(ScanOperation.unapply(filter1).isEmpty)
+    filter1 match {
+      case ScanOperation(projects, filters, _: Filter) =>
+        assert(projects.size === 2)
+        assert(filters.isEmpty)
+      case _ => assert(false)
+    }
   }
 
   test("Deterministic filter with a child Project with a non-deterministic 
expression") {
     val filter2 = Filter(EqualTo(colA, Literal(1)), Project(Seq(colA, aliasR), 
relation))
     filter2 match {
-      case ScanOperation(projects, filters, _: LocalRelation) =>
+      case ScanOperation(projects, filters, _: Filter) =>

Review comment:
       @viirya this is a real change. After the pattern match, the caller side 
will put Filter first then Project, which means the Filter is pushed through 
the Project. According to the optimizer rule, this pushdown should not happen 
if Project has nondeterministic expressions.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a change in pull request #33958: [SPARK-36718][SQL] Only collapse projects if we don't duplicate expensive expressions

Reply via email to