Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/15417#discussion_r85295977
  
    --- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/FilterPushdownSuite.scala
 ---
    @@ -1016,6 +1016,8 @@ class FilterPushdownSuite extends PlanTest {
         val correctAnswer = x.where("x.a".attr === 5).join(y.where("y.a".attr 
=== 5),
             condition = Some("x.a".attr === Rand(10) && "y.b".attr === 5))
     
    -    comparePlans(Optimize.execute(originalQuery.analyze), 
correctAnswer.analyze)
    --- End diff --
    
    We have to support `PullOutNondeterministic` on `Join` operator, 
since`Join.condition` is where non-deterministic expressions can only appear, 
we can rewrite the plan by:
    1、Pull out non-deterministic part of `condition`, for example, `a = b && 
rand() > 0 && a > 1` has deterministic part `a = b` and the rest is 
non-deterministic;
    2、To handle the cases when multiple nondeterministic expressions exists 
in `condition`, we have to split conjunctive predicates, for every predicate 
that is nondeterministic, we insert a new `Join` operator like(`j` represents 
the original Join operator):
    ```
    Join(
      j,
      LocalRelation(predicate.collect { case n: Nondeterministic => n }),
      Inner,
      Some(rewrite(predicate))
    )
    ```
    Since this added much `Join` operators, I doubt the performance will be 
pretty bad. Do you have any advices?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to