[GitHub] spark pull request: [SPARK-4226][SQL] Support Correlated Sub-queri...

davies Sun, 17 Apr 2016 23:23:47 -0700

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12306#discussion_r60009754
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
    @@ -1447,3 +1450,133 @@ object EmbedSerializerInFilter extends 
Rule[LogicalPlan] {
           }
       }
     }
    +
    +/**
    + * This rule rewrites predicate sub-queries into left semi/anti joins. The 
following predicates
    + * are supported:
    + * a. EXISTS/NOT EXISTS will be rewritten as semi/anti join, unresolved 
conditions in Filter
    + *    will be pulled out as the join conditions.
    + * b. IN/NOT IN will be rewritten as semi/anti join, unresolved conditions 
in the Filter will
    + *    be pulled out as join conditions, value = selected column will also 
be used as join
    + *    condition.
    + */
    +object RewritePredicateSubquery extends Rule[LogicalPlan] with 
PredicateHelper {
    +  /**
    +   * Pull out all correlated predicates from a given sub-query. This 
method removes the correlated
    +   * predicates from sub-query [[Filter]]s and adds the references of 
these predicates to
    +   * all intermediate [[Project]] clauses (if they are missing) in order 
to be able to evaluate the
    +   * predicates in the join condition.
    +   *
    +   * This method returns the rewritten sub-query and the combined (AND) 
extracted predicate.
    +   */
    +  private def pullOutCorrelatedPredicates(
    +      subquery: LogicalPlan,
    +      query: LogicalPlan): (LogicalPlan, Option[Expression]) = {
    +    val references: Set[Expression] = query.output.toSet
    --- End diff --
    
    It's better to use AttributeSet or ExpressionSet



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-4226][SQL] Support Correlated Sub-queri...

Reply via email to