[GitHub] spark pull request #22857: [SPARK-25860][SQL] Replace Literal(null, _) with ...

dongjoon-hyun Sun, 28 Oct 2018 11:54:20 -0700

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22857#discussion_r228760200
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
    @@ -736,3 +736,65 @@ object CombineConcats extends Rule[LogicalPlan] {
           flattenConcats(concat)
       }
     }
    +
    +/**
    + * A rule that replaces `Literal(null, _)` with `FalseLiteral` for further 
optimizations.
    + *
    + * For example, `Filter(Literal(null, _))` is equal to 
`Filter(FalseLiteral)`.
    + *
    + * Another example containing branches is `Filter(If(cond, FalseLiteral, 
Literal(null, _)))`;
    + * this can be optimized to `Filter(If(cond, FalseLiteral, 
FalseLiteral))`, and eventually
    + * `Filter(FalseLiteral)`.
    + *
    + * As a result, many unnecessary computations can be removed in the query 
optimization phase.
    + *
    + * Similarly, the same logic can be applied to conditions in [[Join]], 
predicates in [[If]],
    + * conditions in [[CaseWhen]].
    --- End diff --
    
    The examples are good, but we have to be more clear the scope of this 
optimizer.
    For now, this PR touches not only predicates in WHERE, but also some 
expressions in SELECT.
    Also, it's unclear with aggregation like HAVING. Could you clearly 
enumerate the targets in this documentation, @aokolnychyi ?




---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22857: [SPARK-25860][SQL] Replace Literal(null, _) with ...

Reply via email to