Ryan Blue created SPARK-17995:

             Summary: Use new attributes for columns from outer joins
                 Key: SPARK-17995
                 URL: https://issues.apache.org/jira/browse/SPARK-17995
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.0.0, 1.6.2, 2.1.0
            Reporter: Ryan Blue

Plans involving outer joins use the same attribute reference (by exprId) to 
reference columns above the join and below the join. This is a false 
equivalence that leads to bugs like SPARK-16181, in which an attributes were 
incorrectly replaced by the optimizer. The column has a different schema above 
the outer join because its values may be null. The fix for that issue, [PR 
#13884](https://github.com/apache/spark/pull/13884) has a TODO comment from 
[~cloud_fan] to fix this by using different attributes instead of needing to 
special-case outer joins in rules and this issue is to track that improvement.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to