GitHub user gengliangwang opened a pull request:

    https://github.com/apache/spark/pull/20270

    [SPARK-23079] Fix query constraints propagation with aliases

    ## What changes were proposed in this pull request?
    
    Previously, PR #19201 fix the problem of non-converging constraints.
    
    After that PR #19149 improve the loop and constraints is inferred only once.
    
    So the problem of non-converging constraints is gone.
    
    Also, in current code, the case below will fail.
    
    ```
    
    spark.range(5).write.saveAsTable("t")
    val t = spark.read.table("t")
    val left = t.withColumn("xid", $"id" + lit(1)).as("x")
    val right = t.withColumnRenamed("id", "xid").as("y")
    val df = left.join(right, "xid").filter("id = 3").toDF()
    checkAnswer(df, Row(4, 3))
    
    ```
    
    Because `aliasMap` replace all the aliased child. See the test case in PR 
for details.
    
    The PR remove the useless code for preventing the non-converging 
constraints. Also, infer the constraints with `EqualNullSafe` as well.
     
    
    ## How was this patch tested?
    
    Unit test


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gengliangwang/spark FixConstraint

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20270.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20270
    
----
commit e9dd769cc96158ba3b1597bd5eb6fb824aeab22a
Author: Wang Gengliang <ltnwgl@...>
Date:   2018-01-15T15:10:57Z

    SPARK-23079: constraints should be inferred correctly with aliases

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to