GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/21821
[SPARK-24867] [SQL] Add AnalysisBarrier to DataFrameWriter ## What changes were proposed in this pull request? ```Scala val udf1 = udf({(x: Int, y: Int) => x + y}) val df = spark.range(0, 3).toDF("a") .withColumn("b", udf1($"a", udf1($"a", lit(10)))) df.cache() df.write.saveAsTable("t") ``` Cache is not being used because the plans do not match with the cached plan. This is a regression caused by the changes we made in AnalysisBarrier, since not all the Analyzer rules are idempotent. ## How was this patch tested? Added a test. Also found a bug in the DSV1 write path. This is not a regression. Thus, opened a separate JIRA https://issues.apache.org/jira/browse/SPARK-24869 You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark testMaster22 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21821.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21821 ---- commit 23ec09fc3bbedd2f34c594daf461cebd9c0295a6 Author: Xiao Li <gatorsmile@...> Date: 2018-07-19T23:38:44Z fix ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org