spark git commit: [SPARK-24151][SQL] Case insensitive resolution of CURRENT_DATE and CURRENT_TIMESTAMP

lixiao Mon, 17 Sep 2018 23:19:46 -0700

Repository: spark
Updated Branches:
  refs/heads/master acc645257 -> ba838fee0



[SPARK-24151][SQL] Case insensitive resolution of CURRENT_DATE and 
CURRENT_TIMESTAMP

## What changes were proposed in this pull request?

SPARK-22333 introduced a regression in the resolution of `CURRENT_DATE` and 
`CURRENT_TIMESTAMP`. Before that ticket, these 2 functions were resolved in a 
case insensitive way. After, this depends on the value of 
`spark.sql.caseSensitive`.

The PR restores the previous behavior and makes their resolution case 
insensitive anyhow. The PR takes over #21217, therefore it closes #21217 and 
credit for this patch should be given to jamesthomp.

## How was this patch tested?

added UT

Closes #22440 from mgaido91/SPARK-24151.

Lead-authored-by: James Thompson <[email protected]>
Co-authored-by: Marco Gaido <[email protected]>
Signed-off-by: gatorsmile <[email protected]>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ba838fee
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ba838fee
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ba838fee

Branch: refs/heads/master
Commit: ba838fee001d553e30d2205337c1fa5ccbd57caf
Parents: acc6452
Author: James Thompson <[email protected]>
Authored: Mon Sep 17 23:19:04 2018 -0700
Committer: gatorsmile <[email protected]>
Committed: Mon Sep 17 23:19:04 2018 -0700

----------------------------------------------------------------------
 docs/sql-programming-guide.md                     |  1 +
 .../spark/sql/catalyst/analysis/Analyzer.scala    |  2 +-
 .../sql/catalyst/analysis/AnalysisSuite.scala     | 18 ++++++++++++++++++
 3 files changed, 20 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/ba838fee/docs/sql-programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index e262987..f25415c 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1879,6 +1879,7 @@ working with timestamps in `pandas_udf`s to get the best 
performance, see
 
 ## Upgrading From Spark SQL 2.3 to 2.4
 
+  - In versions 2.2.1+ and 2.3, if `spark.sql.caseSensitive` is set to true, 
then the `CURRENT_DATE` and `CURRENT_TIMESTAMP` functions incorrectly became 
case-sensitive and would resolve to columns (unless typed in lower case). In 
Spark 2.4 this has been fixed and the functions are no longer case-sensitive.
   - Since Spark 2.4, Spark will evaluate the set operations referenced in a 
query by following a precedence rule as per the SQL standard. If the order is 
not specified by parentheses, set operations are performed from left to right 
with the exception that all INTERSECT operations are performed before any 
UNION, EXCEPT or MINUS operations. The old behaviour of giving equal precedence 
to all the set operations are preserved under a newly added configuration 
`spark.sql.legacy.setopsPrecedence.enabled` with a default value of `false`. 
When this property is set to `true`, spark will evaluate the set operators from 
left to right as they appear in the query given no explicit ordering is 
enforced by usage of parenthesis.
   - Since Spark 2.4, Spark will display table description column Last Access 
value as UNKNOWN when the value was Jan 01 1970.
   - Since Spark 2.4, Spark maximizes the usage of a vectorized ORC reader for 
ORC files by default. To do that, `spark.sql.orc.impl` and 
`spark.sql.orc.filterPushdown` change their default values to `native` and 
`true` respectively.

http://git-wip-us.apache.org/repos/asf/spark/blob/ba838fee/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
----------------------------------------------------------------------
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index 580133d..e3b1712 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -1045,7 +1045,7 @@ class Analyzer(
     // support CURRENT_DATE and CURRENT_TIMESTAMP
     val literalFunctions = Seq(CurrentDate(), CurrentTimestamp())
     val name = nameParts.head
-    val func = literalFunctions.find(e => resolver(e.prettyName, name))
+    val func = literalFunctions.find(e => 
caseInsensitiveResolution(e.prettyName, name))
     func.map(wrapper)
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/ba838fee/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
----------------------------------------------------------------------
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
index 3b3edac..f9facbb 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
@@ -32,6 +32,8 @@ import org.apache.spark.sql.catalyst.plans.{Cross, Inner}
 import org.apache.spark.sql.catalyst.plans.logical._
 import org.apache.spark.sql.catalyst.plans.physical.{HashPartitioning, 
Partitioning,
   RangePartitioning, RoundRobinPartitioning}
+import org.apache.spark.sql.catalyst.util._
+import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
 
 
@@ -586,4 +588,20 @@ class AnalysisSuite extends AnalysisTest with Matchers {
       listRelation.select(MultiAlias(MultiAlias(
         PosExplode('list), Seq("first_pos", "first_val")), Seq("second_pos", 
"second_val"))))
   }
+
+  test("SPARK-24151: CURRENT_DATE, CURRENT_TIMESTAMP should be case 
insensitive") {
+    withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") {
+      val input = Project(Seq(
+        UnresolvedAttribute("current_date"),
+        UnresolvedAttribute("CURRENT_DATE"),
+        UnresolvedAttribute("CURRENT_TIMESTAMP"),
+        UnresolvedAttribute("current_timestamp")), testRelation)
+      val expected = Project(Seq(
+        Alias(CurrentDate(), toPrettySQL(CurrentDate()))(),
+        Alias(CurrentDate(), toPrettySQL(CurrentDate()))(),
+        Alias(CurrentTimestamp(), toPrettySQL(CurrentTimestamp()))(),
+        Alias(CurrentTimestamp(), toPrettySQL(CurrentTimestamp()))()), 
testRelation).analyze
+      checkAnalysis(input, expected)
+    }
+  }
 }


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

spark git commit: [SPARK-24151][SQL] Case insensitive resolution of CURRENT_DATE and CURRENT_TIMESTAMP

Reply via email to