[spark] branch branch-3.2 updated: [SPARK-37513][SQL][DOC] date +/- interval with only day-time fields returns different data type between Spark3.2 and Spark3.1

wenchen Wed, 01 Dec 2021 00:22:18 -0800

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-3.2 by this push:
     new c836e93  [SPARK-37513][SQL][DOC] date +/- interval with only day-time 
fields returns different data type between Spark3.2 and Spark3.1
c836e93 is described below

commit c836e93306d1816a0232b69ef83d86c2782688e3
Author: Jiaan Geng <belie...@163.com>
AuthorDate: Wed Dec 1 16:19:50 2021 +0800

    [SPARK-37513][SQL][DOC] date +/- interval with only day-time fields returns 
different data type between Spark3.2 and Spark3.1
    
    ### What changes were proposed in this pull request?
    The SQL show below previously returned the date type, now it returns the 
timestamp type.
    `select date '2011-11-11' + interval 12 hours;`
    `select date '2011-11-11' - interval 12 hours;`
    
    The basic reason is:
    In Spark3.1
    
https://github.com/apache/spark/blob/75cac1fe0a46dbdf2ad5b741a3a49c9ab618cdce/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L338
    In Spark3.2
    
https://github.com/apache/spark/blob/ceae41ba5cafb479cdcfc9a6a162945646a68f05/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L376
    
    Because Spark3.2 have been released, so we add the migration guide for it.
    
    ### Why are the changes needed?
    Give a migration guide for different behavior between Spark3.1 and Spark3.2.
    
    ### Does this PR introduce _any_ user-facing change?
    'No'.
    Just modify the docs.
    
    ### How was this patch tested?
    No need.
    
    Closes #34766 from beliefer/SPARK-37513.
    
    Authored-by: Jiaan Geng <belie...@163.com>
    Signed-off-by: Wenchen Fan <wenc...@databricks.com>
    (cherry picked from commit ec47c3c4394b2410a277e7f7105cf896c28b2ed4)
    Signed-off-by: Wenchen Fan <wenc...@databricks.com>
---
 docs/sql-migration-guide.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 9f51c75..6fcc059 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -103,6 +103,8 @@ license: |
 
   - In Spark 3.2, create/alter view will fail if the input query output 
columns contain auto-generated alias. This is necessary to make sure the query 
output column names are stable across different spark versions. To restore the 
behavior before Spark 3.2, set 
`spark.sql.legacy.allowAutoGeneratedAliasForView` to `true`.
 
+  - In Spark 3.2, date +/- interval with only day-time fields such as `date 
'2011-11-11' + interval 12 hours` returns timestamp. In Spark 3.1 and earlier, 
the same expression returns date. To restore the behavior before Spark 3.2, you 
can use `cast` to convert timestamp as date.
+
 ## Upgrading from Spark SQL 3.0 to 3.1
 
   - In Spark 3.1, statistical aggregation function includes `std`, `stddev`, 
`stddev_samp`, `variance`, `var_samp`, `skewness`, `kurtosis`, `covar_samp`, 
`corr` will return `NULL` instead of `Double.NaN` when `DivideByZero` occurs 
during expression evaluation, for example, when `stddev_samp` applied on a 
single element set. In Spark version 3.0 and earlier, it will return 
`Double.NaN` in such case. To restore the behavior before Spark 3.1, you can 
set `spark.sql.legacy.statisticalAggrega [...]

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.2 updated: [SPARK-37513][SQL][DOC] date +/- interval with only day-time fields returns different data type between Spark3.2 and Spark3.1

Reply via email to