This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new ef818ed  [SPARK-38283][SQL] Test invalid datetime parsing under ANSI 
mode
ef818ed is described below

commit ef818ed86ce41be55bd962a5c809974f957f8734
Author: Gengliang Wang <gengli...@apache.org>
AuthorDate: Tue Feb 22 19:12:02 2022 +0800

    [SPARK-38283][SQL] Test invalid datetime parsing under ANSI mode
    
    ### What changes were proposed in this pull request?
    
    Run datetime-parsing-invalid.sql under ANSI mode in SQLQueryTestSuite for 
improving test coverage.
    
    Also, we can simply set ANSI mode as off in DateFunctionsSuite, so that the 
test suite can pass after we set up a new test job with ANSI on.
    
    ### Why are the changes needed?
    
    Improve test coverage and fix DateFunctionsSuite under ANSI mode.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No
    
    ### How was this patch tested?
    
    UT
    
    Closes #35606 from gengliangwang/fixDateFuncSuite.
    
    Authored-by: Gengliang Wang <gengli...@apache.org>
    Signed-off-by: Gengliang Wang <gengli...@apache.org>
---
 .../inputs/ansi/datetime-parsing-invalid.sql       |   2 +
 .../results/ansi/datetime-parsing-invalid.sql.out  | 263 +++++++++++++++++++++
 .../org/apache/spark/sql/DateFunctionsSuite.scala  |   6 +-
 3 files changed, 270 insertions(+), 1 deletion(-)

diff --git 
a/sql/core/src/test/resources/sql-tests/inputs/ansi/datetime-parsing-invalid.sql
 
b/sql/core/src/test/resources/sql-tests/inputs/ansi/datetime-parsing-invalid.sql
new file mode 100644
index 0000000..70022f3
--- /dev/null
+++ 
b/sql/core/src/test/resources/sql-tests/inputs/ansi/datetime-parsing-invalid.sql
@@ -0,0 +1,2 @@
+--IMPORT datetime-parsing-invalid.sql
+
diff --git 
a/sql/core/src/test/resources/sql-tests/results/ansi/datetime-parsing-invalid.sql.out
 
b/sql/core/src/test/resources/sql-tests/results/ansi/datetime-parsing-invalid.sql.out
new file mode 100644
index 0000000..e6dd07b
--- /dev/null
+++ 
b/sql/core/src/test/resources/sql-tests/results/ansi/datetime-parsing-invalid.sql.out
@@ -0,0 +1,263 @@
+-- Automatically generated by SQLQueryTestSuite
+-- Number of queries: 29
+
+
+-- !query
+select to_timestamp('294248', 'y')
+-- !query schema
+struct<>
+-- !query output
+java.lang.ArithmeticException
+long overflow
+
+
+-- !query
+select to_timestamp('1', 'yy')
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.SparkUpgradeException
+You may get a different result due to the upgrading of Spark 3.0: Fail to 
parse '1' in the new parser. You can set spark.sql.legacy.timeParserPolicy to 
LEGACY to restore the behavior before Spark 3.0, or set to CORRECTED and treat 
it as an invalid datetime string.
+
+
+-- !query
+select to_timestamp('-12', 'yy')
+-- !query schema
+struct<>
+-- !query output
+java.time.format.DateTimeParseException
+Text '-12' could not be parsed at index 0. If necessary set 
spark.sql.ansi.enabled to false to bypass this error.
+
+
+-- !query
+select to_timestamp('123', 'yy')
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.SparkUpgradeException
+You may get a different result due to the upgrading of Spark 3.0: Fail to 
parse '123' in the new parser. You can set spark.sql.legacy.timeParserPolicy to 
LEGACY to restore the behavior before Spark 3.0, or set to CORRECTED and treat 
it as an invalid datetime string.
+
+
+-- !query
+select to_timestamp('1', 'yyy')
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.SparkUpgradeException
+You may get a different result due to the upgrading of Spark 3.0: Fail to 
parse '1' in the new parser. You can set spark.sql.legacy.timeParserPolicy to 
LEGACY to restore the behavior before Spark 3.0, or set to CORRECTED and treat 
it as an invalid datetime string.
+
+
+-- !query
+select to_timestamp('1234567', 'yyyyyyy')
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.SparkUpgradeException
+You may get a different result due to the upgrading of Spark 3.0: Fail to 
recognize 'yyyyyyy' pattern in the DateTimeFormatter. 1) You can set 
spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before 
Spark 3.0. 2) You can form a valid datetime pattern with the guide from 
https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html
+
+
+-- !query
+select to_timestamp('366', 'D')
+-- !query schema
+struct<>
+-- !query output
+java.time.DateTimeException
+Invalid date 'DayOfYear 366' as '1970' is not a leap year. If necessary set 
spark.sql.ansi.enabled to false to bypass this error.
+
+
+-- !query
+select to_timestamp('9', 'DD')
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.SparkUpgradeException
+You may get a different result due to the upgrading of Spark 3.0: Fail to 
parse '9' in the new parser. You can set spark.sql.legacy.timeParserPolicy to 
LEGACY to restore the behavior before Spark 3.0, or set to CORRECTED and treat 
it as an invalid datetime string.
+
+
+-- !query
+select to_timestamp('366', 'DD')
+-- !query schema
+struct<>
+-- !query output
+java.time.format.DateTimeParseException
+Text '366' could not be parsed, unparsed text found at index 2. If necessary 
set spark.sql.ansi.enabled to false to bypass this error.
+
+
+-- !query
+select to_timestamp('9', 'DDD')
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.SparkUpgradeException
+You may get a different result due to the upgrading of Spark 3.0: Fail to 
parse '9' in the new parser. You can set spark.sql.legacy.timeParserPolicy to 
LEGACY to restore the behavior before Spark 3.0, or set to CORRECTED and treat 
it as an invalid datetime string.
+
+
+-- !query
+select to_timestamp('99', 'DDD')
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.SparkUpgradeException
+You may get a different result due to the upgrading of Spark 3.0: Fail to 
parse '99' in the new parser. You can set spark.sql.legacy.timeParserPolicy to 
LEGACY to restore the behavior before Spark 3.0, or set to CORRECTED and treat 
it as an invalid datetime string.
+
+
+-- !query
+select to_timestamp('30-365', 'dd-DDD')
+-- !query schema
+struct<>
+-- !query output
+java.time.DateTimeException
+Conflict found: Field DayOfMonth 30 differs from DayOfMonth 31 derived from 
1970-12-31. If necessary set spark.sql.ansi.enabled to false to bypass this 
error.
+
+
+-- !query
+select to_timestamp('11-365', 'MM-DDD')
+-- !query schema
+struct<>
+-- !query output
+java.time.DateTimeException
+Conflict found: Field MonthOfYear 11 differs from MonthOfYear 12 derived from 
1970-12-31. If necessary set spark.sql.ansi.enabled to false to bypass this 
error.
+
+
+-- !query
+select to_timestamp('2019-366', 'yyyy-DDD')
+-- !query schema
+struct<>
+-- !query output
+java.time.format.DateTimeParseException
+Text '2019-366' could not be parsed: Invalid date 'DayOfYear 366' as '2019' is 
not a leap year. If necessary set spark.sql.ansi.enabled to false to bypass 
this error.
+
+
+-- !query
+select to_timestamp('12-30-365', 'MM-dd-DDD')
+-- !query schema
+struct<>
+-- !query output
+java.time.DateTimeException
+Conflict found: Field DayOfMonth 30 differs from DayOfMonth 31 derived from 
1970-12-31. If necessary set spark.sql.ansi.enabled to false to bypass this 
error.
+
+
+-- !query
+select to_timestamp('2020-01-365', 'yyyy-dd-DDD')
+-- !query schema
+struct<>
+-- !query output
+java.time.format.DateTimeParseException
+Text '2020-01-365' could not be parsed: Conflict found: Field DayOfMonth 30 
differs from DayOfMonth 1 derived from 2020-12-30. If necessary set 
spark.sql.ansi.enabled to false to bypass this error.
+
+
+-- !query
+select to_timestamp('2020-10-350', 'yyyy-MM-DDD')
+-- !query schema
+struct<>
+-- !query output
+java.time.format.DateTimeParseException
+Text '2020-10-350' could not be parsed: Conflict found: Field MonthOfYear 12 
differs from MonthOfYear 10 derived from 2020-12-15. If necessary set 
spark.sql.ansi.enabled to false to bypass this error.
+
+
+-- !query
+select to_timestamp('2020-11-31-366', 'yyyy-MM-dd-DDD')
+-- !query schema
+struct<>
+-- !query output
+java.time.format.DateTimeParseException
+Text '2020-11-31-366' could not be parsed: Invalid date 'NOVEMBER 31'. If 
necessary set spark.sql.ansi.enabled to false to bypass this error.
+
+
+-- !query
+select from_csv('2018-366', 'date Date', map('dateFormat', 'yyyy-DDD'))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.SparkUpgradeException
+You may get a different result due to the upgrading of Spark 3.0: Fail to 
parse '2018-366' in the new parser. You can set 
spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before 
Spark 3.0, or set to CORRECTED and treat it as an invalid datetime string.
+
+
+-- !query
+select to_date("2020-01-27T20:06:11.847", "yyyy-MM-dd HH:mm:ss.SSS")
+-- !query schema
+struct<>
+-- !query output
+java.time.format.DateTimeParseException
+Text '2020-01-27T20:06:11.847' could not be parsed at index 10. If necessary 
set spark.sql.ansi.enabled to false to bypass this error.
+
+
+-- !query
+select to_date("Unparseable", "yyyy-MM-dd HH:mm:ss.SSS")
+-- !query schema
+struct<>
+-- !query output
+java.time.format.DateTimeParseException
+Text 'Unparseable' could not be parsed at index 0. If necessary set 
spark.sql.ansi.enabled to false to bypass this error.
+
+
+-- !query
+select to_timestamp("2020-01-27T20:06:11.847", "yyyy-MM-dd HH:mm:ss.SSS")
+-- !query schema
+struct<>
+-- !query output
+java.time.format.DateTimeParseException
+Text '2020-01-27T20:06:11.847' could not be parsed at index 10. If necessary 
set spark.sql.ansi.enabled to false to bypass this error.
+
+
+-- !query
+select to_timestamp("Unparseable", "yyyy-MM-dd HH:mm:ss.SSS")
+-- !query schema
+struct<>
+-- !query output
+java.time.format.DateTimeParseException
+Text 'Unparseable' could not be parsed at index 0. If necessary set 
spark.sql.ansi.enabled to false to bypass this error.
+
+
+-- !query
+select unix_timestamp("2020-01-27T20:06:11.847", "yyyy-MM-dd HH:mm:ss.SSS")
+-- !query schema
+struct<>
+-- !query output
+java.time.format.DateTimeParseException
+Text '2020-01-27T20:06:11.847' could not be parsed at index 10. If necessary 
set spark.sql.ansi.enabled to false to bypass this error.
+
+
+-- !query
+select unix_timestamp("Unparseable", "yyyy-MM-dd HH:mm:ss.SSS")
+-- !query schema
+struct<>
+-- !query output
+java.time.format.DateTimeParseException
+Text 'Unparseable' could not be parsed at index 0. If necessary set 
spark.sql.ansi.enabled to false to bypass this error.
+
+
+-- !query
+select to_unix_timestamp("2020-01-27T20:06:11.847", "yyyy-MM-dd HH:mm:ss.SSS")
+-- !query schema
+struct<>
+-- !query output
+java.time.format.DateTimeParseException
+Text '2020-01-27T20:06:11.847' could not be parsed at index 10. If necessary 
set spark.sql.ansi.enabled to false to bypass this error.
+
+
+-- !query
+select to_unix_timestamp("Unparseable", "yyyy-MM-dd HH:mm:ss.SSS")
+-- !query schema
+struct<>
+-- !query output
+java.time.format.DateTimeParseException
+Text 'Unparseable' could not be parsed at index 0. If necessary set 
spark.sql.ansi.enabled to false to bypass this error.
+
+
+-- !query
+select cast("Unparseable" as timestamp)
+-- !query schema
+struct<>
+-- !query output
+java.time.DateTimeException
+Cannot cast Unparseable to TimestampType. To return NULL instead, use 
'try_cast'. If necessary set spark.sql.ansi.enabled to false to bypass this 
error.
+
+
+-- !query
+select cast("Unparseable" as date)
+-- !query schema
+struct<>
+-- !query output
+java.time.DateTimeException
+Cannot cast Unparseable to DateType. To return NULL instead, use 'try_cast'. 
If necessary set spark.sql.ansi.enabled to false to bypass this error.
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/DateFunctionsSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DateFunctionsSuite.scala
index 543f845..762bc15 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DateFunctionsSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DateFunctionsSuite.scala
@@ -23,7 +23,7 @@ import java.time.{Instant, LocalDateTime, ZoneId}
 import java.util.{Locale, TimeZone}
 import java.util.concurrent.TimeUnit
 
-import org.apache.spark.{SparkException, SparkUpgradeException}
+import org.apache.spark.{SparkConf, SparkException, SparkUpgradeException}
 import org.apache.spark.sql.catalyst.util.DateTimeTestUtils.{CEST, LA}
 import org.apache.spark.sql.catalyst.util.DateTimeUtils
 import org.apache.spark.sql.functions._
@@ -35,6 +35,10 @@ import org.apache.spark.unsafe.types.CalendarInterval
 class DateFunctionsSuite extends QueryTest with SharedSparkSession {
   import testImplicits._
 
+  // The test cases which throw exceptions under ANSI mode are covered by 
date.sql and
+  // datetime-parsing-invalid.sql in org.apache.spark.sql.SQLQueryTestSuite.
+  override def sparkConf: SparkConf = 
super.sparkConf.set(SQLConf.ANSI_ENABLED.key, "false")
+
   test("function current_date") {
     val df1 = Seq((1, 2), (3, 1)).toDF("a", "b")
     val d0 = DateTimeUtils.currentDate(ZoneId.systemDefault())

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to