vidyasankarv opened a new issue, #440:
URL: https://github.com/apache/datafusion-comet/issues/440
### Describe the bug
When a String which is an invalid date is cast to a Datetype
in spark 3.2 the error message is
: java.time.DateTimeException: Cannot cast 0 to DateType.
```
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in
stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID
2) (192.168.1.10 executor driver): java.time.DateTimeException: Cannot cast 0
to DateType.
```
in spark 3.3 and above the error message is :
```
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in
stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID
2) (192.168.1.10 executor driver): org.apache.spark.SparkDateTimeException:
[CAST_INVALID_INPUT] The value '0' of the type "STRING" cannot be cast to
"DATE" because it is malformed. Correct the value as per the syntax, or change
its target type. Use `try_cast` to tolerate malformed input and return NULL
instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this
error.
```
Currently in Comet the error messages match to spark 3.3 and above
### Steps to reproduce
Currently in the CometTestSuite we have added an assumption for this test to
be only running in Spark3.3 and above.
Removing that triggers a test failure when the test suite is run on with the
following env **jdk-1.8 and spark-3.2.0**
Additionally you can reproduce this error locally using spark shell
`$SPARK_HOME/bin/spark-shell --conf spark.sql.ansi.enabled=true`
```
import org.apache.spark.sql._
import org.apache.spark.sql.types._
import java.io.File
import java.nio.file.Files
def roundtripParquet(df: DataFrame): DataFrame = {
val tempDir = Files.createTempDirectory("spark").toString
val filename = new File(tempDir,
s"castTest_${System.currentTimeMillis()}.parquet").toString
df.write.mode(SaveMode.Overwrite).parquet(filename)
spark.read.parquet(filename)
}
import spark.implicits._
val data = roundtripParquet(Seq("0").toDF("a"))
data.createOrReplaceTempView("t")
val df = spark.sql(s"select a, cast(a as ${DataTypes.DateType.sql}) from t
order by a")
df.collect().foreach(println)
```
### Expected behavior
CometTestSuite `cast String to DateType' should pass for all environments
### Additional context
https://github.com/apache/datafusion-comet/pull/383#issuecomment-2115341055
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]