andygrove opened a new issue, #14:
URL: https://github.com/apache/arrow-datafusion-comet/issues/14
I was manually experimenting with some cast operations based on my
experience of implementing them in Spark RAPIDS and found the following example
of incorrect behavior. I would recommend implementing some fuzz tests to find
these kind of issues.
## Test data
```
scala> robots.show
+------+
| name|
+------+
|WALL-E|
| R2D2|
| T2|
+------+
```
## Test with Comet
```
scala> import org.apache.spark.sql.types._
scala> val df = robots.withColumn("date",
col("name").cast(DataTypes.TimestampType))
scala> df.show
+------+----+
| name|date|
+------+----+
|WALL-E|null|
| R2D2|null|
| T2|null|
+------+----+
```
## Test with Spark
```
scala> spark.conf.set("spark.comet.enabled", false)
scala> df.show
+------+-------------------+
| name| date|
+------+-------------------+
|WALL-E| null|
| R2D2| null|
| T2|2024-02-09 02:00:00|
+------+-------------------+
```
`T2` is a valid timestamp because `T` is the separator between the optional
date and the time portion. `2` is a valid time because some time fields are
optional.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]