andygrove opened a new issue, #14:
URL: https://github.com/apache/arrow-datafusion-comet/issues/14

   I was manually experimenting with some cast operations based on my 
experience of implementing them in Spark RAPIDS and found the following example 
of incorrect behavior. I would recommend implementing some fuzz tests to find 
these kind of issues. 
   
   ## Test data
   
   ```
   scala> robots.show
   +------+
   |  name|
   +------+
   |WALL-E|
   |  R2D2|
   |    T2|
   +------+
   ```
   
   ## Test with Comet
   
   ```
   scala> import org.apache.spark.sql.types._
   
   scala> val df = robots.withColumn("date", 
col("name").cast(DataTypes.TimestampType))
   
   scala> df.show
   +------+----+
   |  name|date|
   +------+----+
   |WALL-E|null|
   |  R2D2|null|
   |    T2|null|
   +------+----+
   ```
   
   ## Test with Spark
   
   ```
   scala> spark.conf.set("spark.comet.enabled", false)
   
   scala> df.show
   +------+-------------------+
   |  name|               date|
   +------+-------------------+
   |WALL-E|               null|
   |  R2D2|               null|
   |    T2|2024-02-09 02:00:00|
   +------+-------------------+
   ```
   
   `T2` is a valid timestamp because `T` is the separator between the optional 
date and the time portion. `2` is a valid time because some time fields are 
optional.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to