Hi all, I would like to ask what the community thinks regarding the way how Spark handles nanoseconds in the Timestamp type.
As far as I see in the code, Spark assumes microseconds precision. Therefore, I expect to have a truncated to microseconds timestamp or an exception if I specify a timestamp with nanoseconds. However, the current implementation just silently sets nanoseconds as microseconds in [1], which results in a wrong timestamp. Consider the example below: spark.sql("SELECT cast('2015-01-02 00:00:00.000000001' as TIMESTAMP)").show(false) +------------------------------------------------+ |CAST(2015-01-02 00:00:00.000000001 AS TIMESTAMP)| +------------------------------------------------+ |2015-01-02 00:00:00.000001 | +------------------------------------------------+ This issue was already raised in SPARK-17914 but I do not see any decision there. [1] - org.apache.spark.sql.catalyst.util.DateTimeUtils, toJavaTimestamp, line 204 Best regards, Anton