[Spark SQL] Nanoseconds in Timestamps are set as Microseconds

Anton Okolnychyi Thu, 01 Jun 2017 01:17:54 -0700

Hi all,

I would like to ask what the community thinks regarding the way how Spark
handles nanoseconds in the Timestamp type.


As far as I see in the code, Spark assumes microseconds precision.
Therefore, I expect to have a truncated to microseconds timestamp or an
exception if I specify a timestamp with nanoseconds. However, the current
implementation just silently sets nanoseconds as microseconds in [1], which
results in a wrong timestamp. Consider the example below:

spark.sql("SELECT cast('2015-01-02 00:00:00.000000001' as
TIMESTAMP)").show(false)
+------------------------------------------------+
|CAST(2015-01-02 00:00:00.000000001 AS TIMESTAMP)|
+------------------------------------------------+
|2015-01-02 00:00:00.000001                      |
+------------------------------------------------+

This issue was already raised in SPARK-17914 but I do not see any decision
there.

[1] - org.apache.spark.sql.catalyst.util.DateTimeUtils, toJavaTimestamp,
line 204

Best regards,
Anton

[Spark SQL] Nanoseconds in Timestamps are set as Microseconds

Reply via email to