Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22543#discussion_r220420622
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
---
@@ -1018,9 +1018,20 @@ case class TimeAdd(start: Expression, interval:
Expression, timeZoneId: Option[S
}
/**
- * Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time
in UTC, and renders
- * that time as a timestamp in the given time zone. For example, 'GMT+1'
would yield
- * '2017-07-14 03:40:00.0'.
+ * This is a common function for databases supporting TIMESTAMP WITHOUT
TIMEZONE. This function
+ * takes a timestamp which is timezone-agnostic, and interprets it as a
timestamp in UTC, and
+ * renders that timestamp as a timestamp in the given time zone.
+ *
+ * However, timestamp in Spark represents number of microseconds from the
Unix epoch, which is not
+ * timezone-agnostic. So in Spark this function just shift the timestamp
value from UTC timezone to
+ * the given timezone.
+ *
+ * This function may return confusing result if the input is a string with
timezone, e.g.
+ * '2018-03-13T06:18:23+00:00'. The reason is that, Spark firstly cast the
string to timestamp
+ * according to the timezone in the string, and finally display the result
by converting the
+ * timestamp to string according to the session local timezone.
+ *
+ * We may remove this function in Spark 3.0.
--- End diff --
The decision is not made yet, anyway I won't deprecate it in this PR, we
definitely need a new PR
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]