A Bradbury created SPARK-23792:
----------------------------------
Summary: Documentation improvements for datetime functions
Key: SPARK-23792
URL: https://issues.apache.org/jira/browse/SPARK-23792
Project: Spark
Issue Type: Documentation
Components: Documentation, SQL
Affects Versions: 2.3.0
Reporter: A Bradbury
Added details about the supported column input types, the column return type,
behaviour on invalid input, supporting examples and clarifications to the
datetime functions in `org.apache.spark.sql.functions` for Java/Scala.
These changes stemmed from confusion over behaviour of the `date_add` method.
On first use I thought it would add the specified days to the input timestamp,
but it also truncated (cast) the input timestamp to a date, loosing the time
part.
Some examples:
* Noted that the week definition for `dayofweek` method starts on a Sunday
* Corrected documentation for methods such as `last_day` that only listed one
type of input i.e. "date column" changed to "date, timestamp or string"
* Renamed the parameters of the `months_between` method to match those of the
`datediff` method and to indicate which parameter is expected to be before then
other chronologically
* `from_unixtime` documentation referenced the "given format" when there was
no format parameter
* Documentation for `to_timestamp` methods detailed that a unix timestamp in
seconds would be returned (implying 1521926327) when they would actually return
the input cast to a timestamp type
Some observations:
* The first day of the week by the `dayofweek` method is a Sunday, but by the
`weekofyear` method it is a Monday
* The `datediff` method returns a integer value, even with timestamp input,
whereas the `months_between` method returns a double, which seems inconsistent
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]