MrPowers opened a new pull request #31000:
URL: https://github.com/apache/spark/pull/31000
### What changes were proposed in this pull request?
This PR add an `add_hours` function. Here's how users currently need to add
hours to a time column:
```scala
df.withColumn("plus_2_hours", expr("first_datetime + INTERVAL 2 hours"))
```
We don't want to make users manipulate strings in their Scala code. We also
don't want to force users to pass around column names when they should be
passing around Column objects.
The `add_hours` function will make this logic a lot more intuitive and
consistent with the rest of the API:
```scala
df.withColumn("plus_2_hours", add_hours(col("first_datetime"), lit(2)))
```
The [Stackoverflow question on this
issue](https://stackoverflow.com/questions/40883084/adding-12-hours-to-datetime-column-in-spark)
has 21,000 views, so this feature will be useful for a lot of users.
[Spark 3 made some awesome improvements to the dates / times
APIs}(https://databricks.com/blog/2020/07/22/a-comprehensive-look-at-dates-and-timestamps-in-apache-spark-3-0.html)
and this PR is one example of an improvement that'll continue making these
APIs easier to use.
### Why are the changes needed?
There are the `INTERVAL` and UDF work-arounds, so this isn't strictly
needed, but it makes the API a lot easier to work with when performing
hour-based computations. It'll also make the answer easier to find. It's not
easy to find the `INTERVAL` solution in [the
docs](https://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/functions$.html).
### Does this PR introduce _any_ user-facing change?
Yes, this adds the `add_hours` function to the
`org.apache.spark.sql.functions` object which is a public facing API. The
`@since` function annotation will need to be updated with the right version if
this ends up getting merged in.
### How was this patch tested?
Function was unit tested. The unit tests follow the testing patterns of
similar SQL functions.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]