[GitHub] [spark] utkarsh39 opened a new pull request #30303: [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression

GitBox Mon, 09 Nov 2020 15:44:19 -0800


utkarsh39 opened a new pull request #30303:
URL: https://github.com/apache/spark/pull/30303



   ### What changes were proposed in this pull request?
   The following query produces incorrect results:
   ```
   SELECT date_trunc('minute', '1769-10-17 17:10:02')
   ```
   Steps to repro (run the following commands in spark-shell):
   ```
   spark.conf.set("spark.sql.session.timeZone", "America/Los_Angeles")
   spark.sql("SELECT date_trunc('minute', '1769-10-17 17:10:02')").show()
   ```
   This happens as `truncTimestamp` in package 
`org.apache.spark.sql.catalyst.util.DateTimeUtils` incorrectly assumes that 
time zone offsets can never have the granularity of a second and thus does not 
account for time zone adjustment when truncating the given timestamp to 
`minute`. 
   This assumption is currently used when truncating the timestamps to 
`microsecond, millisecond, second, or minute`. 
   
   This PR fixes this issue and always uses time zone knowledge when truncating 
timestamps regardless of the truncation unit.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Added new tests to `DateTimeUtilsSuite` which previously failed and pass now.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] utkarsh39 opened a new pull request #30303: [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression

Reply via email to