[GitHub] [spark] bersprockets commented on pull request #36546: [SPARK-37544][SQL] Correct date arithmetic in sequences

GitBox Fri, 13 May 2022 18:16:31 -0700


bersprockets commented on PR #36546:
URL: https://github.com/apache/spark/pull/36546#issuecomment-1126602057


   This PR brings `Date` in line with `Timestamp` (that is, time-zone aware).
   
   But even `Timestamp` sequences have some anomalies, e.g. (from a Spark 
without my change, in the America/Los_Angeles time-zone):
   ```
   spark-sql> select element_at(sequence(timestamp'2021-01-01', 
timestamp'2021-01-01' + interval 82 hours * 97, interval 82 hours), 97) as a;
   2021-11-24 23:00:00
   Time taken: 0.076 seconds, Fetched 1 row(s)
   spark-sql> select timestamp'2021-01-01' + interval 82 hours * 96 as x;
   2021-11-25 00:00:00
   Time taken: 0.053 seconds, Fetched 1 row(s)
   spark-sql> 
   ```
   The 96th (origin 0) element of the sequence from the first query is 1 hour 
less than the result of the second query. One would think they should be the 
same (both supposedly being `'2021-01-01' + interval 82 hours * 96 `), but the 
"fall back" is being handled differently around element 92 (origin 0) of the 
sequence.
   
   `Date` sequences also have (and will continue to have, after this PR) the 
same anomaly:
   ```
   spark-sql> select date'2021-01-01' + interval 82 hours * 96 as x;
   2021-11-25 00:00:00
   Time taken: 4.146 seconds, Fetched 1 row(s)
   spark-sql> select element_at(sequence(date'2021-01-01', date'2022-01-05', 
interval 82 hours), 97) as a;
   2021-11-24
   Time taken: 0.125 seconds, Fetched 1 row(s)
   spark-sql>  
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] bersprockets commented on pull request #36546: [SPARK-37544][SQL] Correct date arithmetic in sequences

Reply via email to