[ https://issues.apache.org/jira/browse/SPARK-31359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan resolved SPARK-31359. --------------------------------- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 28163 [https://github.com/apache/spark/pull/28163] > Speed up timestamps rebasing > ---------------------------- > > Key: SPARK-31359 > URL: https://issues.apache.org/jira/browse/SPARK-31359 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.0.0 > Reporter: Maxim Gekk > Assignee: Maxim Gekk > Priority: Major > Fix For: 3.0.0 > > > Currently, rebasing of timestamps is performed via conversions to local > timestamps and back to microseconds. This is CPU intensive operation which > can be avoid by converting via pre-calculated tables per each time zone. For > example, the below is timestamps when diffs are changed in > America/Los_Angeles time zone for the range 0001-01-01...2100-01-01 > {code} > 0001-01-01T00:00 diff = -2872 minutes > 0100-03-01T00:00 diff = -1432 minutes > 0200-03-01T00:00 diff = 7 minutes > 0300-03-01T00:00 diff = 1447 minutes > 0500-03-01T00:00 diff = 2887 minutes > 0600-03-01T00:00 diff = 4327 minutes > 0700-03-01T00:00 diff = 5767 minutes > 0900-03-01T00:00 diff = 7207 minutes > 1000-03-01T00:00 diff = 8647 minutes > 1100-03-01T00:00 diff = 10087 minutes > 1300-03-01T00:00 diff = 11527 minutes > 1400-03-01T00:00 diff = 12967 minutes > 1500-03-01T00:00 diff = 14407 minutes > 1582-10-15T00:00 diff = 7 minutes > 1883-11-18T12:22:58 diff = 0 minutes > {code} > It seems it is possible to build rebasing maps, and perform rebasing via the > maps. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org