MaxGekk opened a new pull request #28441:
URL: https://github.com/apache/spark/pull/28441


   ### What changes were proposed in this pull request?
   Skip timestamps rebasing after a global threshold when there is no 
difference between Julian and Gregorian calendars. This allows to avoid 
checking hash maps of switch points, and fixes perf regressions in 
`toJavaTimestamp()` and `fromJavaTimestamp()`.
   
   ### Why are the changes needed?
   The changes fix perf regressions of conversions to/from external type 
`java.sql.Timestamp`.
   
   Before (see the PR's results https://github.com/apache/spark/pull/28440):
   ```
   
================================================================================================
   Conversion from/to external types
   
================================================================================================
   
   OpenJDK 64-Bit Server VM 1.8.0_252-8u252-b09-1~18.04-b09 on Linux 
4.15.0-1063-aws
   Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
   To/from Java's date-time:                 Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   From java.sql.Timestamp                             376            388       
   10         13.3          75.2       1.1X
   Collect java.sql.Timestamp                         1878           1937       
   64          2.7         375.6       0.2X
   ```
   
   After:
   ```
   
================================================================================================
   Conversion from/to external types
   
================================================================================================
   
   OpenJDK 64-Bit Server VM 1.8.0_252-8u252-b09-1~18.04-b09 on Linux 
4.15.0-1063-aws
   Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
   To/from Java's date-time:                 Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   From java.sql.Timestamp                             249            264       
   24         20.1          49.8       1.7X
   Collect java.sql.Timestamp                         1503           1523       
   24          3.3         300.5       0.3X
   ```
   
   Perf improvements in average of:
   
   1. From java.sql.Timestamp is ~ 34%
   2. To java.sql.Timestamps is ~16%
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   By existing test suites `DateTimeUtilsSuite` and `RebaseDateTimeSuite`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to