MaxGekk opened a new pull request #25871: [SPARK-29190][SQL] Optimize 
`extract`/`date_part` for the milliseconds `field`
URL: https://github.com/apache/spark/pull/25871
 
 
   ### What changes were proposed in this pull request?
   
   Changed the `DateTimeUtils.getMilliseconds()` by avoiding the decimal 
division, and replacing it by setting scale and precision while converting 
microseconds to the decimal type.
   
   ### Why are the changes needed?
   This improves performance of `extract` and `date_part()` by more than **50 
times**:
   Before:
   ```
   Invoke extract for timestamp:             Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative     Invoke extract for 
timestamp:             Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   
Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   cast to timestamp                                   397            428       
   45         25.2          39.7       1.0X
   MILLISECONDS of timestamp                         36723          36761       
   63          0.3        3672.3       0.0X
   ```
   After:
   ```
   Invoke extract for timestamp:             Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   cast to timestamp                                   278            284       
    6         36.0          27.8       1.0X
   MILLISECONDS of timestamp                           592            606       
   13         16.9          59.2       0.5X
   ```
   
   
   ### Does this PR introduce any user-facing change?
   No
   
   ### How was this patch tested?
   By existing test suite - `DateExpressionsSuite`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to