[
https://issues.apache.org/jira/browse/SPARK-31426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan resolved SPARK-31426.
---------------------------------
Fix Version/s: 3.0.0
Resolution: Fixed
Issue resolved by pull request 28189
[https://github.com/apache/spark/pull/28189]
> Regression in loading/saving timestamps from/to ORC files
> ---------------------------------------------------------
>
> Key: SPARK-31426
> URL: https://issues.apache.org/jira/browse/SPARK-31426
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 3.0.0
> Reporter: Maxim Gekk
> Assignee: Maxim Gekk
> Priority: Major
> Fix For: 3.0.0
>
>
> Here are results of DateTimeRebaseBenchmark on the current master branch:
> {code}
> Save timestamps to ORC: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> after 1582 59877 59877
> 0 1.7 598.8 0.0X
> before 1582 61361 61361
> 0 1.6 613.6 0.0X
> Load timestamps from ORC: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> after 1582, vec off 48197 48288
> 118 2.1 482.0 1.0X
> after 1582, vec on 38247 38351
> 128 2.6 382.5 1.3X
> before 1582, vec off 53179 53359
> 249 1.9 531.8 0.9X
> before 1582, vec on 44076 44268
> 269 2.3 440.8 1.1X
> {code}
> The results of the same benchmark on Spark 2.4.6-SNAPSHOT:
> {code}
> Save timestamps to ORC: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> after 1582 18858 18858
> 0 5.3 188.6 1.0X
> before 1582 18508 18508
> 0 5.4 185.1 1.0X
> Load timestamps from ORC: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> after 1582, vec off 14063 14177
> 143 7.1 140.6 1.0X
> after 1582, vec on 5955 6029
> 100 16.8 59.5 2.4X
> before 1582, vec off 14119 14126
> 7 7.1 141.2 1.0X
> before 1582, vec on 5991 6007
> 25 16.7 59.9 2.3X
> {code}
> Here is the PR with DateTimeRebaseBenchmark backported to 2.4:
> https://github.com/MaxGekk/spark/pull/27
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]