[
https://issues.apache.org/jira/browse/SPARK-31443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maxim Gekk updated SPARK-31443:
-------------------------------
Description:
DateTimeBenchmark shows the regression
Spark 2.4.6-SNAPSHOT at the PR [https://github.com/MaxGekk/spark/pull/27]
{code:java}
OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux
4.15.0-1063-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
To/from Java's date-time: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
>From java.sql.Date 559 603
> 38 8.9 111.8 1.0X
Collect dates 2306 3221
1558 2.2 461.1 0.2X
{code}
Current master:
{code:java}
OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux
4.15.0-1063-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
To/from Java's date-time: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
>From java.sql.Date 1052 1130
> 73 4.8 210.3 1.0X
Collect dates 3251 4943
1624 1.5 650.2 0.3X
{code}
If we subtract preparing DATE column:
* Spark 2.4.6-SNAPSHOT is (461.1 - 111.8) = 349.3 ns/row
* master is (650.2 - 210.3) = 439 ns/row
The regression of toJavaDate in master against Spark 2.4.6-SNAPSHOT is (439 -
349.3)/349.3 = 25%
was:
DateTimeBenchmark shows the regression
Spark 2.4.6-SNAPSHOT at the PR https://github.com/MaxGekk/spark/pull/27
{code}
================================================================================================
Conversion from/to external types
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux
4.15.0-1063-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
To/from java.sql.Timestamp: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
>From java.sql.Date 614 655
> 43 8.1 122.8 1.0X
{code}
Current master:
{code}
================================================================================================
Conversion from/to external types
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux
4.15.0-1063-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
To/from java.sql.Timestamp: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
>From java.sql.Date 1154 1206
> 46 4.3 230.9 1.0X
{code}
The regression is ~x2.
> Perf regression of toJavaDate
> -----------------------------
>
> Key: SPARK-31443
> URL: https://issues.apache.org/jira/browse/SPARK-31443
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 3.0.0
> Reporter: Maxim Gekk
> Priority: Major
>
> DateTimeBenchmark shows the regression
> Spark 2.4.6-SNAPSHOT at the PR [https://github.com/MaxGekk/spark/pull/27]
> {code:java}
> OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux
> 4.15.0-1063-aws
> Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
> To/from Java's date-time: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> From java.sql.Date 559 603
> 38 8.9 111.8 1.0X
> Collect dates 2306 3221
> 1558 2.2 461.1 0.2X
> {code}
> Current master:
> {code:java}
> OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux
> 4.15.0-1063-aws
> Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
> To/from Java's date-time: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> From java.sql.Date 1052 1130
> 73 4.8 210.3 1.0X
> Collect dates 3251 4943
> 1624 1.5 650.2 0.3X
> {code}
> If we subtract preparing DATE column:
> * Spark 2.4.6-SNAPSHOT is (461.1 - 111.8) = 349.3 ns/row
> * master is (650.2 - 210.3) = 439 ns/row
> The regression of toJavaDate in master against Spark 2.4.6-SNAPSHOT is (439 -
> 349.3)/349.3 = 25%
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]