[
https://issues.apache.org/jira/browse/SPARK-42789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SPARK-42789:
-----------------------------------
Labels: pull-request-available (was: )
> Rewrite multiple GetJsonObjects to a JsonTuple if their json expression is
> the same
> -----------------------------------------------------------------------------------
>
> Key: SPARK-42789
> URL: https://issues.apache.org/jira/browse/SPARK-42789
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.5.0
> Reporter: Yuming Wang
> Priority: Major
> Labels: pull-request-available
>
> Benchmark result:
> {noformat}
> Running benchmark: Benchmark rewrite GetJsonObjects
> Running case: Default: 2
> Stopped after 2 iterations, 77193 ms
> Running case: Rewrite: 2
> Stopped after 2 iterations, 51699 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> Default: 2 37914 38597
> 966 0.2 5244.0 1.0X
> Rewrite: 2 24887 25850
> 1361 0.3 3442.2 1.5X
> Running benchmark: Benchmark rewrite GetJsonObjects
> Running case: Default: 3
> Stopped after 2 iterations, 110890 ms
> Running case: Rewrite: 3
> Stopped after 2 iterations, 56102 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> Default: 3 52862 55445
> NaN 0.1 7311.6 1.0X
> Rewrite: 3 26752 28051
> 1837 0.3 3700.2 2.0X
> Running benchmark: Benchmark rewrite GetJsonObjects
> Running case: Default: 4
> Stopped after 2 iterations, 150828 ms
> Running case: Rewrite: 4
> Stopped after 2 iterations, 57110 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> Default: 4 71680 75414
> NaN 0.1 9914.4 1.0X
> Rewrite: 4 28452 28555
> 145 0.3 3935.4 2.5X
> Running benchmark: Benchmark rewrite GetJsonObjects
> Running case: Default: 5
> Stopped after 2 iterations, 223367 ms
> Running case: Rewrite: 5
> Stopped after 2 iterations, 78193 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> Default: 5 108479 111684
> 1447 0.1 15004.2 1.0X
> Rewrite: 5 36830 39097
> NaN 0.2 5094.0 2.9X
> Running benchmark: Benchmark rewrite GetJsonObjects
> Running case: Default: 10
> Stopped after 2 iterations, 311453 ms
> Running case: Rewrite: 10
> Stopped after 2 iterations, 65873 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> Default: 10 153952 155727
> 2510 0.0 21293.7 1.0X
> Rewrite: 10 32436 32937
> 708 0.2 4486.3 4.7X
> Running benchmark: Benchmark rewrite GetJsonObjects
> Running case: Default: 15
> Stopped after 2 iterations, 451911 ms
> Running case: Rewrite: 15
> Stopped after 2 iterations, 69790 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> Default: 15 224950 225956
> 1423 0.0 31113.6 1.0X
> Rewrite: 15 34806 34895
> 126 0.2 4814.2 6.5X
> Running benchmark: Benchmark rewrite GetJsonObjects
> Running case: Default: 20
> Stopped after 2 iterations, 587378 ms
> Running case: Rewrite: 20
> Stopped after 2 iterations, 76667 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> Default: 20 293155 293689
> 756 0.0 40547.3 1.0X
> Rewrite: 20 38148 38334
> 262 0.2 5276.4 7.7X
> Running benchmark: Benchmark rewrite GetJsonObjects
> Running case: Default: 25
> Stopped after 2 iterations, 732228 ms
> Running case: Rewrite: 25
> 00:38:55.284 WARN org.apache.spark.sql.catalyst.util.package: Truncated the
> string representation of a plan since it was too large. This behavior can be
> adjusted by setting 'spark.sql.debug.maxToStringFields'.
> Stopped after 2 iterations, 93089 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> Default: 25 366037 366114
> 110 0.0 50627.9 1.0X
> Rewrite: 25 45725 46545
> 1159 0.2 6324.4 8.0X
> Running benchmark: Benchmark rewrite GetJsonObjects
> Running case: Default: 30
> Stopped after 2 iterations, 881176 ms
> Running case: Rewrite: 30
> Stopped after 2 iterations, 107241 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> Default: 30 439567 440588
> 1444 0.0 60798.2 1.0X
> Rewrite: 30 52949 53621
> 950 0.1 7323.5 8.3X
> Running benchmark: Benchmark rewrite GetJsonObjects
> Running case: Default: 36
> Stopped after 2 iterations, 1055559 ms
> Running case: Rewrite: 36
> Stopped after 2 iterations, 124081 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> Default: 35 526481 527780
> 1837 0.0 72819.5 1.0X
> Rewrite: 35 60586 62041
> 2058 0.1 8379.8 8.7X
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]