sunchao commented on a change in pull request #34611:
URL: https://github.com/apache/spark/pull/34611#discussion_r751663494
##########
File path: sql/core/benchmarks/DataSourceReadBenchmark-results.txt
##########
@@ -1,252 +1,275 @@
+================================================================================================
+SQL Single Boolean Column Scan
+================================================================================================
+
+OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1020-azure
+Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
+SQL Single BOOLEAN Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
+------------------------------------------------------------------------------------------------------------------------
+SQL CSV 13472 13878
574 1.2 856.5 1.0X
+SQL Json 10036 10477
623 1.6 638.0 1.3X
+SQL Parquet Vectorized 144 167
12 109.2 9.2 93.5X
+SQL Parquet MR 2224 2230
7 7.1 141.4 6.1X
+SQL ORC Vectorized 191 203
6 82.3 12.2 70.5X
+SQL ORC MR 1865 1870
7 8.4 118.6 7.2X
+
+OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1020-azure
+Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
+Parquet Reader Single BOOLEAN Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
+-------------------------------------------------------------------------------------------------------------------------
+ParquetReader Vectorized 119 125
8 131.9 7.6 1.0X
+ParquetReader Vectorized -> Row 60 63
2 260.2 3.8 2.0X
+
+
================================================================================================
SQL Single Numeric Column Scan
================================================================================================
-OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
-Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1020-azure
+Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
SQL Single TINYINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 15943 15956
18 1.0 1013.6 1.0X
-SQL Json 9109 9158
70 1.7 579.1 1.8X
-SQL Parquet Vectorized 168 191
16 93.8 10.7 95.1X
-SQL Parquet MR 1938 1950
17 8.1 123.2 8.2X
-SQL ORC Vectorized 191 199
6 82.2 12.2 83.3X
-SQL ORC MR 1523 1537
20 10.3 96.8 10.5X
-
-OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
-Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
+SQL CSV 16820 16859
54 0.9 1069.4 1.0X
+SQL Json 11583 11586
4 1.4 736.4 1.5X
+SQL Parquet Vectorized 164 177
11 96.0 10.4 102.7X
+SQL Parquet MR 2839 2857
25 5.5 180.5 5.9X
+SQL ORC Vectorized 150 161
7 104.8 9.5 112.1X
+SQL ORC MR 1915 1923
12 8.2 121.7 8.8X
+
+OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1020-azure
+Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Parquet Reader Single TINYINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized 203 206
3 77.5 12.9 1.0X
-ParquetReader Vectorized -> Row 97 100
2 161.6 6.2 2.1X
+ParquetReader Vectorized 211 218
5 74.6 13.4 1.0X
+ParquetReader Vectorized -> Row 286 293
7 55.1 18.2 0.7X
Review comment:
Yea not sure what's going on, but highly doubt it's related to this PR
though. Also "ParquetReader Vectorized" stays the same so it's related to
related to the columnar -> row path
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]