dongjoon-hyun commented on a change in pull request #35100:
URL: https://github.com/apache/spark/pull/35100#discussion_r778462233
##########
File path: sql/hive/benchmarks/OrcReadBenchmark-results.txt
##########
@@ -133,24 +133,90 @@ OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux
5.11.0-1022-azure
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Single Column Scan from 100 columns: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR 115 132
17 9.1 109.6 1.0X
-Native ORC Vectorized 65 77
14 16.0 62.5 1.8X
-Hive built-in ORC 718 733
26 1.5 684.6 0.2X
+Native ORC MR 124 148
27 8.5 118.2 1.0X
+Native ORC Vectorized 71 82
11 14.8 67.4 1.8X
+Hive built-in ORC 782 804
35 1.3 745.6 0.2X
OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Single Column Scan from 200 columns: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR 154 177
23 6.8 147.2 1.0X
-Native ORC Vectorized 104 126
21 10.1 98.8 1.5X
-Hive built-in ORC 1318 1358
56 0.8 1256.8 0.1X
+Native ORC MR 155 184
31 6.8 147.9 1.0X
+Native ORC Vectorized 101 130
24 10.4 96.2 1.5X
+Hive built-in ORC 1477 1494
25 0.7 1408.7 0.1X
OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Single Column Scan from 300 columns: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR 205 232
41 5.1 195.9 1.0X
-Native ORC Vectorized 148 162
17 7.1 141.4 1.4X
-Hive built-in ORC 1889 1942
75 0.6 1801.6 0.1X
+Native ORC MR 191 227
29 5.5 182.4 1.0X
+Native ORC Vectorized 135 153
18 7.7 129.2 1.4X
+Hive built-in ORC 2085 2085
0 0.5 1988.1 0.1X
+
+
+================================================================================================
+Struct scan
+================================================================================================
+
+OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
+Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Single Struct Column Scan with 10 Fields: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
+------------------------------------------------------------------------------------------------------------------------
+Native ORC MR 1126 1149
33 0.9 1073.7 1.0X
+Native ORC Vectorized 1136 1141
7 0.9 1083.4 1.0X
+Hive built-in ORC 589 595
8 1.8 561.4 1.9X
+
+OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
+Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Single Struct Column Scan with 100 Fields: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
+-------------------------------------------------------------------------------------------------------------------------
+Native ORC MR 9880 9995
163 0.1 9422.1 1.0X
+Native ORC Vectorized 9815 9868
75 0.1 9359.9 1.0X
+Hive built-in ORC 3292 3382
127 0.3 3139.3 3.0X
Review comment:
Wow. Nice benchmark to spot this issue. Thanks, @bersprockets .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]