c21 commented on a change in pull request #35090:
URL: https://github.com/apache/spark/pull/35090#discussion_r778575137



##########
File path: sql/hive/benchmarks/OrcReadBenchmark-results.txt
##########
@@ -3,220 +3,220 @@ SQL Single Numeric Column Scan
 
================================================================================================
 
 OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
 SQL Single TINYINT Column Scan:           Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR                                       832           1153         
453         18.9          52.9       1.0X
-Native ORC Vectorized                               148            189         
 24        106.5           9.4       5.6X
-Hive built-in ORC                                   986           1028         
 59         15.9          62.7       0.8X
+Native ORC MR                                      1102           1123         
 30         14.3          70.1       1.0X
+Native ORC Vectorized                               177            254         
 47         89.0          11.2       6.2X
+Hive built-in ORC                                  1356           1396         
 57         11.6          86.2       0.8X
 
 OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
 SQL Single SMALLINT Column Scan:          Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR                                       868            913         
 60         18.1          55.2       1.0X
-Native ORC Vectorized                               133            150         
 21        118.6           8.4       6.5X
-Hive built-in ORC                                  1098           1102         
  6         14.3          69.8       0.8X
+Native ORC MR                                      1030           1054         
 33         15.3          65.5       1.0X
+Native ORC Vectorized                               218            245         
 19         72.3          13.8       4.7X
+Hive built-in ORC                                  1511           1543         
 45         10.4          96.0       0.7X
 
 OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
 SQL Single INT Column Scan:               Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR                                       898            917         
 24         17.5          57.1       1.0X
-Native ORC Vectorized                               155            175         
 16        101.4           9.9       5.8X
-Hive built-in ORC                                  1114           1126         
 17         14.1          70.8       0.8X
+Native ORC MR                                      1151           1162         
 16         13.7          73.2       1.0X
+Native ORC Vectorized                               224            256         
 20         70.1          14.3       5.1X
+Hive built-in ORC                                  1589           1661         
103          9.9         101.0       0.7X
 
 OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
 SQL Single BIGINT Column Scan:            Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR                                       897            981         
117         17.5          57.0       1.0X
-Native ORC Vectorized                               182            224         
 40         86.2          11.6       4.9X
-Hive built-in ORC                                  1194           1368         
247         13.2          75.9       0.8X
+Native ORC MR                                      1097           1194         
136         14.3          69.8       1.0X
+Native ORC Vectorized                               248            274         
 23         63.4          15.8       4.4X
+Hive built-in ORC                                  1601           1615         
 19          9.8         101.8       0.7X
 
 OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
 SQL Single FLOAT Column Scan:             Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR                                       968            987         
 23         16.2          61.6       1.0X
-Native ORC Vectorized                               219            251         
 41         71.8          13.9       4.4X
-Hive built-in ORC                                  1229           1477         
351         12.8          78.1       0.8X
+Native ORC MR                                      1132           1132         
  1         13.9          71.9       1.0X
+Native ORC Vectorized                               263            287         
 21         59.8          16.7       4.3X
+Hive built-in ORC                                  1499           1509         
 15         10.5          95.3       0.8X
 
 OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
 SQL Single DOUBLE Column Scan:            Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR                                      1006           1010         
  5         15.6          64.0       1.0X
-Native ORC Vectorized                               245            265         
 20         64.2          15.6       4.1X
-Hive built-in ORC                                  1220           1228         
 12         12.9          77.6       0.8X
+Native ORC MR                                      1220           1236         
 22         12.9          77.6       1.0X
+Native ORC Vectorized                               280            316         
 40         56.2          17.8       4.4X
+Hive built-in ORC                                  1581           1679         
138          9.9         100.5       0.8X
 
 
 
================================================================================================
 Int and String Scan
 
================================================================================================
 
 OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
 Int and String Scan:                      Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR                                      1906           1923         
 25          5.5         181.8       1.0X
-Native ORC Vectorized                              1057           1067         
 14          9.9         100.8       1.8X
-Hive built-in ORC                                  2183           2248         
 92          4.8         208.2       0.9X
+Native ORC MR                                      2327           2375         
 68          4.5         221.9       1.0X
+Native ORC Vectorized                              1428           1438         
 14          7.3         136.2       1.6X
+Hive built-in ORC                                  2811           2865         
 76          3.7         268.1       0.8X
 
 
 
================================================================================================
 Partitioned Table Scan
 
================================================================================================
 
 OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
 Partitioned Table:                        Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Data column - Native ORC MR                        1039           1107         
 95         15.1          66.1       1.0X
-Data column - Native ORC Vectorized                 181            205         
 27         86.7          11.5       5.7X
-Data column - Hive built-in ORC                    1344           1353         
 13         11.7          85.4       0.8X
-Partition column - Native ORC MR                    686            699         
 12         22.9          43.6       1.5X
-Partition column - Native ORC Vectorized             54             64         
  6        291.4           3.4      19.3X
-Partition column - Hive built-in ORC                945            956         
 13         16.6          60.1       1.1X
-Both columns - Native ORC MR                       1107           1115         
 11         14.2          70.4       0.9X
-Both columns - Native ORC Vectorized                199            258         
 52         79.2          12.6       5.2X
-Both columns - Hive built-in ORC                   1383           1386         
  5         11.4          87.9       0.8X
+Data column - Native ORC MR                        1288           1317         
 40         12.2          81.9       1.0X
+Data column - Native ORC Vectorized                 265            302         
 28         59.3          16.9       4.9X
+Data column - Hive built-in ORC                    1710           1753         
 60          9.2         108.7       0.8X
+Partition column - Native ORC MR                    855            891         
 46         18.4          54.4       1.5X
+Partition column - Native ORC Vectorized             84             96         
 12        187.7           5.3      15.4X
+Partition column - Hive built-in ORC               1244           1254         
 15         12.6          79.1       1.0X
+Both columns - Native ORC MR                       1460           1482         
 31         10.8          92.8       0.9X
+Both columns - Native ORC Vectorized                301            326         
 23         52.3          19.1       4.3X
+Both columns - Hive built-in ORC                   1780           1830         
 70          8.8         113.2       0.7X
 
 
 
================================================================================================
 Repeated String Scan
 
================================================================================================
 
 OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
 Repeated String:                          Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR                                       908            916         
  8         11.5          86.6       1.0X
-Native ORC Vectorized                               180            218         
 42         58.4          17.1       5.1X
-Hive built-in ORC                                  1156           1165         
 13          9.1         110.3       0.8X
+Native ORC MR                                      1143           1161         
 26          9.2         109.0       1.0X
+Native ORC Vectorized                               261            298         
 49         40.1          24.9       4.4X
+Hive built-in ORC                                  1520           1579         
 84          6.9         145.0       0.8X
 
 
 
================================================================================================
 String with Nulls Scan
 
================================================================================================
 
 OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
 String with Nulls Scan (0.0%):            Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR                                      1666           1719         
 75          6.3         158.9       1.0X
-Native ORC Vectorized                               484            501         
 15         21.7          46.1       3.4X
-Hive built-in ORC                                  1985           1989         
  5          5.3         189.3       0.8X
+Native ORC MR                                      2155           2176         
 30          4.9         205.5       1.0X
+Native ORC Vectorized                               640            684         
 43         16.4          61.0       3.4X
+Hive built-in ORC                                  2592           2654         
 88          4.0         247.2       0.8X
 
 OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
 String with Nulls Scan (50.0%):           Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR                                      1567           1635         
 96          6.7         149.5       1.0X
-Native ORC Vectorized                               641            662         
 30         16.4          61.1       2.4X
-Hive built-in ORC                                  1885           1888         
  5          5.6         179.7       0.8X
+Native ORC MR                                      1706           1789         
117          6.1         162.7       1.0X
+Native ORC Vectorized                               721            814         
137         14.5          68.8       2.4X
+Hive built-in ORC                                  2262           2283         
 29          4.6         215.7       0.8X
 
 OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
 String with Nulls Scan (95.0%):           Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR                                       845            851         
  6         12.4          80.6       1.0X
-Native ORC Vectorized                               244            258         
 16         43.0          23.2       3.5X
-Hive built-in ORC                                  1107           1162         
 77          9.5         105.6       0.8X
+Native ORC MR                                       951           1001         
 45         11.0          90.7       1.0X
+Native ORC Vectorized                               254            285         
 19         41.3          24.2       3.7X
+Hive built-in ORC                                  1352           1382         
 44          7.8         128.9       0.7X
 
 
 
================================================================================================
 Single Column Scan From Wide Columns
 
================================================================================================
 
 OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
 Single Column Scan from 100 columns:      Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR                                       124            148         
 27          8.5         118.2       1.0X
-Native ORC Vectorized                                71             82         
 11         14.8          67.4       1.8X
-Hive built-in ORC                                   782            804         
 35          1.3         745.6       0.2X
+Native ORC MR                                       173            205         
 23          6.0         165.3       1.0X
+Native ORC Vectorized                                99            116         
 17         10.5          94.8       1.7X
+Hive built-in ORC                                   980           1042         
 87          1.1         934.4       0.2X
 
 OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
 Single Column Scan from 200 columns:      Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR                                       155            184         
 31          6.8         147.9       1.0X
-Native ORC Vectorized                               101            130         
 24         10.4          96.2       1.5X
-Hive built-in ORC                                  1477           1494         
 25          0.7        1408.7       0.1X
+Native ORC MR                                       200            239         
 35          5.2         190.5       1.0X
+Native ORC Vectorized                               145            174         
 23          7.3         137.8       1.4X
+Hive built-in ORC                                  1809           1945         
193          0.6        1725.0       0.1X
 
 OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
 Single Column Scan from 300 columns:      Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR                                       191            227         
 29          5.5         182.4       1.0X
-Native ORC Vectorized                               135            153         
 18          7.7         129.2       1.4X
-Hive built-in ORC                                  2085           2085         
  0          0.5        1988.1       0.1X
+Native ORC MR                                       278            336         
 52          3.8         264.7       1.0X
+Native ORC Vectorized                               209            232         
 21          5.0         199.0       1.3X
+Hive built-in ORC                                  2679           2828         
211          0.4        2554.8       0.1X
 
 
 
================================================================================================
 Struct scan
 
================================================================================================
 
 OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 5.11.0-1022-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
 Single Struct Column Scan with 10 Fields:  Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Native ORC MR                                      1126           1149         
 33          0.9        1073.7       1.0X
-Native ORC Vectorized                              1136           1141         
  7          0.9        1083.4       1.0X
-Hive built-in ORC                                   589            595         
  8          1.8         561.4       1.9X
+Native ORC MR                                       363            427         
 78          2.9         346.2       1.0X
+Native ORC Vectorized                               375            442         
 88          2.8         357.6       1.0X

Review comment:
       just FYI @bersprockets - for vectorized scan of nested column, I have 
fixed the benchmark to enable vectorization - 
https://github.com/apache/spark/commit/4a2ba5b22e84fc79b44604c60320aa5ae679e13a 
.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to