dongjoon-hyun commented on code in PR #39301:
URL: https://github.com/apache/spark/pull/39301#discussion_r1059239575


##########
sql/hive/benchmarks/OrcReadBenchmark-results.txt:
##########
@@ -2,221 +2,221 @@
 SQL Single Numeric Column Scan
 
================================================================================================
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 SQL Single TINYINT Column Scan:           Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                   920           1005         
120         17.1          58.5       1.0X
-Native ORC MR                                       721            896         
206         21.8          45.8       1.3X
-Native ORC Vectorized                               116            140         
 17        135.5           7.4       7.9X
+Hive built-in ORC                                  1137           1146         
 14         13.8          72.3       1.0X
+Native ORC MR                                      1034           1048         
 20         15.2          65.7       1.1X
+Native ORC Vectorized                                92            117         
 23        170.9           5.8      12.4X
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 SQL Single SMALLINT Column Scan:          Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                  1020           1024         
  6         15.4          64.8       1.0X
-Native ORC MR                                       789            810         
 32         19.9          50.2       1.3X
-Native ORC Vectorized                               109            126         
 19        144.6           6.9       9.4X
+Hive built-in ORC                                  1269           1319         
 71         12.4          80.7       1.0X
+Native ORC MR                                      1087           1088         
  1         14.5          69.1       1.2X
+Native ORC Vectorized                               155            182         
 24        101.3           9.9       8.2X
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 SQL Single INT Column Scan:               Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                  1102           1133         
 43         14.3          70.1       1.0X
-Native ORC MR                                       908            933         
 29         17.3          57.7       1.2X
-Native ORC Vectorized                               143            171         
 36        110.0           9.1       7.7X
+Hive built-in ORC                                  1369           1466         
137         11.5          87.0       1.0X
+Native ORC MR                                      1103           1277         
247         14.3          70.1       1.2X
+Native ORC Vectorized                               175            192         
 26         90.1          11.1       7.8X
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 SQL Single BIGINT Column Scan:            Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                  1081           1159         
110         14.6          68.7       1.0X
-Native ORC MR                                       940            947         
 10         16.7          59.7       1.2X
-Native ORC Vectorized                               173            182         
 10         90.7          11.0       6.2X
+Hive built-in ORC                                  1469           1539         
 98         10.7          93.4       1.0X
+Native ORC MR                                      1214           1239         
 36         13.0          77.2       1.2X
+Native ORC Vectorized                               256            274         
 26         61.4          16.3       5.7X
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 SQL Single FLOAT Column Scan:             Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                  1130           1147         
 24         13.9          71.9       1.0X
-Native ORC MR                                       962            980         
 26         16.3          61.2       1.2X
-Native ORC Vectorized                               213            220         
  7         73.7          13.6       5.3X
+Hive built-in ORC                                  1467           1480         
 18         10.7          93.3       1.0X
+Native ORC MR                                      1239           1303         
 90         12.7          78.8       1.2X
+Native ORC Vectorized                               217            224         
  8         72.6          13.8       6.8X
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 SQL Single DOUBLE Column Scan:            Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                  1154           1176         
 31         13.6          73.4       1.0X
-Native ORC MR                                       962            974         
 19         16.3          61.2       1.2X
-Native ORC Vectorized                               241            254         
 16         65.2          15.3       4.8X
+Hive built-in ORC                                  1432           1456         
 33         11.0          91.1       1.0X
+Native ORC MR                                      1230           1244         
 20         12.8          78.2       1.2X
+Native ORC Vectorized                               243            263         
 24         64.7          15.5       5.9X
 
 
 
================================================================================================
 Int and String Scan
 
================================================================================================
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 Int and String Scan:                      Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                  2185           2232         
 67          4.8         208.4       1.0X
-Native ORC MR                                      1858           1890         
 44          5.6         177.2       1.2X
-Native ORC Vectorized                              1056           1058         
  4          9.9         100.7       2.1X
+Hive built-in ORC                                  2601           2642         
 57          4.0         248.1       1.0X
+Native ORC MR                                      2371           2376         
  7          4.4         226.1       1.1X
+Native ORC Vectorized                              1270           1294         
 33          8.3         121.2       2.0X
 
 
 
================================================================================================
 Partitioned Table Scan
 
================================================================================================
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 Partitioned Table:                        Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Data column - Hive built-in ORC                    1334           1334         
  0         11.8          84.8       1.0X
-Data column - Native ORC MR                        1210           1274         
 91         13.0          76.9       1.1X
-Data column - Native ORC Vectorized                 177            193         
 38         89.1          11.2       7.6X
-Partition column - Hive built-in ORC                998           1002         
  6         15.8          63.4       1.3X
-Partition column - Native ORC MR                    789            822         
 36         19.9          50.2       1.7X
-Partition column - Native ORC Vectorized             53             65         
 12        294.0           3.4      24.9X
-Both columns - Hive built-in ORC                   1472           1530         
 82         10.7          93.6       0.9X
-Both columns - Native ORC MR                       1224           1241         
 23         12.8          77.8       1.1X
-Both columns - Native ORC Vectorized                199            207         
 19         79.0          12.7       6.7X
+Data column - Hive built-in ORC                    1584           1607         
 33          9.9         100.7       1.0X
+Data column - Native ORC MR                        1502           1537         
 49         10.5          95.5       1.1X
+Data column - Native ORC Vectorized                 268            280         
 12         58.7          17.0       5.9X
+Partition column - Hive built-in ORC               1209           1212         
  5         13.0          76.8       1.3X
+Partition column - Native ORC MR                   1010           1018         
 12         15.6          64.2       1.6X
+Partition column - Native ORC Vectorized             52             59         
 12        301.1           3.3      30.3X
+Both columns - Hive built-in ORC                   1712           1735         
 33          9.2         108.8       0.9X
+Both columns - Native ORC MR                       1608           1704         
136          9.8         102.2       1.0X
+Both columns - Native ORC Vectorized                303            314         
 14         51.9          19.3       5.2X
 
 
 
================================================================================================
 Repeated String Scan
 
================================================================================================
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 Repeated String:                          Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                  1181           1192         
 15          8.9         112.6       1.0X
-Native ORC MR                                       913            958         
 74         11.5          87.1       1.3X
-Native ORC Vectorized                               167            172         
  6         62.9          15.9       7.1X
+Hive built-in ORC                                  1390           1396         
  9          7.5         132.5       1.0X
+Native ORC MR                                      1144           1155         
 16          9.2         109.1       1.2X
+Native ORC Vectorized                               222            240         
 25         47.2          21.2       6.3X
 
 
 
================================================================================================
 String with Nulls Scan
 
================================================================================================
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 String with Nulls Scan (0.0%):            Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                  2132           2151         
 28          4.9         203.3       1.0X
-Native ORC MR                                      1621           1651         
 43          6.5         154.6       1.3X
-Native ORC Vectorized                               494            532         
 56         21.2          47.1       4.3X
+Hive built-in ORC                                  2446           2516         
100          4.3         233.2       1.0X
+Native ORC MR                                      2004           2028         
 34          5.2         191.1       1.2X
+Native ORC Vectorized                               556            577         
 30         18.9          53.0       4.4X
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 String with Nulls Scan (50.0%):           Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                  1954           1966         
 17          5.4         186.4       1.0X
-Native ORC MR                                      1566           1578         
 17          6.7         149.3       1.2X
-Native ORC Vectorized                               630            639         
  9         16.7          60.0       3.1X
+Hive built-in ORC                                  2207           2267         
 85          4.8         210.4       1.0X
+Native ORC MR                                      1942           1946         
  6          5.4         185.2       1.1X
+Native ORC Vectorized                               733            747         
 14         14.3          69.9       3.0X
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 String with Nulls Scan (95.0%):           Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                  1067           1081         
 20          9.8         101.7       1.0X
-Native ORC MR                                       843            866         
 20         12.4          80.4       1.3X
-Native ORC Vectorized                               229            236         
  6         45.8          21.9       4.7X
+Hive built-in ORC                                  1300           1317         
 24          8.1         124.0       1.0X
+Native ORC MR                                      1046           1049         
  3         10.0          99.8       1.2X
+Native ORC Vectorized                               274            282         
  8         38.2          26.2       4.7X
 
 
 
================================================================================================
 Single Column Scan From Wide Columns
 
================================================================================================
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 Single Column Scan from 100 columns:      Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                   900            936         
 31          1.2         858.7       1.0X
-Native ORC MR                                       124            135         
 13          8.5         117.8       7.3X
-Native ORC Vectorized                                68             78         
 11         15.4          64.9      13.2X
+Hive built-in ORC                                  1098           1300         
285          1.0        1047.4       1.0X
+Native ORC MR                                       155            162         
  7          6.7         148.2       7.1X
+Native ORC Vectorized                                89             96         
  9         11.8          84.7      12.4X
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 Single Column Scan from 200 columns:      Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                  1447           1488         
 57          0.7        1380.4       1.0X
-Native ORC MR                                       153            172         
 29          6.9         145.6       9.5X
-Native ORC Vectorized                                97            103         
 11         10.8          92.4      14.9X
+Hive built-in ORC                                  2004           2027         
 33          0.5        1910.7       1.0X
+Native ORC MR                                       202            235         
 24          5.2         192.4       9.9X
+Native ORC Vectorized                               128            138         
 16          8.2         121.7      15.7X
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 Single Column Scan from 300 columns:      Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                  2102           2114         
 17          0.5        2004.5       1.0X
-Native ORC MR                                       182            197         
 13          5.7         174.0      11.5X
-Native ORC Vectorized                               170            186         
 13          6.2         161.7      12.4X
+Hive built-in ORC                                  3016           3041         
 36          0.3        2875.9       1.0X
+Native ORC MR                                       247            262         
 20          4.2         235.5      12.2X
+Native ORC Vectorized                               181            201         
 24          5.8         172.2      16.7X
 
 
 
================================================================================================
 Struct scan
 
================================================================================================
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 Single Struct Column Scan with 10 Fields:  Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                   442            459         
 15          2.4         421.5       1.0X
-Native ORC MR                                       324            335         
 18          3.2         309.3       1.4X
-Native ORC Vectorized                               169            176         
 13          6.2         160.8       2.6X
+Hive built-in ORC                                   523            575         
 68          2.0         498.5       1.0X
+Native ORC MR                                      1477           1477         
  0          0.7        1408.2       0.4X
+Native ORC Vectorized                               210            225         
 14          5.0         199.8       2.5X
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 Single Struct Column Scan with 100 Fields:  Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
-------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                   3139           3288        
 212          0.3        2993.1       1.0X
-Native ORC MR                                       2523           2571        
  69          0.4        2405.9       1.2X
-Native ORC Vectorized                               1434           1456        
  32          0.7        1367.2       2.2X
+Hive built-in ORC                                   3677           4179        
 709          0.3        3507.0       1.0X
+Native ORC MR                                      12584          12615        
  43          0.1       12001.3       0.3X
+Native ORC Vectorized                               1867           1875        
  11          0.6        1780.9       2.0X
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 Single Struct Column Scan with 300 Fields:  Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
-------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                  10503          10920        
 590          0.1       10016.4       1.0X
-Native ORC MR                                       8637           8641        
   5          0.1        8237.3       1.2X
-Native ORC Vectorized                               8568           8600        
  45          0.1        8171.3       1.2X
+Hive built-in ORC                                  14307          15087        
1103          0.1       13644.5       1.0X
+Native ORC MR                                      39693          41121        
2020          0.0       37853.7       0.4X
+Native ORC Vectorized                              39081          39252        
 243          0.0       37270.5       0.4X
 
-OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Linux 5.13.0-1021-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 Single Struct Column Scan with 600 Fields:  Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
-------------------------------------------------------------------------------------------------------------------------
-Hive built-in ORC                                  27779          28882        
1559          0.0       26492.5       1.0X
-Native ORC MR                                      29329          29488        
 226          0.0       27970.1       0.9X
-Native ORC Vectorized                              30460          30772        
 441          0.0       29049.4       0.9X
+Hive built-in ORC                                  32291          36141        
 NaN          0.0       30795.2       1.0X
+Native ORC MR                                      94939          95045        
 149          0.0       90541.2       0.3X
+Native ORC Vectorized                              93062          93335        
 386          0.0       88750.4       0.3X

Review Comment:
   I'll take a look at this, `Single Struct Column Scan with 600 Fields`. This 
seems to be related to some new SPARK patches.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to