sunchao commented on code in PR #36123:
URL: https://github.com/apache/spark/pull/36123#discussion_r847694826
##########
sql/core/benchmarks/DataSourceReadBenchmark-jdk11-results.txt:
##########
@@ -2,322 +2,430 @@
SQL Single Numeric Column Scan
================================================================================================
-OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
SQL Single BOOLEAN Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 11809 12046
335 1.3 750.8 1.0X
-SQL Json 8588 8592
7 1.8 546.0 1.4X
-SQL Parquet Vectorized: DataPageV1 140 162
18 112.0 8.9 84.1X
-SQL Parquet Vectorized: DataPageV2 103 117
12 152.6 6.6 114.6X
-SQL Parquet MR: DataPageV1 1634 1648
20 9.6 103.9 7.2X
-SQL Parquet MR: DataPageV2 1495 1501
9 10.5 95.1 7.9X
-SQL ORC Vectorized 180 224
42 87.4 11.4 65.6X
-SQL ORC MR 1536 1576
57 10.2 97.7 7.7X
-
-OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+SQL CSV 11004 11065
86 1.4 699.6 1.0X
+SQL Json 7986 8011
35 2.0 507.7 1.4X
+SQL Parquet Vectorized: DataPageV1 124 148
16 127.0 7.9 88.9X
+SQL Parquet Vectorized: DataPageV2 101 115
12 155.0 6.5 108.4X
+SQL Parquet MR: DataPageV1 1614 1620
8 9.7 102.6 6.8X
+SQL Parquet MR: DataPageV2 1445 1446
2 10.9 91.9 7.6X
+SQL ORC Vectorized 163 204
41 96.2 10.4 67.3X
+SQL ORC MR 1407 1429
31 11.2 89.4 7.8X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Parquet Reader Single BOOLEAN Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 109 114
10 144.3 6.9 1.0X
-ParquetReader Vectorized: DataPageV2 90 93
3 175.3 5.7 1.2X
-ParquetReader Vectorized -> Row: DataPageV1 58 60
4 271.9 3.7 1.9X
-ParquetReader Vectorized -> Row: DataPageV2 39 41
3 404.0 2.5 2.8X
+ParquetReader Vectorized: DataPageV1 123 140
14 128.3 7.8 1.0X
+ParquetReader Vectorized: DataPageV2 105 114
11 150.3 6.7 1.2X
+ParquetReader Vectorized -> Row: DataPageV1 56 61
5 279.9 3.6 2.2X
+ParquetReader Vectorized -> Row: DataPageV2 39 43
4 399.4 2.5 3.1X
-OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
SQL Single TINYINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 14515 14526
16 1.1 922.8 1.0X
-SQL Json 9862 9863
2 1.6 627.0 1.5X
-SQL Parquet Vectorized: DataPageV1 144 167
31 109.5 9.1 101.1X
-SQL Parquet Vectorized: DataPageV2 139 159
27 113.4 8.8 104.6X
-SQL Parquet MR: DataPageV1 1777 1780
3 8.8 113.0 8.2X
-SQL Parquet MR: DataPageV2 1690 1691
2 9.3 107.4 8.6X
-SQL ORC Vectorized 201 238
46 78.3 12.8 72.2X
-SQL ORC MR 1513 1522
14 10.4 96.2 9.6X
-
-OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+SQL CSV 13262 13310
67 1.2 843.2 1.0X
+SQL Json 9104 9173
98 1.7 578.8 1.5X
+SQL Parquet Vectorized: DataPageV1 136 172
31 115.4 8.7 97.3X
+SQL Parquet Vectorized: DataPageV2 138 153
17 114.0 8.8 96.1X
+SQL Parquet MR: DataPageV1 1789 1805
22 8.8 113.7 7.4X
+SQL Parquet MR: DataPageV2 1631 1662
44 9.6 103.7 8.1X
+SQL ORC Vectorized 210 252
33 74.8 13.4 63.0X
+SQL ORC MR 1412 1437
36 11.1 89.7 9.4X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Parquet Reader Single TINYINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 182 192
11 86.6 11.5 1.0X
-ParquetReader Vectorized: DataPageV2 181 188
7 86.9 11.5 1.0X
-ParquetReader Vectorized -> Row: DataPageV1 96 99
4 163.3 6.1 1.9X
-ParquetReader Vectorized -> Row: DataPageV2 96 99
3 163.4 6.1 1.9X
+ParquetReader Vectorized: DataPageV1 171 183
14 92.0 10.9 1.0X
+ParquetReader Vectorized: DataPageV2 175 184
9 90.1 11.1 1.0X
+ParquetReader Vectorized -> Row: DataPageV1 88 95
12 179.0 5.6 1.9X
+ParquetReader Vectorized -> Row: DataPageV2 88 92
4 179.0 5.6 1.9X
-OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
SQL Single SMALLINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 15326 15437
156 1.0 974.4 1.0X
-SQL Json 10281 10290
13 1.5 653.7 1.5X
-SQL Parquet Vectorized: DataPageV1 164 212
36 95.9 10.4 93.4X
-SQL Parquet Vectorized: DataPageV2 230 244
11 68.5 14.6 66.7X
-SQL Parquet MR: DataPageV1 2108 2111
4 7.5 134.0 7.3X
-SQL Parquet MR: DataPageV2 1940 1963
33 8.1 123.3 7.9X
-SQL ORC Vectorized 229 279
34 68.7 14.6 66.9X
-SQL ORC MR 1903 1906
3 8.3 121.0 8.1X
-
-OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+SQL CSV 14022 14236
303 1.1 891.5 1.0X
+SQL Json 9763 9929
235 1.6 620.7 1.4X
+SQL Parquet Vectorized: DataPageV1 173 226
38 90.7 11.0 80.9X
+SQL Parquet Vectorized: DataPageV2 222 241
13 70.7 14.1 63.1X
+SQL Parquet MR: DataPageV1 2069 2086
24 7.6 131.5 6.8X
+SQL Parquet MR: DataPageV2 1771 1806
49 8.9 112.6 7.9X
+SQL ORC Vectorized 203 263
37 77.6 12.9 69.2X
+SQL ORC MR 1528 1552
34 10.3 97.2 9.2X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Parquet Reader Single SMALLINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 253 262
10 62.2 16.1 1.0X
-ParquetReader Vectorized: DataPageV2 323 327
9 48.8 20.5 0.8X
-ParquetReader Vectorized -> Row: DataPageV1 280 288
8 56.3 17.8 0.9X
-ParquetReader Vectorized -> Row: DataPageV2 301 314
21 52.2 19.1 0.8X
+ParquetReader Vectorized: DataPageV1 246 256
11 63.9 15.6 1.0X
+ParquetReader Vectorized: DataPageV2 301 313
17 52.3 19.1 0.8X
+ParquetReader Vectorized -> Row: DataPageV1 257 292
18 61.2 16.3 1.0X
+ParquetReader Vectorized -> Row: DataPageV2 296 318
25 53.1 18.8 0.8X
-OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
SQL Single INT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 16756 16776
28 0.9 1065.3 1.0X
-SQL Json 10690 10692
3 1.5 679.6 1.6X
-SQL Parquet Vectorized: DataPageV1 160 208
45 98.1 10.2 104.5X
-SQL Parquet Vectorized: DataPageV2 390 423
23 40.3 24.8 43.0X
-SQL Parquet MR: DataPageV1 2196 2201
8 7.2 139.6 7.6X
-SQL Parquet MR: DataPageV2 2065 2072
10 7.6 131.3 8.1X
-SQL ORC Vectorized 323 338
10 48.7 20.5 51.9X
-SQL ORC MR 1899 1906
11 8.3 120.7 8.8X
-
-OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+SQL CSV 16153 16252
140 1.0 1027.0 1.0X
+SQL Json 10406 10547
200 1.5 661.6 1.6X
+SQL Parquet Vectorized: DataPageV1 159 207
33 99.1 10.1 101.8X
+SQL Parquet Vectorized: DataPageV2 337 402
40 46.6 21.4 47.9X
+SQL Parquet MR: DataPageV1 2160 2193
46 7.3 137.4 7.5X
+SQL Parquet MR: DataPageV2 1892 1900
11 8.3 120.3 8.5X
+SQL ORC Vectorized 297 340
42 53.0 18.9 54.5X
+SQL ORC MR 1705 1732
38 9.2 108.4 9.5X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Parquet Reader Single INT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 278 285
9 56.6 17.7 1.0X
-ParquetReader Vectorized: DataPageV2 514 518
2 30.6 32.7 0.5X
-ParquetReader Vectorized -> Row: DataPageV1 308 316
11 51.0 19.6 0.9X
-ParquetReader Vectorized -> Row: DataPageV2 498 525
27 31.6 31.6 0.6X
+ParquetReader Vectorized: DataPageV1 251 262
10 62.6 16.0 1.0X
+ParquetReader Vectorized: DataPageV2 418 431
13 37.7 26.6 0.6X
+ParquetReader Vectorized -> Row: DataPageV1 247 288
30 63.7 15.7 1.0X
+ParquetReader Vectorized -> Row: DataPageV2 412 455
39 38.1 26.2 0.6X
-OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
SQL Single BIGINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 21841 21851
14 0.7 1388.6 1.0X
-SQL Json 12828 12843
21 1.2 815.6 1.7X
-SQL Parquet Vectorized: DataPageV1 241 279
19 65.2 15.3 90.6X
-SQL Parquet Vectorized: DataPageV2 554 596
29 28.4 35.2 39.5X
-SQL Parquet MR: DataPageV1 2404 2428
34 6.5 152.8 9.1X
-SQL Parquet MR: DataPageV2 2153 2166
18 7.3 136.9 10.1X
-SQL ORC Vectorized 417 464
62 37.7 26.5 52.4X
-SQL ORC MR 2136 2146
14 7.4 135.8 10.2X
-
-OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+SQL CSV 19896 20026
183 0.8 1264.9 1.0X
+SQL Json 12540 12634
132 1.3 797.3 1.6X
+SQL Parquet Vectorized: DataPageV1 221 271
30 71.3 14.0 90.1X
+SQL Parquet Vectorized: DataPageV2 546 564
23 28.8 34.7 36.5X
+SQL Parquet MR: DataPageV1 2196 2211
21 7.2 139.6 9.1X
+SQL Parquet MR: DataPageV2 2085 2089
6 7.5 132.5 9.5X
+SQL ORC Vectorized 379 416
39 41.5 24.1 52.5X
+SQL ORC MR 1858 1859
2 8.5 118.1 10.7X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Parquet Reader Single BIGINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 324 357
34 48.6 20.6 1.0X
-ParquetReader Vectorized: DataPageV2 694 702
11 22.6 44.2 0.5X
-ParquetReader Vectorized -> Row: DataPageV1 378 385
8 41.6 24.0 0.9X
-ParquetReader Vectorized -> Row: DataPageV2 701 708
8 22.4 44.6 0.5X
+ParquetReader Vectorized: DataPageV1 311 340
20 50.5 19.8 1.0X
+ParquetReader Vectorized: DataPageV2 639 647
11 24.6 40.6 0.5X
+ParquetReader Vectorized -> Row: DataPageV1 359 376
13 43.9 22.8 0.9X
+ParquetReader Vectorized -> Row: DataPageV2 653 658
9 24.1 41.5 0.5X
-OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
SQL Single FLOAT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 17238 17239
2 0.9 1096.0 1.0X
-SQL Json 12295 12307
18 1.3 781.7 1.4X
-SQL Parquet Vectorized: DataPageV1 162 203
27 96.8 10.3 106.1X
-SQL Parquet Vectorized: DataPageV2 157 194
32 100.4 10.0 110.0X
-SQL Parquet MR: DataPageV1 2163 2165
3 7.3 137.5 8.0X
-SQL Parquet MR: DataPageV2 2014 2014
1 7.8 128.0 8.6X
-SQL ORC Vectorized 458 462
5 34.4 29.1 37.7X
-SQL ORC MR 1984 1984
0 7.9 126.1 8.7X
-
-OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+SQL CSV 17338 17710
526 0.9 1102.3 1.0X
+SQL Json 11844 12121
392 1.3 753.0 1.5X
+SQL Parquet Vectorized: DataPageV1 148 187
28 106.2 9.4 117.0X
+SQL Parquet Vectorized: DataPageV2 147 183
31 106.8 9.4 117.7X
+SQL Parquet MR: DataPageV1 2027 2033
9 7.8 128.9 8.6X
+SQL Parquet MR: DataPageV2 1966 1981
21 8.0 125.0 8.8X
+SQL ORC Vectorized 399 425
25 39.4 25.4 43.4X
+SQL ORC MR 1748 1756
11 9.0 111.2 9.9X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Parquet Reader Single FLOAT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 252 259
10 62.3 16.0 1.0X
-ParquetReader Vectorized: DataPageV2 252 256
9 62.3 16.0 1.0X
-ParquetReader Vectorized -> Row: DataPageV1 259 307
40 60.7 16.5 1.0X
-ParquetReader Vectorized -> Row: DataPageV2 260 295
25 60.5 16.5 1.0X
+ParquetReader Vectorized: DataPageV1 226 240
15 69.6 14.4 1.0X
+ParquetReader Vectorized: DataPageV2 225 237
15 69.9 14.3 1.0X
+ParquetReader Vectorized -> Row: DataPageV1 247 299
38 63.6 15.7 0.9X
+ParquetReader Vectorized -> Row: DataPageV2 245 296
25 64.1 15.6 0.9X
-OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
SQL Single DOUBLE Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 22485 22536
72 0.7 1429.5 1.0X
-SQL Json 16281 16286
8 1.0 1035.1 1.4X
-SQL Parquet Vectorized: DataPageV1 232 288
35 67.9 14.7 97.1X
-SQL Parquet Vectorized: DataPageV2 277 290
9 56.8 17.6 81.2X
-SQL Parquet MR: DataPageV1 2331 2341
15 6.7 148.2 9.6X
-SQL Parquet MR: DataPageV2 2216 2229
18 7.1 140.9 10.1X
-SQL ORC Vectorized 561 569
9 28.0 35.7 40.1X
-SQL ORC MR 2118 2137
27 7.4 134.6 10.6X
-
-OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+SQL CSV 21052 21617
799 0.7 1338.4 1.0X
+SQL Json 15822 16049
321 1.0 1005.9 1.3X
+SQL Parquet Vectorized: DataPageV1 266 286
19 59.0 16.9 79.0X
+SQL Parquet Vectorized: DataPageV2 277 291
14 56.8 17.6 76.0X
+SQL Parquet MR: DataPageV1 2267 2275
12 6.9 144.1 9.3X
+SQL Parquet MR: DataPageV2 2046 2064
26 7.7 130.1 10.3X
+SQL ORC Vectorized 535 545
10 29.4 34.0 39.3X
+SQL ORC MR 1976 2000
34 8.0 125.6 10.7X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Parquet Reader Single DOUBLE Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 355 356
1 44.3 22.6 1.0X
-ParquetReader Vectorized: DataPageV2 355 356
1 44.3 22.6 1.0X
-ParquetReader Vectorized -> Row: DataPageV1 379 386
9 41.5 24.1 0.9X
-ParquetReader Vectorized -> Row: DataPageV2 379 389
10 41.5 24.1 0.9X
+ParquetReader Vectorized: DataPageV1 314 337
25 50.1 20.0 1.0X
+ParquetReader Vectorized: DataPageV2 309 323
14 50.8 19.7 1.0X
+ParquetReader Vectorized -> Row: DataPageV1 331 348
13 47.5 21.1 0.9X
+ParquetReader Vectorized -> Row: DataPageV2 332 347
11 47.4 21.1 0.9X
+
+
+================================================================================================
+SQL Single Numeric Column Scan in Struct
+================================================================================================
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
+SQL Single TINYINT Column Scan in Struct: Best Time(ms)
Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
+-------------------------------------------------------------------------------------------------------------------------------------------
+SQL ORC MR 2193
2196 4 7.2 139.5 1.0X
+SQL ORC Vectorized (Nested Column Disabled) 2211
2222 16 7.1 140.6 1.0X
+SQL ORC Vectorized (Nested Column Enabled) 268
310 32 58.7 17.0 8.2X
+SQL Parquet MR: DataPageV1 2243
2280 53 7.0 142.6 1.0X
+SQL Parquet Vectorized: DataPageV1 (Nested Column Disabled) 2747
2758 16 5.7 174.6 0.8X
+SQL Parquet Vectorized: DataPageV1 (Nested Column Enabled) 155
174 22 101.7 9.8 14.2X
+SQL Parquet MR: DataPageV2 2193
2203 13 7.2 139.5 1.0X
+SQL Parquet Vectorized: DataPageV2 (Nested Column Disabled) 2709
2733 33 5.8 172.3 0.8X
+SQL Parquet Vectorized: DataPageV2 (Nested Column Enabled) 150
174 27 104.7 9.6 14.6X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
+SQL Single SMALLINT Column Scan in Struct: Best Time(ms)
Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
+-------------------------------------------------------------------------------------------------------------------------------------------
+SQL ORC MR 2322
2391 97 6.8 147.6 1.0X
+SQL ORC Vectorized (Nested Column Disabled) 2362
2374 17 6.7 150.2 1.0X
+SQL ORC Vectorized (Nested Column Enabled) 412
419 9 38.2 26.2 5.6X
+SQL Parquet MR: DataPageV1 2393
2400 10 6.6 152.1 1.0X
+SQL Parquet Vectorized: DataPageV1 (Nested Column Disabled) 2919
2922 4 5.4 185.6 0.8X
+SQL Parquet Vectorized: DataPageV1 (Nested Column Enabled) 228
281 54 69.0 14.5 10.2X
+SQL Parquet MR: DataPageV2 2223
2240 25 7.1 141.3 1.0X
+SQL Parquet Vectorized: DataPageV2 (Nested Column Disabled) 2692
2712 28 5.8 171.2 0.9X
+SQL Parquet Vectorized: DataPageV2 (Nested Column Enabled) 341
361 31 46.1 21.7 6.8X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.13.0-1021-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
+SQL Single INT Column Scan in Struct: Best Time(ms)
Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
+-------------------------------------------------------------------------------------------------------------------------------------------
+SQL ORC MR 2376
2380 6 6.6 151.0 1.0X
+SQL ORC Vectorized (Nested Column Disabled) 2333
2378 64 6.7 148.4 1.0X
+SQL ORC Vectorized (Nested Column Enabled) 430
451 20 36.6 27.3 5.5X
+SQL Parquet MR: DataPageV1 2485
2501 22 6.3 158.0 1.0X
+SQL Parquet Vectorized: DataPageV1 (Nested Column Disabled) 3017
3062 65 5.2 191.8 0.8X
Review Comment:
I'm not sure. This part is calling the row-based Parquet reader from
`parquet-mr` which I'm not very familiar with. Overall its performance is not
very good.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]