sunchao commented on a change in pull request #35262:
URL: https://github.com/apache/spark/pull/35262#discussion_r810218228
##########
File path: sql/core/benchmarks/DataSourceReadBenchmark-jdk11-results.txt
##########
@@ -2,322 +2,322 @@
SQL Single Numeric Column Scan
================================================================================================
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
SQL Single BOOLEAN Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 9636 9771
191 1.6 612.6 1.0X
-SQL Json 7960 8227
378 2.0 506.1 1.2X
-SQL Parquet Vectorized: DataPageV1 113 129
12 139.7 7.2 85.6X
-SQL Parquet Vectorized: DataPageV2 84 93
12 186.6 5.4 114.3X
-SQL Parquet MR: DataPageV1 1466 1470
6 10.7 93.2 6.6X
-SQL Parquet MR: DataPageV2 1334 1347
18 11.8 84.8 7.2X
-SQL ORC Vectorized 163 197
27 96.3 10.4 59.0X
-SQL ORC MR 1554 1558
6 10.1 98.8 6.2X
-
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+SQL CSV 14859 14914
77 1.1 944.7 1.0X
+SQL Json 9418 9457
55 1.7 598.8 1.6X
+SQL Parquet Vectorized: DataPageV1 109 128
14 144.6 6.9 136.6X
+SQL Parquet Vectorized: DataPageV2 79 89
8 199.3 5.0 188.3X
+SQL Parquet MR: DataPageV1 1699 1743
62 9.3 108.0 8.7X
+SQL Parquet MR: DataPageV2 1462 1489
38 10.8 93.0 10.2X
+SQL ORC Vectorized 165 200
33 95.3 10.5 90.0X
+SQL ORC MR 1409 1420
16 11.2 89.6 10.5X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Parquet Reader Single BOOLEAN Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 94 103
13 167.1 6.0 1.0X
-ParquetReader Vectorized: DataPageV2 77 86
11 204.3 4.9 1.2X
-ParquetReader Vectorized -> Row: DataPageV1 44 47
4 357.0 2.8 2.1X
-ParquetReader Vectorized -> Row: DataPageV2 35 37
3 445.2 2.2 2.7X
+ParquetReader Vectorized: DataPageV1 101 104
3 155.2 6.4 1.0X
+ParquetReader Vectorized: DataPageV2 82 85
5 192.0 5.2 1.2X
+ParquetReader Vectorized -> Row: DataPageV1 48 50
2 324.6 3.1 2.1X
+ParquetReader Vectorized -> Row: DataPageV2 29 31
3 539.4 1.9 3.5X
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
SQL Single TINYINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 11479 11919
622 1.4 729.8 1.0X
-SQL Json 9894 9922
39 1.6 629.1 1.2X
-SQL Parquet Vectorized: DataPageV1 123 156
30 128.3 7.8 93.6X
-SQL Parquet Vectorized: DataPageV2 126 138
19 125.2 8.0 91.4X
-SQL Parquet MR: DataPageV1 1986 2500
726 7.9 126.3 5.8X
-SQL Parquet MR: DataPageV2 1810 1898
126 8.7 115.1 6.3X
-SQL ORC Vectorized 174 210
30 90.5 11.0 66.1X
-SQL ORC MR 1645 1652
9 9.6 104.6 7.0X
-
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+SQL CSV 17619 17639
28 0.9 1120.2 1.0X
+SQL Json 10590 10606
23 1.5 673.3 1.7X
+SQL Parquet Vectorized: DataPageV1 178 194
10 88.2 11.3 98.8X
+SQL Parquet Vectorized: DataPageV2 178 188
9 88.2 11.3 98.7X
+SQL Parquet MR: DataPageV1 1884 1887
4 8.4 119.8 9.4X
+SQL Parquet MR: DataPageV2 1689 1742
75 9.3 107.4 10.4X
+SQL ORC Vectorized 162 193
24 97.0 10.3 108.7X
+SQL ORC MR 1505 1552
67 10.5 95.7 11.7X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Parquet Reader Single TINYINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 166 177
14 94.9 10.5 1.0X
-ParquetReader Vectorized: DataPageV2 165 172
11 95.3 10.5 1.0X
-ParquetReader Vectorized -> Row: DataPageV1 95 100
5 165.7 6.0 1.7X
-ParquetReader Vectorized -> Row: DataPageV2 85 89
6 186.0 5.4 2.0X
+ParquetReader Vectorized: DataPageV1 230 236
13 68.3 14.6 1.0X
+ParquetReader Vectorized: DataPageV2 228 233
8 69.1 14.5 1.0X
+ParquetReader Vectorized -> Row: DataPageV1 138 150
26 113.7 8.8 1.7X
+ParquetReader Vectorized -> Row: DataPageV2 137 140
2 114.5 8.7 1.7X
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
SQL Single SMALLINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 12176 12646
664 1.3 774.1 1.0X
-SQL Json 9696 9729
46 1.6 616.5 1.3X
-SQL Parquet Vectorized: DataPageV1 151 201
33 103.9 9.6 80.4X
-SQL Parquet Vectorized: DataPageV2 216 235
15 72.7 13.8 56.3X
-SQL Parquet MR: DataPageV1 1915 2017
145 8.2 121.8 6.4X
-SQL Parquet MR: DataPageV2 1954 1978
33 8.0 124.3 6.2X
-SQL ORC Vectorized 197 235
25 79.7 12.6 61.7X
-SQL ORC MR 1769 1829
85 8.9 112.5 6.9X
-
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+SQL CSV 18488 18494
8 0.9 1175.5 1.0X
+SQL Json 11190 11195
7 1.4 711.4 1.7X
+SQL Parquet Vectorized: DataPageV1 125 155
34 125.7 8.0 147.7X
+SQL Parquet Vectorized: DataPageV2 183 192
9 86.1 11.6 101.2X
+SQL Parquet MR: DataPageV1 2153 2160
10 7.3 136.9 8.6X
+SQL Parquet MR: DataPageV2 1876 1889
18 8.4 119.3 9.9X
+SQL ORC Vectorized 212 257
23 74.4 13.4 87.4X
+SQL ORC MR 1653 1658
7 9.5 105.1 11.2X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Parquet Reader Single SMALLINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 230 237
12 68.5 14.6 1.0X
-ParquetReader Vectorized: DataPageV2 293 298
9 53.6 18.7 0.8X
-ParquetReader Vectorized -> Row: DataPageV1 215 265
23 73.2 13.7 1.1X
-ParquetReader Vectorized -> Row: DataPageV2 279 301
32 56.3 17.8 0.8X
+ParquetReader Vectorized: DataPageV1 198 201
5 79.6 12.6 1.0X
+ParquetReader Vectorized: DataPageV2 256 260
3 61.5 16.3 0.8X
+ParquetReader Vectorized -> Row: DataPageV1 193 226
14 81.4 12.3 1.0X
+ParquetReader Vectorized -> Row: DataPageV2 250 253
2 62.8 15.9 0.8X
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
SQL Single INT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 13069 13409
482 1.2 830.9 1.0X
-SQL Json 10599 10621
32 1.5 673.9 1.2X
-SQL Parquet Vectorized: DataPageV1 142 177
34 110.6 9.0 91.9X
-SQL Parquet Vectorized: DataPageV2 313 359
28 50.2 19.9 41.7X
-SQL Parquet MR: DataPageV1 1979 2044
92 7.9 125.8 6.6X
-SQL Parquet MR: DataPageV2 1958 2030
101 8.0 124.5 6.7X
-SQL ORC Vectorized 277 303
21 56.7 17.6 47.1X
-SQL ORC MR 1692 1782
128 9.3 107.6 7.7X
-
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+SQL CSV 18614 18703
125 0.8 1183.5 1.0X
+SQL Json 11673 11711
53 1.3 742.2 1.6X
+SQL Parquet Vectorized: DataPageV1 128 154
26 123.1 8.1 145.7X
+SQL Parquet Vectorized: DataPageV2 270 302
23 58.3 17.1 69.0X
+SQL Parquet MR: DataPageV1 2117 2145
39 7.4 134.6 8.8X
+SQL Parquet MR: DataPageV2 1855 1860
7 8.5 117.9 10.0X
+SQL ORC Vectorized 277 292
16 56.7 17.6 67.2X
+SQL ORC MR 1623 1629
9 9.7 103.2 11.5X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Parquet Reader Single INT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 253 269
18 62.1 16.1 1.0X
-ParquetReader Vectorized: DataPageV2 1197 1199
3 13.1 76.1 0.2X
-ParquetReader Vectorized -> Row: DataPageV1 273 361
110 57.7 17.3 0.9X
-ParquetReader Vectorized -> Row: DataPageV2 379 438
37 41.5 24.1 0.7X
+ParquetReader Vectorized: DataPageV1 225 226
1 69.9 14.3 1.0X
+ParquetReader Vectorized: DataPageV2 362 365
2 43.4 23.0 0.6X
+ParquetReader Vectorized -> Row: DataPageV1 193 218
18 81.5 12.3 1.2X
+ParquetReader Vectorized -> Row: DataPageV2 360 366
6 43.7 22.9 0.6X
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
SQL Single BIGINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 17143 17467
458 0.9 1089.9 1.0X
-SQL Json 11507 12198
977 1.4 731.6 1.5X
-SQL Parquet Vectorized: DataPageV1 238 253
19 66.0 15.2 71.9X
-SQL Parquet Vectorized: DataPageV2 502 567
48 31.3 31.9 34.1X
-SQL Parquet MR: DataPageV1 2333 2335
3 6.7 148.4 7.3X
-SQL Parquet MR: DataPageV2 1948 1972
34 8.1 123.8 8.8X
-SQL ORC Vectorized 389 408
20 40.5 24.7 44.1X
-SQL ORC MR 1726 1817
128 9.1 109.7 9.9X
-
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+SQL CSV 23939 23953
20 0.7 1522.0 1.0X
+SQL Json 14445 14449
5 1.1 918.4 1.7X
+SQL Parquet Vectorized: DataPageV1 186 229
28 84.7 11.8 128.9X
+SQL Parquet Vectorized: DataPageV2 459 493
25 34.3 29.2 52.2X
+SQL Parquet MR: DataPageV1 2180 2184
7 7.2 138.6 11.0X
+SQL Parquet MR: DataPageV2 1954 1973
27 8.1 124.2 12.3X
+SQL ORC Vectorized 368 392
24 42.8 23.4 65.1X
+SQL ORC MR 1793 1794
2 8.8 114.0 13.4X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Parquet Reader Single BIGINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 289 340
43 54.4 18.4 1.0X
-ParquetReader Vectorized: DataPageV2 572 609
27 27.5 36.4 0.5X
-ParquetReader Vectorized -> Row: DataPageV1 329 353
48 47.8 20.9 0.9X
-ParquetReader Vectorized -> Row: DataPageV2 639 654
18 24.6 40.6 0.5X
+ParquetReader Vectorized: DataPageV1 280 293
18 56.1 17.8 1.0X
+ParquetReader Vectorized: DataPageV2 577 602
48 27.3 36.7 0.5X
+ParquetReader Vectorized -> Row: DataPageV1 314 321
10 50.1 19.9 0.9X
+ParquetReader Vectorized -> Row: DataPageV2 581 584
4 27.1 37.0 0.5X
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
SQL Single FLOAT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 13721 13812
129 1.1 872.4 1.0X
-SQL Json 12147 17632
2196 1.3 772.3 1.1X
-SQL Parquet Vectorized: DataPageV1 138 164
25 113.9 8.8 99.4X
-SQL Parquet Vectorized: DataPageV2 151 180
26 104.4 9.6 91.1X
-SQL Parquet MR: DataPageV1 2006 2078
101 7.8 127.6 6.8X
-SQL Parquet MR: DataPageV2 2038 2040
2 7.7 129.6 6.7X
-SQL ORC Vectorized 465 475
10 33.8 29.6 29.5X
-SQL ORC MR 1814 1860
64 8.7 115.4 7.6X
-
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+SQL CSV 19278 19291
18 0.8 1225.6 1.0X
+SQL Json 13366 13381
21 1.2 849.8 1.4X
+SQL Parquet Vectorized: DataPageV1 130 152
23 120.8 8.3 148.1X
+SQL Parquet Vectorized: DataPageV2 135 157
17 116.8 8.6 143.2X
+SQL Parquet MR: DataPageV1 2126 2137
15 7.4 135.2 9.1X
+SQL Parquet MR: DataPageV2 1970 1985
21 8.0 125.2 9.8X
+SQL ORC Vectorized 387 396
11 40.7 24.6 49.8X
+SQL ORC MR 1831 1832
1 8.6 116.4 10.5X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Parquet Reader Single FLOAT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 275 404
187 57.2 17.5 1.0X
-ParquetReader Vectorized: DataPageV2 275 287
12 57.2 17.5 1.0X
-ParquetReader Vectorized -> Row: DataPageV1 227 265
24 69.2 14.4 1.2X
-ParquetReader Vectorized -> Row: DataPageV2 228 259
28 69.1 14.5 1.2X
+ParquetReader Vectorized: DataPageV1 194 197
5 81.1 12.3 1.0X
+ParquetReader Vectorized: DataPageV2 194 197
7 81.2 12.3 1.0X
+ParquetReader Vectorized -> Row: DataPageV1 225 253
18 69.9 14.3 0.9X
+ParquetReader Vectorized -> Row: DataPageV2 224 252
18 70.2 14.2 0.9X
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
SQL Single DOUBLE Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 17269 17620
496 0.9 1097.9 1.0X
-SQL Json 15636 15952
447 1.0 994.1 1.1X
-SQL Parquet Vectorized: DataPageV1 238 267
18 66.0 15.1 72.5X
-SQL Parquet Vectorized: DataPageV2 222 260
21 70.9 14.1 77.9X
-SQL Parquet MR: DataPageV1 2418 2457
56 6.5 153.7 7.1X
-SQL Parquet MR: DataPageV2 2194 2207
18 7.2 139.5 7.9X
-SQL ORC Vectorized 519 528
14 30.3 33.0 33.3X
-SQL ORC MR 1760 1770
14 8.9 111.9 9.8X
-
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+SQL CSV 24691 24705
19 0.6 1569.8 1.0X
+SQL Json 18028 18028
0 0.9 1146.2 1.4X
+SQL Parquet Vectorized: DataPageV1 190 225
28 83.0 12.0 130.3X
+SQL Parquet Vectorized: DataPageV2 188 230
26 83.9 11.9 131.7X
+SQL Parquet MR: DataPageV1 2362 2365
4 6.7 150.2 10.5X
+SQL Parquet MR: DataPageV2 2061 2078
25 7.6 131.0 12.0X
+SQL ORC Vectorized 499 524
37 31.6 31.7 49.5X
+SQL ORC MR 1870 1880
14 8.4 118.9 13.2X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Parquet Reader Single DOUBLE Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 284 305
30 55.3 18.1 1.0X
-ParquetReader Vectorized: DataPageV2 286 286
1 55.1 18.2 1.0X
-ParquetReader Vectorized -> Row: DataPageV1 325 337
16 48.4 20.6 0.9X
-ParquetReader Vectorized -> Row: DataPageV2 346 361
16 45.5 22.0 0.8X
+ParquetReader Vectorized: DataPageV1 276 295
21 57.0 17.5 1.0X
+ParquetReader Vectorized: DataPageV2 278 289
17 56.6 17.7 1.0X
+ParquetReader Vectorized -> Row: DataPageV1 315 326
15 50.0 20.0 0.9X
+ParquetReader Vectorized -> Row: DataPageV2 315 323
8 49.9 20.0 0.9X
================================================================================================
Int and String Scan
================================================================================================
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Int and String Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 12428 12714
405 0.8 1185.2 1.0X
-SQL Json 11088 11251
231 0.9 1057.4 1.1X
-SQL Parquet Vectorized: DataPageV1 1990 1997
10 5.3 189.8 6.2X
-SQL Parquet Vectorized: DataPageV2 2551 2618
95 4.1 243.3 4.9X
-SQL Parquet MR: DataPageV1 3903 3913
15 2.7 372.2 3.2X
-SQL Parquet MR: DataPageV2 3734 3920
263 2.8 356.1 3.3X
-SQL ORC Vectorized 2153 2155
3 4.9 205.3 5.8X
-SQL ORC MR 3485 3549
91 3.0 332.4 3.6X
+SQL CSV 16840 16908
96 0.6 1606.0 1.0X
+SQL Json 12496 12513
25 0.8 1191.7 1.3X
+SQL Parquet Vectorized: DataPageV1 2169 2172
5 4.8 206.9 7.8X
+SQL Parquet Vectorized: DataPageV2 3102 3119
24 3.4 295.9 5.4X
+SQL Parquet MR: DataPageV1 4140 4144
5 2.5 394.8 4.1X
+SQL Parquet MR: DataPageV2 3988 3996
12 2.6 380.3 4.2X
+SQL ORC Vectorized 2180 2196
23 4.8 207.9 7.7X
+SQL ORC MR 3765 3766
2 2.8 359.0 4.5X
================================================================================================
Repeated String Scan
================================================================================================
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Repeated String: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 7116 7167
72 1.5 678.7 1.0X
-SQL Json 6700 6741
58 1.6 639.0 1.1X
-SQL Parquet Vectorized: DataPageV1 526 556
36 19.9 50.1 13.5X
-SQL Parquet Vectorized: DataPageV2 518 533
15 20.2 49.4 13.7X
-SQL Parquet MR: DataPageV1 1504 1656
216 7.0 143.4 4.7X
-SQL Parquet MR: DataPageV2 1676 1676
1 6.3 159.8 4.2X
-SQL ORC Vectorized 497 518
20 21.1 47.4 14.3X
-SQL ORC MR 1657 1787
183 6.3 158.1 4.3X
+SQL CSV 9960 9960
0 1.1 949.8 1.0X
+SQL Json 7625 7712
123 1.4 727.2 1.3X
+SQL Parquet Vectorized: DataPageV1 577 582
6 18.2 55.0 17.3X
+SQL Parquet Vectorized: DataPageV2 584 592
6 18.0 55.7 17.1X
+SQL Parquet MR: DataPageV1 1722 1736
19 6.1 164.2 5.8X
+SQL Parquet MR: DataPageV2 1662 1668
9 6.3 158.5 6.0X
+SQL ORC Vectorized 483 524
27 21.7 46.1 20.6X
+SQL ORC MR 1841 1850
14 5.7 175.5 5.4X
================================================================================================
Partitioned Table Scan
================================================================================================
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Partitioned Table: Best Time(ms) Avg
Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------------
-Data column - CSV 18247
18411 232 0.9 1160.1 1.0X
-Data column - Json 10860
11264 571 1.4 690.5 1.7X
-Data column - Parquet Vectorized: DataPageV1 223
274 26 70.6 14.2 81.9X
-Data column - Parquet Vectorized: DataPageV2 537
559 23 29.3 34.1 34.0X
-Data column - Parquet MR: DataPageV1 2411
2517 150 6.5 153.3 7.6X
-Data column - Parquet MR: DataPageV2 2299
2356 81 6.8 146.2 7.9X
-Data column - ORC Vectorized 417
433 11 37.7 26.5 43.8X
-Data column - ORC MR 2107
2178 101 7.5 134.0 8.7X
-Partition column - CSV 6090
6186 136 2.6 387.2 3.0X
-Partition column - Json 9479
9603 176 1.7 602.7 1.9X
-Partition column - Parquet Vectorized: DataPageV1 49
69 28 322.0 3.1 373.6X
-Partition column - Parquet Vectorized: DataPageV2 49
63 23 322.1 3.1 373.7X
-Partition column - Parquet MR: DataPageV1 1200
1225 36 13.1 76.3 15.2X
-Partition column - Parquet MR: DataPageV2 1199
1240 57 13.1 76.3 15.2X
-Partition column - ORC Vectorized 53
77 26 295.0 3.4 342.2X
-Partition column - ORC MR 1287
1346 83 12.2 81.8 14.2X
-Both columns - CSV 17671
18140 663 0.9 1123.5 1.0X
-Both columns - Json 11675
12167 696 1.3 742.3 1.6X
-Both columns - Parquet Vectorized: DataPageV1 298
303 9 52.9 18.9 61.3X
-Both columns - Parquet Vectorized: DataPageV2 541
580 36 29.1 34.4 33.7X
-Both columns - Parquet MR: DataPageV1 2448
2491 60 6.4 155.6 7.5X
-Both columns - Parquet MR: DataPageV2 2303
2352 69 6.8 146.4 7.9X
-Both columns - ORC Vectorized 385
406 25 40.9 24.5 47.4X
-Both columns - ORC MR 2118
2202 120 7.4 134.6 8.6X
+Data column - CSV 23787
23788 2 0.7 1512.3 1.0X
+Data column - Json 13993
14011 25 1.1 889.7 1.7X
+Data column - Parquet Vectorized: DataPageV1 184
235 36 85.4 11.7 129.2X
+Data column - Parquet Vectorized: DataPageV2 531
542 15 29.6 33.7 44.8X
+Data column - Parquet MR: DataPageV1 2539
2547 13 6.2 161.4 9.4X
+Data column - Parquet MR: DataPageV2 2299
2301 3 6.8 146.2 10.3X
+Data column - ORC Vectorized 379
403 23 41.5 24.1 62.8X
+Data column - ORC MR 2047
2070 33 7.7 130.1 11.6X
+Partition column - CSV 6834
6835 1 2.3 434.5 3.5X
+Partition column - Json 11444
11478 49 1.4 727.6 2.1X
+Partition column - Parquet Vectorized: DataPageV1 51
71 22 308.6 3.2 466.7X
+Partition column - Parquet Vectorized: DataPageV2 51
61 16 310.5 3.2 469.5X
+Partition column - Parquet MR: DataPageV1 1203
1214 15 13.1 76.5 19.8X
+Partition column - Parquet MR: DataPageV2 1210
1224 20 13.0 76.9 19.7X
+Partition column - ORC Vectorized 52
67 14 303.1 3.3 458.4X
+Partition column - ORC MR 1338
1342 5 11.8 85.1 17.8X
+Both columns - CSV 24051
24052 2 0.7 1529.1 1.0X
+Both columns - Json 15016
15030 20 1.0 954.7 1.6X
+Both columns - Parquet Vectorized: DataPageV1 235
269 27 66.9 15.0 101.2X
+Both columns - Parquet Vectorized: DataPageV2 563
617 60 27.9 35.8 42.2X
+Both columns - Parquet MR: DataPageV1 2525
2555 43 6.2 160.5 9.4X
+Both columns - Parquet MR: DataPageV2 2256
2267 15 7.0 143.5 10.5X
+Both columns - ORC Vectorized 407
454 51 38.7 25.9 58.5X
+Both columns - ORC MR 2153
2155 2 7.3 136.9 11.0X
================================================================================================
String with Nulls Scan
================================================================================================
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
String with Nulls Scan (0.0%): Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 7966 12723
2892 1.3 759.7 1.0X
-SQL Json 9897 10008
157 1.1 943.9 0.8X
-SQL Parquet Vectorized: DataPageV1 1176 1264
125 8.9 112.1 6.8X
-SQL Parquet Vectorized: DataPageV2 2224 2326
144 4.7 212.1 3.6X
-SQL Parquet MR: DataPageV1 3431 3483
73 3.1 327.2 2.3X
-SQL Parquet MR: DataPageV2 3845 4043
280 2.7 366.7 2.1X
-ParquetReader Vectorized: DataPageV1 1055 1056
2 9.9 100.6 7.6X
-ParquetReader Vectorized: DataPageV2 2093 2119
37 5.0 199.6 3.8X
-SQL ORC Vectorized 1129 1217
125 9.3 107.7 7.1X
-SQL ORC MR 2931 2982
72 3.6 279.5 2.7X
-
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+SQL CSV 11812 11849
53 0.9 1126.4 1.0X
+SQL Json 11454 11467
18 0.9 1092.3 1.0X
+SQL Parquet Vectorized: DataPageV1 1250 1276
37 8.4 119.2 9.5X
+SQL Parquet Vectorized: DataPageV2 2248 2261
17 4.7 214.4 5.3X
+SQL Parquet MR: DataPageV1 3629 3630
1 2.9 346.1 3.3X
+SQL Parquet MR: DataPageV2 3929 3934
6 2.7 374.7 3.0X
+ParquetReader Vectorized: DataPageV1 921 922
2 11.4 87.8 12.8X
+ParquetReader Vectorized: DataPageV2 1890 1890
0 5.5 180.3 6.2X
+SQL ORC Vectorized 1079 1105
36 9.7 102.9 10.9X
+SQL ORC MR 3042 3070
40 3.4 290.1 3.9X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
String with Nulls Scan (50.0%): Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 6338 6508
240 1.7 604.4 1.0X
-SQL Json 7149 7247
138 1.5 681.8 0.9X
-SQL Parquet Vectorized: DataPageV1 937 984
45 11.2 89.3 6.8X
-SQL Parquet Vectorized: DataPageV2 1582 1608
37 6.6 150.9 4.0X
-SQL Parquet MR: DataPageV1 2525 2721
277 4.2 240.8 2.5X
-SQL Parquet MR: DataPageV2 2969 2974
7 3.5 283.1 2.1X
-ParquetReader Vectorized: DataPageV1 933 940
12 11.2 88.9 6.8X
-ParquetReader Vectorized: DataPageV2 1535 1549
20 6.8 146.4 4.1X
-SQL ORC Vectorized 1144 1204
86 9.2 109.1 5.5X
-SQL ORC MR 2816 2822
8 3.7 268.6 2.3X
-
-OpenJDK 64-Bit Server VM 11.0.13+8-LTS on Linux 5.11.0-1025-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+SQL CSV 8399 8410
16 1.2 801.0 1.0X
+SQL Json 8892 8905
18 1.2 848.0 0.9X
+SQL Parquet Vectorized: DataPageV1 1065 1092
38 9.8 101.6 7.9X
+SQL Parquet Vectorized: DataPageV2 1747 1747
0 6.0 166.6 4.8X
+SQL Parquet MR: DataPageV1 2718 2719
1 3.9 259.2 3.1X
+SQL Parquet MR: DataPageV2 2955 2964
12 3.5 281.8 2.8X
+ParquetReader Vectorized: DataPageV1 1082 1084
3 9.7 103.2 7.8X
+ParquetReader Vectorized: DataPageV2 1707 1713
9 6.1 162.8 4.9X
+SQL ORC Vectorized 1345 1357
17 7.8 128.3 6.2X
+SQL ORC MR 3012 3046
47 3.5 287.3 2.8X
+
+OpenJDK 64-Bit Server VM 11.0.14+9-LTS on Linux 5.11.0-1028-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
String with Nulls Scan (95.0%): Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 4443 4504
86 2.4 423.7 1.0X
-SQL Json 4528 4563
49 2.3 431.8 1.0X
-SQL Parquet Vectorized: DataPageV1 213 233
15 49.2 20.3 20.8X
-SQL Parquet Vectorized: DataPageV2 267 294
22 39.3 25.4 16.7X
-SQL Parquet MR: DataPageV1 1691 1700
13 6.2 161.2 2.6X
-SQL Parquet MR: DataPageV2 1515 1565
70 6.9 144.5 2.9X
-ParquetReader Vectorized: DataPageV1 228 231
2 46.0 21.7 19.5X
-ParquetReader Vectorized: DataPageV2 285 296
9 36.8 27.1 15.6X
-SQL ORC Vectorized 369 425
82 28.4 35.2 12.1X
-SQL ORC MR 1457 1463
9 7.2 138.9 3.0X
+SQL CSV 6169 6176
10 1.7 588.3 1.0X
+SQL Json 5352 5376
35 2.0 510.4 1.2X
+SQL Parquet Vectorized: DataPageV1 248 255
7 42.3 23.6 24.9X
+SQL Parquet Vectorized: DataPageV2 364 372
10 28.8 34.7 17.0X
+SQL Parquet MR: DataPageV1 1624 1626
3 6.5 154.9 3.8X
+SQL Parquet MR: DataPageV2 1520 1526
8 6.9 145.0 4.1X
+ParquetReader Vectorized: DataPageV1 259 262
1 40.4 24.7 23.8X
+ParquetReader Vectorized: DataPageV2 376 378
2 27.9 35.9 16.4X
Review comment:
Hmm this still shows a degradation (69% vs 80% originally). I wonder why
vectorized reader doesn't help here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]