dongjoon-hyun commented on code in PR #46266:
URL: https://github.com/apache/spark/pull/46266#discussion_r1585536429
##########
sql/core/benchmarks/DataSourceReadBenchmark-jdk21-results.txt:
##########
@@ -2,430 +2,430 @@
SQL Single Numeric Column Scan
================================================================================================
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
SQL Single BOOLEAN Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 7930 7984
77 2.0 504.2 1.0X
-SQL Json 8135 8250
163 1.9 517.2 1.0X
-SQL Parquet Vectorized: DataPageV1 76 87
9 205.7 4.9 103.7X
-SQL Parquet Vectorized: DataPageV2 55 65
8 285.3 3.5 143.8X
-SQL Parquet MR: DataPageV1 1785 1787
3 8.8 113.5 4.4X
-SQL Parquet MR: DataPageV2 1643 1680
52 9.6 104.5 4.8X
-SQL ORC Vectorized 114 124
10 138.2 7.2 69.7X
-SQL ORC MR 1494 1496
3 10.5 95.0 5.3X
-
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
+SQL CSV 9759 9826
94 1.6 620.5 1.0X
+SQL Json 8157 8194
53 1.9 518.6 1.2X
+SQL Parquet Vectorized: DataPageV1 86 99
11 183.5 5.4 113.9X
+SQL Parquet Vectorized: DataPageV2 112 120
6 140.8 7.1 87.4X
+SQL Parquet MR: DataPageV1 1775 1776
1 8.9 112.9 5.5X
+SQL Parquet MR: DataPageV2 1745 1749
5 9.0 110.9 5.6X
+SQL ORC Vectorized 119 133
8 132.4 7.6 82.1X
+SQL ORC MR 1464 1464
0 10.7 93.1 6.7X
+
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
Parquet Reader Single BOOLEAN Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 35 36
1 449.0 2.2 1.0X
-ParquetReader Vectorized: DataPageV2 25 26
1 638.4 1.6 1.4X
-ParquetReader Vectorized -> Row: DataPageV1 29 30
1 548.0 1.8 1.2X
-ParquetReader Vectorized -> Row: DataPageV2 18 20
2 851.6 1.2 1.9X
+ParquetReader Vectorized: DataPageV1 94 96
3 167.7 6.0 1.0X
+ParquetReader Vectorized: DataPageV2 113 115
2 139.1 7.2 0.8X
+ParquetReader Vectorized -> Row: DataPageV1 75 75
1 210.9 4.7 1.3X
+ParquetReader Vectorized -> Row: DataPageV2 95 96
1 166.2 6.0 1.0X
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
SQL Single TINYINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 9218 9237
26 1.7 586.1 1.0X
-SQL Json 8885 8900
21 1.8 564.9 1.0X
-SQL Parquet Vectorized: DataPageV1 74 86
9 212.6 4.7 124.6X
-SQL Parquet Vectorized: DataPageV2 74 88
12 211.4 4.7 123.9X
-SQL Parquet MR: DataPageV1 1832 1837
8 8.6 116.5 5.0X
-SQL Parquet MR: DataPageV2 1761 1763
3 8.9 112.0 5.2X
-SQL ORC Vectorized 104 114
11 150.9 6.6 88.5X
-SQL ORC MR 1523 1560
52 10.3 96.8 6.1X
-
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
+SQL CSV 9826 9827
2 1.6 624.7 1.0X
+SQL Json 9154 9168
20 1.7 582.0 1.1X
+SQL Parquet Vectorized: DataPageV1 98 107
8 161.1 6.2 100.7X
+SQL Parquet Vectorized: DataPageV2 95 107
11 164.7 6.1 102.9X
+SQL Parquet MR: DataPageV1 1876 1883
9 8.4 119.3 5.2X
+SQL Parquet MR: DataPageV2 1841 1849
11 8.5 117.1 5.3X
+SQL ORC Vectorized 109 120
9 144.5 6.9 90.3X
+SQL ORC MR 1600 1601
2 9.8 101.7 6.1X
+
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
Parquet Reader Single TINYINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 125 138
14 125.8 7.9 1.0X
-ParquetReader Vectorized: DataPageV2 125 137
11 126.2 7.9 1.0X
-ParquetReader Vectorized -> Row: DataPageV1 44 47
5 355.9 2.8 2.8X
-ParquetReader Vectorized -> Row: DataPageV2 44 47
5 357.8 2.8 2.8X
+ParquetReader Vectorized: DataPageV1 76 78
2 207.9 4.8 1.0X
+ParquetReader Vectorized: DataPageV2 76 78
2 208.0 4.8 1.0X
+ParquetReader Vectorized -> Row: DataPageV1 45 46
2 351.2 2.8 1.7X
+ParquetReader Vectorized -> Row: DataPageV2 44 45
1 353.5 2.8 1.7X
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
SQL Single SMALLINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 9794 9896
144 1.6 622.7 1.0X
-SQL Json 9146 9163
24 1.7 581.5 1.1X
-SQL Parquet Vectorized: DataPageV1 109 117
7 144.1 6.9 89.7X
-SQL Parquet Vectorized: DataPageV2 126 136
5 124.8 8.0 77.7X
-SQL Parquet MR: DataPageV1 2090 2102
16 7.5 132.9 4.7X
-SQL Parquet MR: DataPageV2 1898 1907
14 8.3 120.6 5.2X
-SQL ORC Vectorized 138 149
14 114.1 8.8 71.0X
-SQL ORC MR 1574 1605
43 10.0 100.1 6.2X
-
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
+SQL CSV 9858 9859
1 1.6 626.8 1.0X
+SQL Json 9321 9334
18 1.7 592.6 1.1X
+SQL Parquet Vectorized: DataPageV1 115 130
17 137.0 7.3 85.9X
+SQL Parquet Vectorized: DataPageV2 135 149
17 116.9 8.6 73.2X
+SQL Parquet MR: DataPageV1 2192 2199
10 7.2 139.4 4.5X
+SQL Parquet MR: DataPageV2 2003 2026
32 7.9 127.4 4.9X
+SQL ORC Vectorized 143 153
17 109.9 9.1 68.9X
+SQL ORC MR 1944 1951
11 8.1 123.6 5.1X
+
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
Parquet Reader Single SMALLINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 140 161
67 112.2 8.9 1.0X
-ParquetReader Vectorized: DataPageV2 163 166
3 96.4 10.4 0.9X
-ParquetReader Vectorized -> Row: DataPageV1 139 140
2 113.1 8.8 1.0X
-ParquetReader Vectorized -> Row: DataPageV2 166 182
10 94.8 10.6 0.8X
+ParquetReader Vectorized: DataPageV1 140 147
8 112.7 8.9 1.0X
+ParquetReader Vectorized: DataPageV2 173 177
3 91.0 11.0 0.8X
+ParquetReader Vectorized -> Row: DataPageV1 134 141
8 117.2 8.5 1.0X
+ParquetReader Vectorized -> Row: DataPageV2 165 176
12 95.2 10.5 0.8X
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
SQL Single INT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 11232 11256
33 1.4 714.1 1.0X
-SQL Json 9725 9740
22 1.6 618.3 1.2X
-SQL Parquet Vectorized: DataPageV1 84 97
15 187.8 5.3 134.1X
-SQL Parquet Vectorized: DataPageV2 162 181
13 96.8 10.3 69.1X
-SQL Parquet MR: DataPageV1 1882 1900
26 8.4 119.6 6.0X
-SQL Parquet MR: DataPageV2 1898 1899
2 8.3 120.7 5.9X
-SQL ORC Vectorized 148 157
13 106.1 9.4 75.7X
-SQL ORC MR 1667 1674
10 9.4 106.0 6.7X
-
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
+SQL CSV 11219 11235
22 1.4 713.3 1.0X
+SQL Json 9660 9667
9 1.6 614.2 1.2X
+SQL Parquet Vectorized: DataPageV1 122 126
4 129.1 7.7 92.1X
+SQL Parquet Vectorized: DataPageV2 178 195
17 88.5 11.3 63.1X
+SQL Parquet MR: DataPageV1 2007 2031
33 7.8 127.6 5.6X
+SQL Parquet MR: DataPageV2 2060 2084
34 7.6 131.0 5.4X
+SQL ORC Vectorized 175 184
13 89.8 11.1 64.0X
+SQL ORC MR 1804 1844
56 8.7 114.7 6.2X
+
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
Parquet Reader Single INT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 130 140
11 121.1 8.3 1.0X
-ParquetReader Vectorized: DataPageV2 213 230
10 74.0 13.5 0.6X
-ParquetReader Vectorized -> Row: DataPageV1 128 132
6 122.9 8.1 1.0X
-ParquetReader Vectorized -> Row: DataPageV2 222 226
5 70.7 14.1 0.6X
+ParquetReader Vectorized: DataPageV1 150 157
6 104.7 9.6 1.0X
+ParquetReader Vectorized: DataPageV2 212 226
9 74.3 13.5 0.7X
+ParquetReader Vectorized -> Row: DataPageV1 164 170
6 95.8 10.4 0.9X
+ParquetReader Vectorized -> Row: DataPageV2 242 246
4 64.9 15.4 0.6X
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
SQL Single BIGINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 14617 14690
103 1.1 929.3 1.0X
-SQL Json 10772 10780
11 1.5 684.9 1.4X
-SQL Parquet Vectorized: DataPageV1 118 132
13 133.4 7.5 124.0X
-SQL Parquet Vectorized: DataPageV2 268 300
20 58.7 17.0 54.5X
-SQL Parquet MR: DataPageV1 2289 2314
36 6.9 145.5 6.4X
-SQL Parquet MR: DataPageV2 1993 1995
3 7.9 126.7 7.3X
-SQL ORC Vectorized 215 224
12 73.1 13.7 68.0X
-SQL ORC MR 1840 1851
17 8.6 117.0 7.9X
-
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
+SQL CSV 11095 11134
54 1.4 705.4 1.0X
+SQL Json 9688 9701
18 1.6 616.0 1.1X
+SQL Parquet Vectorized: DataPageV1 293 297
4 53.7 18.6 37.9X
+SQL Parquet Vectorized: DataPageV2 225 253
23 69.9 14.3 49.3X
+SQL Parquet MR: DataPageV1 2423 2437
20 6.5 154.0 4.6X
+SQL Parquet MR: DataPageV2 2041 2055
19 7.7 129.8 5.4X
+SQL ORC Vectorized 165 192
24 95.3 10.5 67.2X
+SQL ORC MR 1742 1753
15 9.0 110.8 6.4X
+
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
Parquet Reader Single BIGINT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 167 179
12 94.0 10.6 1.0X
-ParquetReader Vectorized: DataPageV2 324 331
4 48.5 20.6 0.5X
-ParquetReader Vectorized -> Row: DataPageV1 181 185
5 87.1 11.5 0.9X
-ParquetReader Vectorized -> Row: DataPageV2 322 331
6 48.8 20.5 0.5X
+ParquetReader Vectorized: DataPageV1 308 317
8 51.0 19.6 1.0X
+ParquetReader Vectorized: DataPageV2 276 283
5 56.9 17.6 1.1X
+ParquetReader Vectorized -> Row: DataPageV1 317 321
4 49.6 20.2 1.0X
+ParquetReader Vectorized -> Row: DataPageV2 271 278
7 58.1 17.2 1.1X
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
SQL Single FLOAT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 11070 11076
9 1.4 703.8 1.0X
-SQL Json 11574 11602
39 1.4 735.9 1.0X
-SQL Parquet Vectorized: DataPageV1 86 97
15 182.7 5.5 128.6X
-SQL Parquet Vectorized: DataPageV2 94 103
5 166.9 6.0 117.4X
-SQL Parquet MR: DataPageV1 2065 2130
93 7.6 131.3 5.4X
-SQL Parquet MR: DataPageV2 2157 2169
17 7.3 137.1 5.1X
-SQL ORC Vectorized 266 288
20 59.0 16.9 41.5X
-SQL ORC MR 1740 1780
57 9.0 110.6 6.4X
-
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
+SQL CSV 11177 11185
13 1.4 710.6 1.0X
+SQL Json 11229 11252
32 1.4 713.9 1.0X
+SQL Parquet Vectorized: DataPageV1 83 97
15 189.6 5.3 134.7X
+SQL Parquet Vectorized: DataPageV2 82 96
13 191.1 5.2 135.8X
+SQL Parquet MR: DataPageV1 2029 2055
36 7.8 129.0 5.5X
+SQL Parquet MR: DataPageV2 1986 2014
39 7.9 126.3 5.6X
+SQL ORC Vectorized 229 241
17 68.7 14.6 48.8X
+SQL ORC MR 1751 1763
18 9.0 111.3 6.4X
+
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
Parquet Reader Single FLOAT Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1 144 144
1 109.5 9.1 1.0X
-ParquetReader Vectorized: DataPageV2 140 142
1 112.1 8.9 1.0X
-ParquetReader Vectorized -> Row: DataPageV1 149 156
6 105.6 9.5 1.0X
-ParquetReader Vectorized -> Row: DataPageV2 148 153
5 106.2 9.4 1.0X
+ParquetReader Vectorized: DataPageV1 134 141
7 117.5 8.5 1.0X
+ParquetReader Vectorized: DataPageV2 150 159
8 105.0 9.5 0.9X
+ParquetReader Vectorized -> Row: DataPageV1 143 150
7 109.9 9.1 0.9X
+ParquetReader Vectorized -> Row: DataPageV2 143 152
15 109.9 9.1 0.9X
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
SQL Single DOUBLE Column Scan: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-SQL CSV 14612 14718
150 1.1 929.0 1.0X
-SQL Json 14802 14812
14 1.1 941.1 1.0X
-SQL Parquet Vectorized: DataPageV1 126 144
15 124.3 8.0 115.5X
-SQL Parquet Vectorized: DataPageV2 161 167
5 97.4 10.3 90.5X
-SQL Parquet MR: DataPageV1 2239 2249
14 7.0 142.4 6.5X
-SQL Parquet MR: DataPageV2 2125 2169
63 7.4 135.1 6.9X
-SQL ORC Vectorized 352 366
11 44.6 22.4 41.5X
-SQL ORC MR 1823 1824
1 8.6 115.9 8.0X
-
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
+SQL CSV 11485 11545
86 1.4 730.2 1.0X
+SQL Json 11591 11597
8 1.4 737.0 1.0X
+SQL Parquet Vectorized: DataPageV1 269 288
18 58.5 17.1 42.7X
Review Comment:
This also has slightly different ratio. `DataPageV1` vs `DataPageV2`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]