HyukjinKwon commented on pull request #31858:
URL: https://github.com/apache/spark/pull/31858#issuecomment-803736270
```diff
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Parsing quoted values: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-One quoted string 30131 31843
1489 0.0 602627.2 1.0X
+One quoted string 24185 24195
10 0.0 483694.2 1.0X
Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Wide rows with 1000 columns: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-Select 1000 columns 66630 68022
1345 0.0 66630.3 1.0X
-Select 100 columns 27846 27948
95 0.0 27846.1 2.4X
-Select one column 23184 23574
415 0.0 23184.5 2.9X
-count() 6179 6272
151 0.2 6179.1 10.8X
-Select 100 columns, one bad input field 45030 46637
1421 0.0 45029.5 1.5X
-Select 100 columns, corrupt record field 54971 56153
1428 0.0 54971.4 1.2X
+Select 1000 columns 61793 62388
532 0.0 61793.4 1.0X
+Select 100 columns 21958 21993
34 0.0 21957.9 2.8X
+Select one column 18215 18515
505 0.1 18215.0 3.4X
+count() 5865 6168
296 0.2 5865.1 10.5X
+Select 100 columns, one bad input field 39638 39739
124 0.0 39637.5 1.6X
+Select 100 columns, corrupt record field 47290 48133
741 0.0 47290.0 1.3X
Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Count a dataset with 10 columns: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-Select 10 columns + count() 10923 11008
97 0.9 1092.3 1.0X
-Select 1 column + count() 7411 7567
138 1.3 741.1 1.5X
-count() 2231 2281
43 4.5 223.1 4.9X
+Select 10 columns + count() 9935 10460
461 1.0 993.5 1.0X
+Select 1 column + count() 6786 7179
342 1.5 678.6 1.5X
+count() 2281 2458
165 4.4 228.1 4.4X
Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Write dates and timestamps: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-Create a dataset of timestamps 835 874
34 12.0 83.5 1.0X
-to_csv(timestamp) 7808 8024
191 1.3 780.8 0.1X
-write timestamps to files 6935 7201
239 1.4 693.5 0.1X
-Create a dataset of dates 947 980
28 10.6 94.7 0.9X
-to_csv(date) 5058 5118
54 2.0 505.8 0.2X
-write dates to files 3964 4026
62 2.5 396.4 0.2X
+Create a dataset of timestamps 812 826
14 12.3 81.2 1.0X
+to_csv(timestamp) 7548 7764
192 1.3 754.8 0.1X
+write timestamps to files 7052 7193
141 1.4 705.2 0.1X
+Create a dataset of dates 897 909
13 11.1 89.7 0.9X
+to_csv(date) 4778 4787
10 2.1 477.8 0.2X
+write dates to files 3853 3891
33 2.6 385.3 0.2X
Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Read dates and timestamps: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-read timestamp text from files 1272 1296
34 7.9 127.2 1.0X
-read timestamps from files 22376 22850
429 0.4 2237.6 0.1X
-infer timestamps from files 44109 44455
345 0.2 4410.9 0.0X
-read date text from files 1127 1136
8 8.9 112.7 1.1X
-read date from files 10840 11082
245 0.9 1084.0 0.1X
-infer date from files 13967 14293
424 0.7 1396.7 0.1X
-timestamp strings 1855 1945
91 5.4 185.5 0.7X
-parse timestamps from Dataset[String] 23368 23580
185 0.4 2336.8 0.1X
-infer timestamps from Dataset[String] 46081 46810
633 0.2 4608.1 0.0X
-date strings 1867 1962
93 5.4 186.7 0.7X
-parse dates from Dataset[String] 12308 12349
36 0.8 1230.8 0.1X
-from_csv(timestamp) 23333 24201
1401 0.4 2333.3 0.1X
-from_csv(date) 11734 11898
142 0.9 1173.4 0.1X
+read timestamp text from files 1259 1262
4 7.9 125.9 1.0X
+read timestamps from files 20030 20105
80 0.5 2003.0 0.1X
+infer timestamps from files 39621 39674
61 0.3 3962.1 0.0X
+read date text from files 1039 1068
40 9.6 103.9 1.2X
+read date from files 9352 9363
10 1.1 935.2 0.1X
+infer date from files 11465 11485
23 0.9 1146.5 0.1X
+timestamp strings 1759 1812
59 5.7 175.9 0.7X
+parse timestamps from Dataset[String] 20806 20858
75 0.5 2080.6 0.1X
+infer timestamps from Dataset[String] 40537 40821
258 0.2 4053.7 0.0X
+date strings 1808 1816
12 5.5 180.8 0.7X
+parse dates from Dataset[String] 12080 12311
245 0.8 1208.0 0.1X
+from_csv(timestamp) 20120 21503
1224 0.5 2012.0 0.1X
+from_csv(date) 10607 10768
246 0.9 1060.7 0.1X
Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Filters pushdown: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-w/o filters 12952 13053
157 0.0 129515.9 1.0X
-pushdown disabled 12794 12820
42 0.0 127939.7 1.0X
-w/ filters 1141 1181
35 0.1 11414.2 11.3X
+w/o filters 13109 13249
151 0.0 131086.4 1.0X
+pushdown disabled 12951 12994
63 0.0 129509.7 1.0X
+w/ filters 1095 1113
15 0.1 10953.7 12.0X
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]