sadikovi commented on pull request #34596:
URL: https://github.com/apache/spark/pull/34596#issuecomment-978913178


   The benchmark results are fairly the same, there is some variability. I 
think we are good here, no separate option is required.
   
   Without the PR changes bb9e1d92d931a064c52cbc4cc84eaa32528809f0:
   
   ```
   [info] OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10 on 
Linux 5.4.0-1045-aws
   [info] Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
   [info] Write dates and timestamps:               Best Time(ms)   Avg 
Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] 
------------------------------------------------------------------------------------------------------------------------
   [info] Create a dataset of timestamps                     1170           
1233          60          8.5         117.0       1.0X
   [info] to_csv(timestamp)                                  9771           
9838          58          1.0         977.1       0.1X
   [info] write timestamps to files                          8752           
8790          34          1.1         875.2       0.1X
   [info] Create a dataset of dates                          1330           
1341           9          7.5         133.0       0.9X
   [info] to_csv(date)                                       6502           
6518          14          1.5         650.2       0.2X
   [info] write dates to files                               5487           
5503          14          1.8         548.7       0.2X
   
   [info] OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10 on 
Linux 5.4.0-1045-aws
   [info] Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
   [info] Read dates and timestamps:                Best Time(ms)   Avg 
Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] 
------------------------------------------------------------------------------------------------------------------------
   [info] read timestamp text from files                     1508           
1535          26          6.6         150.8       1.0X
   [info] read timestamps from files                        24018          
24608         531          0.4        2401.8       0.1X
   [info] infer timestamps from files                       51043          
51171         111          0.2        5104.3       0.0X
   [info] read date text from files                          1437           
1451          15          7.0         143.7       1.0X
   [info] read date from files                               9391           
9433          51          1.1         939.1       0.2X
   [info] infer date from files                             21983          
22029          77          0.5        2198.3       0.1X
   [info] timestamp strings                                  2488           
2519          46          4.0         248.8       0.6X
   [info] parse timestamps from Dataset[String]             27073          
27108          33          0.4        2707.3       0.1X
   [info] infer timestamps from Dataset[String]             53325          
53399         106          0.2        5332.5       0.0X
   [info] date strings                                       2802           
2809           6          3.6         280.2       0.5X
   [info] parse dates from Dataset[String]                  11487          
11577          96          0.9        1148.7       0.1X
   [info] from_csv(timestamp)                               25019          
25068          55          0.4        2501.9       0.1X
   [info] from_csv(date)                                    10394          
10431          39          1.0        1039.4       0.1X
   ```
   
   With the PR changes:
   ```
   PR changes:
   
   [info] OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10 on 
Linux 5.4.0-1045-aws
   [info] Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
   [info] Write dates and timestamps:               Best Time(ms)   Avg 
Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] 
------------------------------------------------------------------------------------------------------------------------
   [info] Create a dataset of timestamps                     1164           
1215          44          8.6         116.4       1.0X
   [info] to_csv(timestamp)                                  9733           
9831         125          1.0         973.3       0.1X
   [info] write timestamps to files                          8810           
8832          22          1.1         881.0       0.1X
   [info] Create a dataset of dates                          1339           
1348           9          7.5         133.9       0.9X
   [info] to_csv(date)                                       6511           
6519          12          1.5         651.1       0.2X
   [info] write dates to files                               5488           
5500          11          1.8         548.8       0.2X
   
   
   [info] OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10 on 
Linux 5.4.0-1045-aws
   [info] Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
   [info] Read dates and timestamps:                Best Time(ms)   Avg 
Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] 
------------------------------------------------------------------------------------------------------------------------
   [info] read timestamp text from files                     1479           
1488          10          6.8         147.9       1.0X
   [info] read timestamps from files                        24271          
24680         412          0.4        2427.1       0.1X
   [info] infer timestamps from files                       50436          
50497          54          0.2        5043.6       0.0X
   [info] read date text from files                          1422           
1441          25          7.0         142.2       1.0X
   [info] read date from files                               9725           
9795          63          1.0         972.5       0.2X
   [info] infer date from files                             21550          
21572          28          0.5        2155.0       0.1X
   [info] timestamp strings                                  2483           
2528          39          4.0         248.3       0.6X
   [info] parse timestamps from Dataset[String]             27110          
27199          82          0.4        2711.0       0.1X
   [info] infer timestamps from Dataset[String]             53590          
53720         147          0.2        5359.0       0.0X
   [info] date strings                                       2635           
2644          15          3.8         263.5       0.6X
   [info] parse dates from Dataset[String]                  11662          
11714          56          0.9        1166.2       0.1X
   [info] from_csv(timestamp)                               25599          
25715         139          0.4        2559.9       0.1X
   [info] from_csv(date)                                    10838          
10885          41          0.9        1083.8       0.1X
   [success] Total time: 1164 s (19:24), completed Nov 25, 2021 7:00:03 AM
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to