cloud-fan commented on a change in pull request #26190: [SPARK-29532][SQL] simplify interval string parsing URL: https://github.com/apache/spark/pull/26190#discussion_r337974040
########## File path: sql/core/benchmarks/IntervalBenchmark-results.txt ########## @@ -1,25 +1,23 @@ -Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15 -Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz +Java HotSpot(TM) 64-Bit Server VM 1.8.0_161-b12 on Mac OS X 10.14 +Intel(R) Core(TM) i7-6920HQ CPU @ 2.90GHz cast strings to intervals: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -string w/ interval 420 435 18 2.4 419.8 1.0X -string w/o interval 359 365 10 2.8 358.7 1.2X -1 units w/ interval 752 759 8 1.3 752.0 0.6X -1 units w/o interval 762 766 4 1.3 762.0 0.6X -2 units w/ interval 961 970 8 1.0 960.7 0.4X -2 units w/o interval 970 976 9 1.0 970.2 0.4X -3 units w/ interval 1130 1136 7 0.9 1130.4 0.4X -3 units w/o interval 1150 1158 9 0.9 1150.3 0.4X -4 units w/ interval 1333 1336 3 0.7 1333.5 0.3X -4 units w/o interval 1354 1359 4 0.7 1354.5 0.3X -5 units w/ interval 1523 1525 2 0.7 1523.3 0.3X -5 units w/o interval 1549 1551 3 0.6 1549.4 0.3X -6 units w/ interval 1661 1663 2 0.6 1660.8 0.3X -6 units w/o interval 1691 1704 13 0.6 1691.2 0.2X -7 units w/ interval 1811 1817 8 0.6 1810.6 0.2X -7 units w/o interval 1853 1854 1 0.5 1853.2 0.2X -8 units w/ interval 2029 2037 8 0.5 2028.7 0.2X -8 units w/o interval 2075 2075 1 0.5 2074.5 0.2X -9 units w/ interval 2170 2175 5 0.5 2170.0 0.2X -9 units w/o interval 2204 2212 8 0.5 2203.6 0.2X +1 units w/ interval 5026 5090 73 0.2 5026.2 1.0X +1 units w/o interval 4724 4744 23 0.2 4723.9 1.1X +2 units w/ interval 6002 6066 55 0.2 6002.0 0.8X +2 units w/o interval 5657 5689 47 0.2 5657.3 0.9X +3 units w/ interval 6803 6810 7 0.1 6802.7 0.7X +3 units w/o interval 6606 6611 4 0.2 6606.4 0.8X +4 units w/ interval 7739 7760 21 0.1 7738.8 0.6X +4 units w/o interval 7539 7546 7 0.1 7538.7 0.7X +5 units w/ interval 8698 8700 3 0.1 8697.9 0.6X +5 units w/o interval 8513 8520 12 0.1 8512.9 0.6X +6 units w/ interval 9696 9778 119 0.1 9696.1 0.5X +6 units w/o interval 9625 9693 115 0.1 9625.0 0.5X +7 units w/ interval 10762 10769 8 0.1 10762.0 0.5X +7 units w/o interval 10796 10886 95 0.1 10796.4 0.5X +8 units w/ interval 11791 11802 16 0.1 11790.5 0.4X +8 units w/o interval 11589 11630 42 0.1 11589.3 0.4X +9 units w/ interval 12953 13007 63 0.1 12952.8 0.4X +9 units w/o interval 12739 12760 32 0.1 12738.8 0.4X Review comment: It's about 6 times slower, but perf doesn't matter too much here. It's better to keep a single parser, and make the behavior consistent whenever we parse an interval string. For example 1. `select interval +1 day` works but `select interval '+1 day'` does not. 2. `select interval 1 day 1 year` works (fields can be any order) but `select interval '1 day 1 year'` does not. In general, the handwritten parser is more efficient but is less powerful. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org