panbingkun commented on code in PR #40506: URL: https://github.com/apache/spark/pull/40506#discussion_r1206118883
########## sql/core/benchmarks/JsonBenchmark-jdk17-results.txt: ########## @@ -7,117 +7,118 @@ OpenJDK 64-Bit Server VM 17.0.7+7 on Linux 5.15.0-1037-azure Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz JSON schema inferring: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -No encoding 2964 3045 89 1.7 592.8 1.0X -UTF-8 is set 4365 4382 18 1.1 873.1 0.7X +No encoding 3004 3017 12 1.7 600.8 1.0X +UTF-8 is set 4430 4446 17 1.1 886.0 0.7X Preparing data for benchmarking ... OpenJDK 64-Bit Server VM 17.0.7+7 on Linux 5.15.0-1037-azure Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz count a short column: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -No encoding 2326 2381 52 2.1 465.2 1.0X -UTF-8 is set 3834 3846 17 1.3 766.7 0.6X +No encoding 2345 2392 44 2.1 469.0 1.0X +UTF-8 is set 3832 3845 11 1.3 766.4 0.6X Preparing data for benchmarking ... OpenJDK 64-Bit Server VM 17.0.7+7 on Linux 5.15.0-1037-azure Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz count a wide column: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -No encoding 4599 4622 26 0.2 4599.4 1.0X -UTF-8 is set 6079 6120 62 0.2 6078.8 0.8X +No encoding 7234 7306 71 0.1 7234.4 1.0X +UTF-8 is set 6396 6449 57 0.2 6396.1 1.1X Preparing data for benchmarking ... OpenJDK 64-Bit Server VM 17.0.7+7 on Linux 5.15.0-1037-azure Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz select wide row: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -No encoding 12217 12443 256 0.0 244340.4 1.0X -UTF-8 is set 13720 13823 113 0.0 274409.6 0.9X +No encoding 12800 12832 39 0.0 255996.9 1.0X +UTF-8 is set 13818 13931 115 0.0 276350.2 0.9X Preparing data for benchmarking ... OpenJDK 64-Bit Server VM 17.0.7+7 on Linux 5.15.0-1037-azure Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz Select a subset of 10 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -Select 10 columns 2291 2308 18 0.4 2291.5 1.0X -Select 1 column 1485 1491 8 0.7 1485.2 1.5X +Select 10 columns 1948 1966 31 0.5 1947.5 1.0X +Select 1 column 1453 1456 4 0.7 1452.6 1.3X Preparing data for benchmarking ... OpenJDK 64-Bit Server VM 17.0.7+7 on Linux 5.15.0-1037-azure Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz creation of JSON parser per line: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -Short column without encoding 689 691 3 1.5 688.7 1.0X -Short column with UTF-8 973 977 3 1.0 972.8 0.7X -Wide column without encoding 7239 7283 71 0.1 7238.6 0.1X -Wide column with UTF-8 9634 9667 30 0.1 9634.3 0.1X +Short column without encoding 664 675 12 1.5 663.6 1.0X +Short column with UTF-8 956 975 25 1.0 956.0 0.7X +Wide column without encoding 7269 7299 34 0.1 7268.7 0.1X +Wide column with UTF-8 10444 10474 28 0.1 10444.5 0.1X Preparing data for benchmarking ... OpenJDK 64-Bit Server VM 17.0.7+7 on Linux 5.15.0-1037-azure Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz JSON functions: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -Text read 95 100 9 10.5 95.1 1.0X -from_json 1638 1646 7 0.6 1638.5 0.1X -json_tuple 1971 1996 39 0.5 1970.6 0.0X -get_json_object 1799 1809 13 0.6 1799.3 0.1X +Text read 90 91 3 11.2 89.5 1.0X +from_json 2048 2057 12 0.5 2047.7 0.0X +json_tuple 2334 2340 6 0.4 2334.1 0.0X +get_json_object wholestage off 2295 2299 4 0.4 2295.4 0.0X +get_json_object wholestage on 2158 2161 3 0.5 2158.3 0.0X Review Comment: ditto ########## sql/core/benchmarks/JsonBenchmark-results.txt: ########## @@ -4,120 +4,121 @@ Benchmark for performance of JSON parsing Preparing data for benchmarking ... OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1037-azure -Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz +Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz JSON schema inferring: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -No encoding 3720 3843 121 1.3 743.9 1.0X -UTF-8 is set 5412 5455 45 0.9 1082.4 0.7X +No encoding 3280 3495 218 1.5 655.9 1.0X +UTF-8 is set 4759 4780 18 1.1 951.8 0.7X Preparing data for benchmarking ... OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1037-azure -Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz +Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz count a short column: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -No encoding 3234 3254 33 1.5 646.7 1.0X -UTF-8 is set 4847 4868 21 1.0 969.5 0.7X +No encoding 2734 2780 39 1.8 546.9 1.0X +UTF-8 is set 4421 4472 45 1.1 884.2 0.6X Preparing data for benchmarking ... OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1037-azure -Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz +Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz count a wide column: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -No encoding 5702 5794 101 0.2 5702.1 1.0X -UTF-8 is set 9526 9607 73 0.1 9526.1 0.6X +No encoding 6322 6442 138 0.2 6322.2 1.0X +UTF-8 is set 10099 10136 49 0.1 10099.0 0.6X Preparing data for benchmarking ... OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1037-azure -Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz +Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz select wide row: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -No encoding 18318 18448 199 0.0 366367.7 1.0X -UTF-8 is set 19791 19887 99 0.0 395817.1 0.9X +No encoding 16237 16377 154 0.0 324735.1 1.0X +UTF-8 is set 17622 17694 71 0.0 352440.5 0.9X Preparing data for benchmarking ... OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1037-azure -Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz +Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz Select a subset of 10 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -Select 10 columns 2531 2570 51 0.4 2531.3 1.0X -Select 1 column 1867 1882 16 0.5 1867.0 1.4X +Select 10 columns 2481 2495 14 0.4 2480.8 1.0X +Select 1 column 1789 1792 3 0.6 1789.0 1.4X Preparing data for benchmarking ... OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1037-azure -Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz +Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz creation of JSON parser per line: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -Short column without encoding 868 875 7 1.2 868.4 1.0X -Short column with UTF-8 1151 1163 11 0.9 1150.9 0.8X -Wide column without encoding 12063 12299 205 0.1 12063.0 0.1X -Wide column with UTF-8 16095 16136 51 0.1 16095.3 0.1X +Short column without encoding 812 831 17 1.2 811.9 1.0X +Short column with UTF-8 1150 1153 3 0.9 1149.9 0.7X +Wide column without encoding 11707 11763 49 0.1 11707.4 0.1X +Wide column with UTF-8 17484 17524 35 0.1 17484.2 0.0X Preparing data for benchmarking ... OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1037-azure -Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz +Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz JSON functions: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -Text read 165 170 4 6.1 164.7 1.0X -from_json 2339 2386 77 0.4 2338.9 0.1X -json_tuple 2667 2730 55 0.4 2667.3 0.1X -get_json_object 2627 2659 32 0.4 2627.1 0.1X +Text read 149 152 4 6.7 148.9 1.0X +from_json 2103 2124 21 0.5 2103.4 0.1X +json_tuple 2482 2490 7 0.4 2481.7 0.1X +get_json_object wholestage off 2241 2249 12 0.4 2240.7 0.1X +get_json_object wholestage on 2124 2132 8 0.5 2123.9 0.1X Review Comment: ditto -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
