viirya commented on a change in pull request #32473: URL: https://github.com/apache/spark/pull/32473#discussion_r649470402
########## File path: sql/core/benchmarks/BloomFilterBenchmark-jdk11-results.txt ########## @@ -2,23 +2,179 @@ ORC Write ================================================================================================ -OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure -Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1047-azure +Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz Write 100M rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -Without bloom filter 19503 19621 166 5.1 195.0 1.0X -With bloom filter 22472 22710 335 4.4 224.7 0.9X +Without bloom filter 13568 13645 109 7.4 135.7 1.0X +With bloom filter 16116 16238 172 6.2 161.2 0.8X ================================================================================================ ORC Read ================================================================================================ -OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure -Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1047-azure +Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz Read a row from 100M rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -Without bloom filter 1981 2040 82 50.5 19.8 1.0X -With bloom filter 1428 1467 54 70.0 14.3 1.4X +Without bloom filter 1572 1605 47 63.6 15.7 1.0X +With bloom filter 1343 1359 23 74.5 13.4 1.2X + + +================================================================================================ +ORC Read for IN set +================================================================================================ + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1047-azure +Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz +Read a row from 1M rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Without bloom filter 51 63 15 19.6 51.1 1.0X +With bloom filter 54 88 23 18.5 54.0 0.9X + + +================================================================================================ +Parquet Write +================================================================================================ + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1047-azure +Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz +Write 100M rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Without bloom filter 13679 13954 389 7.3 136.8 1.0X +With bloom filter 18260 18284 33 5.5 182.6 0.7X + + +================================================================================================ +Parquet Read +================================================================================================ + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1047-azure +Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz +Read a row from 100M rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Without bloom filter, blocksize: 2097152 954 984 49 104.8 9.5 1.0X +With bloom filter, blocksize: 2097152 285 307 21 350.4 2.9 3.3X + + +================================================================================================ +Parquet Read +================================================================================================ + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1047-azure +Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz +Read a row from 100M rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Without bloom filter, blocksize: 3145728 788 831 40 126.9 7.9 1.0X +With bloom filter, blocksize: 3145728 192 262 47 521.4 1.9 4.1X + + +================================================================================================ +Parquet Read +================================================================================================ + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1047-azure +Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz +Read a row from 100M rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Without bloom filter, blocksize: 4194304 787 847 75 127.0 7.9 1.0X +With bloom filter, blocksize: 4194304 201 224 18 496.4 2.0 3.9X + + +================================================================================================ +Parquet Read +================================================================================================ + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1047-azure +Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz +Read a row from 100M rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Without bloom filter, blocksize: 5242880 854 872 18 117.1 8.5 1.0X +With bloom filter, blocksize: 5242880 172 222 37 582.7 1.7 5.0X + + +================================================================================================ +Parquet Read +================================================================================================ + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1047-azure +Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz +Read a row from 100M rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Without bloom filter, blocksize: 6291456 785 813 27 127.4 7.9 1.0X +With bloom filter, blocksize: 6291456 167 188 14 598.0 1.7 4.7X + + +================================================================================================ +Parquet Read +================================================================================================ + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1047-azure +Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz +Read a row from 100M rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Without bloom filter, blocksize: 8388608 806 834 42 124.1 8.1 1.0X +With bloom filter, blocksize: 8388608 360 383 29 277.8 3.6 2.2X + + +================================================================================================ +Parquet Read +================================================================================================ + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1047-azure +Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz +Read a row from 100M rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------- +Without bloom filter, blocksize: 16777216 812 846 42 123.2 8.1 1.0X +With bloom filter, blocksize: 16777216 780 807 27 128.2 7.8 1.0X + + +================================================================================================ +Parquet Read +================================================================================================ + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1047-azure +Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz +Read a row from 100M rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------- +Without bloom filter, blocksize: 33554432 852 862 10 117.4 8.5 1.0X +With bloom filter, blocksize: 33554432 820 865 59 121.9 8.2 1.0X + + +================================================================================================ +Parquet Read +================================================================================================ + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1047-azure +Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz +Read a row from 100M rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------- +Without bloom filter, blocksize: 67108864 844 911 58 118.5 8.4 1.0X +With bloom filter, blocksize: 67108864 851 853 2 117.5 8.5 1.0X + + +================================================================================================ +Parquet Read +================================================================================================ + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1047-azure +Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz +Read a row from 100M rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +-------------------------------------------------------------------------------------------------------------------------- +Without bloom filter, blocksize: 134217728 839 887 53 119.3 8.4 1.0X +With bloom filter, blocksize: 134217728 872 881 9 114.6 8.7 1.0X + + +================================================================================================ +Parquet Read for IN set +================================================================================================ + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1047-azure +Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz +Read a row from 1M rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Without bloom filter 70 76 6 14.2 70.2 1.0X +With bloom filter 73 103 22 13.8 72.6 1.0X Review comment: Bloom filter is slower. It is due to IN predicate problem? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org