GitHub user linwen opened a pull request: https://github.com/apache/incubator-hawq/pull/1354
HAWQ-1606. Implement Deciding to Create Bloom Filter During Query Plan And Create Bloom filter For Inner Table This commit implements deciding to create Bloom Filter during query plan and create Bloom filter for inner table, including: 1. Introduce a GUC, hawq_hashjoin_bloomfilter_max_memory_size, controls the maximum memory size for one bloom filter in hash join. 2. Introduce a GUC, hawq_hashjoin_bloomfilter_ratio, when the ratio of (the estimated number of hash join tuples)/(number of tuples of outer table) is lower than the GUC, then Bloom filter can be used in hash join. 3. Decide whether to create Bloom filter during query plan phase. 4. During query execution phase, create Bloom filter structure and poputlate it for tuples from inner table. Please review it, thanks! You can merge this pull request into a Git repository by running: $ git pull https://github.com/linwen/incubator-hawq hawq-1606 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-hawq/pull/1354.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1354 ---- commit 11b20b51026419cf91c71cabb9c17e0e467399f7 Author: Wen Lin <wlin@...> Date: 2018-04-15T11:29:19Z HAWQ-1606. This commit implements deciding to create Bloom Filter during query plan and create Bloom filter for inner table, including: 1. Introduce a GUC, hawq_hashjoin_bloomfilter_max_memory_size, controls the maximum memory size for one bloom filter in hash join. 2. Introduce a GUC, hawq_hashjoin_bloomfilter_ratio, when the ratio of (the estimated number of hash join tuples)/(number of tuples of outer table) is lower than the GUC, then Bloom filter can be used in hash join. 3. Decide whether to create Bloom filter during query plan phase. 4. During query execution phase, create Bloom filter structure and poputlate it for tuples from inner table. ---- ---