GitHub user jianqiao opened a pull request: https://github.com/apache/incubator-quickstep/pull/172
Query optimization with ExactFilter This is a follow-up optimization based on the facility provided by LIPFilters. Note that LIP (lookahead information passing) is an optimization that we can inject efficient filters (e.g. bloom filters) into Select/HashJoin/Aggregate operators to pre-filter the input relations. This PR strength-reduces `HashJoin`s (including inner/semi/anti joins) into `FilterJoin`s. The semantics of a `FilterJoin` is simple: if certain conditions are met, we can build a bit vector from the build side and use the bit vector to _filter_ the probe side. The execution part is slightly more optimized: a `FilterJoin` will not always be converted into a `SelectOperator` plus a `LIPFilter` as its semantics indicates. Instead, in most situations we can avoid creating the `SelectOperator` by attaching the `LIPFilter` properly to some downstream operators â thus avoid unnecessary materialization of intermediate relations. Below shows the performance improvement for SSB scale factor 100 on a cloudlab machine: **SSB SF100**|**master (ms)**|**w/ ExactFilter (ms)** :-----:|:-----:|:-----: Q01|709|574 Q02|648|593 Q03|605|564 Q04|906|675 Q05|754|457 Q06|498|549 Q07|1687|1696 Q08|598|591 Q09|481|470 Q10|450|442 Q11|1208|882 Q12|876|656 Q13|515|475 Total|9937|8625 You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/incubator-quickstep exact-filter Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-quickstep/pull/172.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #172 ---- ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---