Lin Wen created HAWQ-1597:
-----------------------------
Summary: Implement Runtime Filter for Hash Join
Key: HAWQ-1597
URL: https://issues.apache.org/jira/browse/HAWQ-1597
Project: Apache HAWQ
Issue Type: New Feature
Components: Query Execution
Reporter: Lin Wen
Assignee: Lei Chang
Bloom filter is a space-efficient probabilistic data structure invented in
1970, which is used to test whether an element is a member of a set.
Nowdays, bloom filter is widely used in OLAP or data-intensive applications to
quickly filter data. It is usually implemented in OLAP systems for hash join.
The basic idea is, when hash join two tables, during the build phase, build a
bloomfilter information for the inner table, then push down this bloomfilter
information to the scan of the outer table, so that, less tuples from the outer
table will be returned to hash join node and joined with hash table. It can
greatly improment the hash join performance if the selectivity is high.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)