Lin Wen created HAWQ-1597:
-----------------------------

             Summary: Implement Runtime Filter for Hash Join
                 Key: HAWQ-1597
                 URL: https://issues.apache.org/jira/browse/HAWQ-1597
             Project: Apache HAWQ
          Issue Type: New Feature
          Components: Query Execution
            Reporter: Lin Wen
            Assignee: Lei Chang


Bloom filter is a space-efficient probabilistic data structure invented in 
1970, which is used to test whether an element is a member of a set.
Nowdays, bloom filter is widely used in OLAP or data-intensive applications to 
quickly filter data. It is usually implemented in OLAP systems for hash join. 
The basic idea is, when hash join two tables, during the build phase, build a 
bloomfilter information for the inner table, then push down this bloomfilter 
information to the scan of the outer table, so that, less tuples from the outer 
table will be returned to hash join node and joined with hash table. It can 
greatly improment the hash join performance if the selectivity is high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to