[ 
https://issues.apache.org/jira/browse/HAWQ-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen updated HAWQ-1597:
--------------------------
    Attachment: HAWQ Runtime Filter Design.pdf

> Implement Runtime Filter for Hash Join
> --------------------------------------
>
>                 Key: HAWQ-1597
>                 URL: https://issues.apache.org/jira/browse/HAWQ-1597
>             Project: Apache HAWQ
>          Issue Type: New Feature
>          Components: Query Execution
>            Reporter: Lin Wen
>            Assignee: Lin Wen
>            Priority: Major
>             Fix For: 2.4.0.0-incubating
>
>         Attachments: HAWQ Runtime Filter Design.pdf, HAWQ Runtime Filter 
> Design.pdf
>
>
> Bloom filter is a space-efficient probabilistic data structure invented in 
> 1970, which is used to test whether an element is a member of a set.
> Nowdays, bloom filter is widely used in OLAP or data-intensive applications 
> to quickly filter data. It is usually implemented in OLAP systems for hash 
> join. The basic idea is, when hash join two tables, during the build phase, 
> build a bloomfilter information for the inner table, then push down this 
> bloomfilter information to the scan of the outer table, so that, less tuples 
> from the outer table will be returned to hash join node and joined with hash 
> table. It can greatly improment the hash join performance if the selectivity 
> is high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to