[
https://issues.apache.org/jira/browse/HAWQ-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16501160#comment-16501160
]
Hongxu Ma edited comment on HAWQ-1597 at 6/5/18 2:05 AM:
---------------------------------------------------------
Good job, Thanks. [~wlin]
was (Author: hongxu ma):
Thanks! [~wlin]
> Implement Runtime Filter for Hash Join
> --------------------------------------
>
> Key: HAWQ-1597
> URL: https://issues.apache.org/jira/browse/HAWQ-1597
> Project: Apache HAWQ
> Issue Type: New Feature
> Components: Query Execution
> Reporter: Lin Wen
> Assignee: Lin Wen
> Priority: Major
> Fix For: 2.4.0.0-incubating
>
> Attachments: 111BA854-7318-46A7-8338-5F2993D60FA3.png, HAWQ Runtime
> Filter Design.pdf, HAWQ Runtime Filter Design.pdf, q17_modified_hawq.gif
>
>
> Bloom filter is a space-efficient probabilistic data structure invented in
> 1970, which is used to test whether an element is a member of a set.
> Nowdays, bloom filter is widely used in OLAP or data-intensive applications
> to quickly filter data. It is usually implemented in OLAP systems for hash
> join. The basic idea is, when hash join two tables, during the build phase,
> build a bloomfilter information for the inner table, then push down this
> bloomfilter information to the scan of the outer table, so that, less tuples
> from the outer table will be returned to hash join node and joined with hash
> table. It can greatly improment the hash join performance if the selectivity
> is high.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)