use bloom filters to improve the performance of map joins
---------------------------------------------------------
Key: HIVE-1721
URL: https://issues.apache.org/jira/browse/HIVE-1721
Project: Hadoop Hive
Issue Type: New Feature
Components: Query Processor
Reporter: Namit Jain
Assignee: Liyin Tang
In case of map-joins, it is likely that the big table will not find many
matching rows from the small table.
Currently, we perform a hash-map lookup for every row in the big table, which
can be pretty expensive.
It might be useful to try out a bloom-filter containing all the elements in the
small table.
Each element from the big table is first searched in the bloom filter, and only
in case of a positive match,
the small table hash table is explored.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.