[jira] [Created] (HIVE-2375) join multipe small tables with one big table in one mapside join?

Daniel Wu (JIRA) Mon, 15 Aug 2011 06:07:55 -0700

join multipe small tables with one big table in one mapside join?
-----------------------------------------------------------------


                 Key: HIVE-2375
                 URL: https://issues.apache.org/jira/browse/HIVE-2375
             Project: Hive
          Issue Type: New Feature
          Components: Query Processor
         Environment: not related
            Reporter: Daniel Wu
            Priority: Minor


http://mail-archives.apache.org/mod_mbox/hive-user/201108.mbox/%[email protected]%3E

suppose we join 10 small tables (s1,s2...s10) with one huge table (big) in a 
data warehouse
system (the join is between big table and small tables, like star schema).  Is 
it possible to:
 first build 10 hash table: one for each small table,
and loop each row in the big table, if the row survive, just output, if not 
then discard,
in this way we only need to read the big data once, instead of read big data, 
write big data,
read big data, ...

dataflow is like:
1: build 10 hash tables
2: foreach row in big table
         probe the row with each of these 10 hash table
         if match all these 10 hash table, go to next step (output, etc)
         else discard the row.
    end loop


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-2375) join multipe small tables with one big table in one mapside join?

Reply via email to