[Hadoop Wiki] Update of "Hive/JoinOptimization" by Liyi nTang

Apache Wiki Tue, 30 Nov 2010 14:35:08 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The "Hive/JoinOptimization" page has been changed by LiyinTang.
http://wiki.apache.org/hadoop/Hive/JoinOptimization?action=diff&rev1=6&rev2=7

--------------------------------------------------

  Hive-1641 
([[http://issues.apache.org/jira/browse/HIVE-1293|http://issues.apache.org/jira/browse/HIVE-1641]])
 has solved this problem. As shown in Fig2, the basic idea is to create a new 
task, MapReduce Local Task, before the orginal Join Map/Reduce Task. This new 
task will read the small table data from HDFS to in-memory hashtable. After 
reading, it will serialize the in-memory hashtable into files on disk and 
compress the hashtable file into a tar file. In next stage, when the MapReduce 
task is launching, it will put this tar file to Hadoop Distributed Cache, which 
will populate the tar file to each Mapper’s local disk and decompress the file. 
So all the Mappers can deserialize the hashtable file back into memory and do 
the join work as before.
  
  == 1.2 Removing JDBM ==
- Previously, Hive uses JDBM 
([[http://issues.apache.org/jira/browse/HIVE-1293|http://jdbm.sourceforge.net/]])
 as a persistent hashtable. Whenever the in-memory hashtable cannot hold data 
any more, it will swap the key/value into the JDBM table. However when profiing 
the Map Join, we found out this JDBM component takes more than 70 % CPU time as 
shown in Fig3. Also the persistent file JDBM genreated is too large to put into 
the Distributed Cache. For example, if users put 67,000 simple interger 
key/value pairs into the JDBM, it will generate more 22M hashtable file. So the 
JDBM is too heavy weight for Map Join and it would better to remove this 
componet from Hive. Map Join is designed for holding the small table's data 
into memory. If the table is too large to hold, just run as a Common Join. 
There is no need to use persistent hashtable any more.
+ Previously, Hive uses JDBM 
([[http://issues.apache.org/jira/browse/HIVE-1293|http://jdbm.sourceforge.net/]])
 as a persistent hashtable. Whenever the in-memory hashtable cannot hold data 
any more, it will swap the key/value into the JDBM table. However when profiing 
the Map Join, we found out this JDBM component takes more than 70 % CPU time as 
shown in Fig3. Also the persistent file JDBM genreated is too large to put into 
the Distributed Cache. For example, if users put 67,000 simple interger 
key/value pairs into the JDBM, it will generate more 22M hashtable file. So the 
JDBM is too heavy weight for Map Join and it would better to remove this 
componet from Hive. Map Join is designed for holding the small table's data 
into memory. If the table is too large to hold, just run as a Common Join. 
There is no need to use persistent hashtable any more. Hive-1754 
([[http://issues.apache.org/jira/browse/HIVE-1293|http://issues.apache.org/jira/browse/HIVE-1754]])
  
  {{attachment:fig3.jpg}}
  
@@ -32, +32 @@

  
  As shown in Table1, the optmized map join will be 12 ~ 26 times faster than 
the previous one. Most of map join performance improvement comes from removing 
the JDBM component.
  
- = 2. Converting Join into Map Join dyanmically =
+ = 2. Converting Join into Map Join Automatically =
  == 2.1 New Join Exeuction Flow ==
+ Since map join is faster than the common join, it would better to run the map 
join whenever possible. Previously, Hive users need to give a hint in the query 
to assign which table the small table is. For example, select /*+mapjoin(a)*/ 
a.key, b.value from srcpart_empty a join src b on a.key=b.key;   It is not a 
good way for user experience and query performance, because sometimes user may 
give a wrong hint and also users may not give any hints. It would be much 
better to convert the Common Join into Map Join without users' hint.
+ 
+ Hive-1642 
([[http://issues.apache.org/jira/browse/HIVE-1293|http://issues.apache.org/jira/browse/HIVE-1642]])
 has solved the problem by converting the Common Join into Map Join 
automatically. For the Map Join, the query processor should know which input 
table the big table is. Other input table will be recognize as the small table 
during the execution stage and these tables need to be hold in the memory. 
However, the query processor has no idea of input file size during compiling 
time. Because some of the table may be intermediate tables generated from sub 
queries. So the query processor can only figure out the input file size during 
executiom time.
+ 
+ {{attachment:fig5.jpg||height="716px",width="1017px"}}
+ 
+ As shown in fig5,
+ 
  == 2.2 Resolving the Join Operation at Run Time ==
  == 2.3 Backup Task ==
  == 2.4 Performance Evaluation ==

[Hadoop Wiki] Update of "Hive/JoinOptimization" by Liyi nTang

Reply via email to