[Hadoop Wiki] Update of "Hive/JoinOptimization" by Liyi nTang

Apache Wiki Tue, 30 Nov 2010 15:16:01 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The "Hive/JoinOptimization" page has been changed by LiyinTang.
http://wiki.apache.org/hadoop/Hive/JoinOptimization?action=diff&rev1=8&rev2=9

--------------------------------------------------

  In this case, the query processor will launch the original Common Join task 
as a Backup Task to run, which is totally transparent to user. The basic idea 
is shown as Fig 7.
  
  == 2.4 Performance Evaluation ==
+ Here are some performance comparison results. All the benchmark queries here 
can be converted into Map Join.
  
+ '''Table 2: The Comparison between the previous join with the new optimized 
join'''
+ 
+ ''' {{attachment:fig8.jpg}} '''
+ 
+ For the previous common join, the experiment only calculates the average time 
of  map reduce task execution time. Because job finish time will include the 
job scheduling overhead. Sometimes it will wait for some time to start to run 
the job in the cluster. Also for the new optimized common join, the experiment 
only adds up the average time of local task execution time with the average 
time of map reduce execution time. So both of the results should avoid the job 
scheduling overhead.
+ 
+ From the result, if the new common join can be converted into map join, it 
will get 57% ~163 % performance improvement.
+

[Hadoop Wiki] Update of "Hive/JoinOptimization" by Liyi nTang

Reply via email to