Hi,
  I am trying to do inner join on two tables, but running for long time

Tab1 - 100GB
Tab2 - 2GB --  Partition table on source

I tried doing stream table, but ran for long time like 3 hrs : Looks like
only 1 reducer is working on it
I tried Map Join by increasing the mem, it failed.

Pls find the sample query:


set hive.ignore.mapjoin.hint=false;

SET mapred.reduce.tasks=320;

create table ev_claim_claimline_pat_test as

select /*+ streamtable(c) */ c.*, p.col1,p.col2,p.col3 from Tab2 p inner
join Tab1 c

on (trim(p.pid)=trim(c.p_id) and p.source='XYZ');


Can some one help me.


Thanks,

Karthik. B

Reply via email to