Can you also attach explain query result? What's your data format? --Xuefu
On Thu, Dec 3, 2015 at 12:09 AM, Jone Zhang <[email protected]> wrote: > Hive1.2.1 on Spark1.4.1 > > *The first query is:* > set mapred.reduce.tasks=100; > use u_wsd; > insert overwrite table t_sd_ucm_cominfo_incremental partition (ds=20151202 > ) > select t1.uin,t1.clientip from > (select uin,clientip from t_sd_ucm_cominfo_FinalResult where ds=20151202) > t1 > left outer join (select uin,clientip from t_sd_ucm_cominfo_FinalResult > where ds=20151201) t2 > on t1.uin=t2.uin > where t2.clientip is NULL; > > *The second query is:* > set mapred.reduce.tasks=100; > use u_wsd; > insert overwrite table t_sd_ucm_cominfo_incremental partition (ds=20151201 > ) > select t1.uin,t1.clientip from > (select uin,clientip from t_sd_ucm_cominfo_FinalResult where ds=20151201) > t1 > left outer join (select uin,clientip from t_sd_ucm_cominfo_FinalResult > where ds=20151130) t2 > on t1.uin=t2.uin > where t2.clientip is NULL; > > *The attachment show the two query's stages.* > *Here is the partition info* > 104.3 M > /user/hive/warehouse/u_wsd.db/t_sd_ucm_cominfo_finalresult/ds=20151202 > 110.0 M > /user/hive/warehouse/u_wsd.db/t_sd_ucm_cominfo_finalresult/ds=20151201 > 112.6 M > /user/hive/warehouse/u_wsd.db/t_sd_ucm_cominfo_finalresult/ds=20151130 > > > > *Why there are two different stages?* > *The stage1 in first query is very slowly.* > > *Thanks.* > *Best wishes.* >
