Thanks for the reply Bejoy. Sent from my iPhone
On Oct 1, 2012, at 12:30 AM, "Bejoy KS " <bejo...@outlook.com> wrote: > Hi Abshiek > > Both your tables are ideal candidates for map join. > > Can you try a plain join statement without setting any properties other than > num reducers and a map join as the next step. > > hive> set mapred.reduce.tasks=5; > hive>SELECT t2.col1,t3.col1FROM table2 t2JOIN table3 t3; > -- I tried this but it is still firing only one reducer. > Once this goes well try doing map side join. > hive> set auto.convert.join=true; > hive>SELECT t2.col1,t3.col1FROM table2 t2JOIN table3 t3; -- This also does not work It is only showing me as map = 0%, reduce = 0% map = 0%, reduce = 0% map = 0%, reduce = 0% map = 0%, reduce = 0% For 10 min Regards Abhi > > ------Original Message------ > From: Abhishek > To: user@hive.apache.org > Cc: user@hive.apache.org > Cc: Bejoy Ks > Subject: Re: Cartesian Product in HIVE > Sent: Oct 1, 2012 09:32 > > Thanks for the reply Bejoy. I did not any order by in the query. Here are the > properities I have used and query, table sizes ----- set > mapred.reduce.tasks=17; set mapred.child.java.opts=xmx2073741824; set > io.sort.mb=512; set io.sort.factor=250; set > mapred.reduce.parallel.copies=true; set mapred.job.reuse.jvm.num.tasks=1; set > hive.mapred.reduce.tasks.speculative.execution=false; set > hive.mapred.map.tasks.speculative.execution=false; CREATE TABLE t1 AS SELECT > /*+ STREAMTABLE(t2) */ t2.col1, t3.col1 FROM table2 t2 JOIN table3 t3 table2 > : 997406 rows total bytes: 20848934 -- 19.88 mb table3 : 20773 rows total > bytes: 353127 -- 0.33 mb #of Mappers: 4 #of reducers: 1 Regards Abhi On Sep > 30, 2012, at 9:35 AM, Bejoy KS <bejo...@outlook.com> wrote: Hi Abshiek No > need of any similar columns for map join to work. It is just taking the join > process to mapper rather then doing the same in a reducer. The actual bottle > neck is the single reducer. Need to figure out why only one reducer is fired > rather than the set value of 17. Are you using ORDER BY in your query? If so, > it sets the number of reducers to 1. Can you provide the full console stack > here so that we'll be able to understand your issue and help you better? > (starting from the properties you set, your query and the error ). Also can > you get the exact data sizes for two tables. Regards Bejoy KS > From: > abhishek.dod...@gmail.com > Date: Sat, 29 Sep 2012 07:44:06 -0700 > Subject: > Re: Cartesian Product in HIVE > To: user@hive.apache.org; bejoy...@yahoo.com > > > Thanks for the reply Bejoy. > > I tried to map join, by setting the > property mentioned by you and Even > increased the small table file size > > 20k table size would be not more than 200 mb but it doesnot work. > > > Cartesian product of tables, they dont have any similar columns does > map > join work here?? > > By applying below setting with STREAM TABLE HINT it was > processing > around 5 Billion rows per hour,so process took around 4 hrs. > > > Set io.sort.mb=512 > Set mapred.reduce.tasks=17 > Set io.sort.factor=256 > Set > Regards > Bejoy KS > > Send from handheld, please excuse typos.