Hi, Its a simple join,
select count(*) from customer JOIN supplier ON (customer.c_nationkey = supplier.s_nationkey); customer(2.2GB) and supplier (137MB) are TPCH tables generated. Total of 40 Maps tasks are getting generated for this query. Thanks On Mon, Sep 19, 2011 at 7:08 PM, <[email protected]> wrote: > John > Can you share the hive QL you are using for joins? > > Regards > Bejoy K S > > -----Original Message----- > From: john smith <[email protected]> > Date: Mon, 19 Sep 2011 19:02:02 > To: <[email protected]> > Reply-To: [email protected] > Subject: Re: Out of heap space errors on TTs > > Hi all, > > Thanks for the inputs... > > Can I reduce the io.sort.mb ? (owing to the fact that I have less ram size > , > 2GB) > > My conf files doesn't have an entry mapred.child.java.opts .. So I guess > its > taking a default value of 200MB. > > Also how to decide the number of tasks per TT ? I have 4 cores per node and > 2GB of total memory . So how many per node maximum tasks should I set? > > Thanks > > On Mon, Sep 19, 2011 at 6:28 PM, Uma Maheswara Rao G 72686 < > [email protected]> wrote: > > > Hello, > > > > You need configure heap size for child tasks using below proprty. > > "mapred.child.java.opts" in mapred-site.xml > > > > by default it will be 200mb. But your io.sort.mb(300) is more than that. > > So, configure more heap space for child tasks. > > > > ex: > > -Xmx512m > > > > Regards, > > Uma > > > > ----- Original Message ----- > > From: john smith <[email protected]> > > Date: Monday, September 19, 2011 6:14 pm > > Subject: Out of heap space errors on TTs > > To: [email protected] > > > > > Hey guys, > > > > > > I am running hive and I am trying to join two tables (2.2GB and > > > 136MB) on a > > > cluster of 9 nodes (replication = 3) > > > > > > Hadoop version - 0.20.2 > > > Each data node memory - 2GB > > > HADOOP_HEAPSIZE - 1000MB > > > > > > other heap settings are defaults. My hive launches 40 Maptasks and > > > everytask failed with the same error > > > > > > 2011-09-19 18:37:17,110 INFO org.apache.hadoop.mapred.MapTask: > > > io.sort.mb = 300 > > > 2011-09-19 18:37:17,223 FATAL org.apache.hadoop.mapred.TaskTracker: > > > Error running child : java.lang.OutOfMemoryError: Java heap space > > > at > > > > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:781) > > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350) > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > > > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > > > > > > > > Looks like I need to tweak some of the heap settings for TTs to handle > > > the memory efficiently. I am unable to understand which variables to > > > modify (there are too many related to heap sizes). > > > > > > Any specific things I must look at? > > > > > > Thanks, > > > > > > jS > > > > > > >
