Hi,

Its a simple join,

select count(*) from customer JOIN supplier ON (customer.c_nationkey =
supplier.s_nationkey);

customer(2.2GB) and supplier (137MB) are TPCH tables generated.

Total of 40 Maps tasks are getting generated for this query.

Thanks

On Mon, Sep 19, 2011 at 7:08 PM, <[email protected]> wrote:

> John
>    Can you share the hive QL you are using for joins?
>
> Regards
> Bejoy K S
>
> -----Original Message-----
> From: john smith <[email protected]>
> Date: Mon, 19 Sep 2011 19:02:02
> To: <[email protected]>
> Reply-To: [email protected]
> Subject: Re: Out of heap space errors on TTs
>
> Hi all,
>
> Thanks for the inputs...
>
> Can I reduce the io.sort.mb ? (owing to the fact that I have less ram size
> ,
> 2GB)
>
> My conf files doesn't have an entry mapred.child.java.opts .. So I guess
> its
> taking a default value of 200MB.
>
> Also how to decide the number of tasks per TT ? I have 4 cores per node and
> 2GB of total memory . So how many per node maximum tasks should I set?
>
> Thanks
>
> On Mon, Sep 19, 2011 at 6:28 PM, Uma Maheswara Rao G 72686 <
> [email protected]> wrote:
>
> > Hello,
> >
> > You need configure heap size for child tasks using below proprty.
> > "mapred.child.java.opts" in mapred-site.xml
> >
> > by default it will be 200mb. But your io.sort.mb(300) is more than that.
> > So, configure more heap space for child tasks.
> >
> > ex:
> >  -Xmx512m
> >
> > Regards,
> > Uma
> >
> > ----- Original Message -----
> > From: john smith <[email protected]>
> > Date: Monday, September 19, 2011 6:14 pm
> > Subject: Out of heap space errors on TTs
> > To: [email protected]
> >
> > > Hey guys,
> > >
> > > I am running hive and I am trying to join two tables (2.2GB and
> > > 136MB) on a
> > > cluster of 9 nodes (replication = 3)
> > >
> > > Hadoop version - 0.20.2
> > > Each data node memory - 2GB
> > > HADOOP_HEAPSIZE - 1000MB
> > >
> > > other heap settings are defaults. My hive launches 40 Maptasks and
> > > everytask failed with the same error
> > >
> > > 2011-09-19 18:37:17,110 INFO org.apache.hadoop.mapred.MapTask:
> > > io.sort.mb = 300
> > > 2011-09-19 18:37:17,223 FATAL org.apache.hadoop.mapred.TaskTracker:
> > > Error running child : java.lang.OutOfMemoryError: Java heap space
> > >       at
> > >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:781)
> >     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350)
> > >       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > >       at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > >
> > >
> > > Looks like I need to tweak some of the heap settings for TTs to handle
> > > the memory efficiently. I am unable to understand which variables to
> > > modify (there are too many related to heap sizes).
> > >
> > > Any specific things I must look at?
> > >
> > > Thanks,
> > >
> > > jS
> > >
> >
>
>

Reply via email to