Ah, I see. Take a look at the NN, hive use hdfs and if you have jobs with many small files in a table (logfiles as example) and a large cluster the NN could be a bottleneck.
- Alex On Mon, Dec 12, 2011 at 9:20 AM, 王锋 <wfeng1...@163.com> wrote: > before I set -xmx 2g, but hiveserver throws many exception OOM. so I reset > and at the end I set xmx=15g, newRatio=1. Because I watch hiveserver for a > long time.It use memory very large when running job, usually it can be 8g > ,10g,or 15g. so I set xmx=15g ,and newRatio=1 , the young generation will > be large enough to support concurrent running jobs and gc quickly. > > > At 2011-12-12 16:09:05,"alo alt" <wget.n...@googlemail.com> wrote: > > Hi, > > see I right you set java with -xmx=15000M? And you set minimum heap size > (xms) = 15000M? > Here you give java no chance to use less than 15GB memory, because min > says 15000M, and max too. I wondering why any java-process have to need 15G > of memory. Could be in large tomcat od jboss environments. But for hive I'm > quite not sure.. > > - Alex > > > 2011/12/12 王锋 <wfeng1...@163.com> > >> >> I want to know why the hiveserver use so large memory,and where the >> memory has been used ? >> >> 在 2011-12-12 10:02:44,"王锋" <wfeng1...@163.com> 写道: >> >> >> The namenode summary: >> >> >> >> the mr summary >> >> >> and hiveserver: >> >> >> hiveserver jvm args: >> export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=1 -Xms15000m >> -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParallelGC >> -XX:ParallelGCThreads=20 -XX:+UseParall >> elOldGC -XX:-UseGCOverheadLimit -verbose:gc -XX:+PrintGCDetails >> -XX:+PrintGCTimeStamps" >> >> now we using 3 hiveservers in the same machine. >> >> >> 在 2011-12-12 09:54:29,"Aaron Sun" <aaron.su...@gmail.com> 写道: >> >> how's the data look like? and what's the size of the cluster? >> >> 2011/12/11 王锋 <wfeng1...@163.com> >> >>> Hi, >>> >>> I'm one of engieer of sina.com. We have used hive ,hiveserver >>> several months. We have our own tasks schedule system .The system can >>> schedule tasks running with hiveserver by jdbc. >>> >>> But The hiveserver use mem very large, usally large than 10g. we >>> have 5min tasks which will be running every 5 minutes.,and have hourly >>> tasks .total num of tasks is 40. And we start 3 hiveserver in one linux >>> server,and be cycle connected . >>> >>> so why Memory of hiveserver using so large and how we do or some >>> suggestion from you ? >>> >>> Thanks and Best Regards! >>> >>> Royce Wang >>> >>> >>> >>> >>> >> >> >> >> >> > > > -- > Alexander Lorenz > http://mapredit.blogspot.com > > *P **Think of the environment: please don't print this email unless you > really need to.* > > > > > -- Alexander Lorenz http://mapredit.blogspot.com *P **Think of the environment: please don't print this email unless you really need to.*
<<hiveserver.png>>
<<mr.png>>
<<namenode.png>>