how about watch one hive job's stacks .Can it be watched by jobId?
use ps -Lf hiveserverPId| wc -l , the threads num of one hiveserver has 132 theads. [root@d048049 logs]# ps -Lf 15511|wc -l 132 [root@d048049 logs]# every stack size is 10m the mem will be 1320M,1g. so hive's lowest mem is 1g? 在 2011-12-12 17:59:52,"alo alt" <wget.n...@googlemail.com> 写道: >When you start a high-load hive query can you watch the stack-traces? >Its possible over the webinterface: >http://jobtracker:50030/stacks > >- Alex > > >2011/12/12 王锋 <wfeng1...@163.com> >> >> hiveserver will throw oom after several hours . >> >> >> At 2011-12-12 17:39:21,"alo alt" <wget.n...@googlemail.com> wrote: >> >> what happen when you set xmx=2048m or similar? Did that have any negative >> effects for running queries? >> >> 2011/12/12 王锋 <wfeng1...@163.com> >>> >>> I have modify hive jvm args. >>> the new args is -Xmx15000m -XX:NewRatio=1 -Xms2000m . >>> >>> but the memory used by hiveserver is still large. >>> >>> >>> >>> >>> >>> At 2011-12-12 16:20:54,"Aaron Sun" <aaron.su...@gmail.com> wrote: >>> >>> Not from the running jobs, what I am saying is the heap size of the Hadoop >>> really depends on the number of files, directories on the HDFS. Remove old >>> files periodically or merge small files would bring in some performance >>> boost. >>> >>> On the Hive end, the memory consumed also depends on the queries that are >>> executed. Monitor the reducers of the Hadoop job, and my experiences are >>> that reduce part could be the bottleneck here. >>> >>> It's totally okay to host multiple Hive servers on one machine. >>> >>> 2011/12/12 王锋 <wfeng1...@163.com> >>>> >>>> is the files you said the files from runned jobs of our system? and them >>>> can't be so much large. >>>> >>>> why is the cause of namenode. what are hiveserver doing when it use so >>>> large memory? >>>> >>>> how do you use hive? our method using hiveserver is correct? >>>> >>>> Thanks. >>>> >>>> 在 2011-12-12 14:27:09,"Aaron Sun" <aaron.su...@gmail.com> 写道: >>>> >>>> Not sure if this is because of the number of files, since the namenode >>>> would track each of the file and directory, and blocks. >>>> See this one. http://www.cloudera.com/blog/2009/02/the-small-files-problem/ >>>> >>>> Please correct me if I am wrong, because this seems to be more like a hdfs >>>> problem which is actually irrelevant to Hive. >>>> >>>> Thanks >>>> Aaron >>>> >>>> 2011/12/11 王锋 <wfeng1...@163.com> >>>>> >>>>> >>>>> I want to know why the hiveserver use so large memory,and where the >>>>> memory has been used ? >>>>> >>>>> 在 2011-12-12 10:02:44,"王锋" <wfeng1...@163.com> 写道: >>>>> >>>>> >>>>> The namenode summary: >>>>> >>>>> >>>>> >>>>> the mr summary >>>>> >>>>> >>>>> and hiveserver: >>>>> >>>>> >>>>> hiveserver jvm args: >>>>> export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=1 -Xms15000m >>>>> -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParallelGC >>>>> -XX:ParallelGCThreads=20 -XX:+UseParall >>>>> elOldGC -XX:-UseGCOverheadLimit -verbose:gc -XX:+PrintGCDetails >>>>> -XX:+PrintGCTimeStamps" >>>>> >>>>> now we using 3 hiveservers in the same machine. >>>>> >>>>> >>>>> 在 2011-12-12 09:54:29,"Aaron Sun" <aaron.su...@gmail.com> 写道: >>>>> >>>>> how's the data look like? and what's the size of the cluster? >>>>> >>>>> 2011/12/11 王锋 <wfeng1...@163.com> >>>>>> >>>>>> Hi, >>>>>> >>>>>> I'm one of engieer of sina.com. We have used hive ,hiveserver >>>>>> several months. We have our own tasks schedule system .The system can >>>>>> schedule tasks running with hiveserver by jdbc. >>>>>> >>>>>> But The hiveserver use mem very large, usally large than 10g. we >>>>>> have 5min tasks which will be running every 5 minutes.,and have hourly >>>>>> tasks .total num of tasks is 40. And we start 3 hiveserver in one linux >>>>>> server,and be cycle connected . >>>>>> >>>>>> so why Memory of hiveserver using so large and how we do or some >>>>>> suggestion from you ? >>>>>> >>>>>> Thanks and Best Regards! >>>>>> >>>>>> Royce Wang >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >>> >>> >> >> >> >> -- >> Alexander Lorenz >> http://mapredit.blogspot.com >> >> P Think of the environment: please don't print this email unless you really >> need to. >> >> >> >> > > > >-- >Alexander Lorenz >http://mapredit.blogspot.com > >P Think of the environment: please don't print this email unless you >really need to.