[
https://issues.apache.org/jira/browse/SLIDER-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094323#comment-15094323
]
Sumit Mohanty commented on SLIDER-1055:
---------------------------------------
I have noticed it before and thought it was some config options in YARN that is
missing. Its possible for agents to start the process while maintaining the
same process group - not sure if that will remedy the situation.
> hbase-daemon executed by slider is excepted from nodemanager container
> monitoring
> ---------------------------------------------------------------------------------
>
> Key: SLIDER-1055
> URL: https://issues.apache.org/jira/browse/SLIDER-1055
> Project: Slider
> Issue Type: Bug
> Components: application/hbase
> Affects Versions: Slider 0.81
> Reporter: kyungwan nam
>
> here is nodemanager log of a host where a HBASE_REGIONSERVER component is
> running
> {code}
> 2016-01-12 14:11:49,237 DEBUG monitor.ContainersMonitorImpl
> (ContainersMonitorImpl.java:run(361)) - Current ProcessTree list : [ 9801 ]
> 2016-01-12 14:11:49,237 DEBUG monitor.ContainersMonitorImpl
> (ContainersMonitorImpl.java:run(436)) - Constructing ProcessTree for : PID =
> 9801 ContainerId = container_e07_1451897008090_0009_01_000003
> 2016-01-12 14:11:49,262 DEBUG util.ProcfsBasedProcessTree
> (ProcfsBasedProcessTree.java:updateProcessTree(274)) - [ 9801 9806 ]
> 2016-01-12 14:11:49,262 INFO monitor.ContainersMonitorImpl
> (ContainersMonitorImpl.java:run(458)) - Memory usage of ProcessTree 9801 for
> container-id container_e07_1451897008090_0009_01_000003: 14.2 MB of 1 GB
> physical memory used; 517.1 MB of 2.1 GB virtual memory used
> {code}
> used memory for the container is lower than i expected.
> because pids ( 9801 9806 ) are slider-agent process. regionserver process was
> excepted from monitoring.
> here is the result of "ps axjf"
> {code}
> 9798 9801 9801 9801 ? -1 Ss 500 0:00 \_ /bin/bash -c
> python ./infra/agent/slider-agent/agent/main.py --label
> container_e07_1451897008090_0009_01_000003___HBASE_REGIONSERVER --zk-quorum
> 9801 9806 9801 9801 ? -1 Sl 500 0:01 \_ python
> ./infra/agent/slider-agent/agent/main.py --label
> container_e07_1451897008090_0009_01_000003___HBASE_REGIONSERVER --zk-quorum
> 1 9979 9801 9801 ? -1 S 500 0:00 bash
> /volume/nodemanager/usercache/yarn/appcache/application_1451897008090_0009/container_e07_1451897008090_0009_01_000003/app/install/hbase-0.98.13-hadoop2/bin/hbase-daemon.sh
> --config
> /volume/nodemanager/usercache/yarn/appcache/application_1451897008090_0009/container_e07_1451897008090_0009_01_000003/app/install/hbase-0.98.13-hadoop2/conf
> foreground_start regionserver
> 9979 9994 9801 9801 ? -1 Sl 500 0:10 \_
> /package/jdk-1.7.0_45/bin/java -Dproc_regionserver
> -XX:OnOutOfMemoryError=kill -9 %p -Xmx1000m -XX:+UseConcMarkSweepGC
> -XX:ErrorFile=/var/logs/application_1451897008090_0009/container_e07_1451897008090_0009_01_000003/hs_err_pid%p.log
> -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps
> -Xloggc:/var/logs/application_1451897008090_0009/container_e07_1451897008090_0009_01_000003/gc.log-201601121408
> -Xmn200m -XX:CMSInitiatingOccupancyFraction=70 -Xms1024m -Xmx1024m
> -Dhbase.log.dir=/var/logs/application_1451897008090_0009/container_e07_1451897008090_0009_01_000003
> -Dhbase.log.file=hbase-yarn-regionserver.log
> -Dhbase.home.dir=/volume/nodemanager/usercache/yarn/appcache/application_1451897008090_0009/container_e07_1451897008090_0009_01_000003/app/install/hbase-0.98.13-hadoop2/bin/..
> -Dhbase.id.str=yarn -Dhbase.root.logger=INFO,RFA
> -Djava.library.path=/package/hadoop-yarn-2.7.1-arch-centos6-x86_64/lib/native
> -Dhbase.security.logger=INFO,RFAS
> org.apache.hadoop.hbase.regionserver.HRegionServer start
> {code}
> when i use the ProcfsBasedProcessTree (default)
> process-tree is determined by relationship between parent and child process.
> so, daemonized process (ppid=1) can’t be included in process-tree.
> I don't know it can be fixed in slider.
> does it need to implement another ResourceCalculatorProcessTree to replace
> the ProcfsBasedProcessTree?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)