Thank you Harsh,
What are the possible values for the state in LiveNodeManagers bean? Will LOST, ACTIV, REBOOTED and DECOMMISSIONED show up in the state filed? ________________________________ 发件人: Harsh J <ha...@cloudera.com> 发送时间: 2018年10月15日 12:46:49 收件人: ims...@outlook.com 抄送: <user@hadoop.apache.org> 主题: Re: How can I find out which nodemanagers are unhealthy and which nodemangers are lost? The JMX servlet query for 'RMNMInfo' done via /jmx?qry=Hadoop:service=ResourceManager,name=RMNMInfo returns a LiveNodeManagers bean whose value is a JSON-parseable string of all currently-tracked NodeManagers and their actual states (UNHEALTHY, RUNNING, etc.). You can also use the 'yarn node -list' command to retrieve similar information from a CLI. On Mon, Oct 15, 2018 at 8:48 AM Huang Meilong <ims...@outlook.com> wrote: > > Hi, > > > I'm building a system to monitor my hadoop cluster, I can get metrics about > the cluster via hadoop > metrics(https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/Metrics.html?spm=5176.2020520111.111.1.278ad103oLtdlm#NodeManagerMetrics): > > > ClusterMetrics > > ClusterMetrics shows the metrics of the YARN cluster from the > ResourceManager’s perspective. Each metrics record contains Hostname tag as > additional information along with metrics. > > Name Description > NumActiveNMs Current number of active NodeManagers > NumDecommissionedNMs Current number of decommissioned NodeManagers > NumLostNMs Current number of lost NodeManagers for not sending heartbeats > NumUnhealthyNMs Current number of unhealthy NodeManagers > NumRebootedNMs Current number of rebooted NodeManagers > > > How can I find out which nodemangers are unhealthy and which are lost? Better > if it could be achieved by calling jmx rest api or hadoop command. > > > Any suggestions are appreciated, thank you. > > > > HUANG > > > > -- Harsh J