Hi,

I'm building a system to monitor my hadoop cluster, I can get metrics about the 
cluster via hadoop 
metrics(https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/Metrics.html?spm=5176.2020520111.111.1.278ad103oLtdlm#NodeManagerMetrics):


ClusterMetrics

ClusterMetrics shows the metrics of the YARN cluster from the ResourceManager’s 
perspective. Each metrics record contains Hostname tag as additional 
information along with metrics.

Name    Description
NumActiveNMs    Current number of active NodeManagers
NumDecommissionedNMs    Current number of decommissioned NodeManagers
NumLostNMs      Current number of lost NodeManagers for not sending heartbeats
NumUnhealthyNMs Current number of unhealthy NodeManagers
NumRebootedNMs  Current number of rebooted NodeManagers



How can I find out which nodemangers are unhealthy and which are lost? Better 
if  it could be achieved by calling jmx rest api or hadoop command.


Any suggestions are appreciated, thank you.



HUANG



Reply via email to