Hi All,
I have a question about 'gstat'. I'm running Ganglia 3.7.1. When I run
'gstat', some of my machines show proc/total with accurate numbers, but most
are showing 0/0. Please see below.
================================================================
[root@clusterhn: ~/temp]# gstat
CLUSTER INFORMATION
Name: My Cluster
Hosts: 22
Gexec Hosts: 22
Dead Hosts: 0
Localtime: Wed Nov 18 13:56:58 2015
CLUSTER HOSTS
Hostname LOAD CPU Gexec
CPUs (Procs/Total) [ 1, 5, 15min] [ User, Nice, System, Idle, Wio]
n012.cluster.com
16 ( 0/ 0) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n006.cluster.com
16 ( 0/ 0) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n017.cluster.com
16 ( 0/ 0) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n016.cluster.com
16 ( 1/ 0) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n015.cluster.com
16 ( 0/ 0) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n014.cluster.com
16 ( 0/ 0) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n013.cluster.com
16 ( 0/ 424) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n021.cluster.com
16 ( 0/ 424) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n019.cluster.com
16 ( 0/ 0) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n020.cluster.com
16 ( 1/ 424) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n011.cluster.com
16 ( 0/ 0) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n010.cluster.com
16 ( 1/ 424) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n009.cluster.com
16 ( 0/ 0) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n008.cluster.com
16 ( 0/ 425) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n007.cluster.com
16 ( 0/ 0) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n018.cluster.com
16 ( 0/ 0) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n004.cluster.com
16 ( 0/ 0) [ 0.00, 0.00, 0.00] [ 0.0, 14.7, 6.7, 63.3, 0.0]
ON
n003.cluster.com
16 ( 0/ 0) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n002.cluster.com
16 ( 0/ 0) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
n001.cluster.com
16 ( 0/ 424) [ 0.00, 0.00, 0.00] [ 0.0, 0.0, 0.0, 100.0, 0.0]
ON
clusterhn.cluster.com
12 ( 0/ 749) [ 0.08, 0.07, 0.08] [ 0.0, 0.0, 0.0, 99.2, 0.7]
ON
n005.cluster.com
16 ( 2/ 0) [ 6.94, 8.60, 0.00] [ 0.0, 9.3, 0.8, 87.1, 2.7]
ON
================================================================
I have done the following to reset the data in gstat:
1) Stop 'gmetad' and 'gmond' service on HeadNode
2) Stop 'gmond' service on ComputeNodes
3) Remove '/var/lib/ganglia/rrds/*'
4) Start 'gmetad' and 'gmond' service on HeadNode
5) Start 'gmond' service on ComputeNodes
All of the nodes are responding to 'gstat', but the details aren't population
on most of the ComputeNodes.
I also need a bit of an explanation as to what CPU Proc and CPU Total mean,
just so I can understand a bit better.
Any help is greatly appreciated!
Thanks.
--
Kamran Khan
------------------------------------------------------------------------------
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general