Hi All,

I have a question about 'gstat'.  I'm running Ganglia 3.7.1.  When I run 
'gstat', some of my machines show proc/total with accurate numbers, but most 
are showing 0/0.  Please see below.


================================================================

[root@clusterhn: ~/temp]# gstat
CLUSTER INFORMATION
       Name: My Cluster
      Hosts: 22
Gexec Hosts: 22
 Dead Hosts: 0
  Localtime: Wed Nov 18 13:56:58 2015

CLUSTER HOSTS
Hostname                     LOAD                       CPU              Gexec
 CPUs (Procs/Total) [     1,     5, 15min] [  User,  Nice, System, Idle, Wio]

n012.cluster.com
   16 (    0/    0) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n006.cluster.com
   16 (    0/    0) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n017.cluster.com
   16 (    0/    0) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n016.cluster.com
   16 (    1/    0) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n015.cluster.com
   16 (    0/    0) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n014.cluster.com
   16 (    0/    0) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n013.cluster.com
   16 (    0/  424) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n021.cluster.com
   16 (    0/  424) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n019.cluster.com
   16 (    0/    0) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n020.cluster.com
   16 (    1/  424) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n011.cluster.com
   16 (    0/    0) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n010.cluster.com
   16 (    1/  424) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n009.cluster.com
   16 (    0/    0) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n008.cluster.com
   16 (    0/  425) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n007.cluster.com
   16 (    0/    0) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n018.cluster.com
   16 (    0/    0) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n004.cluster.com
   16 (    0/    0) [  0.00,  0.00,  0.00] [   0.0,  14.7,   6.7,  63.3,   0.0] 
ON
n003.cluster.com
   16 (    0/    0) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n002.cluster.com
   16 (    0/    0) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
n001.cluster.com
   16 (    0/  424) [  0.00,  0.00,  0.00] [   0.0,   0.0,   0.0, 100.0,   0.0] 
ON
clusterhn.cluster.com
   12 (    0/  749) [  0.08,  0.07,  0.08] [   0.0,   0.0,   0.0,  99.2,   0.7] 
ON
n005.cluster.com
   16 (    2/    0) [  6.94,  8.60,  0.00] [   0.0,   9.3,   0.8,  87.1,   2.7] 
ON
================================================================

I have done the following to reset the data in gstat:
1) Stop 'gmetad' and 'gmond' service on HeadNode
2) Stop 'gmond' service on ComputeNodes
3) Remove '/var/lib/ganglia/rrds/*'
4) Start 'gmetad' and 'gmond' service on HeadNode
5) Start 'gmond' service on ComputeNodes

All of the nodes are responding to 'gstat', but the details aren't population 
on most of the ComputeNodes.

I also need a bit of an explanation as to what CPU Proc and CPU Total mean, 
just so I can understand a bit better.

Any help is greatly appreciated!

Thanks.

--
Kamran Khan
------------------------------------------------------------------------------
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to