Re: [Ganglia-general] Jittery Displays - number of nodes changing erratically.

Ron Reeder Thu, 17 Jun 2004 15:32:34 -0700

I did the updates ... No joy....

I know I updated to 2.5.5 for the webfront end, but the footer says 2.5.4


(Actually, I pulled the tarball, exploded it... and checked the conf.php file...
<?php
# $Id: conf.php,v 1.16 2003/07/29 23:55:01 sacerdoti Exp $
#
# Gmetad-webfrontend version. Used to check for updates.
#
$majorversion = 2;
$minorversion = 5;
$microversion = 4;

So, they never updated the microversion ....
naughty...)

So, I've got:

webfronted  2.5.5-1
gemetad     2.5.6
gmond       2.5.6-1

The old gmonds.... do not have this issue...

What else did I change .


Here's a diff from one of old compute nodes ... vs new
 # $Id: gmond.conf,v 1.3 2004/01/20 19:15:23 sacerdoti Exp $
---
> # $Id: gmond.conf,v 1.2 2002/09/19 00:37:18 sacerdoti Exp $
11c11
< name  "K-Cluster"
---
> name  "Linux Compute"
17c17
< owner "Schlumberger"
---
> owner "Denver WesternGeco SLB"
23a24
> latlong "N39.75 W104.87"
28a30
> url "http://ddclx01.denver.nam.slb.com/";
45c47
< # mcast_if  eth1
---
> mcast_if  eth0
69c71,73
< 1xx.1xx.147.179
---
> # 2.3.2.3 3.4.3.4 5.6.5.6
note: ips are x'd by me...
> trusted_hosts 1xx.2xx.6.201 1xx.2xx.12.151 192.168.1.1
> #trusted_hosts 192.168.1.1
74c78
< # num_nodes  1024
---
> num_nodes  128
99c103
< #no_setuid  on
---
> # no_setuid  on
113,120c117
< # rpr - on temporarily ... till where gmetad server will live is decided.
< all_trusted on
< #
< # If you want dead nodes to "time out", enter a nonzero value here. If 
specified,
< # a host will be removed from our state if we have not heard from it in this
< # number of seconds.
< # default: 0 (immortal)
< host_dmax 3600
---
> # all_trusted on


Ok, so num_nodes is different ... i.e. defautls to 1024 ..
If this is truely a cluster metric (not a grid metric) that should not matter...

I added the host_dmax ... Since I've never seen ganglia display a host as 
down....
I think I need to start using deaf as well.

Bernard Li wrote:

Hi Ron:

I would actually try to use consistent versions for both gmetad and
gmond (and the webfrontend too but I don't think it has been updated
recently).

I have tried to use mis-matching versions before and it seems okay, but
I guess it's best to keep things consistent to eliminate all the
possibilities...

Cheers,

Bernard

-----Original Message-----

From: Ron Reeder [mailto:[EMAIL PROTECTED]Sent: Thursday, June 17, 2004 12:58

To: Bernard Li
Cc: [email protected]

Subject: Re: [Ganglia-general] Jittery Displays - number ofnodes changing erratically.


gmond was at

Bernard Li wrote:

Hey Ron:

Which version did you upgrade from?


gmond 2.5.1

BUT,  ... making it more interesting ....

A co-worker installed a new Ganglia setup (server/clients) in England.

He's seeing the same thing... on different versions of serverbackend/frontend httpd software.


Site  - front  -  back
Denver  2.5.1  -  2.5.5
Gatwick 2.5.4  -  2.5.6

all with gmond 2.5.6-1 are seeing this "jittery" issue...

Ok, I'll upgrade the Denver center to latest - see if thatdoesn't help.


I kept my Ganglia web server at:  gmetad:

I have upgraded from a previous version without any problems...
2.5.4...?

Cheers,

Bernard
-----Original Message-----
From: Ron Reeder [mailto:[EMAIL PROTECTED]
Sent: Thursday, June 17, 2004 11:37
To: Bernard Li
Cc: [email protected]
Subject: Re: [Ganglia-general] Jittery Displays - number of nodeschanging erratically.
Yes,
I've been running Ganglia - for well over a year... no

problems, after

initial install.
I've upgraded several times... again no biggie....



Bernard Li wrote:
Hi Ron:
Did you recently upgrade from an older version of Ganglia?

This is

really an odd behaviour...

Cheers,

Bernard

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On


Behalf Of Ron

Reeder
Sent: Thursday, June 17, 2004 11:15
To: [email protected]
Subject: [Ganglia-general] Jittery Displays - number of


nodes changing

erratically.

Sirs,

With new gmond 2.5.6-1 - We are getting 'jittery' displays


- where the

number of nodes and number of CPU's is varying wildly on


the 'Overview

of <Cluster>'  page.

The summed LOAD and MEM charts are particularly bad .

Yes, when ever I go to the page is always shows:  82 hosts
(164 CPUs) up and running none down.

I do have the value:
host_dmax 3600

in gmond.conf

'Cause it seems that Ganglia  _NEVER_ thinks hosts die....

(Maybe a seperate problem)

How could the node/CPU lines graph as horrible zig-zags (nothorizontal-lines as they should) Yet, the host count is

always the

same?

Chart is attached gif file.

Re: [Ganglia-general] Jittery Displays - number of nodes changing erratically.

Reply via email to