Hey Ron: I assume that you have tried shutting down all the daemons and then re-starting them...
Cheers, Bernard > -----Original Message----- > From: Ron Reeder [mailto:[EMAIL PROTECTED] > Sent: Thursday, June 17, 2004 15:32 > To: Bernard Li > Cc: [email protected] > Subject: Re: [Ganglia-general] Jittery Displays - number of > nodes changing erratically. > > I did the updates ... No joy.... > > I know I updated to 2.5.5 for the webfront end, but the > footer says 2.5.4 > > (Actually, I pulled the tarball, exploded it... and checked > the conf.php file... > <?php > # $Id: conf.php,v 1.16 2003/07/29 23:55:01 sacerdoti Exp $ # > # Gmetad-webfrontend version. Used to check for updates. > # > $majorversion = 2; > $minorversion = 5; > $microversion = 4; > > So, they never updated the microversion .... > naughty...) > > So, I've got: > > webfronted 2.5.5-1 > gemetad 2.5.6 > gmond 2.5.6-1 > > The old gmonds.... do not have this issue... > > What else did I change . > > > Here's a diff from one of old compute nodes ... vs new > # $Id: gmond.conf,v 1.3 2004/01/20 19:15:23 sacerdoti Exp $ > --- > > # $Id: gmond.conf,v 1.2 2002/09/19 00:37:18 sacerdoti Exp $ > 11c11 > < name "K-Cluster" > --- > > name "Linux Compute" > 17c17 > < owner "Schlumberger" > --- > > owner "Denver WesternGeco SLB" > 23a24 > > latlong "N39.75 W104.87" > 28a30 > > url "http://ddclx01.denver.nam.slb.com/" > 45c47 > < # mcast_if eth1 > --- > > mcast_if eth0 > 69c71,73 > < 1xx.1xx.147.179 > --- > > # 2.3.2.3 3.4.3.4 5.6.5.6 > note: ips are x'd by me... > > trusted_hosts 1xx.2xx.6.201 1xx.2xx.12.151 192.168.1.1 > > #trusted_hosts 192.168.1.1 > 74c78 > < # num_nodes 1024 > --- > > num_nodes 128 > 99c103 > < #no_setuid on > --- > > # no_setuid on > 113,120c117 > < # rpr - on temporarily ... till where gmetad server will > live is decided. > < all_trusted on > < # > < # If you want dead nodes to "time out", enter a nonzero > value here. If specified, < # a host will be removed from our > state if we have not heard from it in this < # number of seconds. > < # default: 0 (immortal) > < host_dmax 3600 > --- > > # all_trusted on > > > Ok, so num_nodes is different ... i.e. defautls to 1024 .. > If this is truely a cluster metric (not a grid metric) that > should not matter... > > I added the host_dmax ... Since I've never seen ganglia > display a host as down.... > I think I need to start using deaf as well. > > Bernard Li wrote: > > Hi Ron: > > > > I would actually try to use consistent versions for both gmetad and > > gmond (and the webfrontend too but I don't think it has > been updated > > recently). > > > > I have tried to use mis-matching versions before and it seems okay, > > but I guess it's best to keep things consistent to > eliminate all the > > possibilities... > > > > Cheers, > > > > Bernard > > > > > >>-----Original Message----- > >>From: Ron Reeder [mailto:[EMAIL PROTECTED] > >>Sent: Thursday, June 17, 2004 12:58 > >>To: Bernard Li > >>Cc: [email protected] > >>Subject: Re: [Ganglia-general] Jittery Displays - number of nodes > >>changing erratically. > >> > >>gmond was at > >> > >>Bernard Li wrote: > >> > >>>Hey Ron: > >>> > >>>Which version did you upgrade from? > >> > >>gmond 2.5.1 > >> > >>BUT, ... making it more interesting .... > >> > >>A co-worker installed a new Ganglia setup (server/clients) > in England. > >> > >>He's seeing the same thing... on different versions of server > >>backend/frontend httpd software. > >> > >>Site - front - back > >>Denver 2.5.1 - 2.5.5 > >>Gatwick 2.5.4 - 2.5.6 > >> > >>all with gmond 2.5.6-1 are seeing this "jittery" issue... > >> > >>Ok, I'll upgrade the Denver center to latest - see if that doesn't > >>help. > >> > >>I kept my Ganglia web server at: gmetad: > >> > >>>I have upgraded from a previous version without any problems... > >>>2.5.4...? > >>> > >>>Cheers, > >>> > >>>Bernard > >>> > >>> > >>> > >>>>-----Original Message----- > >>>>From: Ron Reeder [mailto:[EMAIL PROTECTED] > >>>>Sent: Thursday, June 17, 2004 11:37 > >>>>To: Bernard Li > >>>>Cc: [email protected] > >>>>Subject: Re: [Ganglia-general] Jittery Displays - number of nodes > >>>>changing erratically. > >>>> > >>>>Yes, > >>>> > >>>>I've been running Ganglia - for well over a year... no > >> > >>problems, after > >> > >>>>initial install. > >>>>I've upgraded several times... again no biggie.... > >>>> > >>>> > >>>> > >>>>Bernard Li wrote: > >>>> > >>>> > >>>>>Hi Ron: > >>>>> > >>>>>Did you recently upgrade from an older version of Ganglia? > >> > >> This is > >> > >>>>>really an odd behaviour... > >>>>> > >>>>>Cheers, > >>>>> > >>>>>Bernard > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>>-----Original Message----- > >>>>>>From: [EMAIL PROTECTED] > >>>>>>[mailto:[EMAIL PROTECTED] On > >>>> > >>>>Behalf Of Ron > >>>> > >>>> > >>>>>>Reeder > >>>>>>Sent: Thursday, June 17, 2004 11:15 > >>>>>>To: [email protected] > >>>>>>Subject: [Ganglia-general] Jittery Displays - number of > >>>> > >>>>nodes changing > >>>> > >>>> > >>>>>>erratically. > >>>>>> > >>>>>>Sirs, > >>>>>> > >>>>>>With new gmond 2.5.6-1 - We are getting 'jittery' displays > >>>> > >>>>- where the > >>>> > >>>> > >>>>>>number of nodes and number of CPU's is varying wildly on > >>>> > >>>>the 'Overview > >>>> > >>>> > >>>>>>of <Cluster>' page. > >>>>>> > >>>>>>The summed LOAD and MEM charts are particularly bad . > >>>>>> > >>>>>>Yes, when ever I go to the page is always shows: 82 hosts > >>>>>>(164 CPUs) up and running none down. > >>>>>> > >>>>>>I do have the value: > >>>>>>host_dmax 3600 > >>>>>> > >>>>>>in gmond.conf > >>>>>> > >>>>>>'Cause it seems that Ganglia _NEVER_ thinks hosts die.... > >>>>>> > >>>>>>(Maybe a seperate problem) > >>>>>> > >>>>>>How could the node/CPU lines graph as horrible zig-zags (not > >>>>>>horizontal-lines as they should) Yet, the host count is > >> > >>always the > >> > >>>>>>same? > >>>>>> > >>>>>>Chart is attached gif file. > >>>>>> > >>>>>> > >>>>> > >>>>> > >> > > > >

