Ian,

Thanks!  That resolved it.  I had restarted gmetad but not gmond on
the head node.   

--Lew

-----Original Message-----
From: Ian Cunningham [mailto:[EMAIL PROTECTED] 
Sent: Friday, March 30, 2007 3:12 PM
To: Lewis E. Randerson
Cc: [email protected]; Kevin Ying
Subject: Re: [Ganglia-general] Changing IP addresses sometimes causes
heartbeat to not be seen even though TN is an acceptable value.

Lew,

You need to restart the gmond on the head node to get it to forget about 
the old machine. The head node will be the first node in the data source 
directive in the gmetad.conf. One of the features of ganglia is to 
remember all the nodes that have sent information in the past. Gmond 
tracks nodes by IP address, not by host name as you expected.

Good Luck,
Ian

Lewis E. Randerson wrote:
> Hi,
>
> I have changed the ip address of some members of one of our cluster and
have
> seen that in some cases this causes the heartbeat to not be recognized
> even though TN is an acceptable value and current data is still displayed.
> What happens is that a "This host is down' message is displayed and the
host
> is not included with the list of up computers.
>
> Know what I can do to repair this?
>
> The problem is that the ganglia host report web page is reporting that no
> heartbeat is being seen for four of fifteen machines whose ip address has
> been
> changed.  However if you look at the XML being served by gmetad you see.
>
> <HOST NAME="host name" IP="new ip address" REPORTED="1175278727" TN="4" 
>     TMAX="20" DMAX="0" LOCATION="unspecified" GMOND_STARTED="1175278188">
> <HOST NAME="host name" IP="old ip address" REPORTED="1173869674"
TN="1409057"
>     TMAX="20" DMAX="0" LOCATION="unspecified" GMOND_STARTED="1170351747">
>
> Note two entries. One for the new ip address with TN=4 and one for the old
> padres with TN=1409057.  It looks like for heartbeat the latest entry in
the
> XML listing is being used.
>
> For the other fifteen machines, the order of the XML entries is switched
with
> the
> new ip address being last, then the heartbeat is recognized.
>
> --Lew
>
>
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share
your
> opinions on IT & business topics through brief surveys-and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> Ganglia-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/ganglia-general
>
>   


Reply via email to