We don't really consider the ganglia monitoring critical, so we can sustain what would be an exceedingly rare failure like both nodes going down.

The real reason I did this was to reduce multicast traffic... the network guys were getting blue in the face talking about how 160 nodes were all broadcasting and listening at the same time. I don't understand the actual mechanics, but they are now happy (or should I say "marginally happier"... they never seem to really be happy ;-)

Paul
Princeton Plasma Physics Lab

Bernard Li wrote:

Hey Paul:

But I guess if the odd chance of both the two nodes going down, then
your history will be lost...

Of course if you are using Ganglia on a large cluster, you probably
don't want every node to be sending packets to each other ;-)

Cheers,

Bernard
-----Original Message-----
From: Paul Henderson [mailto:[EMAIL PROTECTED] Sent: Friday, June 04, 2004 10:38
To: Johnston Michael J Contr AFRL/DES
Cc: Bernard Li; [email protected]
Subject: Re: [Ganglia-general] All my nodes listed as clusters

What I've been doing is running the gmond on all my cluster nodes, but making all but 2 of my 160 nodes "deaf" (see gmond.conf). All the nodes then multicast their information, but only two hold the data, the other nodes just broadcast but don't hold any data.

This is *really* useful, because if one node dies or is moved, then you don't have to restart gmond on every single node to get it to 'forget' the node... you just need to do it on the two listening nodes. Also, network traffic is significantly reduced.

Paul
Princeton Plasma Physics Lab

Johnston Michael J Contr AFRL/DES wrote:

Thanks for the response Bernard!

I guess I didn't think that I could only put 1 node in the
data_source
line because how does it know to go and collect the
information from
the other nodes? Does it just scan the subnet looking for
any machine
running gmond? Every one of my nodes has the exact same gmond.conf file on it with the name of my cluster in it. Is that how it knows?

Thanks for asking about the graphs... Thanks to everyone's
pointers, I
learned that I had listed the path to the RRDtool directory, but hadn't put the executable name into the path. After I
changed that it
all started working... ;) Ganglia is really awesome!

Mike


----------------------------------------------------------------------
--

*From:* Bernard Li [mailto:[EMAIL PROTECTED]
*Sent:* Friday, June 04, 2004 11:18 AM
*To:* Johnston Michael J Contr AFRL/DES; [email protected]
*Subject:* RE: [Ganglia-general] All my nodes listed as clusters

If you only have one cluster, you only need one data_source
(think of
the data_source as the headnode of your cluster, if you will).

So you just need one entry for data_source - you can put
more than one
node in the data_source entry for redundancy purposes.

So I take it you can see your graph now and the previous thread you posted is dead?

Cheers,

Bernard

----------------------------------------------------------------------
--

   *From:* [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
*On Behalf Of
   *Johnston Michael J Contr AFRL/DES
   *Sent:* Friday, June 04, 2004 8:37
   *To:* [email protected]
   *Subject:* [Ganglia-general] All my nodes listed as clusters

   I have a silly question, as usual...

When I bring up the view of my cluster, it comes up as
a Grid... so
   it .looks like this:

   Grid > MyCluster > Choose a Node

   I'm guessing that's because in my gmetad.conf file I have every
   node in my cluster listed as:

   data_source "N1" 60 192.168.3.2:8649

   data_source "N2" 60 192.168.3.3:8649

   I'm sure that I'm listing them wrong because Ganglia thinks that
   each node is its own cluster. My question is how do I make them
appear like one unit as I see in the demo pages? Do I
add them all
   to one data_source line?

On a side question, is it normal for my head node to
always be in
   the red? It looks like it's only using about 8% CPU, but it's
   always red or orange.





Reply via email to