Thanks again Steven for the pointer.  I was able to get it to work by 
turning all_trusted on, or setting the trusted host to the ip of the
webfrontend machine.  I did not think to look there at the beginning
since it seems to be able to connect when I use the default multicast 
IP address.  Now I don't even have to run the gmond on the frontend
which is what I expected.  Kind of confusing what I had earlier, 
but it seems to work now.  Very nice tool.

Btw, since documentation is so sparse, if you all need help writing some
user documentation for the next release, just let me know.

Thanks!
Phuong

-----Original Message-----
From: Steven Wagner [mailto:[EMAIL PROTECTED]
Sent: Wednesday, January 15, 2003 7:43 PM
To: Vu, Phuong A (MP Technology)
Cc: '[email protected]'
Subject: Re: [Ganglia-general] How to setup multiple clusters using
different multicast IP


Vu, Phuong A (MP Technology) wrote:
> This is probably very obvious, and I have read some of the previous 
> discussions on this topics but the answer given does not seem to work for
me
> at all, and the documentation is not obvious for me how to proceed
further.
> 
> Scenario:  I have say 32 Linux boxes on the same core switch, no routing
> involved,
> but would like to partition them out into say 2 clusters of 16 machines
each
> and running gmetad on another Linux box, say XYZ.  I gave each cluster a
> name as defined
> in the /etc/gmond.conf file on the individual machine, and use as
> data_source
> in the /etc/gmetad.conf file on XYZ
> 
> data_source "cluster1" ... here I list the mane of the 16 machines on the
> 1st cluster
> data_source "cluster2" ... here I list the name of the 16 machines on the
> 2nd cluster

When all else fails, go to the source!

When you telnet to localhost:8651 on XYZ (the host running the web 
front-end and the metadaemon), you should see two open <CLUSTER> tags (grep 
for it!) - one for cluster1, one for cluster2.

If you see one, then gmetad is having trouble reaching the hosts in one of 
the two data sources.  If you see none, it's having trouble reaching both 
of them.  If you see "Connection refused," gmetad is not running or it's 
crashing on startup.

Even if your RRD permissions are all jacked up on the front-end, you should 
be able to get a basic web page up.

Make sure that conf.php in the web front-end is connecting to
localhost:8651.

Also, make sure that your monitoring cores trust the box running the 
front-end.  Otherwise, they will not be able to connect.

You can verify that they trust the box by telnetting to $MEMBER_OF_CLUSTER 
port 8649.  If you see the connection accepted and then closed immediately, 
you have a trust issue (add your front-end box's IP to /etc/gmond.conf on 
the cluster machines you intend to poll).  If the connection is refused, 
you may need to check the monitoring core's configuration (is it listening 
on that port?  Configured as mute?).  If XML data goes shooting across your 
xterm, you win a prize - that could mean you have a weird gmetad XML 
parsing error on your hands (with custom metrics, this can happen... ). 
Your prize is, you get to run gmetad with debug output on! :)

> 1. Unless I also start gmond on XYZ, I won't be able to see anything on
the
> webfrontend.
>    I don't want XYZ to be part of the monitored clusters.

The monitoring core probably shouldn't be running on the front-end.  The 
metadaemon should be enough.

> I know in previous reply, Steven Wagner has said that this should work,
but
> I am not able to get it to behave that way.  Am I missing something very
> obvious ?

You know, gentle readers, you really shouldn't assume that because my setup 
works, anyone else's will.  The stuff I did to get my Ganglia setup working 
went way beyond messing with the configs.  Especially when it comes to the 
metadaemon...

By this point in Ganglia 2.x's development, if someone is going through the 
kind of stuff I went through on any of the currently-supported platforms, 
something is wrong...

Reply via email to