Hi,

One way to look at it is that multicast is only used between gmonds to
communicate.. not between gmond and everything else.  A gmetad can query
either a gmond (typically port 8649), or a gmetad (port 8651 by default).
This is unicast TCP delivering an XML stream.  So, the gmonds both use
multicast to find each other, and unicast to deliver data to querying
connections.

> > gmetads can be data sources to other gmetads, as long as they are
> trusted
> > and such.  That part should be fairly straightforward - I've done it to

> gmetad + webfront-end (master in another subnet)
>   |
>   |- gmetad (local_A in subnet A)
>   |    |
>   |    |- cluster A
>   |
>   |
>   |- gmetad (local_B in subnet B)
>        |
>        |- cluster B
> 
> The 'master' is outside the clusters subnets.
> 
> Until gmetad (local_A) everything is fine. cluster B is not implemented
> yet. All gmetad's trust each other by means of the appopriate line (I
> think) in /etc/gmetad.conf, namely:
> 
> trusted_hosts localhost master local_A
> 
> I cannot directly contact the gmond's to the master gmetad because
> multi_cast is not allowed outside a subnet (routing!).

Well, here, the 'master' gmetad would use unicast TCP to contact the gmond
on port 8649 (unless you changed that).  You can try 'telnet clusterA 8649'
and it should give you an XML stream.  If the connection opens, then closes
immediately, you (probably) don't have your trust configured correctly (or
perhaps a firewall or ... ).

> With debug level 10 I see that the master gmetad tries listens too
> port 8651 which is the gmond multi_cast port and does not see anythging
> there ("[local A] is a dead source").

The master gmetad listening port, here, should only be used in the web
frontend.  The gmetad will *query* the *other* gmonds or gmetad (i.e. "pull"
data), rather than having it "pushed" to the master gmetad on the listening
port.  Or am I misunderstanding what you think occurs?

> What has to be given in the "data_source" line? I tried:
> 
> data_source [cluster A] local_A

Two notes: one, I find it helpful to specify the polling frequency
explicitly, since there's currently a bug in the parsing code related to
that.  Second, specify the port number explicitly too, it's a good habit.

So, if you want to query the gmond on local_A, 

data_source 15 "Cluster A" local_A.realm.org:8649

or gmetad:

data_source 15 "Cluster A" local_A.realm.org:8651

These will poll those data_sources every 15 seconds, and you can may have
changed those ports in the .conf files.

If these give you problems, start from the basics .. try telnetting to those
ports manually and seeing if the output is correct.  If that doesn't work to
begin with, then you have other problems.

-- 
Ken MacInnis - System Research Programmer II - MGrid
130 CCAB, 1071 Beal Ave, Ann Arbor, MI 48109
kmacinni at umich dot edu - +1 734 647 8307 (w) - +1 734 936 4919 (f)



Reply via email to