Hi Ken,
Thanx for your reply.
> > I run a couple of gmond's and one gmetad in a common subnet.
> > How can I make the gmetad talk to another gmetad, running in another
> > subnet. I want to operate a central web server outside my farm.
> > Within the subnet everything, incl. the web-frontend, works fine.
> >
> > I guess the gmond's (/etc/gmond.conf) do not have to know about the
> > second gmetad? It's all in /etc/gmetad.conf?
>
>
> gmetads can be data sources to other gmetads, as long as they are trusted
> and such. That part should be fairly straightforward - I've done it to
> hierarchically 'arrange' different clusters within our "Grid"/VO. Have you
> run into problems doing this, or are you still at the conceptual level here?
Well, I want to run a 'master' gmetad outside my cluster(s) to keep the
webfront-end and the apache outside the cluster too.
My sketch would look like yours:
gmetad + webfront-end (master in another subnet)
|
|- gmetad (local_A in subnet A)
| |
| |- cluster A
|
|
|- gmetad (local_B in subnet B)
|
|- cluster B
The 'master' is outside the clusters subnets.
Until gmetad (local_A) everything is fine. cluster B is not implemented
yet. All gmetad's trust each other by means of the appopriate line (I
think) in /etc/gmetad.conf, namely:
trusted_hosts localhost master local_A
I cannot directly contact the gmond's to the master gmetad because
multi_cast is not allowed outside a subnet (routing!).
With debug level 10 I see that the master gmetad tries listens too
port 8651 which is the gmond multi_cast port and does not see anythging
there ("[local A] is a dead source").
What has to be given in the "data_source" line? I tried:
data_source [cluster A] local_A
Thanx
Andreas Gellrich
-------------------------------------------------------------------------
Dr. Andreas Gellrich, Physicist | E-mail: [EMAIL PROTECTED]
Deutsches Elektronen-Synchrotron | Office: 2b/322
DESY IT-Division | Phone: +49 40 8998-2732
Notkestr. 85 | Fax/Voice: +49 40 8994-2732
22607 Hamburg, Germany | Mobile: 0170 780 7479 (899892732)
-------------------------------------------------------------------------
>
> I will note that I don't do this at the moment because I encountered very
> very severe data loss. I was running three gmetads (two subordinate to one
> "master") on a machine with a web frontend for each. A rough sketch:
>
> gmetad
> -|-gmetad (local)
> | |
> | --128 node cluster (gmond remote)
> | --50 node cluster (gmond remote)
> | --32 node cluster (gmond remote)
> |
> |
> |-gmetad (local)
> | |
> | --4-6 node cluster (gmond local)
> |
> |-remote gmond for a 10 node cluster
>
> I think it has something to do with the timestamping issues being discussed
> on the dev list, but for now I just run the single gmetad querying all the
> gmonds directly, and it's fine (albeit uglier).
>
> The problem I refer to is the RRD_Update getting double updates for a single
> timestamp, i.e. the rrd delta was too small being 0 seconds. The debug
> output showed "RRD_update: illegal attempt to update using time [NNNN] when
> last update time is [same-NNNN] (minimum one second step)" every minute or
> so, with corresponding lack of data in the graphs. Hope that'll be fixed in
> 2.5.4 so I can run this stuff properely. :)
>
> --
> Ken MacInnis - System Research Programmer II - MGrid
> 130 CCAB, 1071 Beal Ave, Ann Arbor, MI 48109
> kmacinni at umich dot edu - +1 734 647 8307 (w) - +1 734 936 4919 (f)
>
>
>