Hey folks,

We have 5 identical (but separate) ganglia setups. All running gmond/gmetad
3.0.2. All but one of these setups have a common problem: they only write
data to data-source RRDs and not the global cluster RRDs.

The setups are fairly simple:

- each ganglia setup is a cluster
- each cluster is divided into classes
- all of the boxes in each class _unicast_ their data to the 1st and 2nd
instance of their class (foo1,foo2,foo3,foo4..fooN, unicase to foo1 and foo2)
- all of the boxes also _unicast_ their data to the 'ganglia server' (which
is running gmetad, gmond, and the ganglia-web stuff)
- gmetad has a data_source setup for each cluster where it pulls from the
first intance of that class (and if that fails the second isntance). For
example:
   data_source "foo" foo1 foo2
And a 'misc' class which it pulls from itself:
   data_source "misc" localhost

Here comes the problem. For any given host - lets say, foo4 - it writes the
data to the RRD in foo/foo4 but NOT in the cluster directory (we'll call it
'cluster1' - each one is different), cluster1/foo4/. The RRDs are in
cluster5/foo4, but they have nothing but NaN's in them.

If you telnet to localhost 8649 you cans ee all the data unter CLUSTER
NAME="cluster1"... the data *is* there. It just only gets written under the
data_source directory and not under the global directory.

I've started up gmetad with debug9 and I see no problems.
I've tried dropping down to only one data_source (which housed 4 boxes).

I'm out of ideas. Any ideas would be helpful. Thanks.


-- 
Phil Dibowitz
P: 310-360-2330 C: 213-923-5115
Unix Admin, Ticketmaster.com

"Never write it in C if you can do it in 'awk';
 Never do it in 'awk' if 'sed' can handle it;
 Never use 'sed' when 'tr' can do the job;
 Never invoke 'tr' when 'cat' is sufficient;
 Avoid using 'cat' whenever possible" -- Taylor's Laws of Programming


Reply via email to