Hey folks, We have 5 identical (but separate) ganglia setups. All running gmond/gmetad 3.0.2. All but one of these setups have a common problem: they only write data to data-source RRDs and not the global cluster RRDs.
The setups are fairly simple: - each ganglia setup is a cluster - each cluster is divided into classes - all of the boxes in each class _unicast_ their data to the 1st and 2nd instance of their class (foo1,foo2,foo3,foo4..fooN, unicase to foo1 and foo2) - all of the boxes also _unicast_ their data to the 'ganglia server' (which is running gmetad, gmond, and the ganglia-web stuff) - gmetad has a data_source setup for each cluster where it pulls from the first intance of that class (and if that fails the second isntance). For example: data_source "foo" foo1 foo2 And a 'misc' class which it pulls from itself: data_source "misc" localhost Here comes the problem. For any given host - lets say, foo4 - it writes the data to the RRD in foo/foo4 but NOT in the cluster directory (we'll call it 'cluster1' - each one is different), cluster1/foo4/. The RRDs are in cluster5/foo4, but they have nothing but NaN's in them. If you telnet to localhost 8649 you cans ee all the data unter CLUSTER NAME="cluster1"... the data *is* there. It just only gets written under the data_source directory and not under the global directory. I've started up gmetad with debug9 and I see no problems. I've tried dropping down to only one data_source (which housed 4 boxes). I'm out of ideas. Any ideas would be helpful. Thanks. -- Phil Dibowitz P: 310-360-2330 C: 213-923-5115 Unix Admin, Ticketmaster.com "Never write it in C if you can do it in 'awk'; Never do it in 'awk' if 'sed' can handle it; Never use 'sed' when 'tr' can do the job; Never invoke 'tr' when 'cat' is sufficient; Avoid using 'cat' whenever possible" -- Taylor's Laws of Programming

