Me again. :) I have a grid with ~100 hosts divided into ~10 clusters (not evenly).
All of the clusters works fine except one.
This specific cluster has 4 boxes in it. Of these 4 boxes, 1 of them gets
data into ganglia as expected - the other 3 do not.
More specifically, when you restart gmond on one of those 3 boxes, you get
data for about about a minute. A little sliver of data on the graph and
that's it.
But it gets stranger. The boxes are ses1, ses2, ses3, ses4.
ses1 - ses3 are the 'bad' boxes - but a telnet to port 8649 shows data.
ses4, however shows no data on a telnet to port 8649 (even though this is
the one that _works_).
ses1-4 all unicast their data to ses1 and ses2, and then gmetad sucks up the
data from ses1 and ses2.
I don't see anything in the logs, I tried starting gmond in debug mode but
didn't see anything resembling an error.
Thanks,
--
Phil Dibowitz
P: 310-360-2330 C: 213-923-5115
Unix Admin, Ticketmaster.com
"Never write it in C if you can do it in 'awk';
Never do it in 'awk' if 'sed' can handle it;
Never use 'sed' when 'tr' can do the job;
Never invoke 'tr' when 'cat' is sufficient;
Avoid using 'cat' whenever possible" -- Taylor's Laws of Programming
signature.asc
Description: OpenPGP digital signature

