|
I was able to remove the "dead" host (that isn't really dead) from the overview display. I had to kill all gmond's everywhere, and the gmetad. Then I removed the rrd files for the "dead" host from gmetad's rrds directory, and the rrd directory itself. Then I removed the "dead" host's IP address from gmetad.conf. Then I brought up all the gmonds (except the "dead" one) and then the gmetad. Apparently, these steps will have to be added to our failover procedure. Martin Knoblauch wrote: I don't know what "designated as a collector" means....Also, just to better understand the situation, what is the exact setup? Is one of the "gmond"s designated as a collector? Or do all "gmond"s carry all metrics from all hosts? Which "gmond" is queried by "gmetad" (snippet from config file)? You should telnet/nc to that "gmond" and check whether it has current metrics from "B". Nor do I know how to control which gmonds carry all metrics from which hosts. There is only one udp_send_channel in gmond.conf, and the host in there is the one running gmetad. My /etc/ganglia/gmetad.conf file has only one line in it. data_source "clustername" followed by a list of IP addresses of all the gmond hosts. (My original understanding was the gmetad queries each gmond, or the gmonds all report to the gmetad. So I just listed all the IP addresses there. But now it seems the flow is more complex than that.) I don't have a manpage for gmetad.conf, so I just guessed what to put in there from the sample file. -Cameron This email message is for the sole use of the intended recipient(s) and may
contain confidential information. Any unauthorized review, use, disclosure
or distribution is prohibited. If you are not the intended recipient,
please contact the sender by reply email and destroy all copies of the original
message.
|
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev
_______________________________________________ Ganglia-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-general

