[Ganglia-general] ".pyc" file not getting made for python module

2012-10-25 Thread deep desai
Hi all, I have made a python module for getting metrics for rabbitmq. I followed following steps: 1) Put this rabbitmq.py file here:"/usr/lib64/ganglia/python_modules/" 2) chmod it to 755 3) Put the ".pyconf" file here:"/etc/ganglia/conf.d" 4) restart gmond The problem is it is no

Re: [Ganglia-general] Question about scaling

2012-10-25 Thread Potter,Mark L
Vladimir, It is still reporting random nodes as down with gmetad set to collect every 15 seconds. Unfortunately I have to be done with this for today but will be back at it first thing in the morning (CDT). I have also made sure nothing else is running on this box. At the moment it's just gangl

Re: [Ganglia-general] Question about scaling

2012-10-25 Thread Vladimir Vuksan
60 seconds is likely the problem. I would leave it at default ie 15. I can explain later. "Potter,Mark L" wrote: >Nicholas, > >I have it set to collect every 60 seconds at the moment as per the >gmetad I posted yesterday but even with that, running "netstat -ua" in >a 1 second watch loop, once

Re: [Ganglia-general] Question about scaling

2012-10-25 Thread Potter,Mark L
Nicholas, I have it set to collect every 60 seconds at the moment as per the gmetad I posted yesterday but even with that, running "netstat -ua" in a 1 second watch loop, once Recv-Q pops it is still responding immediately and the Recv-Q never stays lit, so to speak, for more than two seconds.

Re: [Ganglia-general] Question about scaling

2012-10-25 Thread Nicholas Satterly
Hi Mark, I wouldn't be so quick to dismiss timeouts as the problem. The "0.9751s" it took to download and parse ganglia's XML tree refers to the time it took the PHP web frontend to query the gmetad XML whereas the timeout's I was referring to occur when the gmetad polls the gmonds during metric c

Re: [Ganglia-general] Question about scaling

2012-10-25 Thread Potter,Mark L
Well things blew up ~184 hosts. The web interface shows a random number of hosts down each refresh, although sometimes there are all up. It reports just ~1 second to download and process the XML: "Downloading and parsing ganglia's XML tree took 0.9751s." So I don't think timeouts are the problem

Re: [Ganglia-general] Question about scaling

2012-10-25 Thread Potter,Mark L
>Hi Mark, > >I assume cnode340 is the head node that all ~340 other gmond's send their data >to. If so, you could reduce >the amount of redundant metadata flying around by >increasing "send_metadata_interval" to 120 seconds or >higher. That is correct, cnode340 is the head node for ganglia. I h

Re: [Ganglia-general] [Ganglia-developers] Adding Holt-Winters databases to existing rrd causes __SummaryInfo__ metric to fail to render on graphs

2012-10-25 Thread Aaron Nichols
On Wed, Oct 24, 2012 at 9:13 AM, Vladimir Vuksan wrote: > I don't have a lot of time to look into it however different between > SummaryInfo RRDs and other RRDs is that SummaryInfo contains ds[num] which > is the number of nodes that being summarized. I wonder if that is somehow > throwing off yo