thanks for all the input. Perhaps unsurprisingly, we are already looking at most of this for dealing with a large cluster, but I'm happy to see that we're not the biggest :)

The RRD issues is good to know. We have a _very_ large disk array that we can write too, and I already ordered the collection server with 32G of ram. So, that should easily be bypassed. We're already going to break it up into chunks.

sadly, I don't think anyone has been able to answer my question about jitter. It's not a simple overhead issue... you could have something that is 1ms worth of overhead on each node per minute. But if that 1ms of overhead prevents the calculation from completion, that 1ms impacts all of the nodes. Then another one hits for 1ms 1ms later, again slowing down the calculation.

the question could be answered if someone had tried their performance with all the gmond's off vs on on a latency sensitive application. some large gaussian job would do it.

On Nov 7, 2007, at 6:28 PM, richard grevis wrote:

Douglas,

what Matthias said is good.

At one stage we had a grid of 6,000 servers in maybe 50+ clusters with
5-10 second polling (!!!).\

Here are my experiences and some tips, some you will already know:
- The overhead of the gmond agent is very low on the monitored hosts,
both for CPU and network I/O. Not storing any local data is a Good Thing.

- Network overhead from UDP data is really, really low. In our case we
unicast the UDP to headnodes. Headnode CPU load was still really small.

- Your first (and also biggest) bottleneck is calling RRDupdate and writing RRD data to the filesystem. Many posts talk of this. We used SAN for the RRD files. Others made a tmpfs with rsync periodic backup. strace gmetad
  and you will see what I mean.

- gmetad spawns 1 thread per data_source as best I see, and each thread does the TCP/XML data retrieval and then the RRD updates. This affected us because
  of the 5-10 second polling of data sources.

- Personally I like 10 second polling, but it depends on your typical job
  durations.

Tips?
- Make a grid, chopping your cluster up. Helps on the display side too!

- Integer values returned from gmond still give rise to RRD files that are
  updated at the poll rate, even if they are constant (e.g. cpu clock
  speed). Remove ones you don't need or morph them into string values.

- Gaps in graphed data? For us it was the inability of each thread doing all it had too in within the polling interval window. The ganglia server
  itself did not run out of overall cpu, in fact it is quite low.

- We also got the occassional gap exactly on the hour. Matt Toy postulated that this was the moment that RRD had to update its aggregated values.

- Make gmetric scripts on the ganglia server that give you I/O wait,
disk service time etc. Spikes in I/O wait correlated with gaps for us.
  umm. Mostly.

regards,
Richard G


--



Doug Nordwall
Unix Administrator
EMSL Computer and Network Support
Unclassified Computer Security
Phone: (509)372-6776; Fax: (509)376-0420
The best book on programming for the layman is "Alice in Wonderland"; but that's because it's the best book on anything for the layman.



-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to