thanks for all the input. Perhaps unsurprisingly, we are already
looking at most of this for dealing with a large cluster, but I'm
happy to see that we're not the biggest :)
The RRD issues is good to know. We have a _very_ large disk array
that we can write too, and I already ordered the collection server
with 32G of ram. So, that should easily be bypassed. We're already
going to break it up into chunks.
sadly, I don't think anyone has been able to answer my question about
jitter. It's not a simple overhead issue... you could have something
that is 1ms worth of overhead on each node per minute. But if that
1ms of overhead prevents the calculation from completion, that 1ms
impacts all of the nodes. Then another one hits for 1ms 1ms later,
again slowing down the calculation.
the question could be answered if someone had tried their performance
with all the gmond's off vs on on a latency sensitive application.
some large gaussian job would do it.
On Nov 7, 2007, at 6:28 PM, richard grevis wrote:
Douglas,
what Matthias said is good.
At one stage we had a grid of 6,000 servers in maybe 50+ clusters with
5-10 second polling (!!!).\
Here are my experiences and some tips, some you will already know:
- The overhead of the gmond agent is very low on the monitored hosts,
both for CPU and network I/O. Not storing any local data is a
Good Thing.
- Network overhead from UDP data is really, really low. In our case we
unicast the UDP to headnodes. Headnode CPU load was still really
small.
- Your first (and also biggest) bottleneck is calling RRDupdate and
writing
RRD data to the filesystem. Many posts talk of this. We used SAN
for the
RRD files. Others made a tmpfs with rsync periodic backup. strace
gmetad
and you will see what I mean.
- gmetad spawns 1 thread per data_source as best I see, and each
thread does
the TCP/XML data retrieval and then the RRD updates. This
affected us because
of the 5-10 second polling of data sources.
- Personally I like 10 second polling, but it depends on your
typical job
durations.
Tips?
- Make a grid, chopping your cluster up. Helps on the display side
too!
- Integer values returned from gmond still give rise to RRD files
that are
updated at the poll rate, even if they are constant (e.g. cpu clock
speed). Remove ones you don't need or morph them into string values.
- Gaps in graphed data? For us it was the inability of each thread
doing
all it had too in within the polling interval window. The ganglia
server
itself did not run out of overall cpu, in fact it is quite low.
- We also got the occassional gap exactly on the hour. Matt Toy
postulated
that this was the moment that RRD had to update its aggregated
values.
- Make gmetric scripts on the ganglia server that give you I/O wait,
disk service time etc. Spikes in I/O wait correlated with gaps
for us.
umm. Mostly.
regards,
Richard G
--
Doug Nordwall
Unix Administrator
EMSL Computer and Network Support
Unclassified Computer Security
Phone: (509)372-6776; Fax: (509)376-0420
The best book on programming for the layman is "Alice in Wonderland";
but that's because it's the best book on anything for the layman.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general