Do any of you who have very large clusters find that ganglia fills
the bill for your cluster? Do you notice a performance hit from it?
How much tuning have you done with it? Do you wish it was better in
some fashion, and how so? I'm looking for experience with very large
clusters here, since we already have a number of clusters in the
32-256 node range that we use ganglia on.
For those of you with even more detailed knowledge, and perhaps the
time to read a paper, have to figured out if it introduces jitter
into your cluster, and if so, how to avoid it?
relevant paper:
http://www.sc-conference.org/sc2003/paperpdfs/pap301.pdf
The short of it is that for tightly coupled jobs, the overhead of one
job can affect all the nodes involved in the job. One node has to
spend 1ms dealing with some overhead on the system, and all the nodes
end up spending that 1ms waiting for him to get done.
Doug Nordwall
Unix Administrator
EMSL Computer and Network Support
Unclassified Computer Security
Phone: (509)372-6776; Fax: (509)376-0420
The best book on programming for the layman is "Alice in Wonderland";
but that's because it's the best book on anything for the layman.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general