[Ganglia-general] ganglia and performance on large (1000+ node) clusters

Douglas Nordwall Wed, 07 Nov 2007 09:39:20 -0800

Do any of you who have very large clusters find that ganglia fillsthe bill for your cluster? Do you notice a performance hit from it?How much tuning have you done with it? Do you wish it was better insome fashion, and how so? I'm looking for experience with very largeclusters here, since we already have a number of clusters in the32-256 node range that we use ganglia on.

For those of you with even more detailed knowledge, and perhaps thetime to read a paper, have to figured out if it introduces jitterinto your cluster, and if so, how to avoid it?


relevant paper:
http://www.sc-conference.org/sc2003/paperpdfs/pap301.pdf

The short of it is that for tightly coupled jobs, the overhead of onejob can affect all the nodes involved in the job. One node has tospend 1ms dealing with some overhead on the system, and all the nodesend up spending that 1ms waiting for him to get done.


Doug Nordwall
Unix Administrator
EMSL Computer and Network Support
Unclassified Computer Security
Phone: (509)372-6776; Fax: (509)376-0420

The best book on programming for the layman is "Alice in Wonderland";but that's because it's the best book on anything for the layman.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

[Ganglia-general] ganglia and performance on large (1000+ node) clusters

Reply via email to