Dan Moniz wrote:

<snip>

3) Load on the monitor host/head node seems higher than it should be. It hovers around 2.6 - 3.0. While other software is running on this host, shutting down gmetad results in load falling back down to levels similar to other compute hosts (since the monitor host/head node is currently also a host in the Compute Hosts cluster). Also, in concert with the higher than expected load, ssh sessions to the monitor host/head node seem to take a long time to establish. Again, shutting down gmetad seems to alleviate these problems. While both of these issues don't prevent work from being done or gmetad from working (in the current configuration), it does seem abnormally high and is something of an annoyance.

Does this happen all the time, or do you happen to have a webbrowser open all the time on the ganglia page? If so, I might know why.

Over here we noticed that when one or especially multiple people have a webbrowser open continuously, it generates a bigger load on the web frontend server. This seemed to happen because the cluster overview page shows all host graph's by default, and it refreshes automaticly.

Meaning that everytime the overview automaticly refreshes, it redraws 280 host graphs, which can be quite consuming depending on hardware.

If this seems to be your the case, I have a little patch to set the default cluster overview to not show the host graphs by default. This decreased the load on our web frontend server. It still stays around 0.9 over here, but that's better than 2.5+

I would also recommend running the gmetad / web frontend on a seperate machine and not on your head/login node if you can spare the hardware.

You could also use a ramdisk as Matt suggested to store the .rrd's, if you have enough RAM in the machine. However our cluster (275 machines) generates about 150 Mb's worth of .rrd files, which is a pretty big chunk of RAM.


Ramon.

Reply via email to