I have 3.0.4 installed and mostly working properly. We have lots of hosts, so the rrds are on tmpfs. I have a grid-of-grids configuration, with separate gmetad's running for each grid, all on the same host. The server in question is pretty beefy; it's a Sun T2000; typically has a load average around one.
One of the grids (GRID-A) has about 700 hosts and is comprised of 2 clusters (525 & 175 hosts). When you visit the page for GRID-A, you will sometimes see accurate counts of hosts up and cpus total, and sometimes some of those numbers will be low. It seems as though the web server is handing out data that comes from the middle of a calculation cycle. Sometimes the cpus total number for one of the clusters will be quite low; occasionally you'll see zeros for all three (cpus total, hosts up, hosts down) for the grid summary. Sometimes the grid summary number is not the same as the sum of the numbers from the two clusters. The graphs shown on this page seem to always be correct. The red and green lines showing CPUs and Nodes are perfectly flat. If you visit the parent of GRID-A (the top level grid that rolls up all the other grids), you will sometimes see low numbers for GRID-A, and sometimes the overall total numbers will be wrong. On the toplevel grid page, the read and green lines for GRID-A and for the summary are not flat, but show "dips" of varying severity. This problem was worse before I moved the rrds to tmpfs. It has improved markedly, but not gone away completely. The other grids are smaller (largest is 400) and do not show these symptoms. ideas? ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Ganglia-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-general

