Hi, I have been running ganglia for most of the last year, quite happily. My hosts are configured to send unicast data to a single gmetad server.
Recently, large portions of the cluster's graphs are empty. A sample is shown at http://cryptio.net/~ben/ganglia/blank_graphs.png Notice that not all hosts are missing data (Burgertime, for example, has all the data there). I thought it was due to high load, because I first noticed it when the gmetad server was being hammered by a separate process. But it has long since recovered, and I have not seen the graphs recover, but they have in fact gotten worse. I was running 3.0.1, and tried upgrading to 3.0.2 on the off chance it would fix something, but it did not. I have since downgraded the webui because I have made some changes[*] and I don't want to spend the time to migrate them just now. :) When I go into the page for a single host and click on the 'gmetrics' link, I find that all of my metrics have a record of being recieved within the last two minutes (my time period). And yet, their graphs show up empty. Any thoughts? What logs should I be looking at? I am running on a Fedora Core 3 system, with version 3.0.1 (now 3.0.2). I don't think I've made any gross changes to the environment within the last week, which is the time period in which all this annoyance has started. The only think I can say is that the beginning of this strangeness coincides with a brief (12-hr) period of intense load on the gmetad server. Thanks, -ben [*] for those interested - I added an 8-hour and 3-day view; I find the 8-hour view the most useful by far. I also changed the size of the graphs to fit my 20" screen. Finally, I added a Disk summary graph, in addition to the Load, CPU, Memory, and Network. Is there any interest in patching these into the source? -- Ben Hartshorne email: [EMAIL PROTECTED] http://ben.hartshorne.net
signature.asc
Description: Digital signature

