> As a side note, I found another problem with the gmonds from each > node in the cluster not expiring their timeouts for all metric transmition > when a node a particular node restarts it's gmond. That is the restarted > gmond doesn't _know_ about certain less frequent metric updates from other > nodes. I have more details, I'm not sure if this is related to the > default route interface problem I just got into, but I'll get back with > you once I can devote more time to actually digging deep with my issues.
the is likely occuring because the gmond was not down for longer than the time threshold period.. default is 60 seconds. i need to document that better and update the gmond restart script to sleep(90) before starting gmond again. try stopping gmond.. waiting 90 seconds and then restarting it. thanks for the feedback -matt

