On Monday 05 May 2003 19:31, Federico Sacerdoti wrote: > On Monday, May 5, 2003, at 05:07 AM, Martin Knoblauch wrote: > > Hi, > > > > today I upgraded one of our clusters from 2.5.1 to 2.5.3 (gmond, > > gmetad and the web-frontend). Since then the log-files on the gmetad > > node get filled with stuff like: > > > > headnode /usr/sbin/gmetad[22664]: RRD_update: illegal attempt to update > > using time 1052131942 when last update time is 1052132440 (mini > > mum one second step) > > I think I know what this is. We recently changed the RRD update logic > to use CLUSTER LOCALTIME > as the rrd timestamp. This was done in 2.5.3 I believe. > > Now if your gmetad has a data source which is another gmetad (port > 8651), it will try to update its rrds multiple times with the same > CLUSTER LOCALTIME. Why? Because gmetad only updates its XML every > 20-30s. >
Hmm. There is only one gmetad in our setup. All sources come from variour gmond's (we have two clusters on ports 8650 and 8652). > So it is possible for your gmetad to attempt to update its rrds twice > with the same LOCALTIME timestamp, causing the errors you see in your > logs. > Actually I could get rid of the messages and the graphs are OK again. It turned out that in the course of a total rebuild of our cluster we forgot to synchronize the system clocks on (most of) the nodes. Times were pretty far away. Since I restarted ntpd, all looks OK. > This is one of those hard-to-anticipate bugs which occur from > unintended side effects to the system. To fix it, I believe we need to > use the true localtime when updating rrds for which we are not the > "authority" on. (The authority mode is off whenever we get our data > from another gmetad). > Seems the time-keeping is a bit touchy :-) Martin -- ---------------------------------- Martin Knoblauch [EMAIL PROTECTED] http://www.knobisoft.de
