Hi,

our gmetad boxes (2 of them) with 12 data sources, 6 of which are
gmetad and 6 gmonds, are spamming syslog like mad with the following
message:

Sep  6 06:33:32 localhost.localdomain /usr/sbin/gmetad[2526]:
RRD_update (/var/lib/ganglia/rrds/...metric.rrd): illegal attempt to
update using time 1252244010 when last update time is 1252244010
(minimum one second step)

This happens for both metrics and summary graphs.

Looking at the hosts every appear to be fine to me, and ntp is running
everywhere and in sync.

Looking at the code instead both gmetad/gmetad.c and
gmetad/data_thread.c have a possibly suspicious call to sleep:

in gmetad.c:417
         sleep_time = 10 + ((30-10)*1.0) * rand()/(RAND_MAX + 1.0);
         sleep(sleep_time);

in data_thread.c:193
         sleep_time = (d->step - 5) + (10 * (rand()/(float)RAND_MAX))
- (end.tv_sec - start.tv_sec);
         if( sleep_time > 0 )
            sleep(sleep_time);

two observation:
- based on man 3 sleep, if any signal is sent to gmetad, the sleep
interval can be 0
- end.tv_sec - start.tv_sec could compute to a considerably high
number that along with a short step could result in a sleep_time < =
0.

thoughts?

thanks

-- 
"Behind every great man there's a great backpack" - B.

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to