On Mon, 2011-04-11 at 17:08 -0700, Rick Jones wrote:
> I'm noticing messages such as:
> 
> Mon Apr 11 16:13:14 2011  **WARNING** RRD:
> rrd_update(/usr/local/var/ntop/rrd/interfaces/sFlow-device.2/sFlow/48/ifOutBroadcastPkts.rrd)
>  error: 
> /usr/local/var/ntop/rrd/interfaces/sFlow-device.2/sFlow/48/ifOutBroadcastPkts.rrd:
>  illegal attempt to update using time 1302563594 when last update time is 
> 1302563594 (minimum one second step)
> 
> appearing when ntop is running - they seem to travel in fairly large
> groups - one entry per statistic per port it seems (I don't have that
> much scroll-back and ntop is not running as a daemon so it isn't going
> to syslog (iirc)).
> 
> At one level the error message is encouraging - it suggests that one can
> indeed store to the RRD as often as once a second :) but it is still
> worrisome regardless.

As part of learning more about RRD, I have been writing my own little
"store the sflow counters to an RRD" program, and was noticing similar
errors there, eventhough I was creating the RRD with a start time of
even as much as 2 seconds prior to "current" (using --start -2).

I have a hypothesis as to why I am seeing it in my program but am still
not familiar enough with the ntop code to say if it is happening there,
but sFlow counters tend to travel in groups.  If one snaps the arrival
timestamp of the sFlow PDU when making the recvfrom() call, say to use
as the timestamp for the update... but uses "current" time in the create
(even subtracting some seconds from it) by the time one has created five
or six new RRDs, several seconds can pass and so the sixth or seventh
can be well beyond the arrival timestamp.  I won't inundate the list
with the strace, but here is some output from my little program:

(successful updates are actually silent)

2011-04-27-13:42:33.307211: stat_and_create_generic_counter_db creating
new RRD /tmp/counters/192.168.1.7/0/31/generic.rrd
2011-04-27-13:42:33.754664: stat_and_create_generic_counter_db creating
new RRD /tmp/counters/192.168.1.7/0/33/generic.rrd
2011-04-27-13:42:34.335818: stat_and_create_generic_counter_db creating
new RRD /tmp/counters/192.168.1.7/0/37/generic.rrd
2011-04-27-13:42:34.925467: stat_and_create_generic_counter_db creating
new RRD /tmp/counters/192.168.1.7/0/38/generic.rrd
2011-04-27-13:42:35.377701: stat_and_create_generic_counter_db creating
new RRD /tmp/counters/192.168.1.7/0/40/generic.rrd
2011-04-27-13:42:35.916924: stat_and_create_generic_counter_db creating
new RRD /tmp/counters/192.168.1.7/0/41/generic.rrd
2011-04-27-13:42:36.344945: stat_and_create_generic_counter_db creating
new RRD /tmp/counters/192.168.1.7/0/19/generic.rrd
2011-04-27-13:42:36.912633: update_generic_counter_db() rrd_update()
returned '/tmp/counters/192.168.1.7/0/19/generic.rrd: illegal attempt to
update using time 1303936953 when last update time is 1303936954
(minimum one second step)'
2011-04-27-13:42:36.912814: update_generic_counter_db failure storing to
generic counter db line 1512 argc 32 position 3

I suppose that part of "the answer" is to use the rrcached, but the way
I've fixed the issue in my little program is to use the PDU's arrival
timestamp less one second.

Oh, and one other thing I've noticed - "contemporary" (perhaps for
versions of contemporary going back several years) allow update
timestamps to include milliseconds.  While the minimum step remains one
second, this helps avoid issues when one is using a step of one second,
and the actual arrival times are fluttering back and forth across the
one second interval.

rick jones


_______________________________________________
Ntop mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop

Reply via email to