Hi Jason, comparing SNMP counters and NetFlow data can be tricky. This is mainly because SNMP counters are updated real-time while NetFlow by the way it has been conceived involves sort of "buffering": the first packet seen in a flow creates a flow structure; then the NetFlow agent accumulates counters and other stuff (ie. TCP flags); finally it kicks the flow to the collector, either because it reckons it's closed (ie. TCP RST seen) or because of inactivity timers (that vary from protocol to protocol), or because it's a long lived flow (so you have another timer for that), or because for example the NetFlow agent running out of resources (and this also adds a touch of fun).
Then, once things reach the collector are further buffered: pmacct can do this in two stages for optimizing resources and cope with sustained traffic rates: a) when a flow is handed off/distributed from the core plugin to backend (memory, xSQL) plugins via the 'plugin_buffer_size' directive and b) in case an xSQL plugin is used, when a flow is cached inside the plugin waiting to be sent to the database; this is tuned via the 'sql_refresh_time' directive. Comparing accuracy can get even trickier when enabling the 'sql_history' feature, ie. to chop in bins of 5 minutes the traffic per IP address, if any of the eviction timers at the NetFlow agent is larger than the SQL history timeframe (in production this is not an issue but getting a clue about accuracy is a different call). Summarizing: * you can limit the SNMP vs NetFlow impact of the collector by storing collected data into a memory table * if testing in lab without a huge number of concurrent flows, then you can disable buffering by omitting the 'plugin_buffer_size' (by default pmacct doesn't buffer) * make sure the collector doesn't loose any NetFlow datagrams, ie. run pmacct in foreground (or in background by logging somewhere) and watch out for any suspicious message * reduce at some bare minimum all the timers at the NetFlow agent * take into account that SNMP counters might very possibly reason in terms of frames instead of IP packets. The required math has to be applied in this case What surprises me a little bit is the NetFlow counters greater than the SNMP counters by a factor of 2. Any chance such flows have been seen by, say, two agents and thus reported twice to the collector? Cheers, Paolo On Mon, Sep 22, 2008 at 11:48:33AM -0700, Jason Chambers wrote: > Hello all, > > Great tool, very useful. > > I'm trying to understand the total bytes per time bin collected by nfacctd. > > The problem I have is a calculated bandwidth per time bin (5 minute > intervals) does not match the calculated bandwidth from SNMP byte counters. > > On one link, NFacct data is always more than what SNMP data reports. In > some cases it is by a factor of 2. On another link it is usually below > the average bandwidth however there are some instances of the described. > > If anything, I would expect the calculated bandwidth to always be less > than what SNMP reports since I am limiting NFacct collection to a minb > value. > > I suspect maybe this is something to do with the accounting time window > within nfacct and the netflow timers. I'm looking through the source > code for clues, but maybe someone has seen this before or can point out > my mistake ? > > > Regards, > > --Jason _______________________________________________ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
