On Fri, May 14, 2004 at 08:32:42AM -0700, Link King scratched on the wall: > > Any ideas here? Not possible to get these numbers accurately?
"It depends." First off, you have to understand that anything that deals with flows and time is largely statistical. You can't get real accurate numbers because there is no information about the "shape" of a flow-- if it has a lot of burst traffic up front, and then a slower stream of small packets, the flow will be "front loaded" when looked at in time (with both packets and bytes), but there is no way to tell that from an export record-- the best you can do is assume the flow is linear in time for both packets and data. In our case, this wasn't too big of a deal. We run about 350M-flows a day, so averages are on our side-- and individual flows don't mean a whole lot. This also means that there is no such thing as "max" attributes. You can calculate average values down to some resolution-- like 60 seconds-- and that's that. You only get a "max" if you aggregate those values further. If you've used RRDTOOL, you know what I mean. You can't have a MAX type on your highest resolution track (well, you can, but it's the exact same as an AVERAGE). The difference between MAX and AVERAGE only matters when you aggregate data. If this is what you want, that's great. If you want some higher resolution "max", like 1 second, you're out of luck. Anyways, I wrote some stuff that did this to generate graphs, but eventually gave up on it. It was too complex for the data we had, especially since you can never have real-time data (delayed by the max flow life-- 30 min., usually). As I'm sure you are aware, the big problem is that flows are not reported until they are done. If you try to build a graph based off the reporting time, your graph will have lot of big spikes in it, as it will assume that all the data and bytes were transfered in the instant the flow was reported. That has little to do with reality, especially for very long (== very big in packets and data) flows. What we did was create a large array of 1 minute "buckets" in memory that held several hours of data. It was a sliding window kind of thing. When a flow was reported, we looked at the start time, end time, and the total number of bytes and packets. Assuming the flow was totally linear (they never are, but...), we calculated how many bytes and packets to throw in each bucket, accounting for fractional time values for the start and end buckets. Once we did this, our graphs looked much nicer. You still can't call a value "final" until it is older than the max flow lifetime, so the graph actually sloops down towards zero as it approaches "now." That confuses a lot of people, so we tended to just run the graphs in a delayed mode (or highlighted the last 30 minutes in a red background or something). It's a lot of work, and the results are still one average and assumption piled on top of the next, but it's something. We also have our own suite of netflow tools, so I have no idea if this kind of thing is or isn't available in one of the public tools. I also thought coming up with a better statistical model for the "shape" of flows would be a great masters thesis, but so far I've had no takers. -j > > As part of a report I'm trying to build using flow-report I'd like to > > include average and maximum bps and pps numbers. The problem that I run > > into is that I sample only 1 out of every 100 packets and the numbers > > generated via flow-report (with scale 100 option) do not appear to be > > accurate. > > > > I can figure out the average numbers with a little math but am stumped on > > how to generate maximum bps and pps numbers from the captured flow > > information. > > > > Is there a way to get accurate max bps and pps info via flow-report or > > another tool? > > > > > > -- > Link King > [EMAIL PROTECTED] > > _______________________________________________ > Flow-tools mailing list > [EMAIL PROTECTED] > http://mailman.splintered.net/mailman/listinfo/flow-tools -- Jay A. Kreibich | Integration & Software Eng. [EMAIL PROTECTED] | Campus IT & Edu. Svcs. <http://www.uiuc.edu/~jak> | University of Illinois at U/C _______________________________________________ Flow-tools mailing list [EMAIL PROTECTED] http://mailman.splintered.net/mailman/listinfo/flow-tools
