Hi Greg,

Sorry for the delay. I've been away for holiday and business for several weeks.

As for your question:
First of all, which nfdump version are you using?

I can not really reproduce your findings.  Using lists in nfdump is optimised 
and should have only little impact.

Example:
nfdump -r /.../nfcapd.201110031000 'not any'
Date flow start          Duration Proto      Src IP Addr:Port          Dst IP 
Addr:Port   Packets    Bytes Flows
Summary: total flows: 0, total bytes: 0, total packets: 0, avg bps: 0, avg pps: 
0, avg bpp: 0
Total flows processed: 3488792, Blocks skipped: 0, Bytes read: 183063912
Sys: 0.564s flows/second: 6185417.6  Wall: 0.577s flows/second: 6040832.2

This is the throughput for nfdump on this host for just reading a file of 68MB 
in size compressed with 3.5 Mio flows
and denying each flow with a filter. The same can be done with a list filter, 
which uses a bit more cpu but with only a
single entry:

nfdump -r /.../nfcapd.201110031000 'ip in [127.0.0.0/24]'
Date flow start          Duration Proto      Src IP Addr:Port          Dst IP 
Addr:Port   Packets    Bytes Flows
Summary: total flows: 0, total bytes: 0, total packets: 0, avg bps: 0, avg pps: 
0, avg bpp: 0
Total flows processed: 3488792, Blocks skipped: 0, Bytes read: 183063912
Sys: 0.660s flows/second: 5285728.1  Wall: 0.661s flows/second: 5277556.4

There is only little difference in speed and throughput.
Lists are pretty much efficient and 5k entries do not really slow down much. 
Much bigger impact have your system
resources such as I/O throughput and memory availability.

As for the lists:

ip in [ iplist ]

where iplist is a list of IP addresses or network blocks such as 192.168.0.0/24.

net in [] does not work, as the source net is not defined. So use netblocks in 
ip lists.

nfdump does not use threading so far. This may come in future, but the impact 
is questionable as disk I/O is the normal
bottle neck.

Hope, this helps, otherwise let me know.

        - Peter

On 8/15/11 10:02, Greg Zapp wrote:
> I'm using netflow data to for billing based on international vs
> domestic bandwidth.  As such I'm using a filter file with about 5k
> networks.  For testing I've been processing about 536k flows from a
> 27M flow file brought in from a pcap file.  This represents about 5
> minutes of capture.
> This takes a little less than 1 second with no filter file.  Here are
> my questions:
> 
> Why does it take about 27 seconds to process when using a filter file
> with only one network?
> 
> Why does it take about 27 seconds to process when using a filter file
> with 5k networks?
> 
> Are lists supported for network filtering?  I read about lists for
> IP's but I get a syntax error when saying "net in [ x.x.x.x.x
> x.x.x.x]"
> 
> Should nfdump be using only 1 core?  I thought it was multi threaded
> but perhaps I'm mistaken.
> 
> 
> 
> Any assistance would be much appreciated.
> 
> Thanks,
>      Greg
> 
> ------------------------------------------------------------------------------
> uberSVN's rich system and user administration capabilities and model 
> configuration take the hassle out of deploying and managing Subversion and 
> the tools developers use with it. Learn more about uberSVN and get a free 
> download at:  http://p.sf.net/sfu/wandisco-dev2dev
> _______________________________________________
> Nfdump-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfdump-discuss

-- 
--
Be nice to your netflow data

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Nfdump-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nfdump-discuss

Reply via email to