Re: [Nfsen-discuss] [BULK] Re: Increasing performance for plugins

Peter Haag Fri, 09 Feb 2007 00:55:55 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Adrian,



- --On February 9, 2007 9:18:50 +0200 Adrian Popa <[EMAIL PROTECTED]> wrote:

| Peter, I have a question about performance.
|
| If I were to give up on my plugin and instead create about 200 profiles
| , each searching for a different network prefix and input/output
| interface in the flows, would nfsen be able to handle this kind of data?
| (actually I have 20 prefixes on 6 routers with 1 to 4 interfaces each,
| and I'd like to plot upstream and downstream traffic - so I think there
| will be more than 200 profiles...).
| For each profile, I could set an expire time of 10 minutes (because I
| don't need to save the flows - just to get the graphs).
|
| Or, as you suggested, I could use channels in the same profile, but I
| have no idea how I could create or manage them... (perhaps you have some
| tips where I could find some documentation about that).
|
| My main question is: do you think nfsen can handle 200 profiles? I have
| only a production machine, and I'm not eager to experiment on it! :)

I understand your concerns very well! However, it's difficult to give a
absolute answer. It may well be, that it works - it mainly depends on:

1. The amount of data in a 5min time slot in your live profile.
2. The power of your production machine - IO
3. The power of your production machine - CPU

A few explanation to understand how NfSen works under the hood, so you can
decide, how you want to proceed. Everything below is based on NfSen >= 
snapshot-20070110.
NfSen version < snapshot-20070110 have a bit different implementation - less 
flexible
but follows a similar principle.

A profile consists of one or more channels. Each channel is based on one or more
netflow sources from live profile and has it's own filter. Therefore a channel 
is
no longer bound to a specific netflow source of profile 'live'. The traditional 
1:1
profile from NfSen 1.2.4 is implemented in that each channel is built from one
netflow source in the live profile and all channels have the same filter.
When selecting '1:1 channels from profile live' while creating a profile, NfSen
will automatically create the appropriate number of channels for you. When using
individual channels, you can add new channels to the created profile by clicking
on the '+' icon on the right of 'Channel list'. When you are done with adding 
all
channels, click on the 'accept' icon on the right of 'Status'. This will commit
the new profile and switch it to active state.

As of processing:
At every 5min slot, the periodic process will update all continuous profiles. 
All
channels in all continuous profiles are updated at once using 'nfprofile'. 
Therefore
it makes no difference, if you have 1 profile with 200 channels or 200 profiles
with 1 channel each. nfprofile reads the flows from live profile once only, 
regardless of
how many channels you have and applies all channel filters in sequence.
Performancewise you need to read all flow data and write all new channel data 
back
to disk. If the channel filters do no overlap you essentially read and write 
the same
amount of data - a bit more expensive "copy" operation from 1 file to many 
files.
Furthermore you apply each channel filter to each flow. Filtering is 
implemented quite
efficient, so most of the time your system does IO wait and you still have some 
idle
CPU cycles. Of course filtering may become an issue if your filters are several 
10k
in size, which I do not expect in your case. 'nfprofile' will get more 
parallized
in future versions, so the profiler can make better use of the more beefy 
systems today.

To get rid of the data you can set the expire time to 10min or to 1byte. If you 
like
to try out new things, I can send you a copy of the (not yet published) snapshot
implementing shadow profiles - profiles which do not collect any netflow data, 
but only
graphical data. This means the profiler just reads the flow data and does not 
write
anything, just updates the RRD DBs for the graphs.

To make the long story short - it depends. :)
You may start by implementing 20 channels first, and then double the size every 
time
in next steps, while observing the behaviour of your system.

And last but not least - yes documentation will come too.

Hope this helps to make your decision.

    - Peter
|
| Thank you!
|
| Peter Haag wrote:
| > -----BEGIN PGP SIGNED MESSAGE-----
| > Hash: SHA1
| >
| > Hi Adrian,
| >
| > - --On February 1, 2007 15:28:32 +0200 Adrian Popa <[EMAIL PROTECTED]> 
wrote:
| >
| > | Hello,
| > |
| > | I have a question about the performance of nfdump, but first, let me
| > | explain what I'm trying to do:
| > | I have a plugin that searches the collected flows for specific network
| > | prefixes (or AS-es) on each exporting router, on specific intefaces. The
| > | information is then fed into custom rrd files and plotted as png images.
| > | Searching is done by using a top 1 record/bytes and filtering by 'inif x
| > | and net 1.2.3.0/24'. Here's an example:
| > |
| > | $nfdump -r /data/nfsen/profiles/live/$border/nfcapd.$timeslot -n 1 -s
| > | record/bytes -o "fmt:%ts %td %pr %sap -> %dap %pkt %byt %bps %in %out
| > | %sas %das %fl" '$ifType $if and net $prefix'
| > |
| > | I have to search for input traffic on a specific interface for a
| > | specific network prefix and also for output traffic for the same thing.
| > |
| > | I achieved to do this, and it works well, but execution time for 2
| > | borders (with 3-4 interfaces each), 20 prefixes and 50 AS-es  for a peak
| > | traffic of about 2Gbps takes about 3,5 minutes.
| >
| > If I understand you right, you are going to call this nfdump command for
| > each of the prefixes, which results in a lot of sequential nfdump commands.
| >
| > |
| > | In the future I will want to monitor other routers, on the same
| > | principle. As far as I see, I can do that, but either I gather less
| > | data, or I use a different machine for collecting.
| > |
| > | A colleague of mine proposed that I split my script (which is 100%
| > | sequential) into several threads that run at the same time. Each thread
| > | would call nfdump and update it's particular rrd.
| > |
| > | My question to you is this: Assuming that I start the new processes like
| > | threads (or more likely like forked processes), would I get a speed
| > | increase? I'd like to say that this script keeps the processor usage at
| > | about 60-80%.
| >
| > The CPU usage is only half of the story for your plugin. Almost every time
| > I/O is much more a problem. For each nfdump command you read a lot of data.
| > If this amount of data does not fit into the file system cache of your OS,
| > the performance rapidly drops. So I'd recommend you to analyse how your
| > system behaves in such a plugin cycle. Have a look at the IO using iostat
| > check the service time of your disks. If you still have room creating
| > more threads can result in better performance. If your IO system is at its
| > limit, you will not gain anything, and CPU will stay at 60-80% as your 
system
| > has lots of IO wait. Overcoming this, you would need more RAM to increase 
the
| > available memory for the IO file system cache. A high service time in IO 
stat
| > means slow disks - so you'll need the right balance of disks and memory.
| >
| > Furthermore you can try to optimise IO by limiting reading data once and 
doing
| > parallel processing - the way which nfprofile does profiling all your 
channels.
| > It reads the data once only and applies all filters to the same data.
| >
| > Coming back to your plugin - you may try to optimise IO by creating a 
profile
| > with a channel per prefix and creating adequate filters per channel. Your
| > plugin then needs to create the stat Top 1 per channel which may result in 
reading
| > less data over all - but this is just a guess.
| >
| > So - it's a bit of all.
| > Hope this helps anyway.
| >
| >     - Peter
| >
| > |
| > | I don't know if the forked processes would load the same input file into
| > | memory again and again, or if they would share the same file (lowering
| > | memory consumption)?
| > |
| > | What are your recomandations?
| > |
| > | Thank you for your time,
| > |
| > | --
| > | Adrian Popa
| > |
| > |
| > |
| > | -------------------------------------------------------------------------
| > | Using Tomcat but need to do more? Need to support web services, security?
| > | Get stuff done quickly with pre-integrated technology to make your job 
easier.
| > | Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
| > | http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
| > | _______________________________________________
| > | Nfsen-discuss mailing list
| > | [email protected]
| > | https://lists.sourceforge.net/lists/listinfo/nfsen-discuss
| >
| >
| >
| > - --
| > _______ SWITCH - The Swiss Education and Research Network ______
| > Peter Haag,  Security Engineer,  Member of SWITCH CERT
| > PGP fingerprint: D9 31 D5 83 03 95 68 BA  FB 84 CA 94 AB FC 5D D7
| > SWITCH,  Limmatquai 138,  CH-8001 Zurich,  Switzerland
| > E-mail: [EMAIL PROTECTED] Web: http://www.switch.ch/
| > -----BEGIN PGP SIGNATURE-----
| > Version: GnuPG v1.4.3 (Darwin)
| >
| > iQCVAwUBRcMAaP5AbZRALNr/AQLXEwP/bymvl/R3I5MqF8qXSq82QXwDng9VPcyH
| > 56KfUdgDFYpVSOM/Jjn08t8LPaGCA/2DQFxzjzXc+g/YngLfOFZFxkjZEDfRo3AS
| > 53T8cZeTHIx8gy4Xn1y5VqerK0+Q4BNB+I+1+YYo/g8wVfE+pBMNNKh1m+krIwLO
| > isZnG514jMA=
| > =E3HQ
| > -----END PGP SIGNATURE-----
| >
| >
| >
|
|
| -------------------------------------------------------------------------
| Using Tomcat but need to do more? Need to support web services, security?
| Get stuff done quickly with pre-integrated technology to make your job easier.
| Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
| http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
| _______________________________________________
| Nfsen-discuss mailing list
| [email protected]
| https://lists.sourceforge.net/lists/listinfo/nfsen-discuss



- --
_______ SWITCH - The Swiss Education and Research Network ______
Peter Haag,  Security Engineer,  Member of SWITCH CERT
PGP fingerprint: D9 31 D5 83 03 95 68 BA  FB 84 CA 94 AB FC 5D D7
SWITCH,  Limmatquai 138,  CH-8001 Zurich,  Switzerland
E-mail: [EMAIL PROTECTED] Web: http://www.switch.ch/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iQCVAwUBRcw3DP5AbZRALNr/AQIp6AP6AmgcojbVR/9okVONs+EmUNzVch4TqwO6
goRXY5OQ6z2TB3z4afFGIsyN+ghaIpqmG7nhqZ71vzfU/qmSIeAmwnOv92sZlJcK
ZgDW8DDeHnCsfM9lilwiZAsXFRahgrLr1MNnU6mSaie1/OoYxAEYU4gUOEwaqkZZ
tsv50dK32h0=
=fDg7
-----END PGP SIGNATURE-----


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nfsen-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nfsen-discuss

Re: [Nfsen-discuss] [BULK] Re: Increasing performance for plugins

Reply via email to