----- Original Message ---- From: Karl Tatgenhorst <[EMAIL PROTECTED]> To: jay alvarez <[EMAIL PROTECTED]> Cc: Jonathan Glass <[EMAIL PROTECTED]>; [email protected] Sent: Thursday, January 11, 2007 10:24:06 PM Subject: Re: [Flow-tools] flow-cat "20gig of flows" |flow-stat -f8 -S2 takes forever to complete... >
> The performance of all these tools is kind of determined by the > amount of learning and tweaking your hardware and learning the nuances > of the software. Explore the -m option on flow-cat (disables mmap()) > this will buy much better performance. Also, always keep in mind > everything that is going on and as you think it through you can learn > where to optimize. Our system for flow analysis uses SAN space, a ram > san and various other tricks. Yesterday after I wrote this post my > coworker parsed, sorted and flow-stat'd 61 GBs of flows. The operation > took just over 13 minutes. Wow, that was impressive! The only thing on my mind was using 64 bit OS so I can use plenty of RAM.. But that was just a dream... My boss wants immediate output but he wouldn't allow me to buy such expensive toys for my netflow lab.. Not sure about that SAN space, I will try googling for it. Also that mmap thing... Great! thanks for the tips. Right now I just decided to flow-cat 1 week of flows (around 5-8gig) and then sorted flow-stat it. Sadly, It runs for about 30-40 minutes. Well, better than no output at all.. Thanks again! -jay On Wed, 2007-01-10 at 15:30 -0800, jay alvarez wrote: > > ----- Original Message ---- > From: Karl Tatgenhorst <[EMAIL PROTECTED]> > To: Jonathan Glass <[EMAIL PROTECTED]> > Cc: jay alvarez <[EMAIL PROTECTED]>; [email protected] > Sent: Wednesday, January 10, 2007 10:53:00 PM > Subject: Re: [Flow-tools] flow-cat "20gig of flows" |flow-stat -f8 -S2 takes > forever to complete... > > > Hi, > > > Not sure about your hardwares specs but here are some tips. > > # cat /proc/version > Linux version 2.6.8-2-386 ([EMAIL PROTECTED]) (gcc version 3.3.5 (Debian > 1:3.3.5-13)) #1 Tue Aug 16 12:46:35 UTC 2005 > > Intel Xeon 3.00Ghz with 1 gig ram and lots of hd space (200gb). > > > > First, the file size is unbelievably unwieldy. You are most likely > > looking at only certain types of traffic (and if not perhaps you should > > consider breaking it out by traffic type) why not rewrite the files in > > that way. Let us say for example that ICMP is not important and flows > > with less than 3 pkts (not a full tcp handshake). I bet this would cut a > > sizable percentage out of your files. > > The goal is to have an output of Top destination ip (using flow-stat) then > parse it using custom scripts to agreggate all IPs belonging to a particular > country.. Sort of finding out the top destination countries for the month so > that the network guys can do all their routing trick... > > > > > Next, flow-stat needs flow-cat to finish entirely in memory before it > > can build the hashes. This means that you need 20 GBS worth of memory > > used PRIOR to flow-stat building the hashes. > > I see... so I guess I'll have to limit my flow-cat to 1 or less Gb of flows > or make use of the extra 2Gb swap just for the flow-stat to function > smoothly.. > > > This is a difficult trick > > since Debian would need to be specially tuned to use 3GB (2GB is the > > usual max 3 is high end). To accomplish this you would need 20GBs > > minimum of swap space and that would need to be physically on a drive > > other than the drive holding the flow files or you will just be i/o > > bound. Why not cat together single weeks of traffic (with the above > > mentioned edits) and then put them in excel to create the monthly > > reports? > > 1st day flow totals to 750Mb.. And adding the 2nd day equals 1.6 gb. I guess > this would still be tolerable considering I am planning to use the swap space. > > So what now? I mean, is it ok if I just "flow-cat 2_days_of_flows" flow-cat > another 2 days and so on. Then I will flow-cat each output all together then > throw it to flow-stat? hmm... this is getting trickier... I guess it would > really be impossible to do a "flow-cat 20gb_flows |flow-stat -f8", even if I > remove sorting, right? > > I haven't tried flow-cat'ing a week of flows yet.. and given 750mb per 1day > flow, it would roughly be around 5 to 6 gb.. Will it be ok if I flow-cat > |flow-stat -f8 -S2 this size considering my current hardware specs? Or should > I just show them the top destination countries for every two days, > then aggregate them as needed? > > Or you have any other suggestion? > > I see, perhaps this is why Flowviewer is taking too long when showing flow > reports for a long span of time. I wonder if Flowviewer guys have already > considered this. No wonder why other admins here said they have left > flowviewer because it takes forever to complete a month of report. > > > Thanks. > -jay > > > The tip on lsof -p <pid> very cool, just thought I would mention > > that. Thanks. > > Karl Tatgenhorst > > On Wed, 2007-01-10 at 09:28 -0500, Jonathan Glass wrote: > > jay alvarez wrote: > > > Hi, > > > > > > I have a directory of flow-captured flows for a whole month(Dec2006) and > > > I'm trying to do a > > > flow-cat "flows_dir" | flowstat -f8 -S2 > topdestination > > > > > > I left it in background and it's been running for 30 hours now. > > > Doing a "top" shows flow-stat being on top of the list from time to time > > > consuming around 60% of memory on a debian system. Noticeably, flow-cat > > > doesn't appear in "top" (perhaps it's done with its job) > > > > > > however ps shows them both. > > > > > > #ps -aux |grep flow > > > > > > root 22604 0.9 0.0 6448 284 ? S Jan09 16:31 flow-cat > > > /var/netflow/ft/all/dec2006/ > > > root 22605 7.0 52.3 875204 474452 ? D Jan09 123:07 flow-stat > > > -f8 -S2 > > > > > > > > > > > > Also lsof > > > > > > # lsof |grep flow-cat > > > > > > flow-cat 22604 root cwd DIR 8,3 224 36536 > > > flow-cat 22604 root rtd DIR 8,4 584 2 / > > > flow-cat 22604 root txt REG 8,3 88716 25290 > > > /usr/bin/flow-cat > > > flow-cat 22604 root mem REG 8,4 90248 110 > > > /lib/ld-2.3.2.so > > > flow-cat 22604 root mem REG 8,4 73304 5891 > > > /lib/tls/libnsl-2.3.2.so > > > flow-cat 22604 root mem REG 8,4 28880 6019 > > > /lib/libwrap.so.0.7.6 > > > flow-cat 22604 root mem REG 8,3 67468 5598 > > > /usr/lib/libz.so.1.2.2 > > > flow-cat 22604 root mem REG 8,4 1254660 5886 > > > /lib/tls/libc-2.3.2.so > > > flow-cat 22604 root mem REG 8,1 3548008 48872 > > > /var/netflow/ft/all/dec2006/ft-v05.2006-12-21.133000+0800 > > > flow-cat 22604 root 0u CHR 136,0 2 > > > /dev/pts/0 (deleted) > > > flow-cat 22604 root 1w FIFO 0,7 12005820 > > > pipe > > > flow-cat 22604 root 2u CHR 136,0 2 > > > /dev/pts/0 (deleted) > > > flow-cat 22604 root 3r REG 8,1 3548008 48872 > > > /var/netflow/ft/all/dec2006/ft-v05.2006-12-21.133000+0800 > > > > > > Above shows flow-cat seems to have stopped processing at Dec 21, don't > > > know why. > > > > > > > > > # lsof |grep flow-stat > > > > > > flow-stat 22605 root cwd DIR 8,3 224 36536 > > > /usr/local/home/jayson/topcountries > > > flow-stat 22605 root rtd DIR 8,4 584 2 / > > > flow-stat 22605 root txt REG 8,3 130208 25291 > > > /usr/bin/flow-stat > > > flow-stat 22605 root mem REG 8,4 90248 110 > > > /lib/ld-2.3.2.so > > > flow-stat 22605 root mem REG 8,4 73304 5891 > > > /lib/tls/libnsl-2.3.2.so > > > flow-stat 22605 root mem REG 8,4 28880 6019 > > > /lib/libwrap.so.0.7.6 > > > flow-stat 22605 root mem REG 8,3 67468 5598 > > > /usr/lib/libz.so.1.2.2 > > > flow-stat 22605 root mem REG 8,4 1254660 5886 > > > /lib/tls/libc-2.3.2.so > > > flow-stat 22605 root 0r FIFO 0,7 12005820 > > > pipe > > > flow-stat 22605 root 1w REG 8,3 0 36353 > > > /usr/local/home/jayson/topcountries/topdestinationip > > > flow-stat 22605 root 2u CHR 136,0 2 > > > /dev/pts/0 (deleted) > > > > > > As you can see above, I have redirected the output to "topdestinatioip" > > > But up to now, the file is still empty. > > > > > > Do you know am I going to find out the progress of what I'm doing? > > > I'm just afraid that the program might have stopped running and I am > > > waiting for nothing now. > > > > > > Thanks > > > - jay > > > > > > > > > > > > ------------------------------------------------------------------------ > > > Want to start your own business? Learn how on Yahoo! Small Business. > > > <http://us.rd.yahoo.com/evt=41244/*http://smallbusiness.yahoo.com/r-index> > > > > > > > > > ------------------------------------------------------------------------ > > > > > > _______________________________________________ > > > Flow-tools mailing list > > > [EMAIL PROTECTED] > > > http://mailman.splintered.net/mailman/listinfo/flow-tools > > > > Just as a personal preference, I like to start my flow-cat sessions in > > the background, find their process id, and watch it. Literally: > > > > flow-cat & > > ps -aef|grep flow-cat > > watch "lsof -p <flow-cat-pid>" > > > > So I can see exactly what files flow-cat is processing, and watch for it > > to die. > > > > > > > > > > ____________________________________________________________________________________ > Cheap talk? > Check out Yahoo! Messenger's low PC-to-Phone call rates. > http://voice.yahoo.com ____________________________________________________________________________________ Yahoo! Music Unlimited Access over 1 million songs. http://music.yahoo.com/unlimited _______________________________________________ Flow-tools mailing list [EMAIL PROTECTED] http://mailman.splintered.net/mailman/listinfo/flow-tools
