Hi all: I was not happy with the performance of some of my rules that handles ~100 events/sec (with 1-2s bursts to 800 events/sec every minute or so).
It was consuming 50% of the cpu at this rate and the lag was pretty significant (> 100 events). This is unusual it should be taking a few percent of the cpu at these levels and the lag should be less than 10 or so events at most times. I did some rewriting of the rules using jump and cfsets to streamline the handling of particular high volume event creators (slapd, firewall) so they only went through one ruleset and not the entire pipeline of rulesets. This sped things up by 10-15%. But I was still seeing 35-40% cpu use. I did a state dump (kill -USR1) and saw that I had 100K+ contexts defined. These were created to record the postgres connection information (user, host, ports) and timed out after a day. They are created at the rate of roughly 1.5/second by our load balancer which verifies that postgres instances were up and running. Then there were another 10-20 thousand created by actual postgres queries over a 24 hour period. They timed out after 24 hours but this was creating a huge burden during runtime since many of my rules use context expressions to control their operation/application. So the tree of 100K contexts has to be walked for 50% of the rules if not more. Also the virtual mem size of the process was close to 250MB. I changed my postgres logging to include logging of disconnects from the postgres server. I added a rule to delete the connection context when a disconnect happened. This reduced the steady state number of contexts to approx 800. It also reduced the memory to 100MB. As a result my sec process is now using 2-5% of the cpu and has a lag that is less than 10 events (which is within the error range of the technique I am using to measure the lag). So what's the moral? Clean up after yourself whenever possible. Even if you have to add additional logging events to the stream if they help keep things clean they will help performance. On another topic, does anybody have a good method for measuring the sec lag? I have a single input file (/var/log/messages) and I am using: tail -n 75 /var/log/messages > /tmp/a; /etc/init.d/sec_rsyslog_master_mon dump | \ sed -ne '/Content of input buffer/,+101p' | tail -n 75 | \ comm -13 /tmp/a - | wc -l which grabs the last 75 lines in the input file and compares it to the last 75 lines in sec's buffer. The comm prints lines that exist in the input file but are not matched by lines in sec's input buffer. Then the lines are counted. This has a few issues with it, but seems to be the most accurate mechanism I have come up with. Obviously this falls down if you have multiple input files, separate buffers/file etc. Enjoy the new year. -- -- rouilj John Rouillard =========================================================================== My employers don't acknowledge my existence much less my opinions. ------------------------------------------------------------------------------ Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnmore_122712 _______________________________________________ Simple-evcorr-users mailing list Simple-evcorr-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users