David, Did a few runs with those lines present / commented out with rsyslog-master. I see no difference in either on-cpu or off-cpu performance. The left-hand-side is with edge-relay and core-relay lines being active, and the right-side has them commented out. Here is the cpu profile: https://drive.google.com/open?id=0B_XhUZLNFT4dWDFOZlB6VW1hc0E The hypercall on top is cpu-idle call. This was on a c1.4large box on ec2.
Both were on the same version of Rsyslog (built off master). I'll do another round of tests with the version I run on production (which is patched fork of 8.12, afair) to identify any large change in cpu profile. This is a much more simplified version of config though. I extracted valid (non garbled) json messages from debug log(got 9089 lines) and pumped them in 100 times using tcpflood. tcpflood call was timed, it took between 19.7 - 20.5 seconds for each run regardless of edge and core relay counters being incremented. It was ~ 45k / sec (but again, neither the load profile, not the machine is not exactly like yours), so its not apples to apples. On Fri, Apr 8, 2016 at 12:15 PM, singh.janmejay <singh.janme...@gmail.com> wrote: > It doesn't treat them any different. It takes the key and puts it in a > hash-table and set an accumulator as its value. Then it'll look it up > every time you ask it to increment the counter followed by actual > increment. That is about it. > > The only way I can foresee the contents affecting its behavior is the > hash-fn takes more CPU time when given a large string. But that should > be on-cpu and not off-cpu time. > > Today im working on a getting an Rsyslog build running with config > into which I inject messages from your debug-log (i'll extract > messages and tcpflood them in) and try to understand why it behaves > different between those lines commented out or not. > > If I fail to reproduce it in this simple setup, we'll have to track > backtraces that are taking Rsyslog off-cpu in your environment. > > On Fri, Mar 25, 2016 at 12:57 AM, David Lang <da...@lang.hm> wrote: >> doing a little more digging (and some accidental stuff), is it possible that >> it's running into grief if the contents of the variable are not a simple >> word (spaces or other funny characters in the value)? >> >> David Lang >> >> On Wed, 23 Mar 2016, singh.janmejay wrote: >> >>> Date: Wed, 23 Mar 2016 01:07:49 +0530 >>> >>> From: singh.janmejay <singh.janme...@gmail.com> >>> Reply-To: rsyslog-users <rsyslog@lists.adiscon.com> >>> To: rsyslog-users <rsyslog@lists.adiscon.com> >>> Subject: Re: [rsyslog] dynastats problems with user defined variables? >>> >>> Yep, it looked like that. Its interesting though, its 100% off CPU >>> (very unique). >>> >>> On Wed, Mar 23, 2016 at 1:02 AM, David Lang <da...@lang.hm> wrote: >>>> >>>> Yes, the first one is with the user variables commented out, exactly what >>>> I >>>> pasted in from the grep. >>>> >>>> the second is enabling the dyn_inc for edge relays. >>>> >>>> David Lang >>>> >>>> On Wed, 23 Mar 2016, singh.janmejay wrote: >>>> >>>>> Date: Wed, 23 Mar 2016 00:51:18 +0530 >>>>> From: singh.janmejay <singh.janme...@gmail.com> >>>>> Reply-To: rsyslog-users <rsyslog@lists.adiscon.com> >>>>> To: rsyslog-users <rsyslog@lists.adiscon.com> >>>>> Subject: Re: [rsyslog] dynastats problems with user defined variables? >>>>> >>>>> You meant the first one was with "commented out" version right? (you >>>>> said uncommenting edge_relay, so double-checking). >>>>> >>>>> On Wed, Mar 23, 2016 at 12:38 AM, David Lang <da...@lang.hm> wrote: >>>>>> >>>>>> >>>>>> here is vmstat 1 running >>>>>> >>>>>> # vmstat 1 >>>>>> procs -----------memory---------- ---swap-- -----io---- -system-- >>>>>> ------cpu----- >>>>>> r b swpd free buff cache si so bi bo in cs us sy >>>>>> id >>>>>> wa st >>>>>> 2 0 62280 2363804 584 27280376 0 1 21 596 0 1 46 >>>>>> 3 >>>>>> 51 0 0 >>>>>> 1 0 62280 2349476 584 27280612 0 0 0 87042 979 891 4 >>>>>> 0 >>>>>> 95 0 0 >>>>>> 1 0 62280 2330488 584 27280908 0 0 0 0 388 258 4 >>>>>> 0 >>>>>> 96 0 0 >>>>>> 1 0 62280 2314744 584 27281148 0 0 0 0 396 255 4 >>>>>> 0 >>>>>> 96 0 0 >>>>>> 1 0 62280 2297736 584 27281388 0 0 0 0 376 245 4 >>>>>> 0 >>>>>> 96 0 0 >>>>>> 2 0 62280 2546260 584 27245272 0 0 0 4 639 568 6 >>>>>> 1 >>>>>> 94 0 0 >>>>>> 2 0 62280 2464300 584 27245928 0 0 0 1716 936 1146 8 >>>>>> 0 >>>>>> 91 0 0 >>>>>> 2 0 62280 2394452 584 27246444 0 0 0 8 687 333 8 >>>>>> 0 >>>>>> 92 0 0 >>>>>> >>>>>> >>>>>> here is vmstat 1 uncommenting edge_relay >>>>>> >>>>>> # vmstat 1 >>>>>> procs -----------memory---------- ---swap-- -----io---- -system-- >>>>>> ------cpu----- >>>>>> r b swpd free buff cache si so bi bo in cs us sy >>>>>> id >>>>>> wa st >>>>>> 1 0 62280 3198672 584 26770624 0 1 21 596 0 1 46 >>>>>> 3 >>>>>> 51 0 0 >>>>>> 0 0 62280 3198904 584 26770664 0 0 0 0 151 247 0 >>>>>> 0 >>>>>> 100 0 0 >>>>>> 0 0 62280 3200332 584 26770664 0 0 0 0 128 233 0 >>>>>> 0 >>>>>> 100 0 0 >>>>>> 0 0 62280 3202244 584 26770664 0 0 0 4 172 315 0 >>>>>> 0 >>>>>> 100 0 0 >>>>>> 0 0 62280 3203132 584 26770664 0 0 0 0 132 218 0 >>>>>> 0 >>>>>> 100 0 0 >>>>>> 0 0 62280 3203912 584 26770668 0 0 0 8 203 347 0 >>>>>> 0 >>>>>> 100 0 0 >>>>>> >>>>>> >>>>>> things just are not running with it uncommented. >>>>>> >>>>>> I'm going to send you my config directly, it is a very heavy config. >>>>>> >>>>>> David Lang >>>>>> >>>>>> >>>>>> >>>>>> On Wed, 23 Mar 2016, singh.janmejay wrote: >>>>>> >>>>>>> On my cluster, each node (those 66 nodes) have 4 threads for ruleset >>>>>>> (it does everything, lognorm based parsing, lookups using loopup >>>>>>> table, re_extract, string concat, several if-else based conditions and >>>>>>> queues up for egress using omkafka and I generally see 3 being >>>>>>> utilized). This setup runs at ~37k msg/s (2.5M / 66 = 37k). >>>>>>> >>>>>>> 300k/Min is still 5k/s which should be doable on a single-thd (unless >>>>>>> your ruleset is much heavier than mine). Im thinking for a >>>>>>> equivalently complex ruleset 37k / 4 = 9k should be achievable. >>>>>>> >>>>>>> I'll try to reproduce this. Can you share out the exact commit-id you >>>>>>> are working with? >>>>>>> >>>>>>> On Tue, Mar 22, 2016 at 8:38 PM, David Lang <da...@lang.hm> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, 22 Mar 2016, singh.janmejay wrote: >>>>>>>> >>>>>>>>> On Tue, Mar 22, 2016 at 5:01 AM, David Lang <da...@lang.hm> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I added this to my configs >>>>>>>>>> >>>>>>>>>> # grep -B 1 -A 1 dyn_ /etc/rsyslog.conf >>>>>>>>>> # custom stats to track in rsyslog >>>>>>>>>> dyn_stats(name="msgs_per_host" resettable="on" >>>>>>>>>> maxCardinality="3000" >>>>>>>>>> unusedMetricLife="1200") >>>>>>>>>> dyn_stats(name="msgs_per_edge_relay" resettable="on" >>>>>>>>>> maxCardinality="3000" >>>>>>>>>> unusedMetricLife="1200") >>>>>>>>>> dyn_stats(name="msgs_per_core_relay" resettable="on" >>>>>>>>>> maxCardinality="3000" >>>>>>>>>> unusedMetricLife="1200") >>>>>>>>>> dyn_stats(name="msgs_per_program" resettable="on" >>>>>>>>>> maxCardinality="3000" >>>>>>>>>> unusedMetricLife="1200") >>>>>>>>>> dyn_stats(name="msgs_per_tag" resettable="on" maxCardinality="3000" >>>>>>>>>> unusedMetricLife="1200") >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> /var/log/sources-messages;sources >>>>>>>>>> set $.inc = dyn_inc("msgs_per_host", $hostname); >>>>>>>>>> set $.inc = dyn_inc("msgs_per_program", $programname); >>>>>>>>>> # if $!trusted!edge!relay != "" then { >>>>>>>>>> # set $.inc = dyn_inc("msgs_per_edge_relay", >>>>>>>>>> $!trusted!edge!relay); >>>>>>>>>> # } >>>>>>>>>> # if $!trusted!core!relay != "" then { >>>>>>>>>> # set $.inc = dyn_inc("msgs_per_edge_relay", >>>>>>>>>> $!trusted!core!relay); >>>>>>>>>> # } >>>>>>>>>> -- >>>>>>>>>> } >>>>>>>>>> #set $.inc = dyn_inc("msgs_per_tag", $.custommessage); >>>>>>>>>> /var/log/messages-tags;manual >>>>>>>>>> >>>>>>>>>> if I uncomment any of the lines that refer to $! or $. variables, >>>>>>>>>> rsyslog >>>>>>>>>> basically stops processing messages (a few messages seem to be >>>>>>>>>> processed >>>>>>>>>> in >>>>>>>>>> spurts, bit a lot of processing never happens) >>>>>>>>>> >>>>>>>>>> any chance that this has problems with locking around user defined >>>>>>>>>> variables? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Not sure, I run it in production (66 6-cpu(HT) haswell VMs handling >>>>>>>>> an >>>>>>>>> aggregate of ~2.5 M-msg / sec (~1.5kB each)) and haven't seen such >>>>>>>>> jitters, but then I don't use 3 level deep keys either and in my >>>>>>>>> case, >>>>>>>>> the feature is backported to 8.12 branch), so still uses json-c (not >>>>>>>>> libfastjson). I do use it with user-defined variables too, just not >>>>>>>>> nested as deep. >>>>>>>>> >>>>>>>>> We don't treat variables differently after dereferencing them in the >>>>>>>>> top-most layer handling rainerscript fn-calls, so $hostname and >>>>>>>>> $!x!y!z, as far as any rainerscript fn invocation is considered, are >>>>>>>>> the same (performance-wise) except for the top-level dereference >>>>>>>>> (outside of fn-impl-body). >>>>>>>>> >>>>>>>>> Some more info may come handy in reproducing it: >>>>>>>>> - What is the message throughput (in msg/s) and what is the avg msg >>>>>>>>> size? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> with these commented out as above, ~50k messages/min normal, ~200K >>>>>>>> message/min if I'm replaying queued logs (box saturated, see the >>>>>>>> other >>>>>>>> thread where 8.17 final seems to have slowed down from ~300K >>>>>>>> messages/min >>>>>>>> that I was getting with a 8.16+ git) >>>>>>>> >>>>>>>>> - Do you see messages being written to sources-messages file, while >>>>>>>>> they don't get written to messages-tags file? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Yes, but not all messages. I have an alarm that goes off if >>>>>>>> sources-messages >>>>>>>> isn't written to in two 35 second checks (70 seconds total), that >>>>>>>> alarm >>>>>>>> never fired. But there would be a burst of messages show up in >>>>>>>> sources-messages and then 20 or so seconds before another burst would >>>>>>>> show >>>>>>>> up. >>>>>>>> >>>>>>>>> - Do you see a significant jump in voluntary context switches(say >>>>>>>>> per >>>>>>>>> 10 seconds or even per second) when you uncomment those lines? How >>>>>>>>> big >>>>>>>>> is the jump? >>>>>>>>> - How does the observation change if you replace it with this kind >>>>>>>>> of >>>>>>>>> variable access pattern: >>>>>>>>> >>>>>>>>> set $.x = $!trusted!edge!relay; >>>>>>>>> if $.x != "" then { >>>>>>>>> set $.inc = dyn_inc("msgs_per_edge_relay", $.x); >>>>>>>>> } >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> will check. >>>>>>>> >>>>>>>>> - Is this stock 8.17 build? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> stock config, but against git as of yesterday. I needed to get >>>>>>>> liblognorm >>>>>>>> 2.0 and the current libjsonfast >>>>>>>> >>>>>>>>> - Is this x86-64 Linux? What distro (and version)? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ubuntu trusty (14.04) 64 bit. >>>>>>>> >>>>>>>>>> >>>>>>>>>> Also, the documentation doesn't say why we are setting $.inc to the >>>>>>>>>> result >>>>>>>>>> of the command, what does $.inc contain after the command is run? >>>>>>>>>> from >>>>>>>>>> the >>>>>>>>>> documentation, it looks like it contains an error code (at least >>>>>>>>>> the >>>>>>>>>> way >>>>>>>>>> it's used is similar to error code returns in other languages), but >>>>>>>>>> I >>>>>>>>>> don't >>>>>>>>>> see what the errors are in the doc page. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Yes, it is the error-code. It has value 0 when successful, an >>>>>>>>> non-zero >>>>>>>>> when it fails to increment (it may fail due to maxCardinality being >>>>>>>>> hit or due to a contended for lock etc (it doesn't wait for the >>>>>>>>> lock, >>>>>>>>> it uses try-lock and doesn't increment the counter if it can't get a >>>>>>>>> shared-lock over the bucket). >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> While I do have multiple threads running on the main queue, the >>>>>>>> places >>>>>>>> I >>>>>>>> am >>>>>>>> calling this are on action queues with only one worker, so the only >>>>>>>> lock >>>>>>>> conflict should be when the data is reported. >>>>>>>> >>>>>>>>> These are rsyslog error-codes. It does an error pass-thru. >>>>>>>>> >>>>>>>>> I'll add this to documentation (didn't know this was missed). >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks. >>>>>>>> >>>>>>>> Sorry I didn't get a chance to run this pre-release, this is exactly >>>>>>>> why >>>>>>>> I >>>>>>>> try to use features like this pre-release to spot the things that the >>>>>>>> person >>>>>>>> developign them misses because they are too close to the problem. :-) >>>>>>>> >>>>>>>> >>>>>>>> David Lang >>>>>>>> _______________________________________________ >>>>>>>> rsyslog mailing list >>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>>>> http://www.rsyslog.com/professional-services/ >>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>>>> myriad >>>>>>>> of >>>>>>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>>>>>> DON'T >>>>>>>> LIKE THAT. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> rsyslog mailing list >>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>> http://www.rsyslog.com/professional-services/ >>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>> myriad >>>>>> of >>>>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>>>> DON'T >>>>>> LIKE THAT. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> rsyslog mailing list >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> http://www.rsyslog.com/professional-services/ >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>>> of >>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >>>> LIKE THAT. >>> >>> >>> >>> >>> >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of >> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >> LIKE THAT. > > > > -- > Regards, > Janmejay > http://codehunk.wordpress.com -- Regards, Janmejay http://codehunk.wordpress.com _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.