David,

Did a few runs with those lines present / commented out with
rsyslog-master. I see no difference in either on-cpu or off-cpu
performance. The left-hand-side is with edge-relay and core-relay
lines being active, and the right-side has them commented out. Here is
the cpu profile:
https://drive.google.com/open?id=0B_XhUZLNFT4dWDFOZlB6VW1hc0E
The hypercall on top is cpu-idle call. This was on a c1.4large box on ec2.

Both were on the same version of Rsyslog (built off master). I'll do
another round of tests with the version I run on production (which is
patched fork of 8.12, afair) to identify any large change in cpu
profile.

This is a much more simplified version of config though. I extracted
valid (non garbled) json messages from debug log(got 9089 lines) and
pumped them in 100 times using tcpflood.

tcpflood call was timed, it took between 19.7 - 20.5 seconds for each
run regardless of edge and core relay counters being incremented. It
was ~ 45k / sec (but again, neither the load profile, not the machine
is not exactly like yours), so its not apples to apples.


On Fri, Apr 8, 2016 at 12:15 PM, singh.janmejay
<singh.janme...@gmail.com> wrote:
> It doesn't treat them any different. It takes the key and puts it in a
> hash-table and set an accumulator as its value. Then it'll look it up
> every time you ask it to increment the counter followed by actual
> increment. That is about it.
>
> The only way I can foresee the contents affecting its behavior is the
> hash-fn takes more CPU time when given a large string. But that should
> be on-cpu and not off-cpu time.
>
> Today im working on a getting an Rsyslog build running with config
> into which I inject messages from your debug-log (i'll extract
> messages and tcpflood them in) and try to understand why it behaves
> different between those lines commented out or not.
>
> If I fail to reproduce it in this simple setup, we'll have to track
> backtraces that are taking Rsyslog off-cpu in your environment.
>
> On Fri, Mar 25, 2016 at 12:57 AM, David Lang <da...@lang.hm> wrote:
>> doing a little more digging (and some accidental stuff), is it possible that
>> it's running into grief if the contents of the variable are not a simple
>> word (spaces or other funny characters in the value)?
>>
>> David Lang
>>
>> On Wed, 23 Mar 2016, singh.janmejay wrote:
>>
>>> Date: Wed, 23 Mar 2016 01:07:49 +0530
>>>
>>> From: singh.janmejay <singh.janme...@gmail.com>
>>> Reply-To: rsyslog-users <rsyslog@lists.adiscon.com>
>>> To: rsyslog-users <rsyslog@lists.adiscon.com>
>>> Subject: Re: [rsyslog] dynastats problems with user defined variables?
>>>
>>> Yep, it looked like that. Its interesting though, its 100% off CPU
>>> (very unique).
>>>
>>> On Wed, Mar 23, 2016 at 1:02 AM, David Lang <da...@lang.hm> wrote:
>>>>
>>>> Yes, the first one is with the user variables commented out, exactly what
>>>> I
>>>> pasted in from the grep.
>>>>
>>>> the second is enabling the dyn_inc for edge relays.
>>>>
>>>> David Lang
>>>>
>>>> On Wed, 23 Mar 2016, singh.janmejay wrote:
>>>>
>>>>> Date: Wed, 23 Mar 2016 00:51:18 +0530
>>>>> From: singh.janmejay <singh.janme...@gmail.com>
>>>>> Reply-To: rsyslog-users <rsyslog@lists.adiscon.com>
>>>>> To: rsyslog-users <rsyslog@lists.adiscon.com>
>>>>> Subject: Re: [rsyslog] dynastats problems with user defined variables?
>>>>>
>>>>> You meant the first one was with "commented out" version right? (you
>>>>> said uncommenting edge_relay, so double-checking).
>>>>>
>>>>> On Wed, Mar 23, 2016 at 12:38 AM, David Lang <da...@lang.hm> wrote:
>>>>>>
>>>>>>
>>>>>> here is vmstat 1 running
>>>>>>
>>>>>> # vmstat 1
>>>>>> procs -----------memory---------- ---swap-- -----io---- -system--
>>>>>> ------cpu-----
>>>>>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
>>>>>> id
>>>>>> wa st
>>>>>>  2  0  62280 2363804    584 27280376    0    1    21   596    0    1 46
>>>>>> 3
>>>>>> 51  0 0
>>>>>>  1  0  62280 2349476    584 27280612    0    0     0 87042  979  891  4
>>>>>> 0
>>>>>> 95  0 0
>>>>>>  1  0  62280 2330488    584 27280908    0    0     0     0  388  258  4
>>>>>> 0
>>>>>> 96  0 0
>>>>>>  1  0  62280 2314744    584 27281148    0    0     0     0  396  255  4
>>>>>> 0
>>>>>> 96  0 0
>>>>>>  1  0  62280 2297736    584 27281388    0    0     0     0  376  245  4
>>>>>> 0
>>>>>> 96  0 0
>>>>>>  2  0  62280 2546260    584 27245272    0    0     0     4  639  568  6
>>>>>> 1
>>>>>> 94  0 0
>>>>>>  2  0  62280 2464300    584 27245928    0    0     0  1716  936 1146  8
>>>>>> 0
>>>>>> 91  0 0
>>>>>>  2  0  62280 2394452    584 27246444    0    0     0     8  687  333  8
>>>>>> 0
>>>>>> 92  0 0
>>>>>>
>>>>>>
>>>>>> here is vmstat 1 uncommenting edge_relay
>>>>>>
>>>>>> # vmstat 1
>>>>>> procs -----------memory---------- ---swap-- -----io---- -system--
>>>>>> ------cpu-----
>>>>>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
>>>>>> id
>>>>>> wa st
>>>>>>  1  0  62280 3198672    584 26770624    0    1    21   596    0    1 46
>>>>>> 3
>>>>>> 51  0 0
>>>>>>  0  0  62280 3198904    584 26770664    0    0     0     0  151  247  0
>>>>>> 0
>>>>>> 100 0  0
>>>>>>  0  0  62280 3200332    584 26770664    0    0     0     0  128  233  0
>>>>>> 0
>>>>>> 100 0  0
>>>>>>  0  0  62280 3202244    584 26770664    0    0     0     4  172  315  0
>>>>>> 0
>>>>>> 100 0  0
>>>>>>  0  0  62280 3203132    584 26770664    0    0     0     0  132  218  0
>>>>>> 0
>>>>>> 100 0  0
>>>>>>  0  0  62280 3203912    584 26770668    0    0     0     8  203  347  0
>>>>>> 0
>>>>>> 100 0  0
>>>>>>
>>>>>>
>>>>>> things just are not running with it uncommented.
>>>>>>
>>>>>> I'm going to send you my config directly, it is a very heavy config.
>>>>>>
>>>>>> David Lang
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, 23 Mar 2016, singh.janmejay wrote:
>>>>>>
>>>>>>> On my cluster, each node (those 66 nodes) have 4 threads for ruleset
>>>>>>> (it does everything, lognorm based parsing, lookups using loopup
>>>>>>> table, re_extract, string concat, several if-else based conditions and
>>>>>>> queues up for egress using omkafka and I generally see 3 being
>>>>>>> utilized). This setup runs at ~37k msg/s (2.5M / 66 = 37k).
>>>>>>>
>>>>>>> 300k/Min is still 5k/s which should be doable on a single-thd (unless
>>>>>>> your ruleset is much heavier than mine). Im thinking for a
>>>>>>> equivalently complex ruleset 37k / 4  = 9k should be achievable.
>>>>>>>
>>>>>>> I'll try to reproduce this. Can you share out the exact commit-id you
>>>>>>> are working with?
>>>>>>>
>>>>>>> On Tue, Mar 22, 2016 at 8:38 PM, David Lang <da...@lang.hm> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, 22 Mar 2016, singh.janmejay wrote:
>>>>>>>>
>>>>>>>>> On Tue, Mar 22, 2016 at 5:01 AM, David Lang <da...@lang.hm> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I added this to my configs
>>>>>>>>>>
>>>>>>>>>> # grep -B 1 -A 1 dyn_ /etc/rsyslog.conf
>>>>>>>>>> # custom stats to track in rsyslog
>>>>>>>>>> dyn_stats(name="msgs_per_host" resettable="on"
>>>>>>>>>> maxCardinality="3000"
>>>>>>>>>> unusedMetricLife="1200")
>>>>>>>>>> dyn_stats(name="msgs_per_edge_relay" resettable="on"
>>>>>>>>>> maxCardinality="3000"
>>>>>>>>>> unusedMetricLife="1200")
>>>>>>>>>> dyn_stats(name="msgs_per_core_relay" resettable="on"
>>>>>>>>>> maxCardinality="3000"
>>>>>>>>>> unusedMetricLife="1200")
>>>>>>>>>> dyn_stats(name="msgs_per_program" resettable="on"
>>>>>>>>>> maxCardinality="3000"
>>>>>>>>>> unusedMetricLife="1200")
>>>>>>>>>> dyn_stats(name="msgs_per_tag" resettable="on" maxCardinality="3000"
>>>>>>>>>> unusedMetricLife="1200")
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>   /var/log/sources-messages;sources
>>>>>>>>>>   set $.inc = dyn_inc("msgs_per_host", $hostname);
>>>>>>>>>>   set $.inc = dyn_inc("msgs_per_program", $programname);
>>>>>>>>>> #  if $!trusted!edge!relay != "" then {
>>>>>>>>>> #    set $.inc = dyn_inc("msgs_per_edge_relay",
>>>>>>>>>> $!trusted!edge!relay);
>>>>>>>>>> #  }
>>>>>>>>>> #  if $!trusted!core!relay != "" then {
>>>>>>>>>> #    set $.inc = dyn_inc("msgs_per_edge_relay",
>>>>>>>>>> $!trusted!core!relay);
>>>>>>>>>> #  }
>>>>>>>>>> --
>>>>>>>>>>     }
>>>>>>>>>>     #set $.inc = dyn_inc("msgs_per_tag", $.custommessage);
>>>>>>>>>>     /var/log/messages-tags;manual
>>>>>>>>>>
>>>>>>>>>> if I uncomment any of the lines that refer to $! or $. variables,
>>>>>>>>>> rsyslog
>>>>>>>>>> basically stops processing messages (a few messages seem to be
>>>>>>>>>> processed
>>>>>>>>>> in
>>>>>>>>>> spurts, bit a lot of processing never happens)
>>>>>>>>>>
>>>>>>>>>> any chance that this has problems with locking around user defined
>>>>>>>>>> variables?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Not sure, I run it in production (66 6-cpu(HT) haswell VMs handling
>>>>>>>>> an
>>>>>>>>> aggregate of ~2.5 M-msg / sec (~1.5kB each)) and haven't seen such
>>>>>>>>> jitters, but then I don't use 3 level deep keys either and in my
>>>>>>>>> case,
>>>>>>>>> the feature is backported to 8.12 branch), so still uses json-c (not
>>>>>>>>> libfastjson). I do use it with user-defined variables too, just not
>>>>>>>>> nested as deep.
>>>>>>>>>
>>>>>>>>> We don't treat variables differently after dereferencing them in the
>>>>>>>>> top-most layer handling rainerscript fn-calls, so $hostname and
>>>>>>>>> $!x!y!z, as far as any rainerscript fn invocation is considered, are
>>>>>>>>> the same (performance-wise) except for the top-level dereference
>>>>>>>>> (outside of fn-impl-body).
>>>>>>>>>
>>>>>>>>> Some more info may come handy in reproducing it:
>>>>>>>>> - What is the message throughput (in msg/s) and what is the avg msg
>>>>>>>>> size?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> with these commented out as above, ~50k messages/min normal, ~200K
>>>>>>>> message/min if I'm replaying queued logs (box saturated, see the
>>>>>>>> other
>>>>>>>> thread where 8.17 final seems to have slowed down from ~300K
>>>>>>>> messages/min
>>>>>>>> that I was getting with a 8.16+ git)
>>>>>>>>
>>>>>>>>> - Do you see messages being written to sources-messages file, while
>>>>>>>>> they don't get written to messages-tags file?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, but not all messages. I have an alarm that goes off if
>>>>>>>> sources-messages
>>>>>>>> isn't written to in two 35 second checks (70 seconds total), that
>>>>>>>> alarm
>>>>>>>> never fired. But there would be a burst of messages show up in
>>>>>>>> sources-messages and then 20 or so seconds before another burst would
>>>>>>>> show
>>>>>>>> up.
>>>>>>>>
>>>>>>>>> - Do you see a significant jump in voluntary context switches(say
>>>>>>>>> per
>>>>>>>>> 10 seconds or even per second) when you uncomment those lines? How
>>>>>>>>> big
>>>>>>>>> is the jump?
>>>>>>>>> - How does the observation change if you replace it with this kind
>>>>>>>>> of
>>>>>>>>> variable access pattern:
>>>>>>>>>
>>>>>>>>> set $.x = $!trusted!edge!relay;
>>>>>>>>> if $.x != "" then {
>>>>>>>>>  set $.inc = dyn_inc("msgs_per_edge_relay", $.x);
>>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> will check.
>>>>>>>>
>>>>>>>>> - Is this stock 8.17 build?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> stock config, but against git as of yesterday. I needed to get
>>>>>>>> liblognorm
>>>>>>>> 2.0 and the current libjsonfast
>>>>>>>>
>>>>>>>>> - Is this x86-64 Linux? What distro (and version)?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ubuntu trusty (14.04) 64 bit.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Also, the documentation doesn't say why we are setting $.inc to the
>>>>>>>>>> result
>>>>>>>>>> of the command, what does $.inc contain after the command is run?
>>>>>>>>>> from
>>>>>>>>>> the
>>>>>>>>>> documentation, it looks like it contains an error code (at least
>>>>>>>>>> the
>>>>>>>>>> way
>>>>>>>>>> it's used is similar to error code returns in other languages), but
>>>>>>>>>> I
>>>>>>>>>> don't
>>>>>>>>>> see what the errors are in the doc page.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes, it is the error-code. It has value 0 when successful, an
>>>>>>>>> non-zero
>>>>>>>>> when it fails to increment (it may fail due to maxCardinality being
>>>>>>>>> hit or due to a contended for lock etc (it doesn't wait for the
>>>>>>>>> lock,
>>>>>>>>> it uses try-lock and doesn't increment the counter if it can't get a
>>>>>>>>> shared-lock over the bucket).
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> While I do have multiple threads running on the main queue, the
>>>>>>>> places
>>>>>>>> I
>>>>>>>> am
>>>>>>>> calling this are on action queues with only one worker, so the only
>>>>>>>> lock
>>>>>>>> conflict should be when the data is reported.
>>>>>>>>
>>>>>>>>> These are rsyslog error-codes. It does an error pass-thru.
>>>>>>>>>
>>>>>>>>> I'll add this to documentation (didn't know this was missed).
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> Sorry I didn't get a chance to run this pre-release, this is exactly
>>>>>>>> why
>>>>>>>> I
>>>>>>>> try to use features like this pre-release to spot the things that the
>>>>>>>> person
>>>>>>>> developign them misses because they are too close to the problem. :-)
>>>>>>>>
>>>>>>>>
>>>>>>>> David Lang
>>>>>>>> _______________________________________________
>>>>>>>> rsyslog mailing list
>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>>>> myriad
>>>>>>>> of
>>>>>>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>>>>>> DON'T
>>>>>>>> LIKE THAT.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> rsyslog mailing list
>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>> http://www.rsyslog.com/professional-services/
>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>> myriad
>>>>>> of
>>>>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>>>> DON'T
>>>>>> LIKE THAT.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> rsyslog mailing list
>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>> http://www.rsyslog.com/professional-services/
>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>>> of
>>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
>>>> LIKE THAT.
>>>
>>>
>>>
>>>
>>>
>> _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/professional-services/
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
>> LIKE THAT.
>
>
>
> --
> Regards,
> Janmejay
> http://codehunk.wordpress.com



-- 
Regards,
Janmejay
http://codehunk.wordpress.com
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to