You meant the first one was with "commented out" version right? (you
said uncommenting edge_relay, so double-checking).

On Wed, Mar 23, 2016 at 12:38 AM, David Lang <[email protected]> wrote:
> here is vmstat 1 running
>
> # vmstat 1
> procs -----------memory---------- ---swap-- -----io---- -system--
> ------cpu-----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
> wa st
>  2  0  62280 2363804    584 27280376    0    1    21   596    0    1 46  3
> 51  0 0
>  1  0  62280 2349476    584 27280612    0    0     0 87042  979  891  4  0
> 95  0 0
>  1  0  62280 2330488    584 27280908    0    0     0     0  388  258  4  0
> 96  0 0
>  1  0  62280 2314744    584 27281148    0    0     0     0  396  255  4  0
> 96  0 0
>  1  0  62280 2297736    584 27281388    0    0     0     0  376  245  4  0
> 96  0 0
>  2  0  62280 2546260    584 27245272    0    0     0     4  639  568  6  1
> 94  0 0
>  2  0  62280 2464300    584 27245928    0    0     0  1716  936 1146  8  0
> 91  0 0
>  2  0  62280 2394452    584 27246444    0    0     0     8  687  333  8  0
> 92  0 0
>
>
> here is vmstat 1 uncommenting edge_relay
>
> # vmstat 1
> procs -----------memory---------- ---swap-- -----io---- -system--
> ------cpu-----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
> wa st
>  1  0  62280 3198672    584 26770624    0    1    21   596    0    1 46  3
> 51  0 0
>  0  0  62280 3198904    584 26770664    0    0     0     0  151  247  0  0
> 100 0  0
>  0  0  62280 3200332    584 26770664    0    0     0     0  128  233  0  0
> 100 0  0
>  0  0  62280 3202244    584 26770664    0    0     0     4  172  315  0  0
> 100 0  0
>  0  0  62280 3203132    584 26770664    0    0     0     0  132  218  0  0
> 100 0  0
>  0  0  62280 3203912    584 26770668    0    0     0     8  203  347  0  0
> 100 0  0
>
>
> things just are not running with it uncommented.
>
> I'm going to send you my config directly, it is a very heavy config.
>
> David Lang
>
>
>
> On Wed, 23 Mar 2016, singh.janmejay wrote:
>
>> On my cluster, each node (those 66 nodes) have 4 threads for ruleset
>> (it does everything, lognorm based parsing, lookups using loopup
>> table, re_extract, string concat, several if-else based conditions and
>> queues up for egress using omkafka and I generally see 3 being
>> utilized). This setup runs at ~37k msg/s (2.5M / 66 = 37k).
>>
>> 300k/Min is still 5k/s which should be doable on a single-thd (unless
>> your ruleset is much heavier than mine). Im thinking for a
>> equivalently complex ruleset 37k / 4  = 9k should be achievable.
>>
>> I'll try to reproduce this. Can you share out the exact commit-id you
>> are working with?
>>
>> On Tue, Mar 22, 2016 at 8:38 PM, David Lang <[email protected]> wrote:
>>>
>>> On Tue, 22 Mar 2016, singh.janmejay wrote:
>>>
>>>> On Tue, Mar 22, 2016 at 5:01 AM, David Lang <[email protected]> wrote:
>>>>>
>>>>>
>>>>> I added this to my configs
>>>>>
>>>>> # grep -B 1 -A 1 dyn_ /etc/rsyslog.conf
>>>>> # custom stats to track in rsyslog
>>>>> dyn_stats(name="msgs_per_host" resettable="on" maxCardinality="3000"
>>>>> unusedMetricLife="1200")
>>>>> dyn_stats(name="msgs_per_edge_relay" resettable="on"
>>>>> maxCardinality="3000"
>>>>> unusedMetricLife="1200")
>>>>> dyn_stats(name="msgs_per_core_relay" resettable="on"
>>>>> maxCardinality="3000"
>>>>> unusedMetricLife="1200")
>>>>> dyn_stats(name="msgs_per_program" resettable="on" maxCardinality="3000"
>>>>> unusedMetricLife="1200")
>>>>> dyn_stats(name="msgs_per_tag" resettable="on" maxCardinality="3000"
>>>>> unusedMetricLife="1200")
>>>>>
>>>>> --
>>>>>   /var/log/sources-messages;sources
>>>>>   set $.inc = dyn_inc("msgs_per_host", $hostname);
>>>>>   set $.inc = dyn_inc("msgs_per_program", $programname);
>>>>> #  if $!trusted!edge!relay != "" then {
>>>>> #    set $.inc = dyn_inc("msgs_per_edge_relay", $!trusted!edge!relay);
>>>>> #  }
>>>>> #  if $!trusted!core!relay != "" then {
>>>>> #    set $.inc = dyn_inc("msgs_per_edge_relay", $!trusted!core!relay);
>>>>> #  }
>>>>> --
>>>>>     }
>>>>>     #set $.inc = dyn_inc("msgs_per_tag", $.custommessage);
>>>>>     /var/log/messages-tags;manual
>>>>>
>>>>> if I uncomment any of the lines that refer to $! or $. variables,
>>>>> rsyslog
>>>>> basically stops processing messages (a few messages seem to be
>>>>> processed
>>>>> in
>>>>> spurts, bit a lot of processing never happens)
>>>>>
>>>>> any chance that this has problems with locking around user defined
>>>>> variables?
>>>>
>>>>
>>>>
>>>> Not sure, I run it in production (66 6-cpu(HT) haswell VMs handling an
>>>> aggregate of ~2.5 M-msg / sec (~1.5kB each)) and haven't seen such
>>>> jitters, but then I don't use 3 level deep keys either and in my case,
>>>> the feature is backported to 8.12 branch), so still uses json-c (not
>>>> libfastjson). I do use it with user-defined variables too, just not
>>>> nested as deep.
>>>>
>>>> We don't treat variables differently after dereferencing them in the
>>>> top-most layer handling rainerscript fn-calls, so $hostname and
>>>> $!x!y!z, as far as any rainerscript fn invocation is considered, are
>>>> the same (performance-wise) except for the top-level dereference
>>>> (outside of fn-impl-body).
>>>>
>>>> Some more info may come handy in reproducing it:
>>>> - What is the message throughput (in msg/s) and what is the avg msg
>>>> size?
>>>
>>>
>>>
>>> with these commented out as above, ~50k messages/min normal, ~200K
>>> message/min if I'm replaying queued logs (box saturated, see the other
>>> thread where 8.17 final seems to have slowed down from ~300K messages/min
>>> that I was getting with a 8.16+ git)
>>>
>>>> - Do you see messages being written to sources-messages file, while
>>>> they don't get written to messages-tags file?
>>>
>>>
>>>
>>> Yes, but not all messages. I have an alarm that goes off if
>>> sources-messages
>>> isn't written to in two 35 second checks (70 seconds total), that alarm
>>> never fired. But there would be a burst of messages show up in
>>> sources-messages and then 20 or so seconds before another burst would
>>> show
>>> up.
>>>
>>>> - Do you see a significant jump in voluntary context switches(say per
>>>> 10 seconds or even per second) when you uncomment those lines? How big
>>>> is the jump?
>>>> - How does the observation change if you replace it with this kind of
>>>> variable access pattern:
>>>>
>>>> set $.x = $!trusted!edge!relay;
>>>> if $.x != "" then {
>>>>  set $.inc = dyn_inc("msgs_per_edge_relay", $.x);
>>>> }
>>>
>>>
>>>
>>> will check.
>>>
>>>> - Is this stock 8.17 build?
>>>
>>>
>>>
>>> stock config, but against git as of yesterday. I needed to get liblognorm
>>> 2.0 and the current libjsonfast
>>>
>>>> - Is this x86-64 Linux? What distro (and version)?
>>>
>>>
>>>
>>> ubuntu trusty (14.04) 64 bit.
>>>
>>>>>
>>>>> Also, the documentation doesn't say why we are setting $.inc to the
>>>>> result
>>>>> of the command, what does $.inc contain after the command is run? from
>>>>> the
>>>>> documentation, it looks like it contains an error code (at least the
>>>>> way
>>>>> it's used is similar to error code returns in other languages), but I
>>>>> don't
>>>>> see what the errors are in the doc page.
>>>>
>>>>
>>>>
>>>> Yes, it is the error-code. It has value 0 when successful, an non-zero
>>>> when it fails to increment (it may fail due to maxCardinality being
>>>> hit or due to a contended for lock etc (it doesn't wait for the lock,
>>>> it uses try-lock and doesn't increment the counter if it can't get a
>>>> shared-lock over the bucket).
>>>
>>>
>>>
>>> While I do have multiple threads running on the main queue, the places I
>>> am
>>> calling this are on action queues with only one worker, so the only lock
>>> conflict should be when the data is reported.
>>>
>>>> These are rsyslog error-codes. It does an error pass-thru.
>>>>
>>>> I'll add this to documentation (didn't know this was missed).
>>>
>>>
>>>
>>> Thanks.
>>>
>>> Sorry I didn't get a chance to run this pre-release, this is exactly why
>>> I
>>> try to use features like this pre-release to spot the things that the
>>> person
>>> developign them misses because they are too close to the problem. :-)
>>>
>>>
>>> David Lang
>>> _______________________________________________
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>> http://www.rsyslog.com/professional-services/
>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>> of
>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
>>> LIKE THAT.
>>
>>
>>
>>
>>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
> LIKE THAT.



-- 
Regards,
Janmejay
http://codehunk.wordpress.com
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to