Excerpts from David Lang's message of 2014-12-19 18:25:27 +0100: > > On Fri, 19 Dec 2014, Marc Fournier wrote: > > > Excerpts from David Lang's message of 2014-12-18 19:03:54 +0100: > >> On Wed, 17 Dec 2014, Marc Fournier wrote: > >> > >>> Excerpts from David Lang's message of 2014-12-16 20:41:49 +0100: > >>>> On Tue, 16 Dec 2014, Marc Fournier wrote: > >>>> > >>>>> Hello, > >>>>> > >>>>> openldap writes its logs using syslog(3). Three log messages are emitted > >>>>> for each ldap search operation. > >>>>> > >>>>> On a given openldap setup, I reach about 2500 operations per second. The > >>>>> simple fact of shutting rsyslog down icreases this number up to 5000 > >>>>> (logs are lost obviously). > >>>>> > >>>>> Writing to a local file, to /dev/null, or to a remote server over UDP > >>>>> doesn't significantly change this number. I tried tweaking various > >>>>> settings related to the main queue but didn't see any huge difference. > >>>>> > >>>>> ipmstats shows exactly the same numbers for "imuxsock: submitted" than > >>>>> for "main Q: enqueued". All the discarded/full/failed/etc counters have > >>>>> a value of 0. > >>>>> > >>>>> This is rsyslog-8.6.0-2.el6.x86_64 from the adiscon RPM repo (but had > >>>>> similar problems with version 5.8.10-9.el6_6 which comes with el6). > >>>>> > >>> > >>> Ok, right my config is a mess I agree ! Been iterating over it since > >>> back when I used 5.8.10. > >>> > >>> So I started over with a brand new config. The best results I got so far > >>> was with these settings: > >>> > >>> module(load="impstats" log.syslog="off" log.file="/var/log/impstats.log" > >>> resetCounters="off" interval="5") > >>> module(load="imuxsock" SysSock.RateLimit.Interval="0" > >>> SysSock.Annotate="off" SysSock.FlowControl="off" > >>> SysSock.IgnoreTimestamp="off") > >>> > >>> $MaxMessageSize 2k > >>> > >>> $MainMsgQueueDequeueBatchSize 4096 > >>> $MainMsgQueueType LinkedList > >>> $MainMsgQueueSize 250000 > >>> $MainMsgQueueWorkerThreads 4 > >>> $MainMsgQueueWorkerThreadMinimumMessages 100000 > >>> $RepeatedMsgReduction off > >>> $SystemLogRateLimitInterval 0 > >>> > >>> *.* ~ > >>> > >>> The numbers are slightly better now: ~3500 ops/s vs 2500 yesterday. But > >>> still a significant difference with the max I reach when running with > >>> rsyslogd stopped (~5000 ops/s). > >>> > >>> Is this an expected compromise ? Or are there possible improvements I > >>> can do ? > >> > >> I'm not seeing anything obviously wrong here, does impstats show anything > >> interesting (errors, etc) > > > > No, no errors. The discard/full counters also stay at 0. > > > >> I would actually reduce the worker threads down to 1 (it's > >> counter-intuitive, > >> but having too many threads in rsyslog can actually slow things down > >> because of > >> the overhead involved with locking and unlocking the queues) > > > > Ok, tried that. No significant change with the issue I'm facing though. > > > > Meanwhile I improved my stress-test setup, and upgraded the target > > machine to a quad-core (I was suspecting competition for cpu time > > between openldap and rsyslog on the previous single-core machine). > > > > The good news is that I can reproduce the problem much more reliably. > > The bad news is that numbers are even worse and that I have proof that > > the bottleneck lies in rsyslog. Here are my latest observations: > > > > - no rsyslog running: 15k to 17k ldap operations/second > > - rsyslog with above config, except for MainMsgQueueWorkerThreads back > > to 1: down to 7.5k ops/s > > wihtout rsyslog running, there shouldn't even be a /dev/log to write to, so > any > logging should be a noop. > > are you running systemd by any chance?
No. All this is on an up to date rhel6, which predates systemd. I basically wanted to compare the throughput of openldap with vs without logging. My point here is that openldap has no way of knowing that nothing is reading /dev/log. The same code paths leading to syslog() are used, etc. It apparently just suffers from the performance issues downstream. > let's simplify the config more, try making it just > > module(load="impstats" log.syslog="off" log.file="/var/log/impstats.log" > resetCounters="off" interval="5") > module(load="imuxsock" SysSock.RateLimit.Interval="0") > $MaxMessageSize 2k > $MainMsgQueueDequeueBatchSize 4096 > $RepeatedMsgReduction off > > *.* ~ > > The next step I would try is to change the parser settings to avoid wasting > effort trying to parse the message, but I would be a bit surprised if that > was > the problem here. Thanks ! I'll give this a try on monday. > The fact that rsyslog is spending so much time in select puzzles me (the rest > makes sense). > > Rainer is going to need to take a look at this, but we are about to loose him > for Christmas vacation (if we haven't already) I'm sure he'll come back :) I also have a few days off, no problem resuming the discussion in January. Thanks for caring David, much appreciated ! Cheers, Marc > > - when monitoring openldap's threads state, a surge of threads in > > "pending" state as soon as rsyslog is started. > > - a trivial syslog-ng config, reading from /dev/log and sending > > everything to /dev/null, results in only about 10% ops/s decrease. > > - also the pending threads count stay close to zero when using > > syslog-ng. > > > > To recap, the starting point of this issue was that openldap temporarily > > stopped responding when a sudden surge of requests came in (ie: critical > > service disruption for users). And I found out this problem didn't > > happen when logging was disabled. > > > > As the performance numbers we can read on the web about rsyslog are an > > order of magnitude bigger *but* usually relate to the imtcp/imudp > > modules, I'm wondering if there possibly was a problem with the imuxsock > > module (FWIW "in:imuxsock" is constantly in the top 3 of "top -H" during > > my tests) ? > > > > Also no idea what value this has, but strace shows that rsyslog spends > > significantly more time in system calls than syslog-ng, in my setup: > > > > # timeout 10 strace -c -fp $(pidof rsyslogd) > > Process 27446 attached with 4 threads - interrupt to quit > > Process 27446 detached > > Process 27447 detached > > Process 27448 detached > > Process 27449 detached > > % time seconds usecs/call calls errors syscall > > ------ ----------- ----------- --------- --------- ---------------- > > 94.85 2.981049 49 61079 select > > 4.39 0.137935 0 287362 55920 futex > > 0.68 0.021477 0 61077 recvmsg > > 0.07 0.002301 0 61084 gettimeofday > > 0.00 0.000000 0 6 writev > > 0.00 0.000000 0 2 getrusage > > ------ ----------- ----------- --------- --------- ---------------- > > 100.00 3.142762 470610 55920 total > > > > # timeout 10 strace -c -fp $(pgrep -f ^syslog-ng) > > Process 29477 attached - interrupt to quit > > Process 29477 detached > > % time seconds usecs/call calls errors syscall > > ------ ----------- ----------- --------- --------- ---------------- > > 41.39 0.001228 0 359614 gettimeofday > > 23.12 0.000686 0 1774 1 read > > 15.67 0.000465 0 105768 lseek > > 15.57 0.000462 0 105767 write > > 4.25 0.000126 0 21153 poll > > 0.00 0.000000 0 114 stat > > 0.00 0.000000 0 1 restart_syscall > > ------ ----------- ----------- --------- --------- ---------------- > > 100.00 0.002967 594191 1 total > > > > > > Thanks a lot for your guidance so far ! > > > > Cheers, > > Marc > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com/professional-services/ > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > > LIKE THAT. > > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

