Excerpts from David Lang's message of 2014-12-19 18:25:27 +0100:
> 
> On Fri, 19 Dec 2014, Marc Fournier wrote:
> 
> > Excerpts from David Lang's message of 2014-12-18 19:03:54 +0100:
> >> On Wed, 17 Dec 2014, Marc Fournier wrote:
> >>
> >>> Excerpts from David Lang's message of 2014-12-16 20:41:49 +0100:
> >>>> On Tue, 16 Dec 2014, Marc Fournier wrote:
> >>>>
> >>>>> Hello,
> >>>>>
> >>>>> openldap writes its logs using syslog(3). Three log messages are emitted
> >>>>> for each ldap search operation.
> >>>>>
> >>>>> On a given openldap setup, I reach about 2500 operations per second. The
> >>>>> simple fact of shutting rsyslog down icreases this number up to 5000
> >>>>> (logs are lost obviously).
> >>>>>
> >>>>> Writing to a local file, to /dev/null, or to a remote server over UDP
> >>>>> doesn't significantly change this number. I tried tweaking various
> >>>>> settings related to the main queue but didn't see any huge difference.
> >>>>>
> >>>>> ipmstats shows exactly the same numbers for "imuxsock: submitted" than
> >>>>> for "main Q: enqueued". All the discarded/full/failed/etc counters have
> >>>>> a value of 0.
> >>>>>
> >>>>> This is rsyslog-8.6.0-2.el6.x86_64 from the adiscon RPM repo (but had
> >>>>> similar problems with version 5.8.10-9.el6_6 which comes with el6).
> >>>>>
> >>>
> >>> Ok, right my config is a mess I agree ! Been iterating over it since
> >>> back when I used 5.8.10.
> >>>
> >>> So I started over with a brand new config. The best results I got so far
> >>> was with these settings:
> >>>
> >>> module(load="impstats" log.syslog="off" log.file="/var/log/impstats.log" 
> >>> resetCounters="off" interval="5")
> >>> module(load="imuxsock" SysSock.RateLimit.Interval="0" 
> >>> SysSock.Annotate="off" SysSock.FlowControl="off" 
> >>> SysSock.IgnoreTimestamp="off")
> >>>
> >>> $MaxMessageSize 2k
> >>>
> >>> $MainMsgQueueDequeueBatchSize 4096
> >>> $MainMsgQueueType LinkedList
> >>> $MainMsgQueueSize 250000
> >>> $MainMsgQueueWorkerThreads 4
> >>> $MainMsgQueueWorkerThreadMinimumMessages 100000
> >>> $RepeatedMsgReduction off
> >>> $SystemLogRateLimitInterval 0
> >>>
> >>> *.*  ~
> >>>
> >>> The numbers are slightly better now: ~3500 ops/s vs 2500 yesterday. But
> >>> still a significant difference with the max I reach when running with
> >>> rsyslogd stopped (~5000 ops/s).
> >>>
> >>> Is this an expected compromise ? Or are there possible improvements I
> >>> can do ?
> >>
> >> I'm not seeing anything obviously wrong here, does impstats show anything
> >> interesting (errors, etc)
> >
> > No, no errors. The discard/full counters also stay at 0.
> >
> >> I would actually reduce the worker threads down to 1 (it's 
> >> counter-intuitive,
> >> but having too many threads in rsyslog can actually slow things down 
> >> because of
> >> the overhead involved with locking and unlocking the queues)
> >
> > Ok, tried that. No significant change with the issue I'm facing though.
> >
> > Meanwhile I improved my stress-test setup, and upgraded the target
> > machine to a quad-core (I was suspecting competition for cpu time
> > between openldap and rsyslog on the previous single-core machine).
> >
> > The good news is that I can reproduce the problem much more reliably.
> > The bad news is that numbers are even worse and that I have proof that
> > the bottleneck lies in rsyslog. Here are my latest observations:
> >
> > - no rsyslog running: 15k to 17k ldap operations/second
> > - rsyslog with above config, except for MainMsgQueueWorkerThreads back
> >   to 1: down to 7.5k ops/s
> 
> wihtout rsyslog running, there shouldn't even be a /dev/log to write to, so 
> any 
> logging should be a noop.
> 
> are you running systemd by any chance?

No. All this is on an up to date rhel6, which predates systemd.

I basically wanted to compare the throughput of openldap with vs without
logging.

My point here is that openldap has no way of knowing that nothing is
reading /dev/log. The same code paths leading to syslog() are used, etc.
It apparently just suffers from the performance issues downstream.

> let's simplify the config more, try making it just
> 
> module(load="impstats" log.syslog="off" log.file="/var/log/impstats.log" 
> resetCounters="off" interval="5")
> module(load="imuxsock" SysSock.RateLimit.Interval="0")
> $MaxMessageSize 2k
> $MainMsgQueueDequeueBatchSize 4096
> $RepeatedMsgReduction off
> 
> *.*  ~
> 
> The next step I would try is to change the parser settings to avoid wasting 
> effort trying to parse the message, but I would be a bit surprised if that 
> was 
> the problem here.

Thanks ! I'll give this a try on monday.

> The fact that rsyslog is spending so much time in select puzzles me (the rest 
> makes sense).
> 
> Rainer is going to need to take a look at this, but we are about to loose him 
> for Christmas vacation (if we haven't already)

I'm sure he'll come back :) I also have a few days off, no problem
resuming the discussion in January.

Thanks for caring David, much appreciated !

Cheers,
Marc

> > - when monitoring openldap's threads state, a surge of threads in
> >   "pending" state as soon as rsyslog is started.
> > - a trivial syslog-ng config, reading from /dev/log and sending
> >   everything to /dev/null, results in only about 10% ops/s decrease.
> > - also the pending threads count stay close to zero when using
> >   syslog-ng.
> >
> > To recap, the starting point of this issue was that openldap temporarily
> > stopped responding when a sudden surge of requests came in (ie: critical
> > service disruption for users). And I found out this problem didn't
> > happen when logging was disabled.
> >
> > As the performance numbers we can read on the web about rsyslog are an
> > order of magnitude bigger *but* usually relate to the imtcp/imudp
> > modules, I'm wondering if there possibly was a problem with the imuxsock
> > module (FWIW "in:imuxsock" is constantly in the top 3 of "top -H" during
> > my tests) ?
> >
> > Also no idea what value this has, but strace shows that rsyslog spends
> > significantly more time in system calls than syslog-ng, in my setup:
> >
> > # timeout 10 strace -c -fp $(pidof rsyslogd)
> > Process 27446 attached with 4 threads - interrupt to quit
> > Process 27446 detached
> > Process 27447 detached
> > Process 27448 detached
> > Process 27449 detached
> > % time     seconds  usecs/call     calls    errors syscall
> > ------ ----------- ----------- --------- --------- ----------------
> > 94.85    2.981049          49     61079           select
> >  4.39    0.137935           0    287362     55920 futex
> >  0.68    0.021477           0     61077           recvmsg
> >  0.07    0.002301           0     61084           gettimeofday
> >  0.00    0.000000           0         6           writev
> >  0.00    0.000000           0         2           getrusage
> > ------ ----------- ----------- --------- --------- ----------------
> > 100.00    3.142762                470610     55920 total
> >
> > # timeout 10 strace -c -fp $(pgrep -f ^syslog-ng)
> > Process 29477 attached - interrupt to quit
> > Process 29477 detached
> > % time     seconds  usecs/call     calls    errors syscall
> > ------ ----------- ----------- --------- --------- ----------------
> > 41.39    0.001228           0    359614           gettimeofday
> > 23.12    0.000686           0      1774         1 read
> > 15.67    0.000465           0    105768           lseek
> > 15.57    0.000462           0    105767           write
> >  4.25    0.000126           0     21153           poll
> >  0.00    0.000000           0       114           stat
> >  0.00    0.000000           0         1           restart_syscall
> > ------ ----------- ----------- --------- --------- ----------------
> > 100.00    0.002967                594191         1 total
> >
> >
> > Thanks a lot for your guidance so far !
> >
> > Cheers,
> > Marc
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
> > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T 
> > LIKE THAT.
> >
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to