Also, be sure to give your queues descriptive names. Otherwise they will just
be automatically assigned numbers. For example, here's a config snippet that
defines a single new TCP port input and binds a ruleset to it that contains
just one output action (that writes everything to a file):
| ruleset(name="net-1234" queue.type="linkedList") {
| action(type="omfile"
| name="net-1234-action"
| file="/var/log/1234-messages.log"
| template="t_standard")
| }
| input(type="imptcp"
| name="net-1234-input"
| port="1234"
| ruleset="net-1234")
| *.* /var/log/messages
The names will help with thread identification in 'top' - the thread name will
be "rs:______" where the underlines are the first 12 or so characters of the
queue name, such as "rs:net-1234-act".
Also, they will help you identify queue statistics in the impstats output
(which in this example will wind up in /var/log/messages via the *.* selector
above). For example:
Aug 22 12:00:00 loghost rsyslogd-pstats: imptcp(*/1234/IPv4): submitted=1449868
Aug 22 12:00:00 loghost rsyslogd-pstats: net-1234: size=0 enqueued=1449868
full=0 discarded.full=0 discarded.nf=0 maxqsize=180
Aug 22 12:00:00 loghost rsyslogd-pstats: net-1234-action: processed=1449868
failed=0
which shows you messages submitted by the input (tcp port 1234), messages
enqueued (and possibly dropped) by the ruleset's main message queue, and
messages processed by the action queue. (I've rearranged the pstats log
messages into this order to make the input->rulesetQ->output data flow more
clear.)
Things to watch for are increasing "full=__" counts (which indicate that this
particular queue's buffer is getting 100% full), and of course the
"discarded.full=__" and "discarded.nf=__" counts, which indicate that messages
are being discarded (and therefore not sent on to the output action queue). In
the example above, 100% of the 1,449,868 messages that were received on the TCP
port were successfully enqueued, processed, and output to the target logfile
"/var/log/1234-messages.log".
So if you were seeing increasing discards, you could then adjust the queue size
for this ruleset queue (analogous to the Main Message Queue, but specific to
this ruleset; defaults to 10,000 messages.)
Alternatively, if the action (output) queue had increasing "failed=__" counts,
you could adjust that queue size instead (default is 1,000 messages). (Or, you
could filter out messages you don't need to process, or try to prevent them
from even entering the input module in the first place, or...)
Hope this helps,
--
Dave Caplinger
On Aug 21, 2013, at 11:55 AM, David Lang <[email protected]> wrote:
> As Radu says, we need to see where the bottleneck is. seeing what thread is
> maxing out the CPU will probably point us in the right direction.
>
> start top on the receiving box and then hit "H" to show threads and let's see
> which one is maxing out the CPU, they are usually labeled in a way that we can
> figure out what that thread is doing, but the worst case may require doing a
> strace of that thread.
>
> What version are you running now?
>
> David Lang
>
> On Wed, 21 Aug 2013, Radu Gheorghe wrote:
>
>> BTW, how do your config files look like now? Maybe some of us can point out
>> places where you can optimize.
>>
>>
>> 2013/8/21 Radu Gheorghe <[email protected]>
>>
>>> Hello Robert,
>>>
>>> Some pointers can be:
>>> - use the impstats module to see the state of your queues
>>> - use htop or something like that to see which threads consume the most
>>> CPU (maybe you can start more of those threads to solve the problem). I
>>> didn't do this myself but David and Rainer keep saying that threads are
>>> labeled so you can understand what uses more resources
>>> - netstat -su should also help with some information on packet loss and
>>> stuff
>>>
>>> Best regards,
>>> Radu
>>>
>>>
>>> 2013/8/21 Robert Ortiz <[email protected]>
>>>
>>>> Hello guys,
>>>>
>>>> So i was able to get the logs to come in at 25k mps and not drop a single
>>>> one, I changed the ctl file to increase the mem to 200000, I also installed
>>>> nscd and was able to get this to work, unfortunately when i went up to 50k
>>>> mps i dropped about 20k mps, is there a way I can see something that can
>>>> tell me where I might have a problem?
>>>>
>>>> Robert
>>>> ----- Original Message -----
>>>> From: David Lang
>>>> Sent: 08/08/13 02:14 PM
>>>> To: rsyslog-users
>>>> Subject: Re: [rsyslog] performance tweaking
>>>>
>>>> The first thing I would do is make sure that you start rsyslog without
>>>> DNS lookups (add the -x flag to startup), the overhead of doing a DNS
>>>> lookup on each message that comes in is very significant. The newest
>>>> versions of rsyslog (7.4) include some caching of DNS data, but it can
>>>> still be significant. With 5.x I think this change by itself will probably
>>>> get you over 100K logs/sec The next thing is the main message queue size,
>>>> your configuration leaves it at the default of 10K, if you are looking to
>>>> receive 100K messages/sec, that's not very big, I would set it large enough
>>>> to handle at least a couple seconds worth of logs, and if this box is a
>>>> dedicated syslog server, set it so that it will use the majority of RAM on
>>>> your system. with 32G of ram on the system, and a default 2k message size,
>>>> setting this well above 1M is very reasonable. As noted by someone else,
>>>> setting larger buffers in /etc/sysctl.conf may help If you can disable
>>>> connection tracking in the iptables stack, i
>>>> t will significantly reduce the kernel overhead (how many systems are
>>>> you recieving logs from?) Setting net.ipv4.netfilter.ip_conntrack_max large
>>>> may help As far as your rules go: 'contains' is significantly more
>>>> expensive than 'startswith' on version 5.x, the if..then structure is
>>>> significantly slower than the properties filter like: :hostname, contains,
>>>> 'pdc' /var/log/test/f_ad rsyslog 7.x contains a ruleset optimizer that
>>>> eliminates this performance problem. what do you have in your included
>>>> files? It's worth checking to see where your bottleneck is, simplify your
>>>> rules to write everything to one file and see what the resulting
>>>> performance is like. That way you know if your problem is on the input side
>>>> or the output side. if you run top, and hit 'H' to show the different
>>>> threads, you can see what threads are running out of CPU time. My guess is
>>>> that it will be a thread labeled "Main Q", which is the output side of
>>>> things (due to the use of the inefficient if..then filters
>>>> ), and that's causing the too-small queue to fill up, causing UDP
>>>> messages to be lost. rsyslog 7.4 combined with a recent Linux kernel also
>>>> has the ability to recieve multiple UDP packets in a single system call,
>>>> this would significantly improve performance. I don't know if RHEL 6.4
>>>> includes a recent enough kernel. This is the batchSize parameter. Another
>>>> useful parameter for UDP input is TimeRequery. If you have a lot of
>>>> messages arriving at the same time, doing a gettimeofday() call to the
>>>> system can be slow, and many consecutive calls will return the same value,
>>>> so rsyslog lets you say that as long as the incoming buffer from the OS has
>>>> more logs ready, only do a time lookup every N messages instead of every
>>>> message. Setting this to something like 100 or 1000 will virtually
>>>> eliminate the overhead of doing this lookup, and the worst that can happen
>>>> is that the time received timestamp may be off by 1 second for messages
>>>> that arrive in a batch right at the end of one second
>>>> and the beginning of the next second (i.e. you will almost certinly
>>>> never notice this, this does not affect the timestamp generated by the host
>>>> system in any case) back in the rsyslog 4.x days, I was able to get rsyslog
>>>> to handle gig-e wire speed (~380K logs/sec), and rsyslog has only gotten
>>>> faster since. David Lang On Thu, 8 Aug 2013, Robert Ortiz wrote: > Hey
>>>> Guys, > > I am new to this mailing list and I wanted to see about getting
>>>> some pointers > if possible regarding tweakin rsyslog: > > I am pretty new
>>>> to rsyslog, and I've been given a pretty fun task... to test > rsyslog vs
>>>> syslog-ng and pick the best one, I am having a problem with rsyslog > where
>>>> im at 25K/mps and im dropping logs, I need to get it at 100k mps with > and
>>>> I'm not sure where the misconfiguration is if anyone could take a look I >
>>>> would really appreciatte it, > > my current setup: > > rhel 6.4 x86_64 >
>>>> rsyslog-5.8.10-2.el6.x86_64 > Dual Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz
>>>>> 32GB RAM > 500GB 15k rai
>>>> d 0 > > > # rsyslog v5 configuration file > > # For more information see
>>>> /usr/share/doc/rsyslog-*/rsyslog_conf.html > # If you experience problems,
>>>> see http://www.rsyslog.com/doc/troubleshoot.html > > #### MODULES #### >
>>>>> $ModLoad imuxsock # provides support for local system logging (e.g. via
>>>> logger command) > $ModLoad imklog # provides kernel logging support
>>>> (previously done by rklogd) > #$ModLoad immark # provides --MARK-- message
>>>> capability > > # Provides UDP syslog reception > $ModLoad imudp >
>>>> $UDPServerRun 514 > # $UDPServerTimeRequery 10 > > # Provides TCP syslog
>>>> reception > #$ModLoad imtcp > #$InputTCPServerRun 514 > > > #### GLOBAL
>>>> DIRECTIVES #### > > # Use default timestamp format >
>>>> $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat > > # File syncing
>>>> capability is disabled by default. This feature is usually not required, >
>>>> # not useful and an extreme performance hit > #$ActionFileEnableSync on > >
>>>> # Include all config files in /etc/rsyslog.d/ > $IncludeConf
>>>> ig /etc/rsyslog.d/*.conf > > # Set Buffer Size - default is 4k > #
>>>> $OMFileIOBufferSize 128k > # Set Main Message Queue Size - default is 10000
>>>>> # $MainMsgQueueSize 50000 > > #### RULES #### > > # Log all kernel
>>>> messages to the console. > # Logging much else clutters up the screen. >
>>>> #kern.* /dev/console > > if $hostname contains 'pdc' then
>>>> /var/log/test/f_ad > & ~ > if $hostname contains 'fdfw' then
>>>> /var/log/test/f_fw > & ~ > if $hostname contains 'mail' then
>>>> /var/log/test/f_mail > & ~ > if $hostname contains 'pix' then
>>>> /var/log/test/ix > & ~ > if $hostname contains 'rout' then
>>>> /var/log/test/rout > & ~ > if $hostname contains 'networks' then
>>>> /var/log/test/net > & ~ > #if $fromhost-ip == '10.0.0.10' then
>>>> /var/log/test/thost > #& ~ > #if $hostname startswith 'virtserv' then
>>>> /var/log/test/test_virtserv > #&~ > #if $fromhost-ip startswith '10.0.6'
>>>> then /var/log/test/test_10.0.6 > #& ~ > > > # Log anything (except mail) of
>>>> level info or higher. > # Don't log private authenticati
>>>> on messages! > #*.info;mail.none;authpriv.none;cron.none
>>>> /var/log/messages > *.debug /var/log/messages > > # Log all the mail
>>>> messages in one place. > mail.* -/var/log/maillog > > > # Log cron stuff >
>>>> cron.* /var/log/cron > > # Everybody gets emergency messages > *.emerg * >
>>>>> # Save news errors of level crit and higher in a special file. >
>>>> uucp,news.crit /var/log/spooler > > # Save boot messages also to boot.log >
>>>> local7.* /var/log/boot.log > > > # ### begin forwarding rule ### > # The
>>>> statement between the begin ... end define a SINGLE forwarding > # rule.
>>>> They belong together, do NOT split them. If you create multiple > # The
>>>> statement between the begin ... end define a SINGLE forwarding > # rule.
>>>> They belong together, do NOT split them. If you create multiple > #
>>>> forwarding rules, duplicate the whole block! > # Remote Logging (we use TCP
>>>> for reliable delivery) > # > # An on-disk queue is created for this action.
>>>> If the remote host is > # down, messages are spooled to dis
>>>> k and sent when it is up again. > #$WorkDirectory /var/lib/rsyslog #
>>>> where to place spool files > #$ActionQueueFileName fwdRule1 # unique name
>>>> prefix for spool files > #$ActionQueueMaxDiskSpace 1g # 1gb space limit
>>>> (use as much as possible) > #$ActionQueueSaveOnShutdown on # save messages
>>>> to disk on shutdown > #$ActionQueueType LinkedList # run asynchronously >
>>>> #$ActionResumeRetryCount -1 # infinite retries if host is down > # remote
>>>> host is: name/ip:port, e.g. 192.168.0.1:514, port optional > #*.*
>>>> @@remote-host:514 > # ### end of the forwarding rule ### > > > > Robert. >
>>>> _______________________________________________ > rsyslog mailing list >
>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >
>>>> http://www.rsyslog.com/professional-services/ > What's up with rsyslog?
>>>> Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC
>>>> mailing list, posts are ARCHIVED by a myriad of sites beyond our control.
>>>> PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT. > ___________
>>>> ____________________________________ rsyslog mailing list
>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>> http://www.rsyslog.com/professional-services/ What's up with rsyslog?
>>>> Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing
>>>> list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE
>>>> UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Robert.
>>>> _______________________________________________
>>>> rsyslog mailing list
>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>> http://www.rsyslog.com/professional-services/
>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>> DON'T LIKE THAT.
>>>>
>>>
>>>
>> _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/professional-services/
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
>> LIKE THAT.
>>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
> LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.