On Tue, 13 Dec 2016, Stefan Midjich wrote:


Hi

I recently discovered that one of my log servers was dropping a
majority of its UDP packets according to netstat -su. I could also see
that lines were not being written into log files from remote logging
clients.

CentOS 7 with rsyslog 7.4.7, Physical HW with 24 cores and 20G RAM
receiving about 7Mbit/s UDP logdata non-stop from various clients over
network.

I mitigated the issue by disabling log forwarding to logstash, a TCP
service on localhost:5544. After that no more packets were being
dropped.

Now I'm trying to resolve these performance issues. It's my
understanding that the packets were being dropped because of full
queues. Either main queue was full preventing UDP packets from being
received, or Action queue was full preventing main queue from emptying
UDP packets.

Here is my config: https://bpaste.net/show/cf7d49a2d7b1
Here is my rsyslog-stats.log: https://bpaste.net/show/7fe6878fc0a5

I won't be able to dig into these this morning

In the rsyslog config at the bottom you can see my attempt at
increasing queue size for the action queue but it has not helped,
seconds after I restart rsyslog I see dropped UDP packets spiking.
Indicating that logs are being dropped again.

If I shutdown the logstash listener, or disable forwarding to logstash
in rsyslog the UDP packet errors stop.

This means that logstash is not keeping up, and since it's TCP, rsyslog is having it's queues fill up and once they are full it stops being able to process new messages

I don't know what to make of rsyslog-stats telling me I have 238
million messages enqueued, is that a current or total number? And I
don't know why it says maxqsize=3580 when I've set
main_queue(queue.size="100000").

enqueued is the total number of messages processed. Look at size and maxsize (size is the current size of the queue, maxsize is the largest it's ever been since rsyslog started)

During the quick bursts that I've had forwarding enabled to
troubleshoot I've seen these messages in rsyslog-stats but I can't
interpret what they're trying to say about the action queue.

Mon Dec 12 15:57:44 2016: action 26 queue[DA]: size=0 enqueued=0
full=0 discarded.full=0 discarded.nf=0 maxqsize=0
Mon Dec 12 15:57:44 2016: action 26 queue: size=0 enqueued=63733
full=0 discarded.full=0 discarded.nf=0 maxqsize=32
Mon Dec 12 15:58:14 2016: action 26 queue[DA]: size=0 enqueued=0
full=0 discarded.full=0 discarded.nf=0 maxqsize=0
Mon Dec 12 15:58:14 2016: action 26 queue: size=32 enqueued=134485
full=0 discarded.full=0 discarded.nf=0 maxqsize=32
Mon Dec 12 15:58:44 2016: action 26 queue[DA]: size=0 enqueued=0
full=0 discarded.full=0 discarded.nf=0 maxqsize=0
Mon Dec 12 15:58:44 2016: action 26 queue: size=0 enqueued=208181
full=0 discarded.full=0 discarded.nf=0 maxqsize=32

I hope someone can shed some light on this for me. Thanks.

This action is not the problem, but you have some other queues that are filling up.

David Lang

--
Vänliga Hälsningar / Sincerely
Stefan M
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to