Hi David,

> the problem is probably not on the system that stops forwarding messages, but 
> rather on the system they are forwarding the messages to.
> 
> When the queues fill up, unles you have configured rsyslog to throw away 
> messages, it will stop accepting any new messages as it can't put them in the 
> queue. This is "working as designed" (one of these days I've got to sit down 
> and 
> finish writing my "how to make your logs unreliable" article :-)

There are several reasons I do not think this is the case:

We have multiple relays downstream connected to upstream relays, and see
messages come through these other paths when this situation occurs.

Also, a frequent solution to the problem is to restart the stuck process
(and only that one), where we see the messages flush through upstream
relays when shutting down, implying they are not holding back messages.

Finally, we do have impstats enabled, it is going through he main queue
but this actually allows to probe the status of the queue. We have
nagios alert when there are no stats messages coming in a fixed time
window. Before getting stuck, messages in the queue are at maximum
(actually we see 700k in the main queue which is set at 1M), then we see
no more stats from only the stuck relays, others keep pushing stats and
reflect the reduction in message throughput in their main queue sizes.

> what version are you running, there have been some unicode related fixes in 
> the 
> last few versions.

We have a mix of systems but in general rsyslog is at least version
8.22, with a number of systems being 8.23.

> A couple things to do would be
> 
> 1. make sure you have impstats enabled, and since you are having problems 
> delivering messages, make sure it either uses a different ruleset (with a 
> queue) 
> or writes a file to disk so that you don't risk the pstats data getting stuck 
> as 
> well.

As above, we have it enabled, but it is going through the default
ruleset at the moment. I can look into getting it into a different
ruleset and both write maybe a days worth to disk as well as sending
upstream.

> 2. as a debugging tool, consider writing the logs to disk before forwarding 
> them. You don't need to keep a very long history of them, but seeing the 
> message 
> that rsyslog was trying to send could be very helpful

Will look into this as well, though I suspect it cannot be done anytime
real soon. It would be nice to see what message was being processed,
though it is possible the issue would prevent its writing to disk as
well, if it gets stuck in the main queue and not the omrelp action...

> 3. look at the systems receiving the messages to see if anything odd happens 
> there around the time that things start failing.

As above, I don't believe it is the receiving systems, but if anythings
still stands out let me know. Thanks for all your input.

Arik

P.S. I only got the digest, not the original response, so apologies if
this does not get properly inserted in the thread.

-- 
*This correspondence (including any attachments) is for the intended 
recipient(s) only. It may contain confidential or privileged information or 
both. No confidentiality or privilege is waived or lost by any 
mis-transmission. If you receive this correspondence by mistake, please 
contact the sender immediately, delete this correspondence (and all 
attachments) and destroy any hard copies. You must not use, disclose, copy, 
distribute or rely on any part of this correspondence (including any 
attachments) if you are not the intended 
recipient(s).本メッセージに記載および添付されている情報(以下、総称して「本情報」といいます。)は、本来の受信者による使用のみを意図しています。誤送信等により本情報を取得された場合でも、本情報に係る秘密、または法律上の秘匿特権が失われるものではありません。本電子メールを受取られた方が、本来の受信者ではない場合には、本情報及びそのコピーすべてを削除・破棄し、本電子メールが誤って届いた旨を発信者宛てにご通知下さいますようお願いします。本情報の閲覧、発信または本情報に基づくいかなる行為も明確に禁止されていることをご了承ください。*
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to