Thank you both David and Rainer - I really appreciate the detailed responses
and it's already helping put my box in a better state.
good to hear
I used the impstats analyzer and started increasing dynaFileCacheSize for
the actions that were problematic. CPU usage increased by about 2.5x and
context switches went up to 800k for about 15 minutes after I restarted
rsyslogd but seems to have settled back in around 60-70% and around 100k
context switches now. Top showed three in:imtcp threads as the busiest
during that time.
a backlog of inbound messages via tcp after a restart isn't unusual.
with heavy inbound tcp use you may want to look at using imptcp instead of
imtcp.
Cacti stats show about half the number of disk operations as before. So it
seems to be a general performance gain, but it did not affect message loss,
however, as I can already see that some messages have disappeared.
I'm surprised it didn't help more than that. what do the impstats look like.
Regarding queues, some of my rules in each ruleset do forward messages to
another host (a SIEM) over TCP. Should I keep the queue to support that, or
is TCP good enough?
it's a good idea to have a queue on outbound delivery methods that can block
because if the remote system goes down it will block all message processing.
Consider using a disk queue for this as well (to hand an extended outage of the
remote system)
also note for the record that TCP is not as reliable as you may be thinking.
http://blog.gerhards.net/2008/04/on-unreliability-of-plain-tcp-syslog.html
RH support suggests turning of DNS lookups. Would that have any affect?
We've come to rely on hostname for creating directories for each server.
the %hostname% variable is what the sending system puts in the message, DNS
lookups aren't needed for that. Turning off DNS will avoid delays on UDP
messages. With DNS lookups, a slow DNS server can cause delays (without rsyslog
using CPU) that can end up causing some messages to be lost if the OS queues end
up running out of space.
Now that you have addressed some of the worst problems in your config, let's
talk about the log loss problem.
are you loosing messages that arrive via UDP or TCP (or both)?
what are you doing to roll logs? are you doing a restart of rsyslog or a HUP? Do
you have HUPisRestart turned on? logs will be lost when you do a restart
(messages in the memory queues will be lost when you shutdown for a restart)
David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.