On Wed, 25 May 2016, Karri Vrkreddy wrote:

Hi,

We have multiple central rsyslog servers behind a DNS Round-Robin
and a stand alone syslog server as a fallback. I am trying to
implement failover with the following configuration.


************
$ModLoad impstats
$PStatInterval 60
$PStatSeverity 7
syslog.debug /var/log/rsyslog-stats

$RuleSet CentralLogging
$MainMsgQueueSize 100000
$MainMsgQueueDiscardMark 90000
$MainMSgQueueDiscardSeverity 3
$RulesetCreateMainQueue on
$ActionSendTCPRebindInterval 900
$ActionName dnsrr
*.*    @@dnsrr;RSYSLOG_SyslogProtocol23Format
$ActionExecOnlyWhenPreviousIsSuspended on
$ActionName failover
& @@fallbackserver;RSYSLOG_SyslogProtocol23Format
$ActionExecOnlyWhenPreviousIsSuspended off

$InputUDPServerBindRuleset stormcentral
$UDPServerRun 514
**************

I tried to test the failover by adding an iptable rule to DROP the
outbound packets to current remote server ( one host behind dnsrr )
in use.

As expected I see an increase in QueueSize. I was expecting an increase
in "failed" metric for dnsrr action and increase in "processed" metric
for failover action.

But, it did not happen. The Queue size increased and after some time , at
May 24 12:58:53 in the log below , I noticed that rsyslogd established
new connection to a different host behind DNS RR and drained the Queue.

Logs at end of the mail.


Rsyslog V7 is the one in use.


1. Shouldn't rsyslog failover to the failover action before waiting for so
long and establishing another connection ?
2. If this is the expected behavior, Can I forget the
ActionExecOnlyWhenPreviousIsSuspended as rsyslog is itself establishing new
connection to a different server behind the DNSRR
( probably calling gethostbyname() again )
3. I do not have ActionSendTCPRebindInterval set.

This is one of the things that is much clearer in the new syntax

What you actually have is

ruleset {
 deliver message to worker queue
 if you can't deliver it to the queu, send it to the failover host
}

worker queue {
 deliver to the main destination
}

What you need to do is to make the queue part of the ruleset, not part of the first action in the ruleset, then you can do failover between things.

In addition, you probably want to look at rebindinterval, this tells rsyslog to disconnect and reconnect every X messages to give the load balancing mechanism a chance to work.

I don't like round robin DNS, it sometimes works, but it only works if the clients are doing repeated DNS queries. If they cache the result, they won't shift. Typically only DNS resolvers look at the DNS info to figure out how frequently they need to check again, everything else either does a full lookup every time access is needed, or just looks it up once and uses the IP after that.

What I prefer to do is to use a local load balancer at the destination. This could be an appliance like an F5, but you can also use CLUSTERIP in iptables directly.

I cover this in the lisa presentation and the first article in my logging series:

https://www.usenix.org/conference/lisa12/technical-sessions/presentation/lang_david
https://www.usenix.org/publications/login/david-lang-series
https://www.usenix.org/publications/login/april14/lang

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to