On 12/17/2014 07:44 PM, Otis Gospodnetic wrote:
Hi,
I remember reading something about not being able to use ELB with multiple
rsyslog servers behind it. I can't find this information any more.... so
I'm wondering if anyone here knows if there are any issues with running
multiple rsyslog instances behind ELB?
might have been my emails,
based on my experience:
- both ends: definitely use RELP
- sender: if possible try to use keep alive (it wasn't available for
RELP but I think it's available now)
- sender: use action.resumeRetryCount and action.resumeInterval
(otherwise rsyslog does not reconnect or at least it takes a loooong time)
- sender: make sure you have a separate queue for RELP, otherwise
blocked RELP blocks main queue (you want rsyslog to read incoming
messages even if it has temporary problems sending them out)
example action for RELP (sender):
action(
type="omrelp"
target="elb.hostname"
port="5140"
template="json"
# see http://www.rsyslog.com/doc/node32.html
# disk used if forwarding blocked
queue.type="LinkedList"
queue.spoolDirectory="/some/path/"
queue.filename="json"
queue.maxdiskspace="75161927680" # 70GB (valuable data)
# these might help with ELB disconnecting after 60s of inactivity
action.resumeRetryCount="-1"
action.resumeInterval="5"
)
where:
- elb.hostname is the ELB hostname
- /some/path/ is spool directory
the main problem with ELB is that rsyslog opens bunch of connections
(one per thread) and these connections might or might not be used
(depending on traffic). If the connection is not used to some time ELB
closes it but rsyslog has hard time recognizing that. Connection can
also go dead because of backend problem (instance behind ELB has a problem).
Rsyslog is not able to figure out that these connections are dead and
continues to send data. At least in some scenarios it's because ELB
simply pretends everything is fine, in some other scenarios I am not
entirely sure what happens.
The end result is that there is no data logged, in some cases rsyslog
keeps retrying, in some cases it keeps sending data to nowhere etc. but
it does not recover.
The above config solves the problem but you still might get into
situation where the data is not being send out for 10-15 minutes. That's
why it's important to not block the main queue.
hope this helps... there's lots of details and examples in my
previous emails, just search for my email address or name on rsyslog
mailing list...
erik
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.