On 12/18/2014 03:53 PM, singh.janmejay wrote:
True, but even non RELP delivery mechanism will lose messages in the same
way as it would when talking to a remote instance without load balancer. So
risks are the same, provided load balancer sends an RST for packets for
unknown session.
RELP or not (assuming a well behaved load balancer) is hence an independent
choice from having a load balancer or not.
Also it is important to double check keep-alive settings of the socket.
Re-ack should go before timeout on lb rather than the timeout on receiver
host.
ELB is not behaving well and there is virtually nothing you can set
up. You can ask Amazon support to make the time before it kills inactive
connection longer (default is 60s I think). That's about it.
That's why RELP AND the retry setting I posted in another emil on
this thread are absolutely crucial. If interested you can search
archives of this mailing list for tons of details on various aspects of
ELB and rsyslog.
erik
--
Regards,
Janmejay
PS: Please blame the typos in this mail on my phone's uncivilized soft
keyboard sporting it's not-so-smart-assist technology.
On Dec 19, 2014 4:27 AM, "Erik Steffl" <[email protected]> wrote:
On 12/18/2014 10:00 AM, David Lang wrote:
On Thu, 18 Dec 2014, singh.janmejay wrote:
Usually load-balancers don't issue RST for unknown sessions. This is done
in order to avoid additional overhead of processing packets even for
unknown sessions.
I haven't worked with ELB, but doesn't it have a config to enable issuing
RST (instead of simply dropping packets for unknown sessions)? Usually
load-balancers have a way to turn on RST for this case.
Once turned on, regardless of it being RELP or not, it should work just
fine. Write to socket on rsyslog side will fail with the same error codes
as it would in case receiver dropped connection (crash or whatever).
without relp, rsyslog considers the log delivered to the remote system
when it sends it to the local TCP stack (because there is no mechanism
to know anything beyond this), so you can have a bunch of messages
'sent' on a connection that is in the process of being closed, and these
messages are going to be lost. RELP adds application layer acks so that
rsyslog can know that the message really got to the remote system.
As far as needing a separate queue for sending RELP, it depends on what
else is going on in the system. The separate queue can fill up, and when
it does the main queue blocks anyway. So you have to have enough space
in the queues to handle the backlog. Once you have enough space, you
won't stop accepting new messages. The difference between having the
space in the main queue and in a separate RELP queue is that if you have
other destinations you are delivering to, having a separate queue allows
you to deliver to those other destinations while RELP is blocked.
totally agreed (after all you and Reiner helped me with exactly this
problem), just want to stress that in realistic production setup you very
likely want to have separate queue to not block the incoming messages.
as you point out, you need enough memory and/or disk space to buffer
messages that were received but not sent (some failure scenarios take 5 to
15 minutes to recover from).
erik
David Lang
On Thu, Dec 18, 2014 at 10:17 AM, Erik Steffl <[email protected]> wrote:
On 12/17/2014 07:44 PM, Otis Gospodnetic wrote:
Hi,
I remember reading something about not being able to use ELB with
multiple
rsyslog servers behind it. I can't find this information any
more.... so
I'm wondering if anyone here knows if there are any issues with running
multiple rsyslog instances behind ELB?
might have been my emails,
based on my experience:
- both ends: definitely use RELP
- sender: if possible try to use keep alive (it wasn't available for
RELP but I think it's available now)
- sender: use action.resumeRetryCount and action.resumeInterval
(otherwise rsyslog does not reconnect or at least it takes a loooong
time)
- sender: make sure you have a separate queue for RELP, otherwise
blocked RELP blocks main queue (you want rsyslog to read incoming
messages
even if it has temporary problems sending them out)
example action for RELP (sender):
action(
type="omrelp"
target="elb.hostname"
port="5140"
template="json"
# see http://www.rsyslog.com/doc/node32.html
# disk used if forwarding blocked
queue.type="LinkedList"
queue.spoolDirectory="/some/path/"
queue.filename="json"
queue.maxdiskspace="75161927680" # 70GB (valuable data)
# these might help with ELB disconnecting after 60s of inactivity
action.resumeRetryCount="-1"
action.resumeInterval="5"
)
where:
- elb.hostname is the ELB hostname
- /some/path/ is spool directory
the main problem with ELB is that rsyslog opens bunch of connections
(one per thread) and these connections might or might not be used
(depending on traffic). If the connection is not used to some time ELB
closes it but rsyslog has hard time recognizing that. Connection can
also
go dead because of backend problem (instance behind ELB has a problem).
Rsyslog is not able to figure out that these connections are dead and
continues to send data. At least in some scenarios it's because ELB
simply
pretends everything is fine, in some other scenarios I am not
entirely sure
what happens.
The end result is that there is no data logged, in some cases rsyslog
keeps retrying, in some cases it keeps sending data to nowhere etc.
but it
does not recover.
The above config solves the problem but you still might get into
situation where the data is not being send out for 10-15 minutes. That's
why it's important to not block the main queue.
hope this helps... there's lots of details and examples in my previous
emails, just search for my email address or name on rsyslog mailing
list...
erik
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.