On 12/15/2013 11:59 PM, Rainer Gerhards wrote:
On Mon, Dec 16, 2013 at 8:47 AM, Erik Steffl <[email protected]> wrote:
looking at messages and relp responses being sent I realized that there
are multiple connection open that are used by the same action, as far as I
can tell.
E.g. I see these relp confirmations received:
[pid 21291] 07:24:00.744578 recvfrom(14, "1 rsp 92 200
OK\nrelp_version=0\nrelp_software=librelp,1.2.0,http://librelp.adiscon.com
\ncommands=syslog\n", 32768, MSG_DONTWAIT, NULL, NULL) = 102
[pid 21291] 07:25:05.093520 recvfrom(16, "1 rsp 92 200
OK\nrelp_version=0\nrelp_software=librelp,1.2.0,http://librelp.adiscon.com
\ncommands=syslog\n", 32768, MSG_DONTWAIT, NULL, NULL) = 102
Then I look at lsof:
rsyslogd 21254 syslog 14u IPv4 8699295 0t0
TCP ip-10-158-97-169.ec2.internal:52708->ec2-50-19-250-187.
compute-1.amazonaws.com:5140 (ESTABLISHED)
rsyslogd 21254 syslog 16u IPv4 8699315 0t0
TCP ip-10-158-97-169.ec2.internal:52709->ec2-50-19-250-187.
compute-1.amazonaws.com:5140 (ESTABLISHED)
Where ec2-50-19-250-187.compute-1.amazonaws.com is the load balancer.
So if there a few connections open, MARK is sent every minute over ONE
of them and keeps it alive, the other one goes bad (ELB timeout). Then
somebody else comes in, tries to use the bad connection and there it is,
silence...
So I guess cron-ing the messages works because that uses /dev/log, same
as the program that runs every 15 minutes and encounters bad connection
problem. However MARK uses the other connection so does not help.
Does that make sense?
definitely, if you have multiple actions, they go over different
connections.
oh! it's two actions, one for json format (@cee:{...} messages)
another one for text format!
But there is no way to make immark to produce @cee messages, right?
So it will never go via action that uses json format (config and few
specific questions below)
Can one action also have multiple connections? Would that happen if
more than one worker works on one queue/action?
That would mean that the possible fixes for this are:
- use the TCP keepalive (not released yet but hopefully soon)
I won't do any official release right before I go off to vacation. Tried
librelp last week, but you know what happed ;) I any case, everthing is
present and can be build from source.
ok, if I can I'll try to build librelp 1.2.2 too and test it.
- somehow make rsyslog deal with the bad connection better (not sure how
yet but since there is no actual network problem I guess there must be
something rsyslog can do to talk to the collector again)
if we don't get an error, how should *we* improve. Fix the load balancer
that accepts messages and throws them away. IMHO rsyslog/relp work just
like they should and I have no idea on how we could fix that problem.
yeah, it's more like I'm hoping that something can be done. I have
some logs (both debug and strace) which I plan to go over in detail. I
saw some suspicious close/open in strace that I think might shed more
light on what's happening. Based on what you said it should recover
after 90 seconds timeout (that you pointed out in debug log) but it does
not.
Anyway, if I find anything interesting in logs I'll report it,
otherwise nevermind...
just asking whether that makes sense, of course not asking anybody to do
anything, obviously for the second solution I'd have to do more work to
figure out what can be done and maybe come up with patches...
If someone finds out a way to make work with the broken balancer, I'll
gladly add patches. I also think you should probably get this going by just
setting the retry settings in a useful way. Have you played with them? Can
you post the relp action config (I am sure you did, but it takes time to
find it...).
if
prifilt("local0.*") or
...
(prifilt("kern.info") and ($msg == '-- MARK --'))
then {
action(type="mmjsonparse")
if $parsesuccess == "OK" then {
action(
type="omrelp"
target="elb.collector.prod.logs.ylmmuy.com"
port="5140"
template="json"
)
} else {
action(
type="omrelp"
target="elb.collector.prod.logs.ylmmuy.com"
port="5140"
template="text"
)
}
stop
}
http://www.rsyslog.com/doc/omrelp.html don't see the retry settings,
are these some generic action retries? Or queue retries? Any
hints/pointers what to try?
Btw would it be possible to change $msg from '-- MARK --' to e.g.
@cee:{"message":"MARK"} in config? Something like this:
if $msg == '-- MARK --' then {
set $msg = '@cee:{"message":"MARK"}';
}
right before action(type="mmjsonparse")
i.e. can I assign to $msg?
Of course if I somehow manage MARK message to go via json action then
text action will never get MARK message and that connection will go
stale... oh well...
Is it possible to first use template to create new message and then
just have ONE action?
erik
Rainer
thanks!
erik
On 12/15/2013 08:41 PM, David Lang wrote:
On Sun, 15 Dec 2013, Erik Steffl wrote:
changed config so now the MARK messages are sent (and received). BTW
they use kern.info facility, not sure why it's not the same as what
Rainer found in source (syslog.info).
This solves the simple test case in which I only have sender, load
balancer (ELB) and receiver. I strace both sender and receiver, see
the MARK messages and the connection is fine, i.e. I send a log
message using e.g. logger, then wait 5 min, then send another one and
it works. Without MARK the second message never arrives.
However in a more complex scenario this does not help at all. Complex
scenario looks like this (ascii arrows are flows of syslog messages):
- 6 machines -> ELB-prod -> collector-prod
- 1 machine -> ELB-test -> collector-test
- every 5 minutes: collector-test -> ELB-prod -> collector-prod
- every 5 minutes: collector-prod -> ELB-prod -> collector-prod (yes,
program on collector-prod sends message to rsyslog on collector-prod
over ELB)
that 5 minute pause cause the connection to go stale somehow which
results in periods of silence. I configured both collector-prod and
collector-test to send MARK messages (to collector-prod since that's
where they send regular messages) but I still see the periods of
silence (on both collectors). Used strace to verify that MARK messages
arrive (I guess it's possible that I confused the MARK messages from
collector-prod and collector-test, will continue investigating on that
front)
However adding cron entry to send few log messages every minute DOES
solve the problem (there is no silence anymore).
Any ideas why that would be? Is it possible that MARK messages are
being sent through different connections than other messages?
As far as I can tell the only difference between MARK messages and
the cron'd messages is that the MARK messages are generated by immark
and use kern.info facilty and the cron'd messages arrive via /dev/log
and use local0.info facility.
Any ideas why would the simpler scenario behave differently than the
complex scenario? Or why MARK messages do not solve the problem in
complex scenario while cron'd messages do?
my guess is that you have some bug in your config that is not forwarding
the mark messages in the more complex case.
There really isn't a difference as far as this problem is concerned
between cron generating the message and immark generating the message.
As you note, it's just which process does the work.
David Lang
thanks!
erik
On 12/12/2013 02:39 PM, David Lang wrote:
On Thu, 12 Dec 2013, Erik Steffl wrote:
I will try to test librelp (with keepalive) but I need some
workaround in the meantime (sort of right now).
Already tested that cron-ing logger once per minute keeps the
connection alive so that's my backup workaround.
immark would be better cause then I only need to install rsyslog
config (easier deployment) plus it would be more efficient, do you
think that what David suggested is the best option?
if I understood David's comment something like this is what I am
looking for:
if
prifilt("local0.*") or
prifilt("local1.*") or
prifilt("local2.*") or
prifilt("local3.*") or
prifilt("local4.*") or
prifilt("local5.*") or
prifilt("local6.*") or
prifilt("local7.*") or
( prifilt("syslog.info") and ... message is --MARK--)
pretty much, I would do $msg == '--MARK--' as the second test
David Lang
then {
action(type="mmjsonparse")
if $parsesuccess == "OK" then {
action(
type="omrelp"
target="someHost"
port="5140"
template="json"
# see http://www.rsyslog.com/doc/node32.html
# disk used if forwarding blocked
queue.filename="json"
queue.maxdiskspace="75161927680" # 70GB (valuable data)
action.writeAllMarkMessages="on"
)
} else {
...
}
reasonable? Can be improved?
thanks!
erik
On 12/12/2013 12:44 PM, Rainer Gerhards wrote:
On Thu, Dec 12, 2013 at 9:17 PM, David Lang <[email protected]> wrote:
On Thu, 12 Dec 2013, Erik Steffl wrote:
On 12/12/2013 08:29 AM, David Lang wrote:
what facility and severity do the immark messages show up as?
immark just generates messages, normal filtering rules determine
where
theya re sent, and the transport used (in this case RELP) has no
effect
on if they are sent or not, it's all in the filters.
thanks, that makes my question a lot more specific. How do I
configure
immark to use a specific facility?
I don't think you do. I think they are using the syslog or kernel
facility, but I'd have to setup a quick test to check. I'll try to
do it
tonight if I can, but since you are seeing the messages locally, log
with
RSYSLOG_DebugFormat for a couple of minutes and look at what they are
logged as.
its syslog.=info:
http://git.adiscon.com/?p=rsyslog.git;a=blob;f=plugins/
immark/immark.c;h=0e946c0b92d555174b38de42dd236a
c4432b98e7;hb=HEAD#l196
All I found when searching is this:
$ModLoad immark.so
$MarkMessagePeriod 60
which is what I have in my config.
Given that I see the --MARK-- messages in /var/log/syslog and
/var/log/kern.log I guess they are going to kern facility. Given
the config
below I need to use e.g. local0 facility.
no, you need to change your filtering config to send these messages,
not
try to change the messages to match your current config.
you actually can't. I considered mark a legacy feature and have not
enhanced it since 8 yrs.
Keepalive is the better option. librelp is not yet build due to the
current
workload. The code actually right now is at github only, as I have
some
problems with the Adiscon repo. Easy to clone from here
https://github.com/rgerhards/librelp
messages have the facility that they have, you don't change the
facility
any more than you re-write the message to say something different.
actually, in this case a config option would make sense. But again, I
thought this is just legacy...
Rainer
David Lang
Unfortunately can't find anything related to --MARK-- and
facilities (or
anything else other than the two settings above).
Any ideas/pointers? Or if not possible to configure immark can I
catch
the --MARK-- message and change its facility? Or catch the --MARK--
message
and have action that uses omrelp and same target (would that use
same TCP
connection)?
Thanks!
erik
David Lang
On Thu, 12 Dec 2013, Erik Steffl wrote:
Date: Thu, 12 Dec 2013 02:30:52 -0800
From: Erik Steffl <[email protected]>
Reply-To: rsyslog-users <[email protected]>
To: rsyslog-users <[email protected]>
Subject: [rsyslog] immark - how to use with action(...)
How would I use immark to send mark messages for defined
actions that
use omrelp?
I have tried something like this:
$ModLoad immark.so
$MarkMessagePeriod 60
if(..)
if
prifilt("local0.*") or
prifilt("local1.*") or
prifilt("local2.*") or
prifilt("local3.*") or
prifilt("local4.*") or
prifilt("local5.*") or
prifilt("local6.*") or
prifilt("local7.*")
then {
action(type="mmjsonparse")
if $parsesuccess == "OK" then {
action(
type="omrelp"
target="someHost"
port="5140"
template="json"
# see http://www.rsyslog.com/doc/node32.html
# disk used if forwarding blocked
queue.filename="json"
queue.maxdiskspace="75161927680" # 70GB (valuable data)
action.writeAllMarkMessages="on"
)
} else {
...
}
I see --MARK-- messages in /var/log/syslog and
/var/log/kern.log but
they are not send by omrelp action (the action works fine, normal
messages are going through).
Verified where the --MARK-- messages are going using strace so
pretty
sure they are only going to those two local files, nothing goes
over
RELP. Also checked on the receiving side of RELP, no incoming
messages
have --MARK-- in them. And the connection goes down which is also
very
strong indicator that there are no --MARK-- messages.
How do I configure it so that the --MARK-- messages are send over
RELP
protocol to someHost (over same TCP connection that the given
action
uses, it's for purpose to keep alive the connection since RELP
does
not support KeepAlive (yet, Rainer just added it to master,
thanks!))
This is on Ubuntu 13.10 using rsyslog 7.5.6, librelp 1.2.0 from
adiscon repo.
Thanks!
erik
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT
POST
if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT
POST if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.