chaged the config as suggested however it does not seem it retries
that often, here's what tcpdump shows:
ubuntu@ip-10-158-97-169:~$ sudo tcpdump -A port 5140
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
01:05:43.652804 IP ip-10-158-97-169.ec2.internal.53118 >
ec2-50-19-250-187.compute-1.amazonaws.com.5140: Flags [P.], seq
3725470457:3725470859, ack 2328419555, win 237, options [nop,nop,TS val
840645376 ecr 1017783106], length 402
E.....@.@...
.a.2....~....*................
2.;.<.#B26 syslog 387 <134>2013-12-17T00:58:13.436136+00:00
ip-10-158-97-169 logBurst[3103]:
@cee:{"message-json":{"count":"0/1","now":"Tue Dec 17 00:58:13
2013"},"yummlyLogOrigin":{"supportLevel":"prod","system":
"LOGS","cluster":"prod","role":"collectorErik","host":"ip-
10-158-97-169","tag":"logBurst[3103]:","programname"
:"logBurst","priority":"local0.info","timestamp":"2013-12-
17T00:58:13.436136+00:00"}}
01:07:43.972781 IP ip-10-158-97-169.ec2.internal.53118 >
ec2-50-19-250-187.compute-1.amazonaws.com.5140: Flags [P.], seq 0:402,
ack 1, win 237, options [nop,nop,TS val 840675456 ecr 1017783106],
length 402
E.....@[email protected]
.a.2....~....*................
2...<.#B26 syslog 387 <134>2013-12-17T00:58:13.436136+00:00
ip-10-158-97-169 logBurst[3103]:
@cee:{"message-json":{"count":"0/1","now":"Tue Dec 17 00:58:13
2013"},"yummlyLogOrigin":{"supportLevel":"prod","system":
"LOGS","cluster":"prod","role":"collectorErik","host":"ip-
10-158-97-169","tag":"logBurst[3103]:","programname"
:"logBurst","priority":"local0.info","timestamp":"2013-12-
17T00:58:13.436136+00:00"}}
syslog 387 keeps repeating every two minutes.
Config:
if
prifilt("local0.*") or
...
prifilt("local7.*")
then {
action(type="mmjsonparse")
if $parsesuccess == "OK" then {
action(
type="omrelp"
target="elb.collector.prod.logs.ylmmuy.com"
port="5140"
template="json"
queue.type="LinkedList"
queue.filename="json"
queue.maxdiskspace="75161927680" # 70GB (valuable data)
action.resumeRetryCount="-1"
action.resumeInterval="5"
)
} else {
...
}
stop
}
Same test as before (host to load balancer to another host using
RELP), no MARK, no other messages, just wait for connection to go stale
then start sending messages every 5 seconds. It takes about 15 minutes
for it to recover.
First message (strace output):
3081 00:58:13.528459 sendto(13, "26 syslog 387
<134>2013-12-17T00:58:13.436136+00:00 ip-10-158-97-169 logBurst[3103]:
@cee:{\"message-json\":{\"count\":\"0/1\",\"now\":\"Tue Dec 17 00:58:13
2013\"},\"yummlyLogOrigin\":{\"supportLevel\":\"prod\",\"
system\":\"LOGS\",\"cluster\":\"prod\",\"role\":\"
collectorErik\",\"host\":\"ip-10-158-97-169\",\"tag\":\"
logBurst[3103]:\",\"programname\":\"logBurst\",\"priority\":\"local0.info
\",\"timestamp\":\"2013-12-17T00:58:13.436136+00:00\"}}\n\n",
402, 0, NULL, 0) = 402
...
3081 00:58:13.529411 setsockopt(13, SOL_TCP, TCP_CORK, [0], 4) = 0
...
3081 00:58:18.725420 setsockopt(13, SOL_TCP, TCP_CORK, [1], 4) = 0
...
3081 00:58:18.726657 sendto(13, "27 syslog 387 ... same as above ...
Just like tcpdump shows the message is being resent (strace output
just like the one above) until:
3081 01:02:27.982896 sendto(13, "77 syslog 387
<134>2013-12-17T01:02:27.893264+00:00 ip-10-158-97-169 logBurst[3257]:
@cee:{\"message-json\":{\"count\":\"0/1\",\"now\":\"Tue Dec 17 01:02:27
2013\"},\"yummlyLogOrigin\":{\"supportLevel\":\"prod\",\"
system\":\"LOGS\",\"cluster\":\"prod\",\"role\":\"
collectorErik\",\"host\":\"ip-10-158-97-169\",\"tag\":\"
logBurst[3257]:\",\"programname\":\"logBurst\",\"priority\":\"local0.info
\",\"timestamp\":\"2013-12-17T01:02:27.893264+00:00\"}}\n\n",
402, 0, NULL, 0 <unfinished ...>
... other threads ...
3081 01:13:44.932842 <... sendto resumed> ) = 45
... writing debug info ...
3081 01:13:44.934579 setsockopt(13, SOL_SOCKET, SO_LINGER, {onoff=1,
linger=0}, 8) = 0
3081 01:13:44.934662 close(13) = 0
After this it recovers. The total time is 15 minutes or so. Is there
any way to shorten this time?
erik
On 12/16/2013 12:42 AM, Rainer Gerhards wrote:
On Mon, Dec 16, 2013 at 9:35 AM, Erik Steffl <[email protected]> wrote:
if
prifilt("local0.*") or
...
(prifilt("kern.info") and ($msg == '-- MARK --'))
then {
action(type="mmjsonparse")
if $parsesuccess == "OK" then {
action(
type="omrelp"
target="elb.collector.prod.logs.ylmmuy.com"
port="5140"
template="json"
)
} else {
action(
type="omrelp"
target="elb.collector.prod.logs.ylmmuy.com"
port="5140"
template="text"
)
}
stop
}
that's what I suspected. You use the defaults, which means "disable me
for
30 seconds if the connections break continuesly". Try
use
action(
type="omrelp"
target="elb.collector.prod.logs.ylmmuy.com"
port="5140"
template="text"
*action.resumeRetryCount="-1"*
*action.resumeInterval="5"*
)
to get you started. It will try infinitely to send messages, but will
pause 5 seconds between retries. Note that you may run into trouble if
the
destination is offline for an extended period of time.
http://www.rsyslog.com/doc/omrelp.html don't see the retry
settings, are
these some generic action retries?
action parameters applying to all actions:
http://www.rsyslog.com/doc/rsyslog_conf_actions.html
(you know the doc discussion, so no need to explain it may be unintuitive
to find ;-))
Rainer
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
if you DON'T LIKE THAT.