chaged the config as suggested however it does not seem it retries that often, here's what tcpdump shows:

ubuntu@ip-10-158-97-169:~$ sudo tcpdump -A port 5140
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
01:05:43.652804 IP ip-10-158-97-169.ec2.internal.53118 > ec2-50-19-250-187.compute-1.amazonaws.com.5140: Flags [P.], seq 3725470457:3725470859, ack 2328419555, win 237, options [nop,nop,TS val 840645376 ecr 1017783106], length 402
E.....@.@...
.a.2....~....*................
2.;.<.#B26 syslog 387 <134>2013-12-17T00:58:13.436136+00:00 ip-10-158-97-169 logBurst[3103]: @cee:{"message-json":{"count":"0/1","now":"Tue Dec 17 00:58:13 2013"},"yummlyLogOrigin":{"supportLevel":"prod","system":"LOGS","cluster":"prod","role":"collectorErik","host":"ip-10-158-97-169","tag":"logBurst[3103]:","programname":"logBurst","priority":"local0.info","timestamp":"2013-12-17T00:58:13.436136+00:00"}}


01:07:43.972781 IP ip-10-158-97-169.ec2.internal.53118 > ec2-50-19-250-187.compute-1.amazonaws.com.5140: Flags [P.], seq 0:402, ack 1, win 237, options [nop,nop,TS val 840675456 ecr 1017783106], length 402
E.....@[email protected]
.a.2....~....*................
2...<.#B26 syslog 387 <134>2013-12-17T00:58:13.436136+00:00 ip-10-158-97-169 logBurst[3103]: @cee:{"message-json":{"count":"0/1","now":"Tue Dec 17 00:58:13 2013"},"yummlyLogOrigin":{"supportLevel":"prod","system":"LOGS","cluster":"prod","role":"collectorErik","host":"ip-10-158-97-169","tag":"logBurst[3103]:","programname":"logBurst","priority":"local0.info","timestamp":"2013-12-17T00:58:13.436136+00:00"}}

  syslog 387 keeps repeating every two minutes.

  Config:

if
  prifilt("local0.*") or
  ...
  prifilt("local7.*")
then {
  action(type="mmjsonparse")
  if $parsesuccess == "OK" then {
    action(
      type="omrelp"
      target="elb.collector.prod.logs.ylmmuy.com"
      port="5140"
      template="json"
      queue.type="LinkedList"
      queue.filename="json"
      queue.maxdiskspace="75161927680" # 70GB (valuable data)
      action.resumeRetryCount="-1"
      action.resumeInterval="5"
    )
  } else {
...
  }
  stop
}

Same test as before (host to load balancer to another host using RELP), no MARK, no other messages, just wait for connection to go stale then start sending messages every 5 seconds. It takes about 15 minutes for it to recover.

  First message (strace output):

3081 00:58:13.528459 sendto(13, "26 syslog 387 <134>2013-12-17T00:58:13.436136+00:00 ip-10-158-97-169 logBurst[3103]: @cee:{\"message-json\":{\"count\":\"0/1\",\"now\":\"Tue Dec 17 00:58:13 2013\"},\"yummlyLogOrigin\":{\"supportLevel\":\"prod\",\"system\":\"LOGS\",\"cluster\":\"prod\",\"role\":\"collectorErik\",\"host\":\"ip-10-158-97-169\",\"tag\":\"logBurst[3103]:\",\"programname\":\"logBurst\",\"priority\":\"local0.info\",\"timestamp\":\"2013-12-17T00:58:13.436136+00:00\"}}\n\n", 402, 0, NULL, 0) = 402
...
3081  00:58:13.529411 setsockopt(13, SOL_TCP, TCP_CORK, [0], 4) = 0
...
3081  00:58:18.725420 setsockopt(13, SOL_TCP, TCP_CORK, [1], 4) = 0
...
3081  00:58:18.726657 sendto(13, "27 syslog 387 ... same as above ...

Just like tcpdump shows the message is being resent (strace output just like the one above) until:

3081 01:02:27.982896 sendto(13, "77 syslog 387 <134>2013-12-17T01:02:27.893264+00:00 ip-10-158-97-169 logBurst[3257]: @cee:{\"message-json\":{\"count\":\"0/1\",\"now\":\"Tue Dec 17 01:02:27 2013\"},\"yummlyLogOrigin\":{\"supportLevel\":\"prod\",\"system\":\"LOGS\",\"cluster\":\"prod\",\"role\":\"collectorErik\",\"host\":\"ip-10-158-97-169\",\"tag\":\"logBurst[3257]:\",\"programname\":\"logBurst\",\"priority\":\"local0.info\",\"timestamp\":\"2013-12-17T01:02:27.893264+00:00\"}}\n\n", 402, 0, NULL, 0 <unfinished ...>
... other threads ...
3081  01:13:44.932842 <... sendto resumed> ) = 45
... writing debug info ...
3081 01:13:44.934579 setsockopt(13, SOL_SOCKET, SO_LINGER, {onoff=1, linger=0}, 8) = 0
3081  01:13:44.934662 close(13)         = 0

After this it recovers. The total time is 15 minutes or so. Is there any way to shorten this time?

        erik

On 12/16/2013 12:42 AM, Rainer Gerhards wrote:
On Mon, Dec 16, 2013 at 9:35 AM, Erik Steffl <[email protected]> wrote:


if
   prifilt("local0.*") or
   ...
   (prifilt("kern.info") and ($msg == '-- MARK --'))

then {
   action(type="mmjsonparse")
   if $parsesuccess == "OK" then {
     action(
       type="omrelp"
       target="elb.collector.prod.logs.ylmmuy.com"
       port="5140"
       template="json"
     )
   } else {
     action(
       type="omrelp"
       target="elb.collector.prod.logs.ylmmuy.com"
       port="5140"
       template="text"
     )
   }
   stop
}


that's what I suspected. You use the defaults, which means "disable me for
30 seconds if the connections break continuesly". Try

use

     action(
       type="omrelp"
       target="elb.collector.prod.logs.ylmmuy.com"
       port="5140"
       template="text"
       *action.resumeRetryCount="-1"*
       *action.resumeInterval="5"*
     )

  to get you started. It will try infinitely to send messages, but will
pause 5 seconds between retries. Note that you may run into trouble if the
destination is offline for an extended period of time.

   http://www.rsyslog.com/doc/omrelp.html don't see the retry settings, are
these some generic action retries?


action parameters applying to all actions:

http://www.rsyslog.com/doc/rsyslog_conf_actions.html

(you know the doc discussion, so no need to explain it may be unintuitive
to find ;-))

Rainer
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to