25.11.2013 15:06, Pavel Levshin:
- 6 hosts that were sending syslog messages to the log collector STOP
sending anything (as verified by stracing rsyslogd, tcpdump and in
amazon AWS console metric for network in)
- after this nothing is ever written into
/path/2013-10-10/03/00/log.json
- the 6 clients start sending sysog messages again when the next file
is created (in this example it would be /path/2013-10-10/03/01/log.json)
So, most probably, rsyslog is unable to write the file.
Ok, not file, but omrelp. From debug log:
8333.028788456:7fe205cce700: relp session 0x7fe1fc006d70 flagged as
broken, IO error
8333.028816098:7fe205cce700: error forwarding via relp, suspending
8333.028826395:7fe205cce700: Action 0x2676750 transitioned to state: rtry
8333.028836333:7fe205cce700: action 0x2676750 call returned -2007
8333.028845453:7fe205cce700: tryDoAction: unexpected error code
-2007[nElem 147, Commited UpTo 32], finalizing
8333.028854197:7fe205cce700: action ret RS_RET_SUSPENDED - retry full batch
8333.028862577:7fe205cce700: tryDoAction 0x2676750, pnElem 147, nElem 147
8333.028873041:7fe205cce700: actionDoRetry: enter loop, iRetries=0
8333.039382517:7fe205cce700: frame to send: '1 open 85 relp_version=0
relp_software=librelp,1.2.0,http://librelp.adiscon.com
commands=syslog
'
8333.039423945:7fe205cce700: ADD sess 0x7fe1fc006d70 unacked 90, sessState 1
8333.039434614:7fe205cce700: sendbuf added to unacked list
8333.039449223:7fe205cce700: relp session read -1 octets, buf '@T̅㿧
8333.039465005:7fe205cce700: relpSessWaitRsp waiting for data on fd 15,
timeout 90.0
8392.214176529:7fe205cce700: relpSessWaitRsp select returns, nfds 1,
errno 11
8392.214221684:7fe205cce700: relp session read 0 octets, buf ''
8392.214231351:7fe205cce700: server closed relp session 0x7fe1fc006d70,
session broken
8392.214239828:7fe205cce700: relpSessWaitState returns 10007
8392.214248528:7fe205cce700: end relpSessConnect, iRet 10007
8392.214271434:7fe205cce700: after TryReestablish, sess state 9
8392.214281794:7fe205cce700: actionDoRetry: action->tryResume returned -2007
8392.214290803:7fe205cce700: actionDoRetry: check for max retries,
iResumeRetryCount 0, iRetries 0
8392.214300988:7fe205cce700: Action 0x2676750 transitioned to state: susp
8392.214309092:7fe205cce700: action suspended, earliest retry=1385088422
8392.214318263:7fe205cce700: action 0x2676750 call returned -2007
8392.214327180:7fe205cce700: tryDoAction: unexpected error code
-2007[nElem 147, Commited UpTo 32], finalizing
8392.214335702:7fe205cce700: action ret RS_RET_SUSPENDED - retry full batch
8392.214343853:7fe205cce700: tryDoAction 0x2676750, pnElem 147, nElem 147
8392.214354236:7fe205cce700: actionTryResume: action 0x2676750 state:
susp, next retry (if applicable): 1385088422 [now 1385088392]
8392.214363589:7fe205cce700: omrelp: shutdownImmediate ptr now is 0x26754b8
--
Pavel Levshin
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.