2016-04-21 8:25 GMT+02:00 Kane Kim <[email protected]>: > What would happen if it will be retried forever at this point? Would it > affect any other queues or only current action queue. If it will not affect > anything else I'd really prefer it to block there until successfully > committed.
I need to read the code to be 100% sure. But I would expect that it holds processing of just the queue where it is running on (98% sure). Rainer > > On Wed, Apr 20, 2016 at 11:16 PM, Rainer Gerhards <[email protected]> > wrote: > >> 2016-04-21 3:22 GMT+02:00 David Lang <[email protected]>: >> > Thanks, I didn't know that you had been able to take any time to think >> about >> > this issue yet. >> >> Sorry for not being more explicit about it. But there was not yet >> anything that could be posted. I did this as a side-activity all since >> this opened. The plan acutally is: >> >> 1. get a testbench test to expose that problem -- this greatly eases >> debugging >> line of thought is to update omfile, which is needed anyhow, so >> that we can induce errors via it >> 2. add missing functionality to write error file >> this does not directly address the issue, but is useful and >> ensures we are on the right path >> 3. check the end transaction endpoint. After review of old notes and >> some thinking, I think that >> part of the "not yet implemented" comment is actually right. It's >> not as easy as suggested by it, >> but we probably can do better. >> >> > >> > It sounds like you are tenitively agreeing with my speculation that the >> end >> > of the block processing is happening just a little sooner than it should, >> > preventing the chance to detect output errors. >> >> No, I don't think so. I think it happens as it should, but we are not >> handling the full situation. Originally, it was planned to write an >> error file (point 2 above) when we cannot commit at end of >> transaction. But we probably can give it some retries at that point as >> well. In v8, we **need** to buffer transactional data until the end of >> batch. If we wouldn't do that, we would always need to commit after >> each and every transaction (because we need processing "completed" >> when we advance to the next action). The, at end of batch, the >> transaction commit processing happens. Right now, we have one try, and >> then give up (unfortunately without any error indication). So we need >> to go into that part of processing and enhance/fix it. >> >> I hope this clarifies. >> Rainer >> >> > >> > David Lang >> > >> > >> > On Wed, 20 Apr 2016, Rainer Gerhards wrote: >> > >> >> I should also say that it had a release goal, so it is on my active >> to-do >> >> list. >> >> >> >> Sent from phone, thus brief. >> >> Am 20.04.2016 19:40 schrieb "Rainer Gerhards" <[email protected] >> >: >> >> >> >>> Yup... Look aty work Log - still no time... Was meant for last release >> >>> but >> >>> then came libfastjson work. >> >>> >> >>> Sent from phone, thus brief. >> >>> Am 20.04.2016 19:39 schrieb "David Lang" <[email protected]>: >> >>> >> >>>> Rainer, ping on this issue. >> >>>> >> >>>> David Lang >> >>>> >> >>>> On Tue, 29 Mar 2016, Kane Kim wrote: >> >>>> >> >>>> Date: Tue, 29 Mar 2016 10:46:49 -0700 >> >>>>> >> >>>>> From: Kane Kim <[email protected]> >> >>>>> Reply-To: rsyslog-users <[email protected]> >> >>>>> To: rsyslog-users <[email protected]> >> >>>>> Subject: Re: [rsyslog] retry if output module returns >> RS_RET_SUSPENDED >> >>>>> >> >>>>> Updated config to: >> >>>>> module(load="omrelp") >> >>>>> local0.* action(type="omrelp" >> >>>>> target="10.0.0.1" >> >>>>> port="2514" >> >>>>> name="debug_relp" >> >>>>> queue.filename="relp_queue" >> >>>>> queue.maxdiskspace="1G" >> >>>>> queue.saveonshutdown="on" >> >>>>> queue.spoolDirectory="/var/spool/rsyslog" >> >>>>> queue.type="Disk" >> >>>>> action.resumeRetryCount="-1" >> >>>>> action.resumeInterval="1" >> >>>>> ) >> >>>>> >> >>>>> Still message is lost: >> >>>>> >> >>>>> 3489.658349703:imuxsock.c : Message from UNIX socket: #3 >> >>>>> 3489.658367582:imuxsock.c : main Q: qqueueAdd: entry added, size >> >>>>> now >> >>>>> log 1, phys 1 entries >> >>>>> 3489.658372561:imuxsock.c : main Q: EnqueueMsg advised worker >> start >> >>>>> 3489.658374985:imuxsock.c : --------imuxsock calling select, >> active >> >>>>> file descriptors (max 3): 3 >> >>>>> 3489.658384201:main Q:Reg/w0 : wti 0x23c8ba0: worker awoke from idle >> >>>>> processing >> >>>>> 3489.658386792:main Q:Reg/w0 : DeleteProcessedBatch: we deleted 0 >> >>>>> objects >> >>>>> and enqueued 0 objects >> >>>>> 3489.658388548:main Q:Reg/w0 : doDeleteBatch: delete batch from >> store, >> >>>>> new >> >>>>> sizes: log 1, phys 1 >> >>>>> 3489.658391356:main Q:Reg/w0 : processBATCH: batch of 1 elements >> must >> >>>>> be >> >>>>> processed >> >>>>> 3489.658392786:main Q:Reg/w0 : processBATCH: next msg 0: <134>Mar 29 >> >>>>> 17:44:49 logger: test >> >>>>> 3489.658395027:main Q:Reg/w0 : PRIFILT 'local0.*' >> >>>>> 3489.658397540:main Q:Reg/w0 : pmask: X X X X X X X X X >> >>>>> X X >> >>>>> X X X X X FF X X X X X X X X X >> >>>>> 3489.658413235:main Q:Reg/w0 : PRIFILT condition result is 1 >> >>>>> 3489.658414686:main Q:Reg/w0 : ACTION 0 >> >>>>> [omrelp:action(type="omrelp" >> >>>>> ...)] >> >>>>> 3489.658417648:main Q:Reg/w0 : executing action 0 >> >>>>> 3489.658419520:main Q:Reg/w0 : action 'debug_relp': called, logging >> to >> >>>>> omrelp (susp 0/0, direct q 0) >> >>>>> 3489.658433462:main Q:Reg/w0 : strm 0x23b74c0: file -1(relp_queue) >> >>>>> flush, >> >>>>> buflen 410 >> >>>>> 3489.658435496:main Q:Reg/w0 : strmPhysWrite, stream 0x23b74c0, len >> >>>>> 410 >> >>>>> 3489.658484243:main Q:Reg/w0 : file >> >>>>> '/var/spool/rsyslog/relp_queue.00000001' opened as #5 with mode 384 >> >>>>> 3489.658487347:main Q:Reg/w0 : strm 0x23b74c0: opened file >> >>>>> '/var/spool/rsyslog/relp_queue.00000001' for WRITE as 5 >> >>>>> 3489.658500498:main Q:Reg/w0 : strm 0x23b74c0: file 5 write wrote >> 410 >> >>>>> bytes >> >>>>> 3489.658502751:main Q:Reg/w0 : debug_relp queue: write wrote 410 >> >>>>> octets >> >>>>> to >> >>>>> disk, queue disk size now 410 octets, EnqOnly:0 >> >>>>> 3489.658504568:main Q:Reg/w0 : debug_relp queue: qqueueAdd: entry >> >>>>> added, >> >>>>> size now log 1, phys 1 entries >> >>>>> 3489.658506705:main Q:Reg/w0 : debug_relp queue:Reg: high activity - >> >>>>> starting 1 additional worker thread(s). >> >>>>> 3489.658528951:main Q:Reg/w0 : debug_relp queue:Reg: started with >> >>>>> state >> >>>>> 0, >> >>>>> num workers now 1 >> >>>>> 3489.658530689:main Q:Reg/w0 : debug_relp queue: EnqueueMsg advised >> >>>>> worker >> >>>>> start >> >>>>> 3489.658532183:main Q:Reg/w0 : action 'debug_relp': set suspended >> >>>>> state >> >>>>> to >> >>>>> 0 >> >>>>> 3489.658533672:main Q:Reg/w0 : PRIFILT >> >>>>> '*.info;mail.none;authpriv.none;cron.none' >> >>>>> 3489.658536164:main Q:Reg/w0 : pmask: 7F 7F X 7F 7F 7F 7F 7F 7F >> >>>>> X X >> >>>>> 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F >> >>>>> 3489.658552240:main Q:Reg/w0 : PRIFILT condition result is 1 >> >>>>> 3489.658553393:main Q:Reg/w0 : ACTION 1 >> >>>>> [builtin:omfile:/var/log/messages] >> >>>>> 3489.658556301:main Q:Reg/w0 : executing action 1 >> >>>>> 3489.658557878:main Q:Reg/w0 : action 'action 1': called, logging to >> >>>>> builtin:omfile (susp 0/0, direct q 1) >> >>>>> 3489.658561265:main Q:Reg/w0 : action 'action 1': is transactional - >> >>>>> executing in commit phase >> >>>>> 3489.658563696:main Q:Reg/w0 : Action 1 transitioned to state: itx >> >>>>> 3489.658564902:main Q:Reg/w0 : action 'action 1': set suspended >> state >> >>>>> to 0 >> >>>>> 3489.658566053:main Q:Reg/w0 : PRIFILT 'authpriv.*' >> >>>>> 3489.658568233:main Q:Reg/w0 : pmask: X X X X X X X X X >> >>>>> X >> >>>>> FF >> >>>>> X X X X X X X X X X X X X X X >> >>>>> 3489.658582888:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>> 3489.658584058:main Q:Reg/w0 : PRIFILT 'mail.*' >> >>>>> 3489.658586243:main Q:Reg/w0 : pmask: X X FF X X X X X X >> >>>>> X X >> >>>>> X X X X X X X X X X X X X X X >> >>>>> 3489.658600961:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>> 3489.658602136:main Q:Reg/w0 : PRIFILT 'cron.*' >> >>>>> 3489.658604562:main Q:Reg/w0 : pmask: X X X X X X X X X >> >>>>> FF X >> >>>>> X X X X X X X X X X X X X X X >> >>>>> 3489.658619159:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>> 3489.658620352:main Q:Reg/w0 : PRIFILT '*.emerg' >> >>>>> 3489.658622551:main Q:Reg/w0 : pmask: 1 1 1 1 1 1 1 1 1 >> >>>>> 1 1 >> >>>>> 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 >> >>>>> 3489.658638435:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>> 3489.658639529:main Q:Reg/w0 : PRIFILT 'uucp,news.crit' >> >>>>> 3489.658641711:main Q:Reg/w0 : pmask: X X X X X X X 7 7 >> >>>>> X X >> >>>>> X X X X X X X X X X X X X X X >> >>>>> 3489.658656380:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>> 3489.658657532:main Q:Reg/w0 : PRIFILT 'local7.*' >> >>>>> 3489.658659794:main Q:Reg/w0 : pmask: X X X X X X X X X >> >>>>> X X >> >>>>> X X X X X X X X X X X X FF X X >> >>>>> 3489.658674346:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>> 3489.658675686:main Q:Reg/w0 : END batch execution phase, entering >> to >> >>>>> commit phase [processed 1 of 1 messages] >> >>>>> 3489.658677400:main Q:Reg/w0 : actionCommitAll: action 1, state 1, >> nbr >> >>>>> to >> >>>>> commit 0 isTransactional 1 >> >>>>> 3489.658679044:main Q:Reg/w0 : doTransaction: have commitTransaction >> >>>>> IF, >> >>>>> using that, pWrkrInfo 0x23c8ce0 >> >>>>> 3489.658680668:main Q:Reg/w0 : entering >> actionCallCommitTransaction(), >> >>>>> state: itx, actionNbr 1, nMsgs 1 >> >>>>> 3489.658682704:main Q:Reg/w0 : omfile: write to stream, pData->pStrm >> >>>>> 0x7f1ef4002480, lenBuf 42, strt data Mar 29 17:44:49 314959ac8cb0 >> >>>>> logger: >> >>>>> test >> >>>>> >> >>>>> 3489.658685296:main Q:Reg/w0 : strm 0x7f1ef4002480: file 4(messages) >> >>>>> flush, buflen 42 >> >>>>> 3489.658686866:main Q:Reg/w0 : strmPhysWrite, stream 0x7f1ef4002480, >> >>>>> len 42 >> >>>>> 3489.658696046:main Q:Reg/w0 : strm 0x7f1ef4002480: file 4 write >> wrote >> >>>>> 42 >> >>>>> bytes >> >>>>> 3489.658697884:main Q:Reg/w0 : Action 1 transitioned to state: rdy >> >>>>> 3489.658699301:main Q:Reg/w0 : Action 1 transitioned to state: itx >> >>>>> 3489.658700575:main Q:Reg/w0 : Action 1 transitioned to state: rdy >> >>>>> 3489.658701773:main Q:Reg/w0 : actionCommit, in retry loop, iRet 0 >> >>>>> 3489.658703176:main Q:Reg/w0 : processBATCH: batch of 1 elements has >> >>>>> been >> >>>>> processed >> >>>>> 3489.658704989:main Q:Reg/w0 : regular consumer finished, iret=0, >> >>>>> szlog >> >>>>> 0 >> >>>>> sz phys 1 >> >>>>> 3489.658707214:main Q:Reg/w0 : DeleteProcessedBatch: we deleted 1 >> >>>>> objects >> >>>>> and enqueued 0 objects >> >>>>> 3489.658708626:main Q:Reg/w0 : doDeleteBatch: delete batch from >> store, >> >>>>> new >> >>>>> sizes: log 0, phys 0 >> >>>>> 3489.658710186:main Q:Reg/w0 : regular consumer finished, iret=4, >> >>>>> szlog >> >>>>> 0 >> >>>>> sz phys 0 >> >>>>> 3489.658711550:main Q:Reg/w0 : main Q:Reg/w0: worker IDLE, waiting >> for >> >>>>> work. >> >>>>> 3489.660013052:7f1efbb14700: thread created, tid 30310, name >> >>>>> 'rs:debug_relp >> >>>>> queue:Reg' >> >>>>> 3489.660298870:debug_relp queue:Reg/w0: wti 0x23c78b0: worker >> starting >> >>>>> 3489.660645336:debug_relp queue:Reg/w0: DeleteProcessedBatch: we >> >>>>> deleted >> >>>>> 0 >> >>>>> objects and enqueued 0 objects >> >>>>> 3489.660922752:debug_relp queue:Reg/w0: doDeleteBatch: delete batch >> >>>>> from >> >>>>> store, new sizes: log 1, phys 1 >> >>>>> 3489.660947418:debug_relp queue:Reg/w0: file >> >>>>> '/var/spool/rsyslog/relp_queue.00000001' opened as #6 with mode 384 >> >>>>> 3489.660959920:debug_relp queue:Reg/w0: strm 0x23b7760: opened file >> >>>>> '/var/spool/rsyslog/relp_queue.00000001' for READ as 6 >> >>>>> 3489.660967370:debug_relp queue:Reg/w0: strm 0x23b7760: file 6 read >> 410 >> >>>>> bytes >> >>>>> 3489.661768120:debug_relp queue:Reg/w0: action 'debug_relp': is >> >>>>> transactional - executing in commit phase >> >>>>> 3489.661782408:debug_relp queue:Reg/w0: wti 0x23c78b0: we need to >> >>>>> create >> >>>>> a >> >>>>> new action worker instance for action 0 >> >>>>> 3489.661789583:debug_relp queue:Reg/w0: relp engine created new >> client >> >>>>> 0x7f1ef00027e0 >> >>>>> 3489.661794073:debug_relp queue:Reg/w0: wti 0x23c78b0: created action >> >>>>> worker instance 1 for action 0 >> >>>>> 3489.661798129:debug_relp queue:Reg/w0: omrelp: beginTransaction >> >>>>> ^C3497.710058579:main thread : DoDie called. >> >>>>> 3497.710272055:main thread : exiting on signal 2 >> >>>>> 3497.710576430:main thread : Terminating input threads... >> >>>>> 3497.710765753:main thread : request term via SIGTTIN for input >> >>>>> thread >> >>>>> 'imuxsock' 0xfcb16700 >> >>>>> 3497.711033925:imuxsock.c : thrdStarter: usrThrdMain imuxsock - >> >>>>> 0x7f1efcb16700 returned with iRet 0, exiting now. >> >>>>> 3497.711447372:main thread : non-cancel input thread termination >> >>>>> succeeded for thread imuxsock 0xfcb16700 >> >>>>> 3497.713028584:imuxsock.c : destructor for debug call stack >> >>>>> 0x7f1eec0008c0 called >> >>>>> 3497.713390963:main thread : imuxsock: unlinking unix socket >> file[0] >> >>>>> /dev/log >> >>>>> 3497.713734513:main thread : main Q: qqueueAdd: entry added, size >> >>>>> now >> >>>>> log 1, phys 1 entries >> >>>>> 3497.713906635:main thread : main Q: EnqueueMsg advised worker >> start >> >>>>> 3497.714257550:main Q:Reg/w0 : wti 0x23c8ba0: worker awoke from idle >> >>>>> processing >> >>>>> 3497.714544345:main Q:Reg/w0 : DeleteProcessedBatch: we deleted 0 >> >>>>> objects >> >>>>> and enqueued 0 objects >> >>>>> 3497.715372923:main Q:Reg/w0 : doDeleteBatch: delete batch from >> store, >> >>>>> new >> >>>>> sizes: log 1, phys 1 >> >>>>> 3497.715611163:main Q:Reg/w0 : processBATCH: batch of 1 elements >> must >> >>>>> be >> >>>>> processed >> >>>>> 3497.715762447:main Q:Reg/w0 : processBATCH: next msg 0: [origin >> >>>>> software="rsyslogd" swVersion="8.18.0.master" x-pid="30306" x-info=" >> >>>>> http://www.rsyslog.com"] exiting on signal 2. >> >>>>> 3497.715965911:main Q:Reg/w0 : PRIFILT 'local0.*' >> >>>>> 3497.715976154:main Q:Reg/w0 : pmask: X X X X X X X X X >> >>>>> X X >> >>>>> X X X X X FF X X X X X X X X X >> >>>>> 3497.716332698:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>> 3497.716337713:main Q:Reg/w0 : PRIFILT >> >>>>> '*.info;mail.none;authpriv.none;cron.none' >> >>>>> 3497.716347729:main Q:Reg/w0 : pmask: 7F 7F X 7F 7F 7F 7F 7F 7F >> >>>>> X X >> >>>>> 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F >> >>>>> 3497.716424684:main Q:Reg/w0 : PRIFILT condition result is 1 >> >>>>> 3497.716429241:main Q:Reg/w0 : ACTION 1 >> >>>>> [builtin:omfile:/var/log/messages] >> >>>>> 3497.716439928:main Q:Reg/w0 : executing action 1 >> >>>>> 3497.716445132:main Q:Reg/w0 : action 'action 1': called, logging to >> >>>>> builtin:omfile (susp 0/0, direct q 1) >> >>>>> 3497.716452142:main Q:Reg/w0 : action 'action 1': is transactional - >> >>>>> executing in commit phase >> >>>>> 3497.716457817:main Q:Reg/w0 : Action 1 transitioned to state: itx >> >>>>> 3497.716462371:main Q:Reg/w0 : action 'action 1': set suspended >> state >> >>>>> to 0 >> >>>>> 3497.716466771:main Q:Reg/w0 : PRIFILT 'authpriv.*' >> >>>>> 3497.716476330:main Q:Reg/w0 : pmask: X X X X X X X X X >> >>>>> X >> >>>>> FF >> >>>>> X X X X X X X X X X X X X X X >> >>>>> 3497.716796250:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>> 3497.716802372:main Q:Reg/w0 : PRIFILT 'mail.*' >> >>>>> 3497.716811926:main Q:Reg/w0 : pmask: X X FF X X X X X X >> >>>>> X X >> >>>>> X X X X X X X X X X X X X X X >> >>>>> 3497.717436983:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>> 3497.717444114:main Q:Reg/w0 : PRIFILT 'cron.*' >> >>>>> 3497.717454023:main Q:Reg/w0 : pmask: X X X X X X X X X >> >>>>> FF X >> >>>>> X X X X X X X X X X X X X X X >> >>>>> 3497.717823841:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>> 3497.717829094:main Q:Reg/w0 : PRIFILT '*.emerg' >> >>>>> 3497.717838767:main Q:Reg/w0 : pmask: 1 1 1 1 1 1 1 1 1 >> >>>>> 1 1 >> >>>>> 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 >> >>>>> 3497.717916499:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>> 3497.717920817:main Q:Reg/w0 : PRIFILT 'uucp,news.crit' >> >>>>> 3497.717930304:main Q:Reg/w0 : pmask: X X X X X X X 7 7 >> >>>>> X X >> >>>>> X X X X X X X X X X X X X X X >> >>>>> 3497.718043828:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>> 3497.718044097:main Q:Reg/w0 : PRIFILT 'local7.*' >> >>>>> 3497.718055458:main Q:Reg/w0 : pmask: X X X X X X X X X >> >>>>> X X >> >>>>> X X X X X X X X X X X X FF X X >> >>>>> 3497.718838257:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>> 3497.718842803:main Q:Reg/w0 : END batch execution phase, entering >> to >> >>>>> commit phase [processed 1 of 1 messages] >> >>>>> 3497.718847686:main Q:Reg/w0 : actionCommitAll: action 1, state 1, >> nbr >> >>>>> to >> >>>>> commit 0 isTransactional 1 >> >>>>> 3497.718852750:main Q:Reg/w0 : doTransaction: have commitTransaction >> >>>>> IF, >> >>>>> using that, pWrkrInfo 0x23c8ce0 >> >>>>> 3497.718857172:main Q:Reg/w0 : entering >> actionCallCommitTransaction(), >> >>>>> state: itx, actionNbr 1, nMsgs 1 >> >>>>> 3497.718862516:main Q:Reg/w0 : omfile: write to stream, pData->pStrm >> >>>>> 0x7f1ef4002480, lenBuf 161, strt data Mar 29 17:44:57 314959ac8cb0 >> >>>>> rsyslogd: [origin software="rsyslogd" swVersion="8.18.0.master" >> >>>>> x-pid="30306" x-info="http://www.rs >> >>>>> 3497.718868427:main Q:Reg/w0 : strm 0x7f1ef4002480: file 4(messages) >> >>>>> flush, buflen 161 >> >>>>> 3497.718873173:main Q:Reg/w0 : strmPhysWrite, stream 0x7f1ef4002480, >> >>>>> len >> >>>>> 161 >> >>>>> 3497.718890732:main Q:Reg/w0 : strm 0x7f1ef4002480: file 4 write >> wrote >> >>>>> 161 >> >>>>> bytes >> >>>>> 3497.718896519:main Q:Reg/w0 : Action 1 transitioned to state: rdy >> >>>>> 3497.718901151:main Q:Reg/w0 : Action 1 transitioned to state: itx >> >>>>> 3497.718905316:main Q:Reg/w0 : Action 1 transitioned to state: rdy >> >>>>> 3497.718909316:main Q:Reg/w0 : actionCommit, in retry loop, iRet 0 >> >>>>> 3497.718913541:main Q:Reg/w0 : processBATCH: batch of 1 elements has >> >>>>> been >> >>>>> processed >> >>>>> 3497.718918158:main Q:Reg/w0 : regular consumer finished, iret=0, >> >>>>> szlog >> >>>>> 0 >> >>>>> sz phys 1 >> >>>>> 3497.718923739:main Q:Reg/w0 : DeleteProcessedBatch: we deleted 1 >> >>>>> objects >> >>>>> and enqueued 0 objects >> >>>>> 3497.718928090:main Q:Reg/w0 : doDeleteBatch: delete batch from >> store, >> >>>>> new >> >>>>> sizes: log 0, phys 0 >> >>>>> 3497.718932606:main Q:Reg/w0 : regular consumer finished, iret=4, >> >>>>> szlog >> >>>>> 0 >> >>>>> sz phys 0 >> >>>>> 3497.718936888:main Q:Reg/w0 : main Q:Reg/w0: worker IDLE, waiting >> for >> >>>>> work. >> >>>>> 3497.718942403:main thread : Terminating main queue... >> >>>>> 3497.718946280:main thread : main Q: initiating worker thread >> >>>>> shutdown >> >>>>> sequence >> >>>>> 3497.718950170:main thread : main Q: trying shutdown of regular >> >>>>> workers >> >>>>> 3497.718957067:main thread : sent SIGTTIN to worker thread >> >>>>> 0xfc315700 >> >>>>> 3497.718961130:main thread : main Q:Reg: waiting 1500ms on worker >> >>>>> thread >> >>>>> termination, 1 still running >> >>>>> 3497.718968288:main Q:Reg/w0 : wti 0x23c8ba0: worker awoke from idle >> >>>>> processing >> >>>>> 3497.718972108:main Q:Reg/w0 : DeleteProcessedBatch: we deleted 0 >> >>>>> objects >> >>>>> and enqueued 0 objects >> >>>>> 3497.718975498:main Q:Reg/w0 : doDeleteBatch: delete batch from >> store, >> >>>>> new >> >>>>> sizes: log 0, phys 0 >> >>>>> 3497.718979083:main Q:Reg/w0 : regular consumer finished, iret=4, >> >>>>> szlog >> >>>>> 0 >> >>>>> sz phys 0 >> >>>>> 3497.718982620:main Q:Reg/w0 : wti 0x23c8ba0: terminating worker >> >>>>> terminateRet=5, bInactivityTOOccured=0 >> >>>>> 3497.718986378:main Q:Reg/w0 : DDDD: wti 0x23c8ba0: worker cleanup >> >>>>> action >> >>>>> instances >> >>>>> 3497.718990027:main Q:Reg/w0 : wti 0x23c8ba0, action 0, ptr (nil) >> >>>>> 3497.718993327:main Q:Reg/w0 : wti 0x23c8ba0, action 1, ptr >> >>>>> 0x7f1ef40022a0 >> >>>>> 3497.718997548:main Q:Reg/w0 : wti 0x23c8ba0, action 2, ptr (nil) >> >>>>> 3497.719000789:main Q:Reg/w0 : wti 0x23c8ba0, action 3, ptr (nil) >> >>>>> 3497.719003996:main Q:Reg/w0 : wti 0x23c8ba0, action 4, ptr (nil) >> >>>>> 3497.719007201:main Q:Reg/w0 : wti 0x23c8ba0, action 5, ptr (nil) >> >>>>> 3497.719010409:main Q:Reg/w0 : wti 0x23c8ba0, action 6, ptr (nil) >> >>>>> 3497.719013604:main Q:Reg/w0 : wti 0x23c8ba0, action 7, ptr (nil) >> >>>>> 3497.719016916:main Q:Reg/w0 : wti 0x23c8ba0: worker exiting >> >>>>> 3497.719020757:main Q:Reg/w0 : main Q:Reg: Worker thread 23c8ba0, >> >>>>> terminated, num workers now 0 >> >>>>> 3497.719090631:main thread : main Q: regular queue workers shut >> >>>>> down. >> >>>>> 3497.719097597:main thread : main Q: checking to see if we need to >> >>>>> cancel any worker threads of the primary queue >> >>>>> 3497.719101335:main thread : main Q: worker threads terminated, >> >>>>> remaining queue size log 0, phys 0. >> >>>>> 3497.719111115:main thread : main Q: queue (type 0) will lose 0 >> >>>>> messages, destroying... >> >>>>> 3497.719501847:main Q:Reg/w0 : destructor for debug call stack >> >>>>> 0x7f1ef40008c0 called >> >>>>> 3497.719547524:main thread : Terminating outputs... >> >>>>> 3497.719553777:main thread : destructing ruleset 0x23ab150, name >> >>>>> 0x23ab190 >> >>>>> 3497.719558276:main thread : debug_relp queue: initiating worker >> >>>>> thread >> >>>>> shutdown sequence >> >>>>> 3497.719561940:main thread : debug_relp queue: trying shutdown of >> >>>>> regular workers >> >>>>> 3497.719568068:main thread : sent SIGTTIN to worker thread >> >>>>> 0xfbb14700 >> >>>>> 3497.719587478:main thread : debug_relp queue:Reg: waiting 0ms on >> >>>>> worker >> >>>>> thread termination, 1 still running >> >>>>> 3497.719593945:debug_relp queue:Reg/w0: end relpSessConnect, iRet >> 10014 >> >>>>> 3497.719599684:debug_relp queue:Reg/w0: Action 0 transitioned to >> state: >> >>>>> rtry >> >>>>> 3497.719603211:debug_relp queue:Reg/w0: doTransaction: action 0, >> >>>>> currIParam >> >>>>> 1 >> >>>>> 3497.719607186:debug_relp queue:Reg/w0: actionDoRetry: debug_relp >> enter >> >>>>> loop, iRetries=0 >> >>>>> 3497.719677407:main thread : debug_relp queue:Reg: timeout waiting >> >>>>> on >> >>>>> worker thread termination >> >>>>> 3497.719677554:main thread : sent SIGTTIN to worker thread >> >>>>> 0xfbb14700 >> >>>>> 3497.719677725:main thread : debug_relp queue: regular shutdown >> >>>>> timed >> >>>>> out on primary queue (this is OK) >> >>>>> 3497.719677897:main thread : debug_relp queue: trying to shutdown >> >>>>> workers within Action Timeoutsetting EnqOnly mode >> >>>>> 3497.719678191:main thread : debug_relp queue: trying immediate >> >>>>> shutdown >> >>>>> of regular workers (if any) >> >>>>> 3497.719678362:main thread : sent SIGTTIN to worker thread >> >>>>> 0xfbb14700 >> >>>>> 3497.719678558:main thread : debug_relp queue:Reg: waiting 1000ms >> on >> >>>>> worker thread termination, 1 still running >> >>>>> 3497.719678754:debug_relp queue:Reg/w0: end relpSessConnect, iRet >> 10014 >> >>>>> 3497.719678877:debug_relp queue:Reg/w0: actionDoRetry: debug_relp >> >>>>> action->tryResume returned -2007 >> >>>>> 3497.719678999:debug_relp queue:Reg/w0: actionDoRetry: debug_relp >> check >> >>>>> for >> >>>>> max retries, iResumeRetryCount -1, iRetries 0 >> >>>>> 3498.720321005:debug_relp queue:Reg/w0: actionTryResume: action >> >>>>> 0x23bebf0 >> >>>>> state: rtry, next retry (if applicable): 0 [now 1459273498] >> >>>>> 3498.720769277:debug_relp queue:Reg/w0: actionCommit, in retry loop, >> >>>>> iRet >> >>>>> -2007 >> >>>>> 3498.721034552:debug_relp queue:Reg/w0: regular consumer finished, >> >>>>> iret=-2007, szlog 0 sz phys 1 >> >>>>> 3498.721240322:debug_relp queue:Reg/w0: DeleteProcessedBatch: we >> >>>>> deleted >> >>>>> 1 >> >>>>> objects and enqueued 0 objects >> >>>>> 3498.721396664:debug_relp queue:Reg/w0: doDeleteBatch: delete batch >> >>>>> from >> >>>>> store, new sizes: log 0, phys 0 >> >>>>> 3498.721831840:debug_relp queue:Reg/w0: wti 0x23c78b0: terminating >> >>>>> worker >> >>>>> because of TERMINATE_NOW mode, del iRet 0 >> >>>>> 3498.721991265:debug_relp queue:Reg/w0: DDDD: wti 0x23c78b0: worker >> >>>>> cleanup >> >>>>> action instances >> >>>>> 3498.722274489:debug_relp queue:Reg/w0: wti 0x23c78b0, action 0, ptr >> >>>>> 0x7f1ef00020d0 >> >>>>> 3498.722431811:debug_relp queue:Reg/w0: relp engine destructing >> client >> >>>>> 0x7f1ef00027e0 >> >>>>> 3498.723389457:debug_relp queue:Reg/w0: wti 0x23c78b0, action 1, ptr >> >>>>> (nil) >> >>>>> 3498.723559462:debug_relp queue:Reg/w0: wti 0x23c78b0, action 2, ptr >> >>>>> (nil) >> >>>>> 3498.723794716:debug_relp queue:Reg/w0: wti 0x23c78b0, action 3, ptr >> >>>>> (nil) >> >>>>> 3498.723972736:debug_relp queue:Reg/w0: wti 0x23c78b0, action 4, ptr >> >>>>> (nil) >> >>>>> 3498.724207978:debug_relp queue:Reg/w0: wti 0x23c78b0, action 5, ptr >> >>>>> (nil) >> >>>>> 3498.724215660:debug_relp queue:Reg/w0: wti 0x23c78b0, action 6, ptr >> >>>>> (nil) >> >>>>> 3498.724219766:debug_relp queue:Reg/w0: wti 0x23c78b0, action 7, ptr >> >>>>> (nil) >> >>>>> 3498.724223775:debug_relp queue:Reg/w0: wti 0x23c78b0: worker exiting >> >>>>> 3498.724228430:debug_relp queue:Reg/w0: debug_relp queue:Reg: Worker >> >>>>> thread >> >>>>> 23c78b0, terminated, num workers now 0 >> >>>>> 3498.724243673:debug_relp queue:Reg/w0: destructor for debug call >> stack >> >>>>> 0x7f1ef00008c0 called >> >>>>> 3498.724371931:main thread : debug_relp queue: checking to see if >> we >> >>>>> need to cancel any worker threads of the primary queue >> >>>>> 3498.724379712:main thread : debug_relp queue: worker threads >> >>>>> terminated, remaining queue size log 0, phys 0. >> >>>>> 3498.724386288:main thread : debug_relp queue: persisting queue to >> >>>>> disk, >> >>>>> 0 entries... >> >>>>> 3498.724391611:main thread : strm 0x23b74c0: file 5(relp_queue) >> >>>>> closing >> >>>>> 3498.724396110:main thread : strm 0x23b74c0: file 5(relp_queue) >> >>>>> flush, >> >>>>> buflen 0 (no need to flush) >> >>>>> 3498.724411574:main thread : strm 0x23b7760: file 6(relp_queue) >> >>>>> closing >> >>>>> 3498.724418797:main thread : strm 0x23b7a00: file -1(relp_queue) >> >>>>> closing >> >>>>> 3498.724423956:main thread : strmCloseFile: deleting >> >>>>> '/var/spool/rsyslog/relp_queue.00000001' >> >>>>> 3498.724490018:main thread : strm 0x7f1ef4002480: file 4(messages) >> >>>>> closing >> >>>>> 3498.724607874:main thread : strm 0x7f1ef4002480: file 4(messages) >> >>>>> flush, buflen 0 (no need to flush) >> >>>>> 3498.724731690:main thread : all primary multi-thread sources have >> >>>>> been >> >>>>> terminated - now doing aux cleanup... >> >>>>> 3498.724737248:main thread : destructing current config... >> >>>>> 3498.724741760:main thread : calling freeCnf(0x23a91e0) for module >> >>>>> 'builtin:omfile' >> >>>>> 3498.724745773:main thread : calling freeCnf(0x23a93a0) for module >> >>>>> 'builtin:ompipe' >> >>>>> 3498.724749977:main thread : calling freeCnf(0x23aa3a0) for module >> >>>>> 'builtin:omfwd' >> >>>>> 3498.724754588:main thread : calling freeCnf(0x23b9b00) for module >> >>>>> 'imuxsock' >> >>>>> 3498.724758956:main thread : calling freeCnf(0x23bac70) for module >> >>>>> 'imklog' >> >>>>> 3498.724807014:main thread : destructing parser 'rsyslog.rfc5424' >> >>>>> 3498.724871141:main thread : destructing parser 'rsyslog.rfc3164' >> >>>>> 3498.724876440:main thread : pmrfc3164: free parser instance >> >>>>> 0x23aa8b0 >> >>>>> 3498.724887558:main thread : file rsyslogd.c released module >> >>>>> 'lmnet', >> >>>>> reference count now 2 >> >>>>> 3498.724892004:main thread : module lmnet NOT unloaded because it >> >>>>> still >> >>>>> has a refcount of 2 >> >>>>> 3498.724895745:main thread : Unloading module builtin:omfile >> >>>>> 3498.724900191:main thread : module lmnet NOT unloaded because it >> >>>>> still >> >>>>> has a refcount of 2 >> >>>>> 3498.724904194:main thread : Unloading module builtin:ompipe >> >>>>> 3498.724908035:main thread : module lmnet NOT unloaded because it >> >>>>> still >> >>>>> has a refcount of 2 >> >>>>> 3498.724911707:main thread : Unloading module builtin-shell >> >>>>> 3498.725044125:main thread : module lmnet NOT unloaded because it >> >>>>> still >> >>>>> has a refcount of 2 >> >>>>> 3498.725050401:main thread : Unloading module builtin:omdiscard >> >>>>> 3498.725054843:main thread : module lmnet NOT unloaded because it >> >>>>> still >> >>>>> has a refcount of 2 >> >>>>> 3498.725058772:main thread : Unloading module builtin:omfwd >> >>>>> 3498.725064464:main thread : file omfwd.c released module 'lmnet', >> >>>>> reference count now 1 >> >>>>> 3498.725068747:main thread : module lmnet NOT unloaded because it >> >>>>> still >> >>>>> has a refcount of 1 >> >>>>> 3498.725072460:main thread : Unloading module builtin:omusrmsg >> >>>>> 3498.725076324:main thread : module lmnet NOT unloaded because it >> >>>>> still >> >>>>> has a refcount of 1 >> >>>>> 3498.725080023:main thread : Unloading module builtin:pmrfc5424 >> >>>>> 3498.725084039:main thread : module lmnet NOT unloaded because it >> >>>>> still >> >>>>> has a refcount of 1 >> >>>>> 3498.725087646:main thread : Unloading module builtin:pmrfc3164 >> >>>>> 3498.725091651:main thread : module lmnet NOT unloaded because it >> >>>>> still >> >>>>> has a refcount of 1 >> >>>>> 3498.725095255:main thread : Unloading module builtin:smfile >> >>>>> 3498.725099010:main thread : module lmnet NOT unloaded because it >> >>>>> still >> >>>>> has a refcount of 1 >> >>>>> 3498.725102601:main thread : Unloading module builtin:smtradfile >> >>>>> 3498.725106419:main thread : module lmnet NOT unloaded because it >> >>>>> still >> >>>>> has a refcount of 1 >> >>>>> 3498.725110003:main thread : Unloading module builtin:smfwd >> >>>>> 3498.725113689:main thread : module lmnet NOT unloaded because it >> >>>>> still >> >>>>> has a refcount of 1 >> >>>>> 3498.725117357:main thread : Unloading module builtin:smtradfwd >> >>>>> 3498.725121132:main thread : module lmnet NOT unloaded because it >> >>>>> still >> >>>>> has a refcount of 1 >> >>>>> 3498.725124782:main thread : Unloading module imuxsock >> >>>>> 3498.725157459:main thread : module lmnet NOT unloaded because it >> >>>>> still >> >>>>> has a refcount of 1 >> >>>>> 3498.725162797:main thread : Unloading module imklog >> >>>>> 3498.725179208:main thread : module lmnet NOT unloaded because it >> >>>>> still >> >>>>> has a refcount of 1 >> >>>>> 3498.726112994:main thread : Unloading module omrelp >> >>>>> 3498.726184405:main thread : module lmnet NOT unloaded because it >> >>>>> still >> >>>>> has a refcount of 1 >> >>>>> 3498.726189821:main thread : module lmregexp NOT unloaded because >> it >> >>>>> still has a refcount of 1 >> >>>>> 3498.726195161:main thread : file conf.c released module 'lmnet', >> >>>>> reference count now 0 >> >>>>> 3498.726198620:main thread : module 'lmnet' has zero reference >> >>>>> count, >> >>>>> unloading... >> >>>>> 3498.726201936:main thread : Unloading module lmnet >> >>>>> 3498.726218655:main thread : rsyslog runtime de-initialized, >> current >> >>>>> users 0 >> >>>>> 3498.726222660:main thread : Clean shutdown completed, bye >> >>>>> >> >>>>> >> >>>>> On Mon, Mar 28, 2016 at 4:41 PM, Kane Kim <[email protected]> >> >>>>> wrote: >> >>>>> >> >>>>> Hello Reiner, >> >>>>>> >> >>>>>> >> >>>>>> I'm seeing the same behavior with omrelp module. I've configured >> >>>>>> omrelp >> >>>>>> to >> >>>>>> non-routable ip address: >> >>>>>> module(load="omrelp") >> >>>>>> local0.* action(type="omrelp" >> >>>>>> target="10.0.0.1" >> >>>>>> port="2514" >> >>>>>> name="debug_relp" >> >>>>>> queue.filename="relp_queue" >> >>>>>> queue.maxdiskspace="1G" >> >>>>>> queue.saveonshutdown="on" >> >>>>>> queue.spoolDirectory="/var/spool/rsyslog" >> >>>>>> queue.type="Disk" >> >>>>>> ) >> >>>>>> Then I try to log a line: >> >>>>>> [root@314959ac8cb0 prism]# logger -p local0.info test >> >>>>>> [root@314959ac8cb0 prism]# ls /var/spool/rsyslog/ >> >>>>>> relp_queue.00000001 >> >>>>>> # I shutdown rsyslog at this point >> >>>>>> [root@314959ac8cb0 prism]# ls /var/spool/rsyslog/ >> >>>>>> # after shutdown queue is deleted and message is lost >> >>>>>> [root@314959ac8cb0 prism]# >> >>>>>> >> >>>>>> Here is rsyslog log, so omrelp starts transaction and then I >> shutdown >> >>>>>> rsyslog, it seems at some point queue is deleted: >> >>>>>> >> >>>>>> 8184.798028612:debug_relp queue:Reg/w0: omrelp: beginTransaction >> >>>>>> ^C8194.245122305:main thread : DoDie called. >> >>>>>> 8194.245301525:main thread : exiting on signal 2 >> >>>>>> 8194.245532709:main thread : Terminating input threads... >> >>>>>> 8194.245873683:main thread : request term via SIGTTIN for input >> >>>>>> thread >> >>>>>> 'imuxsock' 0xcb767700 >> >>>>>> 8194.246079748:imuxsock.c : thrdStarter: usrThrdMain imuxsock - >> >>>>>> 0x7f9bcb767700 returned with iRet 0, exiting now. >> >>>>>> 8194.247556671:main thread : non-cancel input thread termination >> >>>>>> succeeded for thread imuxsock 0xcb767700 >> >>>>>> 8194.249357547:imuxsock.c : destructor for debug call stack >> >>>>>> 0x7f9bbc0008c0 called >> >>>>>> 8194.249784271:main thread : imuxsock: unlinking unix socket >> >>>>>> file[0] >> >>>>>> /dev/log >> >>>>>> 8194.250043368:main thread : main Q: qqueueAdd: entry added, size >> >>>>>> now >> >>>>>> log 1, phys 1 entries >> >>>>>> 8194.250321193:main thread : main Q: EnqueueMsg advised worker >> >>>>>> start >> >>>>>> 8194.250493505:main Q:Reg/w0 : wti 0x1c13bb0: worker awoke from >> idle >> >>>>>> processing >> >>>>>> 8194.250735988:main Q:Reg/w0 : DeleteProcessedBatch: we deleted 0 >> >>>>>> objects >> >>>>>> and enqueued 0 objects >> >>>>>> 8194.250959294:main Q:Reg/w0 : doDeleteBatch: delete batch from >> >>>>>> store, >> >>>>>> new sizes: log 1, phys 1 >> >>>>>> 8194.251913535:main Q:Reg/w0 : processBATCH: batch of 1 elements >> must >> >>>>>> be >> >>>>>> processed >> >>>>>> 8194.252193683:main Q:Reg/w0 : processBATCH: next msg 0: [origin >> >>>>>> software="rsyslogd" swVersion="8.18.0.master" x-pid="30298" x-info=" >> >>>>>> http://www.rsyslog.com"] exiting on signal 2. >> >>>>>> 8194.252377044:main Q:Reg/w0 : PRIFILT 'local0.*' >> >>>>>> 8194.252669683:main Q:Reg/w0 : pmask: X X X X X X X X >> X >> >>>>>> X >> >>>>>> X X X X X X FF X X X X X X X X X >> >>>>>> 8194.252898063:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>>> 8194.252902083:main Q:Reg/w0 : PRIFILT >> >>>>>> '*.info;mail.none;authpriv.none;cron.none' >> >>>>>> 8194.252908847:main Q:Reg/w0 : pmask: 7F 7F X 7F 7F 7F 7F 7F >> 7F >> >>>>>> X >> >>>>>> X 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F >> >>>>>> 8194.252956209:main Q:Reg/w0 : PRIFILT condition result is 1 >> >>>>>> 8194.252959849:main Q:Reg/w0 : ACTION 1 >> >>>>>> [builtin:omfile:/var/log/messages] >> >>>>>> 8194.252967510:main Q:Reg/w0 : executing action 1 >> >>>>>> 8194.252971541:main Q:Reg/w0 : action 'action 1': called, logging >> to >> >>>>>> builtin:omfile (susp 0/0, direct q 1) >> >>>>>> 8194.252977583:main Q:Reg/w0 : action 'action 1': is transactional >> - >> >>>>>> executing in commit phase >> >>>>>> 8194.252982638:main Q:Reg/w0 : Action 1 transitioned to state: itx >> >>>>>> 8194.252986300:main Q:Reg/w0 : action 'action 1': set suspended >> state >> >>>>>> to 0 >> >>>>>> 8194.252989729:main Q:Reg/w0 : PRIFILT 'authpriv.*' >> >>>>>> 8194.252996951:main Q:Reg/w0 : pmask: X X X X X X X X >> X >> >>>>>> X >> >>>>>> FF X X X X X X X X X X X X X X X >> >>>>>> 8194.253051858:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>>> 8194.253056228:main Q:Reg/w0 : PRIFILT 'mail.*' >> >>>>>> 8194.253065299:main Q:Reg/w0 : pmask: X X FF X X X X X >> X >> >>>>>> X >> >>>>>> X X X X X X X X X X X X X X X X >> >>>>>> 8194.253407472:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>>> 8194.253412679:main Q:Reg/w0 : PRIFILT 'cron.*' >> >>>>>> 8194.253422055:main Q:Reg/w0 : pmask: X X X X X X X X >> X >> >>>>>> FF >> >>>>>> X X X X X X X X X X X X X X X X >> >>>>>> 8194.253493551:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>>> 8194.253498038:main Q:Reg/w0 : PRIFILT '*.emerg' >> >>>>>> 8194.253507065:main Q:Reg/w0 : pmask: 1 1 1 1 1 1 1 1 >> 1 >> >>>>>> 1 >> >>>>>> 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 >> >>>>>> 8194.253580307:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>>> 8194.253584400:main Q:Reg/w0 : PRIFILT 'uucp,news.crit' >> >>>>>> 8194.253593442:main Q:Reg/w0 : pmask: X X X X X X X 7 >> 7 >> >>>>>> X >> >>>>>> X X X X X X X X X X X X X X X X >> >>>>>> 8194.253664984:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>>> 8194.253669184:main Q:Reg/w0 : PRIFILT 'local7.*' >> >>>>>> 8194.253678130:main Q:Reg/w0 : pmask: X X X X X X X X >> X >> >>>>>> X >> >>>>>> X X X X X X X X X X X X X FF X X >> >>>>>> 8194.254445021:main Q:Reg/w0 : PRIFILT condition result is 0 >> >>>>>> 8194.254450651:main Q:Reg/w0 : END batch execution phase, entering >> to >> >>>>>> commit phase [processed 1 of 1 messages] >> >>>>>> 8194.254456257:main Q:Reg/w0 : actionCommitAll: action 1, state 1, >> >>>>>> nbr >> >>>>>> to >> >>>>>> commit 0 isTransactional 1 >> >>>>>> 8194.254461646:main Q:Reg/w0 : doTransaction: have >> commitTransaction >> >>>>>> IF, >> >>>>>> using that, pWrkrInfo 0x1c13cf0 >> >>>>>> 8194.254466238:main Q:Reg/w0 : entering >> >>>>>> actionCallCommitTransaction(), >> >>>>>> state: itx, actionNbr 1, nMsgs 1 >> >>>>>> 8194.254471876:main Q:Reg/w0 : omfile: write to stream, >> pData->pStrm >> >>>>>> 0x7f9bc4002480, lenBuf 161, strt data Mar 28 23:36:34 314959ac8cb0 >> >>>>>> rsyslogd: [origin software="rsyslogd" swVersion="8.18.0.master" >> >>>>>> x-pid="30298" x-info="http://www.rs >> >>>>>> 8194.254478237:main Q:Reg/w0 : strm 0x7f9bc4002480: file >> 4(messages) >> >>>>>> flush, buflen 161 >> >>>>>> 8194.254483210:main Q:Reg/w0 : strmPhysWrite, stream >> 0x7f9bc4002480, >> >>>>>> len >> >>>>>> 161 >> >>>>>> 8194.254500329:main Q:Reg/w0 : strm 0x7f9bc4002480: file 4 write >> >>>>>> wrote >> >>>>>> 161 bytes >> >>>>>> 8194.254506254:main Q:Reg/w0 : Action 1 transitioned to state: rdy >> >>>>>> 8194.254511506:main Q:Reg/w0 : Action 1 transitioned to state: itx >> >>>>>> 8194.254515961:main Q:Reg/w0 : Action 1 transitioned to state: rdy >> >>>>>> 8194.254520243:main Q:Reg/w0 : actionCommit, in retry loop, iRet 0 >> >>>>>> 8194.254524679:main Q:Reg/w0 : processBATCH: batch of 1 elements >> has >> >>>>>> been >> >>>>>> processed >> >>>>>> 8194.254529731:main Q:Reg/w0 : regular consumer finished, iret=0, >> >>>>>> szlog 0 >> >>>>>> sz phys 1 >> >>>>>> 8194.254536104:main Q:Reg/w0 : DeleteProcessedBatch: we deleted 1 >> >>>>>> objects >> >>>>>> and enqueued 0 objects >> >>>>>> 8194.254540861:main Q:Reg/w0 : doDeleteBatch: delete batch from >> >>>>>> store, >> >>>>>> new sizes: log 0, phys 0 >> >>>>>> 8194.254545792:main Q:Reg/w0 : regular consumer finished, iret=4, >> >>>>>> szlog 0 >> >>>>>> sz phys 0 >> >>>>>> 8194.254550314:main Q:Reg/w0 : main Q:Reg/w0: worker IDLE, waiting >> >>>>>> for >> >>>>>> work. >> >>>>>> 8194.254556418:main thread : Terminating main queue... >> >>>>>> 8194.254560776:main thread : main Q: initiating worker thread >> >>>>>> shutdown >> >>>>>> sequence >> >>>>>> 8194.254564653:main thread : main Q: trying shutdown of regular >> >>>>>> workers >> >>>>>> 8194.254572961:main thread : sent SIGTTIN to worker thread >> >>>>>> 0xcaf66700 >> >>>>>> 8194.254577260:main thread : main Q:Reg: waiting 1499ms on worker >> >>>>>> thread termination, 1 still running >> >>>>>> 8194.254585507:main Q:Reg/w0 : wti 0x1c13bb0: worker awoke from >> idle >> >>>>>> processing >> >>>>>> 8194.254589465:main Q:Reg/w0 : DeleteProcessedBatch: we deleted 0 >> >>>>>> objects >> >>>>>> and enqueued 0 objects >> >>>>>> 8194.254593070:main Q:Reg/w0 : doDeleteBatch: delete batch from >> >>>>>> store, >> >>>>>> new sizes: log 0, phys 0 >> >>>>>> 8194.254596813:main Q:Reg/w0 : regular consumer finished, iret=4, >> >>>>>> szlog 0 >> >>>>>> sz phys 0 >> >>>>>> 8194.254600412:main Q:Reg/w0 : wti 0x1c13bb0: terminating worker >> >>>>>> terminateRet=5, bInactivityTOOccured=0 >> >>>>>> 8194.254604469:main Q:Reg/w0 : DDDD: wti 0x1c13bb0: worker cleanup >> >>>>>> action >> >>>>>> instances >> >>>>>> 8194.254608268:main Q:Reg/w0 : wti 0x1c13bb0, action 0, ptr (nil) >> >>>>>> 8194.254611752:main Q:Reg/w0 : wti 0x1c13bb0, action 1, ptr >> >>>>>> 0x7f9bc40022a0 >> >>>>>> 8194.254616114:main Q:Reg/w0 : wti 0x1c13bb0, action 2, ptr (nil) >> >>>>>> 8194.254619450:main Q:Reg/w0 : wti 0x1c13bb0, action 3, ptr (nil) >> >>>>>> 8194.254622767:main Q:Reg/w0 : wti 0x1c13bb0, action 4, ptr (nil) >> >>>>>> 8194.254626075:main Q:Reg/w0 : wti 0x1c13bb0, action 5, ptr (nil) >> >>>>>> 8194.254629392:main Q:Reg/w0 : wti 0x1c13bb0, action 6, ptr (nil) >> >>>>>> 8194.254632707:main Q:Reg/w0 : wti 0x1c13bb0, action 7, ptr (nil) >> >>>>>> 8194.254636124:main Q:Reg/w0 : wti 0x1c13bb0: worker exiting >> >>>>>> 8194.254639920:main Q:Reg/w0 : main Q:Reg: Worker thread 1c13bb0, >> >>>>>> terminated, num workers now 0 >> >>>>>> 8194.254656279:main thread : main Q: regular queue workers shut >> >>>>>> down. >> >>>>>> 8194.254656402:main thread : main Q: checking to see if we need >> to >> >>>>>> cancel any worker threads of the primary queue >> >>>>>> 8194.254656524:main thread : main Q: worker threads terminated, >> >>>>>> remaining queue size log 0, phys 0. >> >>>>>> 8194.254656647:main thread : main Q: queue (type 0) will lose 0 >> >>>>>> messages, destroying... >> >>>>>> 8194.255021761:main Q:Reg/w0 : destructor for debug call stack >> >>>>>> 0x7f9bc40008c0 called >> >>>>>> 8194.255069521:main thread : Terminating outputs... >> >>>>>> 8194.255084818:main thread : destructing ruleset 0x1bf6150, name >> >>>>>> 0x1bf6190 >> >>>>>> 8194.255089525:main thread : debug_relp queue: initiating worker >> >>>>>> thread >> >>>>>> shutdown sequence >> >>>>>> 8194.255093230:main thread : debug_relp queue: trying shutdown of >> >>>>>> regular workers >> >>>>>> 8194.255099210:main thread : sent SIGTTIN to worker thread >> >>>>>> 0xca765700 >> >>>>>> 8194.255119598:main thread : debug_relp queue:Reg: waiting 0ms on >> >>>>>> worker thread termination, 1 still running >> >>>>>> 8194.255126357:debug_relp queue:Reg/w0: end relpSessConnect, iRet >> >>>>>> 10014 >> >>>>>> 8194.255132022:debug_relp queue:Reg/w0: Action 0 transitioned to >> >>>>>> state: >> >>>>>> rtry >> >>>>>> 8194.255135551:debug_relp queue:Reg/w0: doTransaction: action 0, >> >>>>>> currIParam 1 >> >>>>>> 8194.255139603:debug_relp queue:Reg/w0: actionDoRetry: debug_relp >> >>>>>> enter >> >>>>>> loop, iRetries=0 >> >>>>>> 8194.255210939:main thread : debug_relp queue:Reg: timeout >> waiting >> >>>>>> on >> >>>>>> worker thread termination >> >>>>>> 8194.255211086:main thread : sent SIGTTIN to worker thread >> >>>>>> 0xca765700 >> >>>>>> 8194.255211257:main thread : debug_relp queue: regular shutdown >> >>>>>> timed >> >>>>>> out on primary queue (this is OK) >> >>>>>> 8194.255211429:main thread : debug_relp queue: trying to shutdown >> >>>>>> workers within Action Timeoutsetting EnqOnly mode >> >>>>>> 8194.255215492:main thread : debug_relp queue: trying immediate >> >>>>>> shutdown of regular workers (if any) >> >>>>>> 8194.255221059:main thread : sent SIGTTIN to worker thread >> >>>>>> 0xca765700 >> >>>>>> 8194.255226792:main thread : debug_relp queue:Reg: waiting 1000ms >> >>>>>> on >> >>>>>> worker thread termination, 1 still running >> >>>>>> 8194.255232713:debug_relp queue:Reg/w0: end relpSessConnect, iRet >> >>>>>> 10014 >> >>>>>> 8194.255237080:debug_relp queue:Reg/w0: actionDoRetry: debug_relp >> >>>>>> action->tryResume returned -2007 >> >>>>>> 8194.255240684:debug_relp queue:Reg/w0: actionDoRetry: debug_relp >> >>>>>> check >> >>>>>> for max retries, iResumeRetryCount 0, iRetries 0 >> >>>>>> 8194.255244459:debug_relp queue:Reg/w0: Action 0 transitioned to >> >>>>>> state: >> >>>>>> susp >> >>>>>> 8194.255257975:debug_relp queue:Reg/w0: Called LogMsg, msg: action >> >>>>>> 'debug_relp' suspended, next retry is Mon Mar 28 23:37:04 2016 >> >>>>>> 8194.255266857:debug_relp queue:Reg/w0: submitMsg2() could not >> submit >> >>>>>> message - queue does (no longer?) exist - ignored >> >>>>>> 8194.255271065:debug_relp queue:Reg/w0: action 'debug_relp' >> suspended, >> >>>>>> earliest retry=1459208224 (now 1459208194), iNbrResRtry 0, duration >> 30 >> >>>>>> 8194.255275244:debug_relp queue:Reg/w0: actionTryResume: action >> >>>>>> 0x1c09af0 >> >>>>>> state: susp, next retry (if applicable): 1459208224 [now 1459208194] >> >>>>>> 8194.255278825:debug_relp queue:Reg/w0: actionCommit, in retry loop, >> >>>>>> iRet >> >>>>>> -2123 >> >>>>>> 8194.255282409:debug_relp queue:Reg/w0: regular consumer finished, >> >>>>>> iret=-2123, szlog 0 sz phys 1 >> >>>>>> 8194.255287286:debug_relp queue:Reg/w0: DeleteProcessedBatch: we >> >>>>>> deleted 1 >> >>>>>> objects and enqueued 0 objects >> >>>>>> 8194.255290952:debug_relp queue:Reg/w0: doDeleteBatch: delete batch >> >>>>>> from >> >>>>>> store, new sizes: log 0, phys 0 >> >>>>>> 8194.255295204:debug_relp queue:Reg/w0: wti 0x1c12880: terminating >> >>>>>> worker >> >>>>>> because of TERMINATE_NOW mode, del iRet 0 >> >>>>>> 8194.255298822:debug_relp queue:Reg/w0: DDDD: wti 0x1c12880: worker >> >>>>>> cleanup action instances >> >>>>>> 8194.255302306:debug_relp queue:Reg/w0: wti 0x1c12880, action 0, ptr >> >>>>>> 0x7f9bc00020d0 >> >>>>>> 8194.255305929:debug_relp queue:Reg/w0: relp engine destructing >> client >> >>>>>> 0x7f9bc00027e0 >> >>>>>> 8194.255309843:debug_relp queue:Reg/w0: wti 0x1c12880, action 1, ptr >> >>>>>> (nil) >> >>>>>> 8194.255313226:debug_relp queue:Reg/w0: wti 0x1c12880, action 2, ptr >> >>>>>> (nil) >> >>>>>> 8194.255316479:debug_relp queue:Reg/w0: wti 0x1c12880, action 3, ptr >> >>>>>> (nil) >> >>>>>> 8194.255319704:debug_relp queue:Reg/w0: wti 0x1c12880, action 4, ptr >> >>>>>> (nil) >> >>>>>> 8194.255322934:debug_relp queue:Reg/w0: wti 0x1c12880, action 5, ptr >> >>>>>> (nil) >> >>>>>> 8194.255326175:debug_relp queue:Reg/w0: wti 0x1c12880, action 6, ptr >> >>>>>> (nil) >> >>>>>> 8194.255329409:debug_relp queue:Reg/w0: wti 0x1c12880, action 7, ptr >> >>>>>> (nil) >> >>>>>> 8194.255332655:debug_relp queue:Reg/w0: wti 0x1c12880: worker >> exiting >> >>>>>> 8194.255336231:debug_relp queue:Reg/w0: debug_relp queue:Reg: Worker >> >>>>>> thread 1c12880, terminated, num workers now 0 >> >>>>>> 8194.255344540:debug_relp queue:Reg/w0: destructor for debug call >> >>>>>> stack >> >>>>>> 0x7f9bc00008c0 called >> >>>>>> 8194.255371534:main thread : debug_relp queue: checking to see if >> >>>>>> we >> >>>>>> need to cancel any worker threads of the primary queue >> >>>>>> 8194.255376798:main thread : debug_relp queue: worker threads >> >>>>>> terminated, remaining queue size log 0, phys 0. >> >>>>>> 8194.255381256:main thread : debug_relp queue: persisting queue >> to >> >>>>>> disk, 0 entries... >> >>>>>> 8194.255385525:main thread : strm 0x1c024c0: file 5(relp_queue) >> >>>>>> closing >> >>>>>> 8194.255389222:main thread : strm 0x1c024c0: file 5(relp_queue) >> >>>>>> flush, >> >>>>>> buflen 0 (no need to flush) >> >>>>>> 8194.255398716:main thread : strm 0x1c02760: file 6(relp_queue) >> >>>>>> closing >> >>>>>> 8194.255404785:main thread : strm 0x1c02a00: file -1(relp_queue) >> >>>>>> closing >> >>>>>> 8194.255409145:main thread : strmCloseFile: deleting >> >>>>>> '/var/spool/rsyslog/relp_queue.00000001' >> >>>>>> 8194.255460812:main thread : strm 0x7f9bc4002480: file >> 4(messages) >> >>>>>> closing >> >>>>>> 8194.255466309:main thread : strm 0x7f9bc4002480: file >> 4(messages) >> >>>>>> flush, buflen 0 (no need to flush) >> >>>>>> 8194.255479325:main thread : all primary multi-thread sources >> have >> >>>>>> been >> >>>>>> terminated - now doing aux cleanup... >> >>>>>> 8194.255482809:main thread : destructing current config... >> >>>>>> 8194.255486662:main thread : calling freeCnf(0x1bf41e0) for >> module >> >>>>>> 'builtin:omfile' >> >>>>>> 8194.255490098:main thread : calling freeCnf(0x1bf43a0) for >> module >> >>>>>> 'builtin:ompipe' >> >>>>>> 8194.255493809:main thread : calling freeCnf(0x1bf53a0) for >> module >> >>>>>> 'builtin:omfwd' >> >>>>>> 8194.255497793:main thread : calling freeCnf(0x1c04b00) for >> module >> >>>>>> 'imuxsock' >> >>>>>> 8194.255501565:main thread : calling freeCnf(0x1c05c70) for >> module >> >>>>>> 'imklog' >> >>>>>> 8194.255543579:main thread : destructing parser 'rsyslog.rfc5424' >> >>>>>> 8194.255547468:main thread : destructing parser 'rsyslog.rfc3164' >> >>>>>> 8194.255550784:main thread : pmrfc3164: free parser instance >> >>>>>> 0x1bf58b0 >> >>>>>> 8194.255559659:main thread : file rsyslogd.c released module >> >>>>>> 'lmnet', >> >>>>>> reference count now 2 >> >>>>>> 8194.255563470:main thread : module lmnet NOT unloaded because it >> >>>>>> still >> >>>>>> has a refcount of 2 >> >>>>>> 8194.255566676:main thread : Unloading module builtin:omfile >> >>>>>> 8194.255570420:main thread : module lmnet NOT unloaded because it >> >>>>>> still >> >>>>>> has a refcount of 2 >> >>>>>> 8194.255573674:main thread : Unloading module builtin:ompipe >> >>>>>> 8194.255576857:main thread : module lmnet NOT unloaded because it >> >>>>>> still >> >>>>>> has a refcount of 2 >> >>>>>> 8194.255579938:main thread : Unloading module builtin-shell >> >>>>>> 8194.255583060:main thread : module lmnet NOT unloaded because it >> >>>>>> still >> >>>>>> has a refcount of 2 >> >>>>>> 8194.255586144:main thread : Unloading module builtin:omdiscard >> >>>>>> 8194.255589509:main thread : module lmnet NOT unloaded because it >> >>>>>> still >> >>>>>> has a refcount of 2 >> >>>>>> 8194.255592658:main thread : Unloading module builtin:omfwd >> >>>>>> 8194.255596441:main thread : file omfwd.c released module >> 'lmnet', >> >>>>>> reference count now 1 >> >>>>>> 8194.255599981:main thread : module lmnet NOT unloaded because it >> >>>>>> still >> >>>>>> has a refcount of 1 >> >>>>>> 8194.255603068:main thread : Unloading module builtin:omusrmsg >> >>>>>> 8194.255606240:main thread : module lmnet NOT unloaded because it >> >>>>>> still >> >>>>>> has a refcount of 1 >> >>>>>> 8194.255609287:main thread : Unloading module builtin:pmrfc5424 >> >>>>>> 8194.255612755:main thread : module lmnet NOT unloaded because it >> >>>>>> still >> >>>>>> has a refcount of 1 >> >>>>>> 8194.255615826:main thread : Unloading module builtin:pmrfc3164 >> >>>>>> 8194.255619252:main thread : module lmnet NOT unloaded because it >> >>>>>> still >> >>>>>> has a refcount of 1 >> >>>>>> 8194.255622315:main thread : Unloading module builtin:smfile >> >>>>>> 8194.255625406:main thread : module lmnet NOT unloaded because it >> >>>>>> still >> >>>>>> has a refcount of 1 >> >>>>>> 8194.255628486:main thread : Unloading module builtin:smtradfile >> >>>>>> 8194.255631753:main thread : module lmnet NOT unloaded because it >> >>>>>> still >> >>>>>> has a refcount of 1 >> >>>>>> 8194.255634819:main thread : Unloading module builtin:smfwd >> >>>>>> 8194.255637916:main thread : module lmnet NOT unloaded because it >> >>>>>> still >> >>>>>> has a refcount of 1 >> >>>>>> 8194.255640987:main thread : Unloading module builtin:smtradfwd >> >>>>>> 8194.255644187:main thread : module lmnet NOT unloaded because it >> >>>>>> still >> >>>>>> has a refcount of 1 >> >>>>>> 8194.255647267:main thread : Unloading module imuxsock >> >>>>>> 8194.255676232:main thread : module lmnet NOT unloaded because it >> >>>>>> still >> >>>>>> has a refcount of 1 >> >>>>>> 8194.255680751:main thread : Unloading module imklog >> >>>>>> 8194.255694428:main thread : module lmnet NOT unloaded because it >> >>>>>> still >> >>>>>> has a refcount of 1 >> >>>>>> 8194.255698085:main thread : Unloading module omrelp >> >>>>>> 8194.255750953:main thread : module lmnet NOT unloaded because it >> >>>>>> still >> >>>>>> has a refcount of 1 >> >>>>>> 8194.255755236:main thread : module lmregexp NOT unloaded because >> >>>>>> it >> >>>>>> still has a refcount of 1 >> >>>>>> 8194.255759705:main thread : file conf.c released module 'lmnet', >> >>>>>> reference count now 0 >> >>>>>> 8194.255762875:main thread : module 'lmnet' has zero reference >> >>>>>> count, >> >>>>>> unloading... >> >>>>>> 8194.255765901:main thread : Unloading module lmnet >> >>>>>> 8194.255780970:main thread : rsyslog runtime de-initialized, >> >>>>>> current >> >>>>>> users 0 >> >>>>>> 8194.255784704:main thread : Clean shutdown completed, bye >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> On Sun, Mar 13, 2016 at 9:30 PM, David Lang <[email protected]> wrote: >> >>>>>> >> >>>>>> On Sun, 13 Mar 2016, Kane Kim wrote: >> >>>>>>> >> >>>>>>> >> >>>>>>> +1to David, having module handle the errors is not 100% reliable, >> >>>>>>> what >> >>>>>>> if >> >>>>>>> >> >>>>>>>> module dies before having a chance to save error log, those >> messages >> >>>>>>>> will >> >>>>>>>> be lost. >> >>>>>>>> >> >>>>>>>> >> >>>>>>> well, the module is a thread of rsyslog, so if it dies, rsyslog has >> >>>>>>> died. >> >>>>>>> >> >>>>>>> But if you are doing large batches (hundreds or thousands), putting >> >>>>>>> all >> >>>>>>> of them in a 'bad messages' file because one has a problem seems >> like >> >>>>>>> the >> >>>>>>> wrong thing to do. >> >>>>>>> >> >>>>>>> given that the error is not detected until the batch is being >> flushed >> >>>>>>> for >> >>>>>>> the output, any outputs that have already flushed their batches >> can't >> >>>>>>> be >> >>>>>>> changed, so the question is if it's worse to have messages resent >> to >> >>>>>>> those >> >>>>>>> outputs as the batch gets re-processed, or having to have the >> output >> >>>>>>> go to >> >>>>>>> a "couldn't deliver" file. >> >>>>>>> >> >>>>>>> It's bad enough if the problem is a corrupt message that can never >> be >> >>>>>>> delivered and the error is a permanent one, but if the problem is >> >>>>>>> instead a >> >>>>>>> temporary error (server is down), you will have a LOT of messages >> >>>>>>> going >> >>>>>>> into this 'couldn't deliver' bucket, when what should really be >> >>>>>>> happening >> >>>>>>> is that rsyslog should be pausing in the processing of messages, >> >>>>>>> retrying >> >>>>>>> after the suspended timeout. >> >>>>>>> >> >>>>>>> David Lang >> >>>>>>> >> >>>>>>> >> >>>>>>> On Mar 13, 2016 9:07 PM, "David Lang" <[email protected]> wrote: >> >>>>>>> >> >>>>>>>> >> >>>>>>>> On Sun, 13 Mar 2016, Rainer Gerhards wrote: >> >>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Rainer, can you explain what happens when a batch fails on an >> >>>>>>>>> action? >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>>> It depends on when what happens and if we have batches of one >> or >> >>>>>>>>>> >> >>>>>>>>>> batches of many. For batches of one, the failure is immediately >> >>>>>>>>>> detected and acted upon. For batches of many,all action strings >> or >> >>>>>>>>>> stored temporarily. During that time, it is assume that the >> >>>>>>>>>> messages >> >>>>>>>>>> could be processed. There is no way around this assumption, >> >>>>>>>>>> because >> >>>>>>>>>> we >> >>>>>>>>>> now run each action completely through the rule engine before >> the >> >>>>>>>>>> next >> >>>>>>>>>> one is begun to process, which means that other parts of the >> >>>>>>>>>> config >> >>>>>>>>>> have actually acted on a message, which means we cannot undo >> that >> >>>>>>>>>> state. If is needed to act on the "real" state, we need to >> process >> >>>>>>>>>> batches of one. There simply is no other solution possible. So >> >>>>>>>>>> let's >> >>>>>>>>>> stay with batches of many. Once the batch is completed, the >> >>>>>>>>>> temporary >> >>>>>>>>>> strings are passed to the output's commitTransaction entry >> point. >> >>>>>>>>>> It >> >>>>>>>>>> is currently assumed that the output does inteligently handle >> the >> >>>>>>>>>> case, and as it looks most do this well (no regression tests >> >>>>>>>>>> failed, >> >>>>>>>>>> no bad reports since we introduced this with the initial v8 >> >>>>>>>>>> version >> >>>>>>>>>> (15+ month ago). If the action fails, status is reflected >> >>>>>>>>>> accordingly. >> >>>>>>>>>> Also, the core engine *should* (but obviously does not support >> >>>>>>>>>> this >> >>>>>>>>>> right now) write an error file with the messages in error (just >> >>>>>>>>>> like >> >>>>>>>>>> omelasticsearch does, which has proven to be quite good). >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> well, it's not in the specs that modules should have local error >> >>>>>>>>> >> >>>>>>>>> files >> >>>>>>>>> the >> >>>>>>>>> way omelasticsearch does. >> >>>>>>>>> >> >>>>>>>>> What I would expect if there is a problem when commiting the >> batch >> >>>>>>>>> is >> >>>>>>>>> that >> >>>>>>>>> the entire batch would be marked as not delivered and try again >> >>>>>>>>> with >> >>>>>>>>> half >> >>>>>>>>> of it. >> >>>>>>>>> >> >>>>>>>>> This will cause messages to other outputs working from the same >> >>>>>>>>> queue to >> >>>>>>>>> get duplciate messages, but it's better than loosing an entire >> >>>>>>>>> batch >> >>>>>>>>> of >> >>>>>>>>> messages because one is corrupt in some way. or there is a >> >>>>>>>>> temporary >> >>>>>>>>> problem with the output. >> >>>>>>>>> >> >>>>>>>>> David Lang >> >>>>>>>>> >> >>>>>>>>> I have managed to look at least briefly at the commitTransaction >> >>>>>>>>> core >> >>>>>>>>> >> >>>>>>>>> engine code. I think we can even improve it and may get some >> >>>>>>>>> ability >> >>>>>>>>>> >> >>>>>>>>>> to do retries if the action itself does not handle that. At the >> >>>>>>>>>> minimum, it should have the ability (which was intended >> >>>>>>>>>> originally, >> >>>>>>>>>> but pushed away by more urgent work) to write that error file. >> >>>>>>>>>> >> >>>>>>>>>> I am very grateful that Kane is working on a repro with the >> >>>>>>>>>> standard >> >>>>>>>>>> modules. Once we have that, I can go deeper into the code and >> >>>>>>>>>> check >> >>>>>>>>>> what I am thinking about. It might be possible that we cannot >> >>>>>>>>>> manage >> >>>>>>>>>> to repro this with the standard modules. Then we need to think >> >>>>>>>>>> about a >> >>>>>>>>>> way via the testing module to do the same thing. >> >>>>>>>>>> >> >>>>>>>>>> I will try to address this within the 8.18 timeframe, but cannot >> >>>>>>>>>> totally commit to it. There has a lot of work been going on, and >> >>>>>>>>>> especially all that CI testing has taken up lots of works. >> >>>>>>>>>> Nevertheless, I've been able to iron out quite some nits, so >> it's >> >>>>>>>>>> probably worth it. I would still like to continue to get the CI >> >>>>>>>>>> environment into better shape, because now I know what I am >> doing. >> >>>>>>>>>> If >> >>>>>>>>>> I stop again and let it go for a couple of weeks, I would lose a >> >>>>>>>>>> lot >> >>>>>>>>>> of momentum. I have also do to some work for support customers, >> >>>>>>>>>> which >> >>>>>>>>>> keeps me able to work on rsyslog in general, so this, too, has >> >>>>>>>>>> priority (and validly has so because without support customers I >> >>>>>>>>>> would >> >>>>>>>>>> be totally unable to work on rsyslog other than as a hobby) >> >>>>>>>>>> >> >>>>>>>>>> I hope this explains the current state. >> >>>>>>>>>> Rainer >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> Kane, can you try and duplicate your problem using standard >> >>>>>>>>>> modules? >> >>>>>>>>>> >> >>>>>>>>>>> I'm >> >>>>>>>>>>> thinking that if you define a omrelp action that doesn't go >> >>>>>>>>>>> anywhere, >> >>>>>>>>>>> it >> >>>>>>>>>>> should return errors similar to what you are trying to do, so >> you >> >>>>>>>>>>> should >> >>>>>>>>>>> be >> >>>>>>>>>>> able to duplicate the problem that way. >> >>>>>>>>>>> >> >>>>>>>>>>> David Lang >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> On Tue, 23 Feb 2016, Kane Kim wrote: >> >>>>>>>>>>> >> >>>>>>>>>>> Looking at omkafka module source code it seems that it relies >> on >> >>>>>>>>>>> rsyslog >> >>>>>>>>>>> >> >>>>>>>>>>> retries in DoAction, returning RS_RET_SUSPENDED: >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> DBGPRINTF("omkafka: writeKafka returned %d\n", iRet); >> >>>>>>>>>>>> if(iRet != RS_RET_OK) { >> >>>>>>>>>>>> iRet = RS_RET_SUSPENDED; >> >>>>>>>>>>>> } >> >>>>>>>>>>>> >> >>>>>>>>>>>> I've tried similar code in DoAction, it seems that action >> >>>>>>>>>>>> processing >> >>>>>>>>>>>> is suspended at that point and messages are not retried to >> >>>>>>>>>>>> module. >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> On Tue, Feb 23, 2016 at 9:06 AM, Kane Kim >> >>>>>>>>>>>> <[email protected]> >> >>>>>>>>>>>> wrote: >> >>>>>>>>>>>> >> >>>>>>>>>>>> Thanks for your help Rainer, I'll try to debug what's going >> on, >> >>>>>>>>>>>> so >> >>>>>>>>>>>> far >> >>>>>>>>>>>> >> >>>>>>>>>>>> it >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> seems rsyslog doesn't retry even with batch size 1. It >> doesn't >> >>>>>>>>>>>>> retry >> >>>>>>>>>>>>> if I >> >>>>>>>>>>>>> return RS_RET_SUSPENDED from DoAction as well. >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> On Tue, Feb 23, 2016 at 9:05 AM, Rainer Gerhards >> >>>>>>>>>>>>> <[email protected] >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>> 2016-02-23 18:03 GMT+01:00 David Lang <[email protected]>: >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> On Tue, 23 Feb 2016, Rainer Gerhards wrote: >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> 2016-02-23 17:38 GMT+01:00 Kane Kim <[email protected] >> >: >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> Hello Rainer, thanks for the prompt reply! To give you >> some >> >>>>>>>>>>>>>>>>> context: >> >>>>>>>>>>>>>>>>> I >> >>>>>>>>>>>>>>>>> want >> >>>>>>>>>>>>>>>>> to write module that both using batching and also can't >> >>>>>>>>>>>>>>>>> loose >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> messages in >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> any circumstances. Are you saying it is by design that >> rsyslog >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> can't >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> do >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> that together? According to documentation rsyslog will retry >> >>>>>>>>>>>>>> if >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> module >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> returns any error. Do you plan to fix this in rsyslog or >> >>>>>>>>>>>>>>>>> update >> >>>>>>>>>>>>>>>>> documentation to say batching and retries don't work? >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> It depends on many things. In almost all cases, the retry >> >>>>>>>>>>>>>>>> should >> >>>>>>>>>>>>>>>> work >> >>>>>>>>>>>>>>>> well (and does so in practice). Unfortunately, I am pretty >> >>>>>>>>>>>>>>>> swamped. >> >>>>>>>>>>>>>>>> I >> >>>>>>>>>>>>>>>> need to go to a conference tomorrow and have had quite >> some >> >>>>>>>>>>>>>>>> unexpected >> >>>>>>>>>>>>>>>> work today. It would probably be good if you could ping me >> >>>>>>>>>>>>>>>> next >> >>>>>>>>>>>>>>>> week >> >>>>>>>>>>>>>>>> to see if we can look into more details what is causing >> you >> >>>>>>>>>>>>>>>> pain. >> >>>>>>>>>>>>>>>> But >> >>>>>>>>>>>>>>>> I can't guarantee that I will be available early next >> week. >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> In general, we cannot handle a fatal error here from an >> >>>>>>>>>>>>>>>> engine >> >>>>>>>>>>>>>>>> PoV, >> >>>>>>>>>>>>>>>> because everything is already processed and we do no >> longer >> >>>>>>>>>>>>>>>> have >> >>>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>> original messages. This is simply needed if you want to >> >>>>>>>>>>>>>>>> process >> >>>>>>>>>>>>>>>> messages one after another through the full config (a goal >> >>>>>>>>>>>>>>>> for v8 >> >>>>>>>>>>>>>>>> that >> >>>>>>>>>>>>>>>> was muuuuch requested). As I said, the solution is to use >> >>>>>>>>>>>>>>>> batches of >> >>>>>>>>>>>>>>>> one, because otherwise we would really need to turn back >> >>>>>>>>>>>>>>>> time >> >>>>>>>>>>>>>>>> and >> >>>>>>>>>>>>>>>> undo >> >>>>>>>>>>>>>>>> everything that was already done on the messages in >> question >> >>>>>>>>>>>>>>>> by >> >>>>>>>>>>>>>>>> other >> >>>>>>>>>>>>>>>> modules (including state advances). >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> I thought that if a batch failed, it pushed all the >> messages >> >>>>>>>>>>>>>>> back >> >>>>>>>>>>>>>>> on >> >>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>> queue and retried with a half size batch until it got to >> the >> >>>>>>>>>>>>>>> one >> >>>>>>>>>>>>>>> message >> >>>>>>>>>>>>>>> that could not be processed and only did a fatal fail on >> that >> >>>>>>>>>>>>>>> message. >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> Now, there is a big difference between a module giving a >> hard >> >>>>>>>>>>>>>>> error >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> "this >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> message is never going to be able to be processed no matter >> >>>>>>>>>>>>>> how >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> many >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> times >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> it's retried" vs a soft error "there is a problem delivering >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> things >> >>>>>>>>>>>>>>> to >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> this >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> destination right now, retry later". I thought the batch >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> processing >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> handled >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> these differently. >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>> That's no longer possible with v8, at least generically. As >> I >> >>>>>>>>>>>>>> said >> >>>>>>>>>>>>>> above, we would need to turn back time. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> But I really run out of time now... >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Rainer >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> David Lang >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> _______________________________________________ >> >>>>>>>>>>>>>>> rsyslog mailing list >> >>>>>>>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >>>>>>>>>>>>>>> http://www.rsyslog.com/professional-services/ >> >>>>>>>>>>>>>>> What's up with rsyslog? Follow >> https://twitter.com/rgerhards >> >>>>>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are >> ARCHIVED >> >>>>>>>>>>>>>>> by a >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> myriad of >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST >> >>>>>>>>>>>>>> if >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> you >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> DON'T >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> LIKE THAT. >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> _______________________________________________ >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> rsyslog mailing list >> >>>>>>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >>>>>>>>>>>>>> http://www.rsyslog.com/professional-services/ >> >>>>>>>>>>>>>> What's up with rsyslog? Follow >> https://twitter.com/rgerhards >> >>>>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED >> >>>>>>>>>>>>>> by >> >>>>>>>>>>>>>> a >> >>>>>>>>>>>>>> myriad >> >>>>>>>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT >> >>>>>>>>>>>>>> POST >> >>>>>>>>>>>>>> if >> >>>>>>>>>>>>>> you >> >>>>>>>>>>>>>> DON'T LIKE THAT. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> _______________________________________________ >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> rsyslog mailing list >> >>>>>>>>>>>> >> >>>>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >>>>>>>>>>>> http://www.rsyslog.com/professional-services/ >> >>>>>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >> >>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED >> by >> >>>>>>>>>>>> a >> >>>>>>>>>>>> myriad >> >>>>>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT >> POST >> >>>>>>>>>>>> if >> >>>>>>>>>>>> you >> >>>>>>>>>>>> DON'T >> >>>>>>>>>>>> LIKE THAT. >> >>>>>>>>>>>> >> >>>>>>>>>>>> _______________________________________________ >> >>>>>>>>>>>> >> >>>>>>>>>>>> rsyslog mailing list >> >>>>>>>>>>> >> >>>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >>>>>>>>>>> http://www.rsyslog.com/professional-services/ >> >>>>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >> >>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED >> by a >> >>>>>>>>>>> myriad >> >>>>>>>>>>> of >> >>>>>>>>>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >> >>>>>>>>>>> you >> >>>>>>>>>>> DON'T >> >>>>>>>>>>> LIKE THAT. >> >>>>>>>>>>> >> >>>>>>>>>>> _______________________________________________ >> >>>>>>>>>>> >> >>>>>>>>>> rsyslog mailing list >> >>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >>>>>>>>>> http://www.rsyslog.com/professional-services/ >> >>>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >> >>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by >> a >> >>>>>>>>>> myriad >> >>>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST >> if >> >>>>>>>>>> you >> >>>>>>>>>> DON'T LIKE THAT. >> >>>>>>>>>> >> >>>>>>>>>> _______________________________________________ >> >>>>>>>>>> >> >>>>>>>>>> rsyslog mailing list >> >>>>>>>>> >> >>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >>>>>>>>> http://www.rsyslog.com/professional-services/ >> >>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >> >>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >> >>>>>>>>> myriad >> >>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST >> if >> >>>>>>>>> you >> >>>>>>>>> DON'T LIKE THAT. >> >>>>>>>>> >> >>>>>>>>> _______________________________________________ >> >>>>>>>>> >> >>>>>>>> rsyslog mailing list >> >>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >>>>>>>> http://www.rsyslog.com/professional-services/ >> >>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >> >>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >> >>>>>>>> myriad >> >>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >> >>>>>>>> you >> >>>>>>>> DON'T LIKE THAT. >> >>>>>>>> >> >>>>>>>> _______________________________________________ >> >>>>>>>> >> >>>>>>> rsyslog mailing list >> >>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >>>>>>> http://www.rsyslog.com/professional-services/ >> >>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >> >>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >> >>>>>>> myriad >> >>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >> >>>>>>> you >> >>>>>>> DON'T LIKE THAT. >> >>>>>>> >> >>>>>>> >> >>>>>> >> >>>>>> _______________________________________________ >> >>>>> >> >>>>> rsyslog mailing list >> >>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >>>>> http://www.rsyslog.com/professional-services/ >> >>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >> >>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >> >>>>> myriad >> >>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >> you >> >>>>> DON'T LIKE THAT. >> >>>>> >> >>>>> _______________________________________________ >> >>>> >> >>>> rsyslog mailing list >> >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >>>> http://www.rsyslog.com/professional-services/ >> >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >> >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >> myriad >> >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> >>>> DON'T LIKE THAT. >> >>>> >> >>> >> >> _______________________________________________ >> >> rsyslog mailing list >> >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >> http://www.rsyslog.com/professional-services/ >> >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T >> >> LIKE THAT. >> >> >> > _______________________________________________ >> > rsyslog mailing list >> > http://lists.adiscon.net/mailman/listinfo/rsyslog >> > http://www.rsyslog.com/professional-services/ >> > What's up with rsyslog? Follow https://twitter.com/rgerhards >> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of >> > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >> > LIKE THAT. >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

