Thanks, David, I'll give it a try. What about TryResume/DoAction question? I changed TryResume to always return RS_RET_OK and DoAction returns RS_RET_SUSPENDED, I print debug information in TryResume/DoAction, log one line, DoAction is called once and not retried after that.
On Thu, Feb 25, 2016 at 3:11 PM, David Lang <[email protected]> wrote: > Ok, this is the change I remembered seeing, it's going to be in 8.17, but > missed 8.16, so try a nightly build or build from git and see if you still > have a problem. > > commit b435f4e7d2ece7f2ea0a7b42826498e224be3f23 > Author: Rainer Gerhards <[email protected]> > Date: Wed Feb 3 16:32:07 2016 +0100 > > bugfix: queue engine can loose one message during queue shutdown > > ... due to improper checking of return states. > > closes https://github.com/rsyslog/rsyslog/issues/262 > > David Lang > > > On Thu, 25 Feb 2016, Kane Kim wrote: > > Date: Thu, 25 Feb 2016 14:56:03 -0800 >> >> From: Kane Kim <[email protected]> >> Reply-To: rsyslog-users <[email protected]> >> To: rsyslog-users <[email protected]> >> Subject: Re: [rsyslog] retry if output module returns RS_RET_SUSPENDED >> >> I'm trying to post to kafka reliably from rsyslog. omkafka module is using >> doAction to post to kafka. I've changed doAction to always return >> RS_RET_SUSPENDED, which would happen if there is any error posting to >> kafka. There are a couple of use cases I want to make sure we are not >> loosing messages: >> 1. log a couple of lines, restart rsyslog - make sure those lines are not >> lost (on graceful restart rsyslog always loosing message in-flight) >> 2. the same as above, but kill -9 rsyslog (it seems that rsyslog looses >> all >> uncommitted messages) >> >> So far I can't make it working, some messages always lost. >> Also rsyslog's retry handling seems to relay on tryResume to block before >> calling doAction, e.g. if TryResume can't reach kafka it will be blocked >> until kafka is reachable again and DoAction will *probably* work. The >> issue >> is if TryResume returns success and then DoAction fails, in this case >> rsyslog never retries that message. >> >> On Wed, Feb 24, 2016 at 5:37 PM, David Lang <[email protected]> wrote: >> >> I saw something in a changelog recently about how the first message could >>> be lost under some conditions. Try the current nightly build (I think >>> this >>> was in something post 8.16) and see if the same thing happens. >>> >>> David Lang >>> >>> On Wed, 24 Feb 2016, Kane Kim wrote: >>> >>> Date: Wed, 24 Feb 2016 17:27:27 -0800 >>> >>>> From: Kane Kim <[email protected]> >>>> Reply-To: rsyslog-users <[email protected]> >>>> To: rsyslog-users <[email protected]> >>>> Subject: Re: [rsyslog] retry if output module returns RS_RET_SUSPENDED >>>> >>>> >>>> BTW, another question, I've created dummy plugin with all endpoints >>>> returning RS_RET_SUSPENDED >>>> (BeginTransaction/TryResume/DoAction/EndTransaction). >>>> I'm logging 3 lines: 1, 2 and 3, queue.saveonshutdown is set to >>>> "on", action.resumeRetryCount="-1" >>>> action.reportSuspensionContinuation="on" >>>> action.resumeInterval="1" queue.dequeuebatchsize="1" queue.type="Disk". >>>> So rsyslog is retrying line "1" indefinitely, then I stop and restart >>>> rsyslog, it starts to retry line "2" and after another restart it >>>> retries >>>> line "3". After 3rd restart queue is deleted and nothing is retried. >>>> Apparently none of the lines were committed successfully by the output >>>> module and they are all lost. >>>> >>>> Is it expected behavior? >>>> >>>> >>>> On Tue, Feb 23, 2016 at 9:11 AM, Kane Kim <[email protected]> >>>> wrote: >>>> >>>> Looking at omkafka module source code it seems that it relies on rsyslog >>>> >>>>> retries in DoAction, returning RS_RET_SUSPENDED: >>>>> >>>>> DBGPRINTF("omkafka: writeKafka returned %d\n", iRet); >>>>> if(iRet != RS_RET_OK) { >>>>> iRet = RS_RET_SUSPENDED; >>>>> } >>>>> >>>>> I've tried similar code in DoAction, it seems that action processing is >>>>> suspended at that point and messages are not retried to module. >>>>> >>>>> >>>>> On Tue, Feb 23, 2016 at 9:06 AM, Kane Kim <[email protected]> >>>>> wrote: >>>>> >>>>> Thanks for your help Rainer, I'll try to debug what's going on, so far >>>>> it >>>>> >>>>>> seems rsyslog doesn't retry even with batch size 1. It doesn't retry >>>>>> if >>>>>> I >>>>>> return RS_RET_SUSPENDED from DoAction as well. >>>>>> >>>>>> On Tue, Feb 23, 2016 at 9:05 AM, Rainer Gerhards < >>>>>> [email protected]> wrote: >>>>>> >>>>>> 2016-02-23 18:03 GMT+01:00 David Lang <[email protected]>: >>>>>> >>>>>>> >>>>>>> On Tue, 23 Feb 2016, Rainer Gerhards wrote: >>>>>>>> >>>>>>>> 2016-02-23 17:38 GMT+01:00 Kane Kim <[email protected]>: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> Hello Rainer, thanks for the prompt reply! To give you some >>>>>>>>>> context: >>>>>>>>>> >>>>>>>>>> I >>>>>>>>> >>>>>>>> >>>>>>> want >>>>>>>> >>>>>>>>> to write module that both using batching and also can't loose >>>>>>>>>> >>>>>>>>>> messages in >>>>>>>>> >>>>>>>> >>>>>>> any circumstances. Are you saying it is by design that rsyslog can't >>>>>>>> >>>>>>>>> >>>>>>>>>> do >>>>>>>>> >>>>>>>> >>>>>>> that together? According to documentation rsyslog will retry if >>>>>>>> >>>>>>>>> >>>>>>>>>> module >>>>>>>>> >>>>>>>> >>>>>>> returns any error. Do you plan to fix this in rsyslog or update >>>>>>>> >>>>>>>>> documentation to say batching and retries don't work? >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> It depends on many things. In almost all cases, the retry should >>>>>>>>> work >>>>>>>>> well (and does so in practice). Unfortunately, I am pretty >>>>>>>>> swamped. I >>>>>>>>> need to go to a conference tomorrow and have had quite some >>>>>>>>> unexpected >>>>>>>>> work today. It would probably be good if you could ping me next >>>>>>>>> week >>>>>>>>> to see if we can look into more details what is causing you pain. >>>>>>>>> But >>>>>>>>> I can't guarantee that I will be available early next week. >>>>>>>>> >>>>>>>>> In general, we cannot handle a fatal error here from an engine PoV, >>>>>>>>> because everything is already processed and we do no longer have >>>>>>>>> the >>>>>>>>> original messages. This is simply needed if you want to process >>>>>>>>> messages one after another through the full config (a goal for v8 >>>>>>>>> that >>>>>>>>> was muuuuch requested). As I said, the solution is to use batches >>>>>>>>> of >>>>>>>>> one, because otherwise we would really need to turn back time and >>>>>>>>> undo >>>>>>>>> everything that was already done on the messages in question by >>>>>>>>> other >>>>>>>>> modules (including state advances). >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> I thought that if a batch failed, it pushed all the messages back on >>>>>>>> >>>>>>>> the >>>>>>> >>>>>>> queue and retried with a half size batch until it got to the one >>>>>>>> >>>>>>>> message >>>>>>> >>>>>>> that could not be processed and only did a fatal fail on that >>>>>>>> message. >>>>>>>> >>>>>>>> Now, there is a big difference between a module giving a hard error >>>>>>>> >>>>>>>> "this >>>>>>> >>>>>>> message is never going to be able to be processed no matter how many >>>>>>>> >>>>>>>> times >>>>>>> >>>>>>> it's retried" vs a soft error "there is a problem delivering things >>>>>>>> to >>>>>>>> >>>>>>>> this >>>>>>> >>>>>>> destination right now, retry later". I thought the batch processing >>>>>>>> >>>>>>>> handled >>>>>>> >>>>>>> these differently. >>>>>>>> >>>>>>>> >>>>>>> That's no longer possible with v8, at least generically. As I said >>>>>>> above, we would need to turn back time. >>>>>>> >>>>>>> But I really run out of time now... >>>>>>> >>>>>>> Rainer >>>>>>> >>>>>>> >>>>>>>> David Lang >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> rsyslog mailing list >>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>>>> http://www.rsyslog.com/professional-services/ >>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>>>> >>>>>>>> myriad of >>>>>>> >>>>>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>>>>>> >>>>>>>> DON'T >>>>>>> >>>>>>> LIKE THAT. >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>> rsyslog mailing list >>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>>> http://www.rsyslog.com/professional-services/ >>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>>> myriad >>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >>>>>>> you >>>>>>> DON'T LIKE THAT. >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>> >>>> rsyslog mailing list >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> http://www.rsyslog.com/professional-services/ >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>> DON'T LIKE THAT. >>>> >>>> _______________________________________________ >>>> >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com/professional-services/ >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>> DON'T LIKE THAT. >>> >>> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> >> _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

