I saw something in a changelog recently about how the first message could be
lost under some conditions. Try the current nightly build (I think this was in
something post 8.16) and see if the same thing happens.
David Lang
On Wed, 24 Feb 2016, Kane Kim wrote:
Date: Wed, 24 Feb 2016 17:27:27 -0800
From: Kane Kim <[email protected]>
Reply-To: rsyslog-users <[email protected]>
To: rsyslog-users <[email protected]>
Subject: Re: [rsyslog] retry if output module returns RS_RET_SUSPENDED
BTW, another question, I've created dummy plugin with all endpoints
returning RS_RET_SUSPENDED
(BeginTransaction/TryResume/DoAction/EndTransaction).
I'm logging 3 lines: 1, 2 and 3, queue.saveonshutdown is set to
"on", action.resumeRetryCount="-1" action.reportSuspensionContinuation="on"
action.resumeInterval="1" queue.dequeuebatchsize="1" queue.type="Disk".
So rsyslog is retrying line "1" indefinitely, then I stop and restart
rsyslog, it starts to retry line "2" and after another restart it retries
line "3". After 3rd restart queue is deleted and nothing is retried.
Apparently none of the lines were committed successfully by the output
module and they are all lost.
Is it expected behavior?
On Tue, Feb 23, 2016 at 9:11 AM, Kane Kim <[email protected]> wrote:
Looking at omkafka module source code it seems that it relies on rsyslog
retries in DoAction, returning RS_RET_SUSPENDED:
DBGPRINTF("omkafka: writeKafka returned %d\n", iRet);
if(iRet != RS_RET_OK) {
iRet = RS_RET_SUSPENDED;
}
I've tried similar code in DoAction, it seems that action processing is
suspended at that point and messages are not retried to module.
On Tue, Feb 23, 2016 at 9:06 AM, Kane Kim <[email protected]> wrote:
Thanks for your help Rainer, I'll try to debug what's going on, so far it
seems rsyslog doesn't retry even with batch size 1. It doesn't retry if I
return RS_RET_SUSPENDED from DoAction as well.
On Tue, Feb 23, 2016 at 9:05 AM, Rainer Gerhards <
[email protected]> wrote:
2016-02-23 18:03 GMT+01:00 David Lang <[email protected]>:
On Tue, 23 Feb 2016, Rainer Gerhards wrote:
2016-02-23 17:38 GMT+01:00 Kane Kim <[email protected]>:
Hello Rainer, thanks for the prompt reply! To give you some context:
I
want
to write module that both using batching and also can't loose
messages in
any circumstances. Are you saying it is by design that rsyslog can't
do
that together? According to documentation rsyslog will retry if
module
returns any error. Do you plan to fix this in rsyslog or update
documentation to say batching and retries don't work?
It depends on many things. In almost all cases, the retry should work
well (and does so in practice). Unfortunately, I am pretty swamped. I
need to go to a conference tomorrow and have had quite some unexpected
work today. It would probably be good if you could ping me next week
to see if we can look into more details what is causing you pain. But
I can't guarantee that I will be available early next week.
In general, we cannot handle a fatal error here from an engine PoV,
because everything is already processed and we do no longer have the
original messages. This is simply needed if you want to process
messages one after another through the full config (a goal for v8 that
was muuuuch requested). As I said, the solution is to use batches of
one, because otherwise we would really need to turn back time and undo
everything that was already done on the messages in question by other
modules (including state advances).
I thought that if a batch failed, it pushed all the messages back on
the
queue and retried with a half size batch until it got to the one
message
that could not be processed and only did a fatal fail on that message.
Now, there is a big difference between a module giving a hard error
"this
message is never going to be able to be processed no matter how many
times
it's retried" vs a soft error "there is a problem delivering things to
this
destination right now, retry later". I thought the batch processing
handled
these differently.
That's no longer possible with v8, at least generically. As I said
above, we would need to turn back time.
But I really run out of time now...
Rainer
David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T
LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.