OK, actually a tougher issue than I thought. After a couple of hours
of testing, it looks like fixing the bug is good. There was some
potential for a very slight optimization, but I left it out as it
would have caused quite some work to ensure that the rest of the
engine works well (hint: testing showed it did not work without
further changes). So I am not 100% happy, but I think this issue is
solved. More details from commit text:

https://github.com/rsyslog/rsyslog/pull/4688

The call rscript statement is able to call a rule set either synchronously or
asynchronously. We did this, because practice showed that both modes
are needed. For various reasons we decided to make async
calls if the ruleset has a queue assigned and sync if not.

To know if a "queue is assigned" we just checked if queue parameters were
given. It was overlookeded the case of someone explicitly specifying a
"direct queue", aka "no queue". As such, queue="direct" triggered async
calls. That in turn meant that when a write operation to a variable was
made inside that rule set, other rulesets could or could not see the
write. While if was often not seen, this was a data race where the
change could also be seen by the outside.

This is now fixed. No matter if queue.type="direct" is specified or
left out, the call will always by synchronous. Any values written to
variables will also be seen by the "outside world" in later processing
stages.

Note that this has some potential to BREAK EXISTING CONFIGURATIONS.
We deem this acceptable because:

1. this was racy at all, so unexpected behaviour could alwas occur
2. it is actually unlikely that someone used the triggering conditions
in practice. But we can not outrule this, especially when the
configuration was auto-generated.

Potential compatibility issues can be solved by defining a small
array-memory queue on the ruleset in question instead of specifying
direct type.

Again, we expect that almost all users will never experience any
problems. If you do, however, please let us know: we may add an
option to re-enable the bug.


Thanks everyone for being persistent.

Rainer

El vie, 17 sept 2021 a las 14:41, Rainer Gerhards
(<[email protected]>) escribió:
>
> David,
>
> I reconsidered your point and am currently having a more in-depth
> look. Maybe something is indeed fishy... Just for everyone's info.
> Will post more when I know more ;-)
>
> Rainer
>
> El vie, 17 sept 2021 a las 11:21, Rainer Gerhards
> (<[email protected]>) escribió:
> >
> > > the issue is that throughout the documentation, we say that not 
> > > specifying a
> > > queue is the same as specifying a queue type of direct. In this case, 
> > > it's not.
> >
> > no, no!
> >
> > All is right, except that the "call" statement calling a ruleset async
> > even when only a queue type direct is set.
> >
> > Behaviour of queues is as it always was, and direct does not de-couple.
> >
> > It is "call" which is executed on a different thread when we detect
> > queue parameters on a ruleset. Anything else is correct. This can be
> > clarified here:
> >
> > https://www.rsyslog.com/doc/master/rainerscript/rainerscript_call.html
> >
> > but nowhere else!
> >
> > Tech details: if call detects a queue on the ruleset, it posts the
> > message to that queue. Otherwise, it runs the ruleset as a subroutine
> > (links directly to the rulesets AST). This ensures synchronicity even
> > when running on multiple threads.
> >
> > Rainer
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to