On Tue, 5 Nov 2013, Rainer Gerhards wrote:

On Mon, Nov 4, 2013 at 7:13 PM, Pavel Levshin <[email protected]> wrote:


04.11.2013 20:16, Rainer Gerhards:


 Let's ignore for a moment that I say it is a bug and take a bit broader
look at it: why do we actually need this capability? What are the use
cases? I am asking because, at least in new engine, this will have
noticable overhead. I also have seen some feature inside the engine that
also looked "cool", but after 6 years nobody seems to have used them. I'd
like to avoid introducing new such ones. Feedback appreciated.


For me, this feature (possibility to take a message from one ruleset and
submit it for processing into another ruleset with it's own queue) could be
useful to decouple different actions. If secondary queue just discards
messages when it is full, then it will not interfere with primary queue.
Also, secondary queue could have different thread settings, which affects
performance.

Currently, we can attach a queue to a single action, but this approach
cannot be applied to multiple actions at once (or it is possible with
omruleset only). Therefore, we cannot asynchronously parse/modify a message
and then output it to a file. I haven't tried omruleset, because it is
declared obsolete.


ah, ok, so the actual use case is to run some activity in the background.
This was not among the original design goals (years ago...). It used to
work well with just the right number of worker threads, and this was what
was the idea behind it (that was long before the problematic message
modification modules came up). But I think you are digging deeper, have the
ability to do some totally different kind of background activity. Makes
sense to me, but I still wonder how important this use case is. Also I
would be interested on comments from others on that case.

the thing is that it's sometime preferrable to be able to manage queues at a level higher than individual actions. When you have many actions, each with their own queue, it's hard (if not impossible) to manage the memory/disk space that's used by rsyslog since all the different queues operate independantly

Also, with disk queues, there is a lot of overhead with the queue options, having to do those many times for different actions is inefficient.

And finally, sometimes you want to couple different actions together (you always send the messages to two different systems), and so you want to have them share the queues.

In current configuration, it is something unnatural that queue is a
property of an action or a ruleset. As I see it: queue is a separate
entity. A message may be passed to a queue from input module or from
another place (such as omruleset, better named omqueue in this case, or
another new statement). Then, the message can be dequeued towards a
ruleset, which may consist of one or more statements (actions).


Well, in rsyslog design it is quite natural. There are actually two
different kinds of queues:

1. main queues, which are used as a buffer between an input source and
processing stage
2. action queues, which are used to buffer slow actions

Rulesets have queues because their primary use is to enable inputs to bind
to them. So they need to have main queue capabilities. Later, the desire to
call rulesets like subroutines was voiced, and we permitted this first via
omruleset (thus the name!). That it duplicated the message was just a
much-hated (!) side effect. When I implemented call for the first time, it
was the desire to remove that side effect. Some time after, it was asked
that call should submit to the queue, if the ruleset had one. I was a bit
surprised but (think - this is what I am currently puzzled about as it is
not in git) "fixed" that.

Long story short -- lot's of features have creeped in that rsyslog was
never designed to do and many were rooted in use of side-effects that were
not expected to be used in that way. The question is if this feature creep
is actually something good; I have been very liberal with that in the past
12 month, and now begin to re-consider that. Adding something quickly may
not be the right thing to do, even though if it solves a use case --- but
it ends up opening one can of worms after the other. This is one reason I
am thinking so hard what to support. BTW: rewriting the engine brings up
all of these undesired (but existing) side-effects and it really shows how
bad some of them are.



Another rather strange property of current engine is possibility to attach
asynchronous queue to a message modification action. Messages are passed by
reference to the queue, which means that they will be modified
asynchronously, and their content is unpredictable for both main and
secondary queues. Then, after modification, the message is just discarded
from the secondary queue. This asynchronous processing, nevertheless, makes
sense for real output actions, when a message can be considered read-only.


Yet another undesired side effect --  it was never thought that someone
would do that (TBH, this sounds outright crazy to *me* [and probably just
me ;)]). Yesterday I've already noted down on the todo list that in v7
calling an output module wiht message passing mode will lead to a warning
at rsyslog startup.


Thus, there should be two ways to pass a message to the secondary queue:
by value (as a copy), or by reference. It could be automated with
copy-on-write, but this can lead to unneeded copying if the message is not
used in main queue.


ack

Using CoW makes a lot of sense, I don't think it makes sense to do a non-CoW copy by reference, because the 'message' can be changed by the other thread (either via a mm module, or by just getting other variables set)


To summarize: current call behaviour makes perfect sense (because call
implies return), but, to be consistent, ruleset objects should not contain
queue as a property. Or, at least, there should be warning when the queue
is not of "direct" type.

And then, possibility to resubmit a message into another queue is also
helpful. It can be done externally (via omfwd/imudp, for example), but, for
performance reasons, it is certainly better to do this internally. Maybe,
omruleset is the right way to go, it only lacks possibility to pass the
message by reference for better performance.

I disagree, I think both types make perfect sense, but for different situations.

call/return (i.e. direct queue) makes sense when you are using the ruleset to simplify your config by taking some thing that you may have to do many places in the rules and only enter them once (or where you want to output to one file from many places in the rules, after different manipulation has been done to the message)

async call (i.e. non-direct queue) can be used in many of the same places that call/return can be used, but it can also be used for many cases there it won't work (say you are sending to remote machines and you want to limit the total disk space used by your queues, you are stuck either using a single queue, or doing a queue per destination where you may have messages duplicated in the different queues)

In fact, I think the cases where call/return matters (i.e. something in the ruleset modifies the message and then things outside the ruleset depend on the change) are probably the rarer need

David Lang

omruleset is definitely not the way to go ;)

My conclusion is that it looks like a call-by-queue-submit should be
supported, as I originally thought. I just need to find out if I was just
dreaming of this bugfix (I have to admit I am *really* puzzled).

One other conclusion is to keep a bit more focussed on feature creep in the
future (see above).

Thanks again,
Rainer



--
Pavel Levshin


_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to