Re: [rsyslog] discussion request: performance enhancement for imtcp

Rainer Gerhards Wed, 09 Jun 2010 01:53:05 -0700

> > Also, strings are not generated before the actions, but while we are
> > processing them. Doing it up front would require even more memory and
> > processing time, because we would need to run over all actions twice
> (once to
> > create the string, once to call the actions, storing all strings
> created).
> > This does not even make sense from a lock contention POV because each
> action
> > has a separate lock, so there can be no lock contention between
> different
> > actions. The question of whether to generate all strings for ONE
> action
> > upfront was the initial question, and I think we have reached some
> consensus
> > on it (meaning that it is at least wroth trying out the performance
> effects
> > and then decide).
> 
> I'm not quite clear on the granularity here.
> 
> if I have the config
> 
> *.* file1
> *.* file2
> *.* @ip1
> *.* @ip2
> *.* @@ip3
> *.* @@ip4
> 
> for purposes of the locking, how many separate things are there?


Sorry, I think should have defined some terms first.

An *action* is a specific instance of some desired output. The actual
processing carried out is NOT termed "action", even though one could easily
do so. I have to admit I have not defined any term for that. So let's call
this processing. That actual processing is carried out by the output module
(and the really bad thing is that the entry point is named "doAction", which
somewhat implies that the output module is called the action, what is not the
case).

Each action can use the service of exactly one output module. Each output
module can provide services to many actions. So we have a N:1 relationship
between actions and output modules.

> depending on how I read your explination, sometimes it sounds like 6
> (one
> for each line) and sometimes itsounds like 3 (one for file output, one
> for
> UDP send, one for TCP send)

In the above samples, 3 output modules are involved, where each output module
is used by two actions. We have 6 actions, and so we have 6 action locks. 

So the output module interface does not serialize access to the output
module, but rather to the action instance. All action-specific data is kept
in a separate, per-action data structure and passed into the output module at
the time the doAction call is made. The output module can modify all of this
instance data as if it were running on a single thread. HOWEVER, any global
data items (in short: everything not inside the action instance data) is
*not* synchronized by the rsyslog core. The output module must take care
itself of synchronization if it desires to have concurrent access to such
data items. All current output modules do NOT access global data other than
for config parsing (which is serial and single-threaded by nature).

I hope this clarifies. If not, please keep asking. It is important to get
this right, and maybe I finally end up expressing me precise enough ;)

Rainer

> 
> >> note that the output lock is only needed when the two threads really
> >> are
> >> accessing the same thing (probably only for files, as you can have
> two
> >> network connections to the same destination at the same time, in
> which
> >> case you can use the path name as the lock id). For things like
> >> databases,
> >> network relays (including relp) it would probably be better if each
> >> worker
> >> thread opened it's own connection. In these cases the destination is
> >> designed to accept messages in parallel on multiple connections
> anyway.
> >> The good news is that the more complex (and slower) sending methods
> >> also
> >> tend to be the ones that can have multiple outbound connections.
> >
> > I agree, but that's another quite large effort. None of the current
> outputs
> > are designed in that way, and it introduces quite some complexity in
> error
> > and recovery cases.  Right now, I'd consider this the last thing that
> I'd
> > address.
> 
> Ok, we'll discuss this when dealing with thread-safe output modules
> 
> >> I seem to remember reading in the module explination that you do
> some
> >> trickery to take fairly normal code written in the module and make
> it
> >> thread-safe (by doing something with the variable access IIRC).
> >
> > That trick simply is the action lock -- so there is no concurrency at
> that
> > level. But I agree (and have begun to work on that idea) that it
> would be
> > useful to provide that capability, at least if the output supports
> it. As it
> > turned out today, there is still some other ground to explore before
> going
> > down that path.
> 
> ahh, that makes sense (I was puzzeled over what trickery you had done
> to
> make the variables be thread-safe)
> 
> >> If you have this (and use the filename as the lock) you also gain
> >> protection against two different actions stepping on each other.
> >>
> >> I have a growing number of cases where I have things like
> >> :hostname, isequal, "foo" /var/log/messages;fixup_format
> >> & ~
> >> *.* /var/log/messages
> >>
> >> this works today if I'm sending over the network instead of writing
> to
> >> a
> >> file, but on my relay boxes (which do both) I have a number of
> >> corrupted
> >> messages each day due to the different actions stepping on each
> other.
> >
> > That is a bug I would be interested in finding. The threading model
> does NOT
> > allow for that possibility (I mean from a design point, as you
> experience it
> > happens, but the design does not mean this is valid). Still I will
> keep
> > myself now focused a bit on the performance optimization, it doesn't
> make
> > sense to now, that I have gained up momentum and knowledge in that
> area
> > again, start another bughunt and loose that momentum. But that's
> definitely
> > something I am interested in, it shows something works fundamentally
> flawed.
> 
> Ok, one thing at a time.
> 
> >> note that if you do this output locking on files, it may be possible
> to
> >> do strange things like
> >>
> >> =*.info /var/log/messages
> >> =*.debug /var/log/messages
> >> etc
> >>
> >> and allow these to have multiple worker threads running so that each
> >> worker be processing messages with different severity as different
> >> actions
> >> in parallel (with just a write lock around the final output to the
> >> file).
> >> This is far uglier than being able to do the action processing in
> >> parallel, but may work.
> >
> > ah, OK, I guess I get the picture. You are writing to files with more
> than
> > one action. That does not work well. Ruleset inclusion is the current
> > solution to it. In the long term, it may be useful to have a single
> object
> > that represents the file being written, no matter which rule is used
> to do
> > it. I'd say that's something that would go together with the new
> config
> > format...
> 
> I think that it wouldn't need any change to the configs. the more I
> think
> about it the more I think this is only really a significant problem for
> file output and (there it should be pretty trivial to implement),
> everything else can just have multiple sockets open (one per
> thread)
> 
> >> I don't see much here where threads handling one message instead of
> >> multiple messages could speed things up much. Since writes are not
> >> atomic,
> >> you still need the output locks (or multiple outputs) even if only
> >> processing one message at a time.
> >>
> >> single thread, single message is a simpler case, but in that case
> the
> >> locking will be very close to a no-op anyway (since there will never
> be
> >> contention)
> >
> > One thing that I found out during my research and testing is that it
> pays to
> > look at a far more granular level, and todays change is the first
> real-world
> > approach to this. Not craft one method that does it all, but see the
> > different config params and what they demand (same for transactions,
> etc,
> > etc.). Then code "driver"-like functions for that specific case and
> call the
> > rigth one for the config params given. That way it is possible to
> provide
> > high speed where it is possible but provide some costly features as
> well.
> > Then, they do not affect the majority of cases that do not need them
> (in
> > other words: pay the performance penalty only if you also get some
> benefits
> > from themn). The same holds true for some other optimizations that
> can only
> > be done when looking at a very fine-granular level. I think that it
> will be
> > possible to even get rid of locks at all in some important cases. I
> will most
> > probably try to introduce some lock-free alternative for the "mark"
> case, not
> > only to cover it, but also to see how it works in practice. Out of my
> testing
> > and reasearch, it should provide superb performance. If that turns
> out to be
> > true, I see many more potential for these methods.
> 
> sounds good. lock free will almost always win.
> 
> > I will try this at the moment, but at the expense of stability. The
> next
> > days, I'll try out at least some ideas and only after that I will see
> what it
> > takes to stabilize the engine in all cases again (getting a too-large
> delta
> > may make this stabilization too hard, doing the stabilization too
> early
> > distracts me from the real facts I intend to look at - but who said
> life is
> > easy ;)).
> 
> I'm going to get mylab setup again to test this.
> 
> David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Re: [rsyslog] discussion request: performance enhancement for imtcp

Reply via email to