Re: [rsyslog] discussion request: performance enhancement for imtcp

david Wed, 09 Jun 2010 08:49:11 -0700

On Wed, 9 Jun 2010, Rainer Gerhards wrote:

> Date: Wed, 9 Jun 2010 10:52:55 +0200
> From: Rainer Gerhards <[email protected]>


>>> Also, strings are not generated before the actions, but while we are
>>> processing them. Doing it up front would require even more memory and
>>> processing time, because we would need to run over all actions twice
>> (once to
>>> create the string, once to call the actions, storing all strings
>> created).
>>> This does not even make sense from a lock contention POV because each
>> action
>>> has a separate lock, so there can be no lock contention between
>> different
>>> actions. The question of whether to generate all strings for ONE
>> action
>>> upfront was the initial question, and I think we have reached some
>> consensus
>>> on it (meaning that it is at least wroth trying out the performance
>> effects
>>> and then decide).
>>
>> I'm not quite clear on the granularity here.
>>
>> if I have the config
>>
>> *.* file1
>> *.* file2
>> *.* @ip1
>> *.* @ip2
>> *.* @@ip3
>> *.* @@ip4
>>
>> for purposes of the locking, how many separate things are there?
>
> Sorry, I think should have defined some terms first.
>
> An *action* is a specific instance of some desired output. The actual
> processing carried out is NOT termed "action", even though one could easily
> do so. I have to admit I have not defined any term for that. So let's call
> this processing. That actual processing is carried out by the output module
> (and the really bad thing is that the entry point is named "doAction", which
> somewhat implies that the output module is called the action, what is not the
> case).
>
> Each action can use the service of exactly one output module. Each output
> module can provide services to many actions. So we have a N:1 relationship
> between actions and output modules.
>
>> depending on how I read your explination, sometimes it sounds like 6
>> (one
>> for each line) and sometimes itsounds like 3 (one for file output, one
>> for
>> UDP send, one for TCP send)
>
> In the above samples, 3 output modules are involved, where each output module
> is used by two actions. We have 6 actions, and so we have 6 action locks.
>
> So the output module interface does not serialize access to the output
> module, but rather to the action instance. All action-specific data is kept
> in a separate, per-action data structure and passed into the output module at
> the time the doAction call is made. The output module can modify all of this
> instance data as if it were running on a single thread. HOWEVER, any global
> data items (in short: everything not inside the action instance data) is
> *not* synchronized by the rsyslog core. The output module must take care
> itself of synchronization if it desires to have concurrent access to such
> data items. All current output modules do NOT access global data other than
> for config parsing (which is serial and single-threaded by nature).
>
> I hope this clarifies. If not, please keep asking. It is important to get
> this right, and maybe I finally end up expressing me precise enough ;)

this clarifies a lot, but not everything

if we can handle the following

*.* file1
*.* file2

with having two worker threads running, one working on file1 and one 
working on file2 (with each having per-thred variables for everything 
except the global data and the actual filedescripter being written to).

why couldn't we handle the case

*.* file1

with two threads working on different messages. (if there was a lock 
around the actual write to file1)?

I'm not understanding the difference between the two cases. I understnad 
that you have a lock to prevent this, but I don't understand what the lock 
is protecting.

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Re: [rsyslog] discussion request: performance enhancement for imtcp

Reply via email to