On Thu, 10 Jun 2010, Rainer Gerhards wrote:

>> this clarifies a lot, but not everything
>>
>> if we can handle the following
>>
>> *.* file1
>> *.* file2
>>
>> with having two worker threads running, one working on file1 and one
>> working on file2 (with each having per-thred variables for everything
>> except the global data and the actual filedescripter being written to).
>>
>> why couldn't we handle the case
>>
>> *.* file1
>>
>> with two threads working on different messages. (if there was a lock
>> around the actual write to file1)?
>>
>> I'm not understanding the difference between the two cases. I
>> understnad
>> that you have a lock to prevent this, but I don't understand what the
>> lock
>> is protecting.
>
> Well, the simple answer is "because the plugin interface specifies it in this
> way". Remember that the interface was created around 2004 (mmmhhh, maybe a
> bit later, but in any case quite some while ago...) and there were far other
> problems than there are today. So the main focus was on getting things done,
> and guaranteeing single-threadedness within an action definitely helped
> getting things done.
>
> Looking at omfile implementation as an example, there *are* lots of things
> guarded by this look. Just think about the dynafile cache, the ZIP writer or
> the background write process. All of them use relatively simple algorithms
> based on the assumption that the core guarantees exclusive use of the
> instance data structures. Removing that guarantee is a non-trivial task.
>
> HOWEVER, as I wrote, I will head into the direction where an output module
> can actually *request* the core to call it concurrently, even for a single
> action instance. This capability is part of the interface "spec", but was so
> far never implemented. When I finally do this, I need to check each plugin,
> and some more require large modifications to support that.
>
> I will probably begin with omfile, where the first thing again is to
> partition processing based on configuration selected. If you just use plain
> files, without zip, without async writing, and without other bells and
> whistles I currently do not have on my mind, the algorithm can greatly be
> simplified and a single lock around the write loop would be sufficient. But
> for the dynafile case, I then need to create a lot of new code to ensure the
> cache structure is not damaged by concurrent access and I also need totally
> different ways to convey the fd to be used back to the actual file writer.
> All of this heavily depends on the exclusivity guaranteed by the interface
> spec. I am not even sure yet this finer lock granularity will be faster -- it
> may introduce more overhead than it saves (but I don't want to argue about
> this today, as it is too far away and I do not yet have a clear
> understanding. Being optimistic, it may be possible to do it lock-free).
>
> Note that the structure of this work is very closely related to what I
> currently do in regard to stage 1 action processing. So once I have tackled
> that, I think I have a quite good blueprint of the type of modifications I
> need to make.
>
> Please keep asking if there are still things that are not certain.

this makes lots of sense.

It may be worthwhile doing some testing on how much time is spent in the 
output portion vs time spent on the selection/string manipulation 
(especially with the new high-speed fixed templates). I am speculating 
that there is a significant amount of time spent if the selection/template 
stage crafting the strings that are then written out (possibly by being 
sent through zip, accessing a dynafile, etc). If so there may be a 
significant win in just moving the lock aquisition from the beginning of 
the action to just before writing (by allowing one thread to be crafting 
the string while another is writing it out)

I would expect the win to be less with more complex output 
(dynafiles/compression) but I also suspect that high speed sites will tend 
to not use these features as much (especially if they make a noticable 
difference in the speed ;-)

something to try a quick test on sometime.

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to