> this clarifies a lot, but not everything
> 
> if we can handle the following
> 
> *.* file1
> *.* file2
> 
> with having two worker threads running, one working on file1 and one
> working on file2 (with each having per-thred variables for everything
> except the global data and the actual filedescripter being written to).
> 
> why couldn't we handle the case
> 
> *.* file1
> 
> with two threads working on different messages. (if there was a lock
> around the actual write to file1)?
> 
> I'm not understanding the difference between the two cases. I
> understnad
> that you have a lock to prevent this, but I don't understand what the
> lock
> is protecting.

Well, the simple answer is "because the plugin interface specifies it in this
way". Remember that the interface was created around 2004 (mmmhhh, maybe a
bit later, but in any case quite some while ago...) and there were far other
problems than there are today. So the main focus was on getting things done,
and guaranteeing single-threadedness within an action definitely helped
getting things done. 

Looking at omfile implementation as an example, there *are* lots of things
guarded by this look. Just think about the dynafile cache, the ZIP writer or
the background write process. All of them use relatively simple algorithms
based on the assumption that the core guarantees exclusive use of the
instance data structures. Removing that guarantee is a non-trivial task.

HOWEVER, as I wrote, I will head into the direction where an output module
can actually *request* the core to call it concurrently, even for a single
action instance. This capability is part of the interface "spec", but was so
far never implemented. When I finally do this, I need to check each plugin,
and some more require large modifications to support that.

I will probably begin with omfile, where the first thing again is to
partition processing based on configuration selected. If you just use plain
files, without zip, without async writing, and without other bells and
whistles I currently do not have on my mind, the algorithm can greatly be
simplified and a single lock around the write loop would be sufficient. But
for the dynafile case, I then need to create a lot of new code to ensure the
cache structure is not damaged by concurrent access and I also need totally
different ways to convey the fd to be used back to the actual file writer.
All of this heavily depends on the exclusivity guaranteed by the interface
spec. I am not even sure yet this finer lock granularity will be faster -- it
may introduce more overhead than it saves (but I don't want to argue about
this today, as it is too far away and I do not yet have a clear
understanding. Being optimistic, it may be possible to do it lock-free).

Note that the structure of this work is very closely related to what I
currently do in regard to stage 1 action processing. So once I have tackled
that, I think I have a quite good blueprint of the type of modifications I
need to make.

Please keep asking if there are still things that are not certain.

Rainer 
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to