On Mon, 31 May 2010, Rainer Gerhards wrote:

>
> You are looking in the same direction I am, and I think this is good news ;)
>
> The current engine supports functions coded in C, but not yet as real plugins
> nor in an easy to see way. It is done via a crude function interface library
> module, and only within the script engine. My original plan (over a year, or
> even two, ago) was to generalize these library plugins, so that it is easy to
> add new code and load them as plugins. Actually, making them available as
> plugins should not be too much work given the already existing
> infrastructure. There already exist a handful of "function modules", the
> control structure is just statically created during compile time, much as
> some of the output plugins are statically linked.
>
> Then the original plan was to enable templates to call scripts and enable
> scripts to define templates (kind of). Unfortunately, I got distracted by
> more important things before I could complete all of this.
>
> HOWEVER, at this time performance was not a major concern. With what has
> evolved in the mean time, I do not like the original approach that much any
> longer. At least the script engine must become much faster before I can take
> a real look at that capability. Right now, scripts generate a interim code
> that then is interpreted by a (kind of) virtual machine. A script invocation
> inside a template would mean that a VM must be instantiated, the script
> interpreted and the resulting string be used as template contents. Clearly,
> this is not for high-performance use. Still, however, it may be useful to
> have that capability for those cases, where performance is not the #1
> consideration. But given that everything would need to be implemented, it
> does make limited sense to look into something known to be too slow in the
> long run. BTW, this is one reason that I have not yet continued to work on
> the script engine, knowing that some larger redesign is due to fit it into
> the now much tighter runtime constraints.
>
> On the performance of the output system: I think the system in general is
> quite fast and efficient, with only ONE important exception: that is, if
> multiple replacements need to happen. Still, the algorithm is quite
> efficient, but it is generic and needs to run though a number of steps. Of
> course, it is definitely faster to permit a C plugin to look at the message
> and then format, in an "atomic" way the resulting custom string. Thus, you
> need to write multiple C codes instead of using a generic engine, but can do
> so in a much higher performance way. I would assume, however, that this
> approach cannot beat the simple templates we usually use (maybe by less than
> 5% and, of course, there may be cases where this matters).
>
> As you know, my current focus is speed, together with some functional
> enhancements. I was looking at queue operations improvements, but the
> potential output speed improvements may be more interesting than the queue
> mode improvements (and apply to more use cases). So it may make sense to look
> into these, first. My challenge here is to find something that is
>
> a) generic enough to be useful in various (usual) cases
> b) specific enough to be rather fast
>
> and it should also be able to implement within a few weeks at most, because I
> can probably not spend much more time on a single feature/refactoring.
>
> One solution may be to create "template modules". I could envision a template
> module to be something that generates the template string *as a whole* from
> the input message.
>
> That is, we would have
>
> $template current-style,"%msg%\n"
>
> but also  (**)
>
> $modload tplcustom
> $template custom,tplcustom
>
> where tplcustom generates the template string.
>
> While this sounds promising, we have some issues. One immediately pops up my
> mind: we will probably be able to use the same template for file writing or
> forwarding, but for file writing we need a LF at the end, while for
> forwarding we do not need it.

this sounds very promising. I question if you really would use the same 
format for writing to a local file as you do when forwarding, the local 
file normall doesn't log the severity info (or at least not in the same 
format)

I believe you already have differing templates for standard vs forwarding 
in many cases.

rather than trying to do it in the config, is there a way to let the C 
module say "have that module do it's think, then I'll tweak the result" so 
that the code doesn't need to get duplicated between modules for the most 
common cases?

> So the most natural way would be to have the ability to embed a "custom
> template" into a regular template, like suggested by this syntax:
>
> $template both,"%=tplcustom%\n"
>
> however, this brings us down to the slippery slope of the original design. As
> a next thing to be requested, I could ask for using not the msg object (with
> its fixed unmodified properties), but rather of a transformation of the
> message object. So we would end up with something like this:
>
> $template cmplx,"%=tplcustom(syslogtag & msg)%"
>
> Which would require a much more complex logic working behind the scenes.
>
> Of course, depending on the format used, the engine could select different
> processing algorithms. Doing this on the fly seems possible, but requires
> more work than I can commit in one sequence.

this is definantly ugly

> Also, it would be useful to have the ability to persist already-generated
> properties with the message while it is continued to be processed in the rule
> engine. So far, we do not have this ability, and the reason is processing
> time (plus, as usual, implementation effort): for that, we would need to
> maintain a list (or hash, ...) of name/value pairs, store them to disk for
> disk queues and shuffle them through the rule engine as processing is carried
> out. As I said, quite doable, but another big addition.

I expect that this isn't worthwhile for a couple of reasons.

1. with something like this you need to worry about multi-thread 
protection and locks, which will kill your performance

2. with modern CPUs you really want to only work in the cache if you can. 
Any access to additional memory will stall the CPU long enough that you 
could have done a significant amount of processing instead. It's to the 
point where the kernel developers have measured and said that if the CPU 
needs to copy the contents of a TCP packet, they can do the checksum 
calculation at the same time and the overhead of the memory I/O will make 
the time taken for the calculateion to be a net zero additional time (the 
CPU internally processes things in parallel)

> So I am somewhat stuck with things that sound interesting, but are a bit
> interdependent. Doing them all together is too big to be useful, and it will
> probably fail because I can probably not keep focus on all of the for the
> next, say, 9 to 12 month that it would require to complete everything.
>
> So I am again down to picking what is most useful. Out of this discussion, it
> looks like the idea I marked with (**), the plain C template generator could
> be a useful route to take. I am saying this under the assumption that it
> would be relatively easy to implement and cause at least some speedup in
> standard cases (contrary to what I expect, I have to admit...). But that
> approach is highly specialized, requiring a C module for each custom format.
> So does it really serve the rsyslog community well - or just some very
> isolated use cases?
>
> Thinking more about it, it would probably be useful if it is both
>
> a) relatively easy to implement   and
> b) causes some speedup in standard cases
>
> But b) cannot be proven without actually implementing the interface. So, in
> practice, the questions boils down to what we *expect* about the usefulness
> of this utility.

well, rather than creating an entire interface, what about creating a 
patch to hard-code TraditionalFormat and TraditionalForwardFormat (or pick 
a couple) and we can benchmark the system with the hard-coded C formats vs 
the current process.

David Lang

> Having said that, I'd appreciate feedback, both on the concrete question of
> the usefulness of this feature as well as any and all comments on the
> situation at large. I am trying to put my development resources, which
> thankfully have been somewhat increased nowadays :) to the area where they
> provide greatest benefit.
>
> Rainer
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to