Hi Michael,

as promised, a more complete response. See inline below... 

> > I took me a while to craft a response to your excellent 
> question. I have
> > done this as a blog post so that it is easier to reference it in the
> > future.
> >
> > I suggest that everyone interested in the v3 design has a 
> look at it,
> > because it describes the way I am heading. If someone 
> doesn't like that,
> > it is now time to speak up - in a few weeks it will probably be
> > impossible to change routes.
> 
> Based on your recent blog post here are some thoughts of mine. Please
> keep in mind, that being (Debian) package maintainer, so I speak from
> a distributors pov.
> 
> 1.) external dependencies
> Having loadable modules for stuff like MySQL/PostgreSQL output support
> is great (for a package maintainer)
> This means I can package a basic rsyslog package with minimal
> dependencies (glibc) and people can chose to install
> rsyslog-(mysql,pgsql) based on their needs.
> Here in this case, loadable modules have a real benefit.
> The only remaining external dependencies currently in rsyslog are libz
> (NETZIP support) and gssapi-krb (KERBEROS support). If those
> functionality would be put into a loadable module, I'd support that.
> 2.) memory usage
> I don't think this is a real issue, even for embedded systems. See
> point 1. If we manage to put external dependencies into loadable
> modules, that would be sufficient imho.
> Modularising everything won't have that much of an memory benefit
> imho. Given that a standard rsyslogd is about 1M RSS, this doesn't
> imply the need for more modularization.

I tend to agree here with you. However, I think that a clear line needs
to be drawn between "modularization" and "loadable modules". Probably I
did cross this line myself a couple of times without even noticing.

The modularization effort *is* needed, because the current monolythic
code base is hard to maintain and has "evolved" - in contrast to be
designed. While this is not bad, you notice in many instants that the
underlaying design is no longer appropriate to do a number of things. In
order to add more really useful features, the code must be modularized.
This will reduce complexity and hopefully make it less subject to
programming error (which, of course, do still happen). [That doesn't
mean I am about to rewrite everything, more on that below.]

So the bottom line is we need modularization for new features. It also
helps greatly with portability, as I outlined in my other mail.

However, we do NOT necessarily need loadble modules - but so far, I find
them desirable. Either way, it seems to be very important to
differentiate between these two concepts.

> 3.) Security
> You mentioned, that you try to improve security through modules.
> Usually, having loadable module support is considered a security risk.

Sure - but it is the the ability to have loadable modules. This risk
does not scale with the number of modules. The problem is that the
interface can be attacked. It's only safe if you do not offer that
functionality at all. At least this is my point of view.

> One messed up $IncludeConfig directive (or manipulated through a
> malicious attacker), and you load potentially hazardous modules.
> Loadable modules support introduces a much bigger attack vector.

Agreed - but not by providing more than a view, as outlined above. The
module interface tries to mitigate against some risk. For example,
output modules never see the internal structure but only the resulting
strings. Of course, so far everything is in-process so a malicious
module may access whatever it wants. I have plans, howerver, to
optionally execute modules out of process. That would be a good option
for not well known third party modules. Obviously at the expense of
quite a performance hit...

> 
> I'm not a SELinux guy. But I'd be interested if loadable modules could
> cause trouble in putting rsyslog in it's own security domain. Maybe
> the fedora guys can comment on this.

I am not qualified to talk about this, but I think along the lines that
Peter posted...

> 
> 4.) (code) modularization
> Writing modular code doesn't strictly imply that the modules have to
> be loadable *.so files.
> You can still write modular code, with a strict API etc. and organize
> it e.g. via sub directories.

Fully agree.

> 5.) Performance penalty through loadable modules.
> I could be wrong on this point, but given that you have pointers on
> functions, when you use modules (dlsym), there is an additional
> indirection on each function call. This could have a performance
> impact for core functionality. This is just speculation and should be
> tested/evaluated beforehand. After all, rsyslog tries to meet
> high-performance needs, too.

No, that doesn't happen (by design). There is a very slight overhead
during module load, when interfaces are queried and symbols are fixed
up. Some indirection, of course, results from a generic approach, but
that is inevidable if the design is modular. With today's hardware, an
indirect function call is not as much of a problem. Again, that has
nothing to do with loable modules but with non-monolythic code.

> 6.) Inconvenience
> This is just a gut feeling, but having to load all sorts of moduls
> first, before rsyslog does anything, could prove cumbersome. As
> administrator you'd have to know, which modules to load, to get
> rsyslog to do what you want. This means additional effort (reading
> docs) and inconvenience.

That's why I said the "core modules" could be present in the default.
But more later below...

> 7.) Robustness
> Having a single binary can prove to be a live safer. E.g. you could
> simply copy the rsyslogd binary from another machine if there is
> something wrong with your local copy.

Agreed, but copying a directory with a number of executables shouldn't
be that much harder ;)

> As you were talking about embedded systems, having the ability to
> compile a static binary including all functionality is a definite
> plus. There might be platforms out there which don't support dlopen().
That's an excellent point!

> 
> I'm still missing the complete picture, too. It's still a bit 
> too vague for me.

That's probably a problem with what I can convey. As I wrote above, I am
not doing a rewrite, with a new design and a new code base. My goal is
to *migrate* rsyslog (which originally based on sysklogd code) over to a
new, better architecture. I try to make as few compromises as possible,
but every now and then there are some things that can't be ignored. My
backward-compatibility questions on this list are part of the process.
So while I have an idea of where I am heading and what I want to achive,
I do not have a crystal clear view of how that will actually happen. I
am going iteratively, tackeling one feature after another and feeding
experience gained back into the process. I am currently at the input
modules and there are definitely some lessons to learn ;)

> 
> You were talking about input, output and filter modules. Rainer, could
> cou try to complete this list somehow, maybe draw/sketch a flow
> diagram, marking the modules.
> 
> input: local, kernel and network, ...
> filter: regexp, ...
> output: mysql, pgsql, file, forward...
> authorization?

Let me give it a try:

Input: immark, imrfc3195, imklog, imfile, imudp, imtcp, imgssapi?,
imtls, imuxsock, ...

Filter: many (many!)... regexp, substr, tolower, toupper, dnslookup?,
left, right, mysqldate, pgsqldate, whateverdate, md5sum, ...

Output: ommysql, ompgsql, omusrmsg, omfwd (split into omudp, omtcp,
omgssapi, ...), ...

Authorization will be provided by the inputs... But there will be some
helpers. How this will happen is not yet clear to me - that's part of
the iterative process. Initially it will be with the modules and then
possibly be migrated to somewhere else.

I'll try to get you a drawing, but that's always a pain for me ;)

> 
> I hope, this all doesn't sound too negative. 

No, much appreciated. It is extremely useful to get that kind of
feedback.

> But before going all
> modular, all these issues should be considered imho.
> 
> Hopefully these comments will help.

After going through all of this, I begin to think that libtool probably
has the answer. I don't know exactly how to, but I think it can take
loadbale modules and I can tell it to make them into static modules. If
that's the case, I can create loable modules, but the build process will
change that into static linking. If I am wrong with that, I can mangle a
bit with the main entry point names in the modules and achive the same
result with just changes to the build process (I don't know yet how to
modify autotools to do that, but I hope there will be some helping hands
;)).

How does this sound?

Rainer
PS: please keep the comments coming, especially if you don't agree! 
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog

Reply via email to