Hi Michael, as promised, a more complete response. See inline below...
> > I took me a while to craft a response to your excellent > question. I have > > done this as a blog post so that it is easier to reference it in the > > future. > > > > I suggest that everyone interested in the v3 design has a > look at it, > > because it describes the way I am heading. If someone > doesn't like that, > > it is now time to speak up - in a few weeks it will probably be > > impossible to change routes. > > Based on your recent blog post here are some thoughts of mine. Please > keep in mind, that being (Debian) package maintainer, so I speak from > a distributors pov. > > 1.) external dependencies > Having loadable modules for stuff like MySQL/PostgreSQL output support > is great (for a package maintainer) > This means I can package a basic rsyslog package with minimal > dependencies (glibc) and people can chose to install > rsyslog-(mysql,pgsql) based on their needs. > Here in this case, loadable modules have a real benefit. > The only remaining external dependencies currently in rsyslog are libz > (NETZIP support) and gssapi-krb (KERBEROS support). If those > functionality would be put into a loadable module, I'd support that. > 2.) memory usage > I don't think this is a real issue, even for embedded systems. See > point 1. If we manage to put external dependencies into loadable > modules, that would be sufficient imho. > Modularising everything won't have that much of an memory benefit > imho. Given that a standard rsyslogd is about 1M RSS, this doesn't > imply the need for more modularization. I tend to agree here with you. However, I think that a clear line needs to be drawn between "modularization" and "loadable modules". Probably I did cross this line myself a couple of times without even noticing. The modularization effort *is* needed, because the current monolythic code base is hard to maintain and has "evolved" - in contrast to be designed. While this is not bad, you notice in many instants that the underlaying design is no longer appropriate to do a number of things. In order to add more really useful features, the code must be modularized. This will reduce complexity and hopefully make it less subject to programming error (which, of course, do still happen). [That doesn't mean I am about to rewrite everything, more on that below.] So the bottom line is we need modularization for new features. It also helps greatly with portability, as I outlined in my other mail. However, we do NOT necessarily need loadble modules - but so far, I find them desirable. Either way, it seems to be very important to differentiate between these two concepts. > 3.) Security > You mentioned, that you try to improve security through modules. > Usually, having loadable module support is considered a security risk. Sure - but it is the the ability to have loadable modules. This risk does not scale with the number of modules. The problem is that the interface can be attacked. It's only safe if you do not offer that functionality at all. At least this is my point of view. > One messed up $IncludeConfig directive (or manipulated through a > malicious attacker), and you load potentially hazardous modules. > Loadable modules support introduces a much bigger attack vector. Agreed - but not by providing more than a view, as outlined above. The module interface tries to mitigate against some risk. For example, output modules never see the internal structure but only the resulting strings. Of course, so far everything is in-process so a malicious module may access whatever it wants. I have plans, howerver, to optionally execute modules out of process. That would be a good option for not well known third party modules. Obviously at the expense of quite a performance hit... > > I'm not a SELinux guy. But I'd be interested if loadable modules could > cause trouble in putting rsyslog in it's own security domain. Maybe > the fedora guys can comment on this. I am not qualified to talk about this, but I think along the lines that Peter posted... > > 4.) (code) modularization > Writing modular code doesn't strictly imply that the modules have to > be loadable *.so files. > You can still write modular code, with a strict API etc. and organize > it e.g. via sub directories. Fully agree. > 5.) Performance penalty through loadable modules. > I could be wrong on this point, but given that you have pointers on > functions, when you use modules (dlsym), there is an additional > indirection on each function call. This could have a performance > impact for core functionality. This is just speculation and should be > tested/evaluated beforehand. After all, rsyslog tries to meet > high-performance needs, too. No, that doesn't happen (by design). There is a very slight overhead during module load, when interfaces are queried and symbols are fixed up. Some indirection, of course, results from a generic approach, but that is inevidable if the design is modular. With today's hardware, an indirect function call is not as much of a problem. Again, that has nothing to do with loable modules but with non-monolythic code. > 6.) Inconvenience > This is just a gut feeling, but having to load all sorts of moduls > first, before rsyslog does anything, could prove cumbersome. As > administrator you'd have to know, which modules to load, to get > rsyslog to do what you want. This means additional effort (reading > docs) and inconvenience. That's why I said the "core modules" could be present in the default. But more later below... > 7.) Robustness > Having a single binary can prove to be a live safer. E.g. you could > simply copy the rsyslogd binary from another machine if there is > something wrong with your local copy. Agreed, but copying a directory with a number of executables shouldn't be that much harder ;) > As you were talking about embedded systems, having the ability to > compile a static binary including all functionality is a definite > plus. There might be platforms out there which don't support dlopen(). That's an excellent point! > > I'm still missing the complete picture, too. It's still a bit > too vague for me. That's probably a problem with what I can convey. As I wrote above, I am not doing a rewrite, with a new design and a new code base. My goal is to *migrate* rsyslog (which originally based on sysklogd code) over to a new, better architecture. I try to make as few compromises as possible, but every now and then there are some things that can't be ignored. My backward-compatibility questions on this list are part of the process. So while I have an idea of where I am heading and what I want to achive, I do not have a crystal clear view of how that will actually happen. I am going iteratively, tackeling one feature after another and feeding experience gained back into the process. I am currently at the input modules and there are definitely some lessons to learn ;) > > You were talking about input, output and filter modules. Rainer, could > cou try to complete this list somehow, maybe draw/sketch a flow > diagram, marking the modules. > > input: local, kernel and network, ... > filter: regexp, ... > output: mysql, pgsql, file, forward... > authorization? Let me give it a try: Input: immark, imrfc3195, imklog, imfile, imudp, imtcp, imgssapi?, imtls, imuxsock, ... Filter: many (many!)... regexp, substr, tolower, toupper, dnslookup?, left, right, mysqldate, pgsqldate, whateverdate, md5sum, ... Output: ommysql, ompgsql, omusrmsg, omfwd (split into omudp, omtcp, omgssapi, ...), ... Authorization will be provided by the inputs... But there will be some helpers. How this will happen is not yet clear to me - that's part of the iterative process. Initially it will be with the modules and then possibly be migrated to somewhere else. I'll try to get you a drawing, but that's always a pain for me ;) > > I hope, this all doesn't sound too negative. No, much appreciated. It is extremely useful to get that kind of feedback. > But before going all > modular, all these issues should be considered imho. > > Hopefully these comments will help. After going through all of this, I begin to think that libtool probably has the answer. I don't know exactly how to, but I think it can take loadbale modules and I can tell it to make them into static modules. If that's the case, I can create loable modules, but the build process will change that into static linking. If I am wrong with that, I can mangle a bit with the main entry point names in the modules and achive the same result with just changes to the build process (I don't know yet how to modify autotools to do that, but I hope there will be some helping hands ;)). How does this sound? Rainer PS: please keep the comments coming, especially if you don't agree! _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog

