[Warning for the faint of heart; some plugging of bcfg2 occurs in this mail.]
>>>>> "Luke" == Luke Crawford <[EMAIL PROTECTED]> writes: Luke> First off, I'm a "practical" type- a "computer janitor" as it Luke> were. I am also a consultant and an Entrepreneur, so I see my Luke> job as eliminating my job- I'm quite interested in Luke> configuration management systems, but on a conceptual level, I Luke> simply don't understand how they would usefully work. Do you mean tools at all, or more researchy systems like autonomics and the like? Luke> Now, I see you are speaking of validation tools; This is Luke> something I understand quite well, and have implemented Luke> (perhaps in a more 'bottom up' than 'top down' fashion than Luke> the theory types would have) using Nagios; The idea being that Luke> those who consume my services should not need to know my phone Luke> number, so my policy is to run an external nagios server to Luke> monitor every service I provide to other people. As I am Luke> checking from an external perspective (sending mail through a Luke> mail system, or retrieving a html page and comparing Luke> known-good bits) the nagios-with-plugin system can catch just Luke> about any configuration error the customer can. Luke> Now, the problem with this is that it catches the error after, Luke> not before the error hits production; as I'm doing quite a lot Luke> of work with Xen-based paravirtualized servers, I'm thinking Luke> about simply running a full test environment with a duplicate Luke> of every real server within my virtual environment; then use Luke> some tool (perhaps systemimager? maybe systemimager with Luke> service-specific scripts to gracefully reload modified Luke> servers?) to copy configs from test to production after the Luke> test passes validation. We've done a bit of work using staging to catch errors before installation. We basically could tag new configuration bits as being in testing, and then the testing systems would get them before the rest of systems, so that errors could be freely encountered and repaired. Basically what you are describing here is a set of software engineering/testing methodologies. Our paper at LISA this year describes a set of slick ways to integrate timeline and versioning data into configuration management specification, and the things you can do with this info once you have it. (All implemented with bcfg2, of course) I would suggest taking a look at it once it comes out. We have done a lot of this sort of server replication, now that our specification is complete. We have found that using the configuration management system to rebuild a system (upon system disk failure or the like) is frequently faster and easier than going to backups. Producing multiple instances of the same service in less tense situations is a breeze. I would greatly suggest you look at bcfg2 for this sort of thing. (We've actually use bcfg2 on top of system imager; SI is used for basic system builds, but all differentiation and updates are done through bcfg2.) <snip> Luke> Me, I'm on the list because I'm interested in configuration Luke> management; but frankly, my brain it too small to comprehend Luke> how you might go about replacing my configuration management Luke> duties with a program. I like the idea, I just don't know how Luke> you would do any better than a systemimager style "base image Luke> for each class of machine, then per-box lists of diffs to Luke> apply" This is a good illustration of the communication gaps between research and practitioners. Bcfg2 does a great job of managing a symbolic configuration for a system in a similar fashion, while allowing much more fine-grained control over things. Ping me off-list for more details, if you are interested. Luke> The basic problem that *I* would like the theory people to Luke> solve is how to break down the "configure the system" Luke> high-level problem into a easy to understand set of tools like Luke> nagios that people like me can come in and configure on a low Luke> level for each one of our services. Heck, you don't even need Luke> to write the actual tools, just describe what the tools need Luke> to do (of course, that's what writing the tools would do, Luke> right? computer languages are designed to precisely specify Luke> what a program ought to do.) I think this also illustrates an interesting point. The theory folks aren't all that interested in problems at that level. They are a little past it. Keep in mind that most are working on fields like constraint solvers, autonomics, etc, and need to publish papers to maintain good academic standing. Until the tools provide a conduit between researchers and our ready users, we will have this disconnect. -nld _______________________________________________ lssconf-discuss mailing list lssconf-discuss@inf.ed.ac.uk http://lists.inf.ed.ac.uk/mailman/listinfo/lssconf-discuss