While I traditionally dislike roadmaps, mostly due to their inevitable inaccuracy, I think that it's useful at this point, especially given the recent period of inactivity, to define one for the next major Upstart milestone: 0.5.
The main goal of this milestone is to define the structure and behaviour of Upstart for its eventual 1.0 release. It should be largely feature- complete in terms of basic behaviour, allowing further development to concentrate on extensions and improvements. Quite by accident, the roadmap has been divided into different parts of the code base; I realised when doing so that this was probably the right order in which to make the changes as well. Unsurprisingly, this starts off quite detailed and becomes hand-wavy towards the end. libnih ------ No, Upstart won't be switching over to glib anytime soon. The main change for libnih is to increase the use of destructors to avoid the problem of determining which *_free() function needs to be called for a particular structure. Each structure will have a *_new() function that allocates it, calls an *_init() function to populate it and sets the destructor to an *_destroy() function. Separate *_free() functions like nih_list_free() will be removed, all calls will just be to nih_free(). The use of an *_init() function will allow structures to be embedded in others rather than using pointers for everything (saving on memory overhead); the containing structure will just call *_init() in its own *_init() function and *_destroy() in its own *_destroy() function. Upstart, the Service Manager ---------------------------- The most important piece of Upstart is being as good a service manager as it can be, so this will be a particular focus for 0.5. Service management in this case is the configuration and management of the individually defined services, including their life cycle and processes; the service manager isn't concerned with how services are started and stopped, but just what to do once they are. Bzr trunk already has a fair amount of changes to the configuration code, supporting the ability to reload the configuration and identify which jobs have come or gone. This will be further developed so that for each job name, there is a list of available configuration with that name; with the appropriate one being selected. This allows conflicting configuration to exist and be handled sanely; which is especially important since I'd like to support job creation by external processes. They'll be able to register themselves as a configuration source and define jobs under it. Upstart will intelligently two jobs attempting to own the same name, selecting one until it is deleted. Continuing in this vein I'm planning to separate the definition of a job, which is static, from its state data. Jobs which permit multiple instances will simply have state structures, rather than existing as a new copy of the job. The intent is that this will allow jobs to have arbitrary instance limits (e.g. one for each tty) rather than just one or infinite. It also rationalises the code somewhat. Upstart currently doesn't pay much attention to a process once it's been forked, other than to wait() for it when it dies. Certainly no effort is paid to check that the exec() call works, let alone any of the earlier environment setup. This is mostly fine, since the problem is logged, but for various reasons we'd like to pay a little more attention. This will be done though a close-on-exec pipe to the child process, on error information will be written to it; thus all the parent has to do is poll for reading, and if it receives data, it knows there was a problem. This allows Upstart to better handle failures, for example taking action if /bin/sh isn't present. An important missing ability is to be able to disable a job from its definition, without having to resort to deleting the file. This will be added, such jobs will still be visible but will report an error if attempted to be started. Transient disabling will also be permitted through "dependencies"; these are lists of paths on the disk (such as the one being exec'd) that must exist, if they do not, the job reports an error on start. Jobs will also support "resources" as a method of throttling jobs; a resource is a string name and a floating point number, jobs define how much of a particular resource they use while running and can only run while the resource is greater than or equal to that number. This will typically be used for locking, or utilisation problems. If a job is started, but has insufficient resources, it will stay in the start/waiting state until the resource goes above the necessary level. Since service management is largely concerned with UNIX processes, the environment that they run in remains important. As well as letting the job definition define environment variables and their values, this will be extended to allow the definition to specify variables to be taken from init's own environment (typically PATH, TERM, etc.) ceasing these from being hard-coded. In addition, it does not seem unreasonable for environment to be specified when starting a job. Continuing this thought, it becomes logical that the environment variables for an instance are what makes that instance different from others of the same job definition. This may end up being the method by which we define the uniqueness of instances, for example "instance TTY" might mean that an instance is only spawned if the $TTY variable is different from any others running. This isn't fully decided yet, but it does seem to me that inventing some other mechanism for doing this is folly since the method of passing those values would just be environment variables anyway! Assuming we develop in this manner, it becomes very reasonable to expand the definition of environment variables in other configuration stanzas than just exec. It's useful to pass more than just environment variables to a job when starting it, it's also useful to be able to pass file descriptors as well. Some safe and secure mechanism will be found where a job started from the command-line can be told what its standard input/output/error should be (normally the terminal from which it was called). Finally the long-standing missing feature of being able to supervise forking daemons will be finished; a few alternate methods of doing this will be available, most likely finding the replacement process at SIGCHLD-time (though we might need to do this twice!) and watching for pid files. Upstart, the System V init Emulator ----------------------------------- Ironically, one of Upstart's stand-out features from other init replacements is its ability to reasonably emulate sysvinit. A few improvements in this are planned for 0.5. The main one is going to be the increased correct use of utmp and wtmp; telinit will handle setting the runlevel itself, and include the RUNLEVEL and PREVLEVEL variables in the event -- rather than relying on runlevel from storing it in utmp and setting them. The shutdown command will write a proper shutdown record, and there will be something to write a startup/reboot record on boot. More usefully, the compatibility reboot tool will check the runlevel, and use that to determine whether or not to call shutdown as the original sysvinit does (rather than relying on -f). Finally init will maintain INIT_PROCESS and DEAD_PROCESS entries in utmp for jobs that require it, through a "utmp id" stanza. The general user of this will the getty job, where such utmp entries are necessary for correct behaviour. Upstart, the IPC Server ----------------------- One minor, trivial change that it almost doesn't seem worth mentioning. Upstart's own home-brew IPC will be dropped, and instead it will depend on D-BUS. IPC is very difficult to get right, and is the biggest focus for security attacks on a daemon so the most important part *to* get right! Maintaining Upstart's own IPC code has been a huge burden, a third of Upstart's code is concerned with it. Making even trivial changes such as adding a "re-exec" command (a long-standing bug) require careful changes and testing, with extensive consideration as to backwards compatibility issues. Switching to an out-of-the-box IPC protocol just makes sense; D-BUS is the currently fashionable one, and its object model fits Upstart quite well. From an external point of view, this will make very little difference since initctl will still appear to behave the same way; the only change is that libupstart goes away, which nobody else is using anyway. The likely method for communication is that the current UNIX abstract namespace socket will remain, and that the protocol for communication over it will be peer-to-peer D-BUS. It will then also be possible to mark a job as the "message bus" in some way, once it is running, Upstart will connect to it as other processes do (while also keeping its socket open). Almost all contributed software will talk to it through the bus, rather than the peer-to-peer socket. Connecting to the message bus allows us to find out when certain D-BUS names are claimed, and thus this too can become a method by which daemons are held out of the running state. Consider a job marked with "dbus org.freedesktop.Hal", Upstart wouldn't mark it as running just because it has been forked, but would wait until the dbus name was claimed. Thus services no longer need to try and use D-BUS to be singletons, they can rely on Upstart handling that for them and wait to claim their bus name until they are ready -- knowing that no other copy of themselves can be running, because the service manager won't allow it. Upstart, the Service Activation Manager --------------------------------------- The other side of the Upstart coin to simply managing services is that it also manages their activation, automatically starting and stopping them when certain events are received. This has unsurprisingly proved to be Upstart's most compelling and controversial feature. To those on the side of controversy, it's worth noting that you don't *have* to start services in this manner. Take the case of D-BUS Service Activation for example; it makes sense for D-BUS to utilise Upstart to handle this, just as Upstart will utilise D-BUS for IPC. There's no reason for D-BUS to invent spurious "events" to pass to Upstart; instead it can simply ask Upstart (over D-BUS) to start a service by name, if it prefers to continue to maintain the D-BUS Name to Service Name mapping, or Upstart could even support starting services by D-BUS Name -- since it will likely have this information anyway to be able to defer the running state until the bus name is claimed. Initially it seems to make sense to discard the notion of "events" entirely, since they appear to be handled already by D-BUS Service Activation (managed by Upstart) and D-BUS Signals. Even in this model you'd want to be able to start or stop services by D-BUS Signals, which D-BUS doesn't currently provide (Service Activation is only if you are addressed by name). This doesn't quite fill the entire picture either though; there are still interesting cases where events can be considered methods instead of signals. Most notably the compatibility or near-compat events like startup, runlevel, etc. Even if signals were enough, there would need to be some way to pass the data of the signal to the process being started -- since it would be running too late to get on the bus and catch the signal before it was lost. Unfortunately many signals don't contain enough information or context anyway, HAL is a notable culprit for this. For example, a job is likely to want to have an instance of itself running for each device of a certain capability. Unfortunately HAL's signals only include the object path, so it's necessary to perform some communication first to convert the DeviceAdded and DeviceRemoved signals into useful events that can be matched and used to start/stop services which will want to know what they are supposed to be handling. And this doesn't even discuss events from non-D-BUS sources, such as inotify or even temporal events (cron). So we still appear to need the ability to define an abstract "event", with the interesting distinguishing feature that rather than watching an abstract flow, events are defined in advance and may actually require some kind of code to run to find out more information. This fits in with one of the original plans for Upstart, where you would have processes that performed particular jobs such as listening for signals from HAL and converting them into useful events for jobs. Not to mention that the one key reason to still think in terms of events is translating them into environment variables for the jobs. Since events still appear to need to stay around, so do states (defined in terms of events); again the very earliest musings about Upstart included the distinction between "edge" and "level" events -- and no matter how hard we try, they just won't go away. Jobs shouldn't need to care about tracking DeviceAdded signals, checking for the camera capability, etc. neither should they have to wordily repeat that they're started on one event and stopped on another... It should be enough that they can just say "while a camera is plugged in" (and then use the inherent environment to define whether one copy is run for all cameras, or each camera). This is what makes Upstart more than just a dumb service manager, by taking some effort to automatically start and stop services as required, it can keep the number running to the minimum needed -- thus conserving resources and improving performance of even the most hefty workstation. Scott -- Have you ever, ever felt like this? Had strange things happen? Are you going round the twist?
signature.asc
Description: This is a digitally signed message part
-- upstart-devel mailing list [email protected] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/upstart-devel
