This is a second update to the Upstart 0.5 Roadmap sent to this mailing list five months ago, which you can find in the archives here: https://lists.ubuntu.com/archives/upstart-devel/2007-October/000468.html
You can find the first update in the archives here: https://lists.ubuntu.com/archives/upstart-devel/2008-January/000573.html Progress -------- Much of the work since the last update has been on the interaction between events and jobs, and what I've come to term the atomicity of jobs. One of the most immediate obvious changes is the loss of arguments to events. The simple reason for this is that with event expressions, there's no logical way to pass all of the arguments of all of the events to the job; so you'd have to duplicate the information anyway, ending up with something like: interface-up eth0 00:11:D8:98:1B:37 IFACE=eth0 HWADDR=00:11:D8:98:1B:37 TYPE=1 Obviously this is a bit silly. The change means that events now only have the environment variables ("parameters") part: interface-up IFACE=eth0 HWADDR=00:11:D8:98:1B:37 TYPE=1 The order they are specified in is preserved, so to match them you can either do it by name: start on interface-up IFACE=eth* or by position: start on interface-up eth* as long as positional matches come first, you can use both: start on interface-up eth* TYPE=1 I figure that the documentation for events will indicate which ones you can rely on being in order, and that they'll be the primary ones for the event. This change means that it's now predictable what the environment of an event expression is and can be extracted at the point the expression becomes true. This environment is combined with that present in the job configuration (which may take variables from init's own environment) and stored in the new job instance when it is started. If the start expression becomes TRUE while the instance is stopping, it does not immediately replace it; instead the old environment remains since the post-stop script may need it. Once the job has finished stopping, and restarts, the new environment is used. ie. given the definition: start on foo stop on bar pre-start exec echo pre-start $FOO post-start exec echo post-start $FOO exec echo main $FOO && sleep inf pre-stop exec pre-stop $FOO post-stop exec post-stop $FOO You would expect to see the following: $ initctl emit foo FOO=hello pre-start hello post-start hello main hello $ initctl emit bar ; initctl emit foo FOO=goodbye pre-stop hello post-stop hello pre-start goodbye post-start goodbye main goodbye This, I think makes much more sense. The list of events that started the job can now be found in the $UPSTART_EVENTS variable, instead of as positional arguments, so that they're consistently available. The same holds true for the events that stop the job, except that the environment from these is not generally useful for the job since it's often just a match for what started it. That being said, since pre-stop is only run for natural stops and cancel the stop without ill effect, it makes sense that this script should receive the stop event environment. Thus it does so, overriding that from the start events where different. The list of events that stopped the job can be found in the $UPSTART_STOP_EVENTS variable. One of the other changes that this introduced was removing the need for the job to keep a reference to the event longer than it needed to block, since it has the environment. This was originally a fix for the "respawn loses environment" problem, but that's irrelevant anyway now. In order to reset the start and stop operators immediately after matching (so that they need to be completely repeated), as well as copying the environment out, we build a blocking list of events. At the same time, the periods that events (and by inference, start and stop commands) are blocked was rationalised. "start job" will now block until the job is running (stopped again for tasks), or until the command is somehow interrupted. If a process fails, or a stop event occurs, or another admin runs "stop job", the start command will exit immediately. Likewise for the "stop" command. Previously the commands would still block until the job was resting again, this seemed overkill and was causing problems for event sequencing. So what does this all buy us? Jobs now have a table of environment variables given to them by the events that started them. We'll extend the start/stop commands to also be able to do this as well. We can use this environment to expand variable references in some special job stanzas. The first and most obvious one where this is useful is "stop on": start on tty-added stop on tty-removed $TTY The value of the $TTY variable is taken from the job's environment, so thus from the start events. Where a variable isn't found, it can never match; so assuming you keep the names unique: start on tty-added or cua-added stop on tty-removed $TTY or cua-removed $CUA Cannot pair a tty-added event with a cua-removed event. The expansion is somewhat shell like, though we should stress that it is only intended to be a limited subset that may not be truly compatible. (Compared to the expansion in script/exec which is actually done by passing the string unmodified to a shell, and letting it worry about it) Current forms we support: $VAR simple reference ${VAR} reference where there might be confusion ${VAR:-foo} foo used if $VAR unset or NULL ${VAR:+foo} foo used unless $VAR is unset or NULL ${VAR-foo} foo used if $VAR unset ${VAR+foo} foo used unless $VAR is unset I'd like to support the #, ##, % and %% forms too; but I haven't figured those out yet. The other stanza where these are expanded is a new one, well, an extension to an existing one. Previously Upstart has supported singleton jobs where only one copy could be active at any one time and instance jobs (now "unlimited-instance") where any number could be running. We now have a middle-ground; you can define a string by which instances must be unique. Only one instance of a given "name" may be active at any one time. The way to define these is by giving an argument to the "instance" stanza: instance $TTY Obviously it makes no sense to not specify any variable expansions here, since the effect would be the same as a singleton job. Other stanzas where these will be expanded will be the planned file dependencies and resources stanzas (see the roadmap). They may be expanded for other similar service activation stanzas as and when we invent them. Variables are explicitly *not* expanded in process stanzas such as "umask", "nice", etc. This is because events aren't sanitised, so you could be at risk of a malicious user injecting bad resource limits, etc. The right way to do this is in the script itself, and to check the value first. (The reason this doesn't apply to the service activation stanzas is that the worst you can do is start a service that will immediately fail.) A surprising and last-minute change has been to the behaviour of the "respawn limit" stanza. This now only limits Upstart's automatic respawning of the job (ie. the "respawn" command itself). Manual restarts of the job are expressly not limited in this way, since the proper way to stop an administrator restarting a job in a while loop is to hit them. Missing Pieces -------------- In other words, "when's the release?" There are two remaining pieces to land before I'm ready to release an 0.5.0 version. The first is a change to the state machine; this is to support features such as resources in the future. A new "inactive" state will be introduced, which will replace "waiting" as the default and final state of the job. "waiting" will become an intermediate state between "inactive" and "starting". Jobs may go from "stopped" into "waiting" directly if being restarted, and may go from "waiting" into "dead" directly if being Nothing will wait in the waiting state at first, but it means we have a state where we can wait later on and leave without worrying about countering event emissions with their opposites. The second is to reintroduce the IPC layer with D-BUS, initially this will likely be limited to the basic methods to get initctl working again -- with more methods such as job registration coming in later releases. Future ------ 0.5.0 isn't intended to be a complete release by any stretch of the imagination, but a first release of the work in trunk in order to widen the testing that it can get and so we can discover what else we need to do. 0.5.x releases will quite quickly see the addition of dependencies and resources, only not targeted for 0.5.0 since they're new features and I don't want to delay too long. Another thing I want to work into a relatively early 0.5.x release is the ability to disable jobs from automatic starting. The favourite two methods for doing this are "profiles" and "flags", and are basically just different ways of doing the same thing. A profile would be a first-class object of which any one can be active at a time. A profile either hand picks which jobs are enabled while active, or excludes jobs that are to be disabled. If the "single-user" profile were active, only the jobs it lists would be able to start. Flags are not so much first-class objects but tags that can appear in job definitions, either positively or negatively. Likewise any number of flags can either be "switched on" or "switched off" on the kernel command line, collectively enabling or disabling jobs. If "!networking" were placed on the kernel command-line, no job with "if networking" in its definition would be able to start automatically but jobs with "unless networking" would be able to start. It's not clear which of these two is the right way to do it yet. The current fork following code is relatively simple; when a process forks, it follows the fork and stops tracing the parent expecting it to go away. This could be done in a much more heuristic way to provide the right behaviour for most daemons. Upstart would trace the process, and follow forks as it does now; but it wouldn't forget the previous one or change the pid. Instead it would record the new pid as an additional process for the job. Should any process terminate or call exec(), it will be struck from the list of known processes and the next one in the list selected instead. If we run out of processes, then we deem it to have died. We may also keep a timer so that after a sensible time (30s?) if we've kept the same process, we forget about the others. Other interesting suggestions for the "wait for" stanza are the ability to wait for the listen() syscall, the creation of a file or the announcement of a D-BUS name. Scott -- Have you ever, ever felt like this? Had strange things happen? Are you going round the twist?
signature.asc
Description: This is a digitally signed message part
-- upstart-devel mailing list [email protected] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/upstart-devel
