Re: Process Dependency?
Admittedly my posting was hurried. On Fri, Oct 31, 2014 at 3:34 PM, Laurent Bercot wrote: > Le 31/10/2014 21:04, Avery Payne a écrit : > > Of course, you are right, in the perspective that dbus >> itself is encouraging a design that is problematic, and that problem >> is now extending into the issues I face when writing my (legacy?) >> scripts. >> > > But it's okay to have dependencies ! > I was picking on dbus because from a practical standpoint, it's a thorn in my side. Writing the ./run script for lightdm was...traumatic. Nothing like a flickering screen that prevents you from entering "sv stop lightdm" because it keeps switching to vt7, then dies, switches to vt7, then dies, switches to vt7...all because dbus didn't "report back". It was rather nasty. > We currently have down -> up >> -> finish -> down as a cycle. In order for dependencies to work, we >> would need a 4-state, i.e. down -> starting -> up -> finish -> down. >> (...) >> > > You are thinking mechanism design here, but I believe it's way too > early for that: I still fail to see how the whole thing would be > beneficial. > I'm probably twsting things quite a lot. The idea was to have a logical state (as part of a finite state machine) that we want to be "in", and then have the software attempt to align the actual state with the logical. Once aligned, they are "in sync" and considered valid. If they are not in sync, then keep trying to align. It's probably too simplistic. > The changes you suggest are very intrusive, so they have to bring matching > benefits. What are those benefits ? What could you do with those changes > that you cannot do right now ? There isn't guesswork with regard to cleanup inside ./finish, which may or many not need to touch "global state". Now ./finish is always a "local" cleanup, i.e. it doesn't care about anything else other than cleaning up after itself, and I don't have to worry about some secondary dependency that I left running behind. > So you are subjecting the starting of a process to an externally checked > global state. Why ? > There's some subtlety here. In this schema, the only state tracking done is "how many times has someone else needed Service X to be up". > The dependency checking can also > be auto-generated. > > If we fail, we go to finish, where the dependencies are notified >> that they aren't needed by our process; it's up to either B or B's >> supervisor (not sure yet) to decide if B needs to go down. >> > > And thus you put service management logic into the supervisor. > I hate these #blurredlines. :P Actually I didn't write clearly, and that was my fault. I'll walk it through and clean up my examples this time. In this example, when I say service, I mean process, and not "service management". 1. Sysadmin asks Service A to start. 2. The svscan process "sees" that Service A has a ./needs directory. 3. Svscan walks the directory entries for ./needs and "starts" each symlink, one at a time, as if someone asked that service to start normally. Each successful start increments a single counter for that service after-the-fact. The counter-per-service is the only change, and is the "global state" you are talking about. 4. If a ./needs fails start, or fails a timeout, it kills whatever it was working on, and then walks backwards through the list of ./needs it just started (we were just there! we should be able to do this). It decrements the counter for that entry as it visits it during the walk-back. For each Service X that it visits, if the counter is zero after decrement, AND Service X is not marked "wanted up", then svscan signals Service X to shut down normally through the supervisor associated with it; otherwise it is left running. At this point everything happens as if Service X was asked to shut down normally, i.e. ./finish is run, etc. 5. If all of the ./needs are reported as up, then the supervisor for Service A is started as normal. That's pretty much it. If we fail, Service A never starts, and svscan can clean up *by asking all the dependencies to clean themselves up*, using the existing mechanisms to shut down the service. If we succeed, nothing changes from what happens already. Process dependency moves out of the run script and into a location that can "see" all the other processes, rather than needing a helper to "ask" if a process is up inside of the script. * * * * * Ok, yeah, I'm looking at the list now and I can see some objections in it. The tally table is probably not ideal because it consumes RAM. I'm picturing a one-time allocation of a blo
Re: Process Dependency?
Le 31/10/2014 21:04, Avery Payne a écrit : The reason I was attempting to tackle process dependencies is actually driven by dbus & friends. I've stumbled on many "modern" processes that need to have A, and possibly B, running before they launch. Of course, you are right, in the perspective that dbus itself is encouraging a design that is problematic, and that problem is now extending into the issues I face when writing my (legacy?) scripts. But it's okay to have dependencies ! If there's a problem with D-Bus, it's not that it needs to be up early because other stuff depends on it. It's that it's way too complex for what it does at such a low level and it bundles together functionality that has no business being provided by a bus. But you cannot blame software for having dependencies - system software is written to be depended on. I haven't had time to think the entire state diagram through but I did give it some brief thought. Personally, I see current frameworks as potentially having an incomplete process state diagram. Most of them have a tri-state arrangment, and I think this is where part of the problem with dependencies shows up. We currently have down -> up -> finish -> down as a cycle. In order for dependencies to work, we would need a 4-state, i.e. down -> starting -> up -> finish -> down. (...) You are thinking mechanism design here, but I believe it's way too early for that: I still fail to see how the whole thing would be beneficial. The changes you suggest are very intrusive, so they have to bring matching benefits. What are those benefits ? What could you do with those changes that you cannot do right now ? The starting state is where magic happens. During starting, other dependencies are notified to start. If they all succeed, we go to up. So you are subjecting the starting of a process to an externally checked global state. Why ? A process can start 1. when the global service state changes and this process is now wanted up, or 2. when it has died unexpectedly and the supervisor just maintains it alive. In case 1, there's no need to modify the supervisor at all : the existing functionality is sufficient. Service management can be done on top of it, possibly via generated scripts. In case 2, the global state has not changed, the process is simply wanted up and should be restarted. If it is wanted up, then its dependencies are *also* wanted up, and should be up. Why would you need to perform a check at that point ? Just restart the process. If something goes wrong, it will die and try again; it's only a transient state, so it's no big deal. Heavyweight applications for which it *is* a big deal to unsuccessfully start up several times can have a safeguard at the top of their run script that checks dependencies for *this* service. The dependency checking can also be auto-generated. If we fail, we go to finish, where the dependencies are notified that they aren't needed by our process; it's up to either B or B's supervisor (not sure yet) to decide if B needs to go down. And thus you put service management logic into the supervisor. I hate these #blurredlines. :P Looping A seems undesirable until you realize that your issue isn't A, it's B. And when A fails to start, there should be a notification to the effect "can't start A because of B". But what are you trying to achieve with this that you cannot already do ? Why can't you just let A be restarted until it works ? If A is too heavy, then specific guards can be put in place, but that should be the exception, not the rule. What I can envision is keeping the global "wanted" state somewhere along with the global "actual" state, and writing primitives, to be called from run scripts, that block until the needed "actual" sub-state matches the needed "wanted" sub-state. That way, processes that choose it can block, instead of loop, until their dependencies are met. But that's a job for a global dependency manager, not a supervision suite, and there is still no need to modify the supervisor. So you can see, supervisors don't talk to each other, the overlord process pretty much stays as it is Not at all - the changes you suggest would be quite heavy on the overlord and the supervisor. Today, overlords can send signals to supervisors and that's it; whereas what you want involves real two-way communication. Sure, it's simple communication with a trivial protocol, but it's still a significant architectural change and complexity increase. Yes, indirectly. Just because A wants to start, but can't because B is having its own issues. I see it as separation of responsibilities - B has to get itself off the ground to start running, but B-the-process doesn't have to be aware of A's needs, that's a problem for B's supervisor. Yes. So let B do its thing, and let A do its thing, and everything will be all right eventually. If looping A is a problem, then add something in A's run script that prevents it from do
Re: Process Dependency?
Part of the temptation I've been fighting with templates is to write "the grand unified template", that does it all. It sounds horrible and barely feasible, but the more I poke at it, the more I realize that there is a specific, constrained set of requirements that could be met in a single script...under the right circumstances. The reality is there will be more than one template in this arrangement. One that is "simple service", one that covers the unique needs of "getties", one that needs a "var/run-and-pid" (which is just simple-service with extras), and one that I haven't done yet that I call "swiss army knife", the script of scripts. There are still lots of "one-offs" that will be needed, and that shows the limits of what can be done. All of these solve the current issues with process management, and as Laurent has pointed out, *none* of them address service management. As a stop-gap, until service management is really ready, the plan is to temporarily patch over the issue by having smaller processes manage the state of services, and then controlling it through process management (and again, Laurent has pointed out this is sub-optimal). An example is using per-interface instances of dhcpcd (no, not *dhcpd*) to manage each interface. This is heavy and bloats up the process tree for larger systems because a single process is needed for each instance of state to manage, when the kernel itself is already using/managing that state. With regard to coming up with something akin to a domain-specific language in the form of a specification for services, this is ideal and solves plenty. I would love, love, LOVE to see a JSON specification that addressed the 9+ specific needs of starting processes as a base point, and then extend it to provide full service coverage, becoming a domain-specific language that encompasses what is needed. This would be backwards compatible with daemontools/runit/s6 (within the limitations of the environment), forwards compatible with future service management, and would completely supplant the need for templates. I'd like to hear more. On Fri, Oct 31, 2014 at 2:40 AM, John Albietz wrote: > Script generators are the way I've been leaning. > > It's really convenient to have one or more services defined in some kind > of structured data format like yaml or json and to then generate and > install the service scripts. > > I wish there was a standard format to define services so the generator > could take one input file and output appropriate service scripts for > different process supervisor systems. > > Anyone seen any efforts in this direction? Most upstart and sysv scripts > have standard boilerplate, so it looks like there are common standards that > could be derived. > > - John > > > On Oct 31, 2014, at 1:05 AM, Laurent Bercot > wrote: > > > > > > First, I need to apologize, because I spend too much time talking and > > dismissing ideas, and not enough time coding and releasing stuff. The > > thing is, coding requires sizeable amounts of uninterrupted time - which > > I definitely do not have at the moment and won't until December or so - > > while writing to the mailing-list is a much lighter commitment. Please > > don't see me as the guy who criticizes initiatives and isn't helping or > > doing anything productive. That's not who I am. (Usually.) > > > > On to your idea. > > What you're suggesting is implementing the dependency tree in the > > filesystem and having the supervisor consult it. > > The thing is, it's indeed a good thing (for code simplicity etc.) to > > implement the dependency tree in the filesystem, but that does not make > > it a good idea to make the process supervision tree handle dependencies > > itself! > > > > (Additionally, the implementation should be slightly different, because > > your ./needs directory only addresses service startup, and you would also > > need a reverse dependency graph for service shutdown. This is an > > implementation detail - it needs to be solved, but it's not the main > > problem I see with your proposal. The symlink idea itself is sound.) > > > > The design issues I see are: > > > > * First and foremost, as always, services can be more than processes. > > Your design would only handle dependencies between long-lived processes; > > those are easy to solve and are not the annoying part of service > > management, iow I don't think "process dependency" is worth tackling > > per se. Dependencies between long-lived processes and machine state that > > is *not* represented by long-lived processes is the critical part of > > dependency manage
Fwd: Process Dependency?
A message was dropped...passing it along as part of the discussion -- Forwarded message -- From: Casper Ti. Vector Date: Fri, Oct 31, 2014 at 3:37 AM Subject: Re: Process Dependency? To: Avery Payne Sorry, but I just found that I did not list-reply your original mail, so this pratically became a private message. Nevertheless, you may forward this message to the mail list if you consider it favourable :) On Fri, Oct 31, 2014 at 08:15:03AM +0800, Casper Ti. Vector wrote: > For one already implemented way of dependency interface in > daemontools-like service managers, you can have a look at how nosh does > it. -- My current OpenPGP key: 4096R/0xE18262B5D9BF213A (expires: 2017.1.1) D69C 1828 2BF2 755D C383 D7B2 E182 62B5 D9BF 213A
Re: Process Dependency?
Script generators are the way I've been leaning. It's really convenient to have one or more services defined in some kind of structured data format like yaml or json and to then generate and install the service scripts. I wish there was a standard format to define services so the generator could take one input file and output appropriate service scripts for different process supervisor systems. Anyone seen any efforts in this direction? Most upstart and sysv scripts have standard boilerplate, so it looks like there are common standards that could be derived. - John > On Oct 31, 2014, at 1:05 AM, Laurent Bercot > wrote: > > > First, I need to apologize, because I spend too much time talking and > dismissing ideas, and not enough time coding and releasing stuff. The > thing is, coding requires sizeable amounts of uninterrupted time - which > I definitely do not have at the moment and won't until December or so - > while writing to the mailing-list is a much lighter commitment. Please > don't see me as the guy who criticizes initiatives and isn't helping or > doing anything productive. That's not who I am. (Usually.) > > On to your idea. > What you're suggesting is implementing the dependency tree in the > filesystem and having the supervisor consult it. > The thing is, it's indeed a good thing (for code simplicity etc.) to > implement the dependency tree in the filesystem, but that does not make > it a good idea to make the process supervision tree handle dependencies > itself! > > (Additionally, the implementation should be slightly different, because > your ./needs directory only addresses service startup, and you would also > need a reverse dependency graph for service shutdown. This is an > implementation detail - it needs to be solved, but it's not the main > problem I see with your proposal. The symlink idea itself is sound.) > > The design issues I see are: > > * First and foremost, as always, services can be more than processes. > Your design would only handle dependencies between long-lived processes; > those are easy to solve and are not the annoying part of service > management, iow I don't think "process dependency" is worth tackling > per se. Dependencies between long-lived processes and machine state that > is *not* represented by long-lived processes is the critical part of > dependency management, and what supervision frameworks are really lacking > today. > > * Let's focus on "process dependency" for a bit. Current process > supervision roughly handles 4 states: the wanted up/down state x the > actual up/down state. This is a simple model that works well for > maintaining daemons, but what happens when you establish dependencies > across daemons ? What does it mean for the supervisor that "A needs B" ? > > - Does that just mean that B should be started before A at boot > time, and that A should be stopped before B at shutdown time ? That's > sensible, but it simply translates to "have a constraint on the order > of the wanted up/down state changes at startup or shutdown". Which can > be handled by the init and shutdown scripts, without the need for direct > support in the supervision framework; an offline init script generator > could analyze the dependency tree and output the proper script, which > would contain the appropriate calls to sv or s6-svc in the correct order. > > - Or does that mean that every time A is started (even if it is > already wanted up and has just unexpectedly died) the supervisor should > check the state of B and not restart A if B happens to be down ? What > would be the benefit of that behaviour over the current one which is > "try and restart A after one second no matter what" ? If B is supposed to > be up, then A should restart without making a fuss. If B is wanted up but > happens to be down, it will be back up at some point and A will then be > able to start again. If B is not wanted up, then why is A ? The dependency > management system has not properly set the wanted states and that is the > problem that needs to be fixed. > * Supervisors currently have no way of notifying their parent, and they > don't need to. Their parent simply maintains them; supervisors are pretty > much independent. You can even run s6-supervise/runsv without s6-svscan/ > runsvdir, even if that won't build a complete supervision tree. The point > is that a supervisor maintains one process according to its wanted state, > and that's it. (With an optional logger for runsv.) Adding a notification > mechanism from the supervisor to its parent (other than dying and sending > a SIGCHLD, obviously) would be a heavy change in the
Re: Process Dependency?
First, I need to apologize, because I spend too much time talking and dismissing ideas, and not enough time coding and releasing stuff. The thing is, coding requires sizeable amounts of uninterrupted time - which I definitely do not have at the moment and won't until December or so - while writing to the mailing-list is a much lighter commitment. Please don't see me as the guy who criticizes initiatives and isn't helping or doing anything productive. That's not who I am. (Usually.) On to your idea. What you're suggesting is implementing the dependency tree in the filesystem and having the supervisor consult it. The thing is, it's indeed a good thing (for code simplicity etc.) to implement the dependency tree in the filesystem, but that does not make it a good idea to make the process supervision tree handle dependencies itself! (Additionally, the implementation should be slightly different, because your ./needs directory only addresses service startup, and you would also need a reverse dependency graph for service shutdown. This is an implementation detail - it needs to be solved, but it's not the main problem I see with your proposal. The symlink idea itself is sound.) The design issues I see are: * First and foremost, as always, services can be more than processes. Your design would only handle dependencies between long-lived processes; those are easy to solve and are not the annoying part of service management, iow I don't think "process dependency" is worth tackling per se. Dependencies between long-lived processes and machine state that is *not* represented by long-lived processes is the critical part of dependency management, and what supervision frameworks are really lacking today. * Let's focus on "process dependency" for a bit. Current process supervision roughly handles 4 states: the wanted up/down state x the actual up/down state. This is a simple model that works well for maintaining daemons, but what happens when you establish dependencies across daemons ? What does it mean for the supervisor that "A needs B" ? - Does that just mean that B should be started before A at boot time, and that A should be stopped before B at shutdown time ? That's sensible, but it simply translates to "have a constraint on the order of the wanted up/down state changes at startup or shutdown". Which can be handled by the init and shutdown scripts, without the need for direct support in the supervision framework; an offline init script generator could analyze the dependency tree and output the proper script, which would contain the appropriate calls to sv or s6-svc in the correct order. - Or does that mean that every time A is started (even if it is already wanted up and has just unexpectedly died) the supervisor should check the state of B and not restart A if B happens to be down ? What would be the benefit of that behaviour over the current one which is "try and restart A after one second no matter what" ? If B is supposed to be up, then A should restart without making a fuss. If B is wanted up but happens to be down, it will be back up at some point and A will then be able to start again. If B is not wanted up, then why is A ? The dependency management system has not properly set the wanted states and that is the problem that needs to be fixed. * Supervisors currently have no way of notifying their parent, and they don't need to. Their parent simply maintains them; supervisors are pretty much independent. You can even run s6-supervise/runsv without s6-svscan/ runsvdir, even if that won't build a complete supervision tree. The point is that a supervisor maintains one process according to its wanted state, and that's it. (With an optional logger for runsv.) Adding a notification mechanism from the supervisor to its parent (other than dying and sending a SIGCHLD, obviously) would be a heavy change in the design and take away from the modularity. It can be done, but not without overwhelming benefits to it; and so far I've found that all the additional stuff that we might need would be best handled *outside of* the supervisors themselves. The more I think about it, the more convinced I am that script generators are the way to go for dependency management, and service management in general. Script generators can take input in any format that we want, and output correct startup/shutdown sequences, and correct run scripts, using the infrastructure and tools we already have without adding complexity to them. It's something I will definitely be looking into. -- Laurent
Process Dependency?
I know that most process management tools look at having the script do the heavy lifting (or at least carry the load by calling other tools) when trying to bring up dependencies. Couldn't we just have a (service)/needs directory? The idea is that the launch program (s6-supervise or runsv) would "see" the ./needs directory the same way it would "see" the ./log directory. Each entry in ./needs is a symlink to an existing (service) directory, so everything needed to start (dependency) is already available. The (service) launcher would notify its *parent* that it wants those launched, and it would be the parent's responsibility to bring up each process entry. For s6 the parent would be s6-svscan, for runit it would be runsvdir. During this time the launcher simply waits until it is signaled to either proceed, or to abort and clean up. Once all dependency entries are up, the parent would signal that the launcher can proceed to start ./run. There isn't much in the way of state-tracking beyond the signals, and the symlinks reduce the requirement for more memory. The existing mechanisms for checking processes remain in place, and can be re-used to ensure that a dependent process didn't die before the ./run script turns over control to (service). Just about all ./run scripts remain as-is and even if they "launch a dependency" they continue to work (because it's already launched). What are the hidden issues that I'm not aware of?