Re: Process Dependency?

2014-10-31 Thread Avery Payne
Admittedly my posting was hurried.

On Fri, Oct 31, 2014 at 3:34 PM, Laurent Bercot  wrote:

> Le 31/10/2014 21:04, Avery Payne a écrit :
>
> Of course, you are right, in the perspective that dbus
>> itself is encouraging a design that is problematic, and that problem
>> is now extending into the issues I face when writing my (legacy?)
>> scripts.
>>
>
>  But it's okay to have dependencies !
>

I was picking on dbus because from a practical standpoint, it's a thorn in
my side.  Writing the ./run script for lightdm was...traumatic.  Nothing
like a flickering screen that prevents you from entering "sv stop lightdm"
because it keeps switching to vt7, then dies, switches to vt7, then dies,
switches to vt7...all because dbus didn't "report back".  It was rather
nasty.


> We currently have down -> up
>> -> finish -> down as a cycle.  In order for dependencies to work, we
>> would need a 4-state, i.e. down -> starting -> up -> finish -> down.
>> (...)
>>
>
>  You are thinking mechanism design here, but I believe it's way too
> early for that: I still fail to see how the whole thing would be
> beneficial.
>

I'm probably twsting things quite a lot.  The idea was to have a logical
state (as part of a finite state machine) that we want to be "in", and then
have the software attempt to align the actual state with the logical.  Once
aligned, they are "in sync" and considered valid.  If they are not in sync,
then keep trying to align.  It's probably too simplistic.


> The changes you suggest are very intrusive, so they have to bring matching
> benefits. What are those benefits ? What could you do with those changes
> that you cannot do right now ?


There isn't guesswork with regard to cleanup inside ./finish, which may or
many not need to touch "global state".  Now ./finish is always a "local"
cleanup, i.e. it doesn't care about anything else other than cleaning up
after itself, and I don't have to worry about some secondary dependency
that I left running behind.


>  So you are subjecting the starting of a process to an externally checked
> global state. Why ?
>

There's some subtlety here.  In this schema, the only state tracking done
is "how many times has someone else needed Service X to be up".


> The dependency checking can also
> be auto-generated.
>
>  If we fail, we go to finish, where the dependencies are notified
>> that they aren't needed by our process; it's up to either B or B's
>> supervisor (not sure yet) to decide if B needs to go down.
>>
>
>  And thus you put service management logic into the supervisor.
> I hate these #blurredlines. :P


Actually I didn't write clearly, and that was my fault.  I'll walk it
through and clean up my examples this time. In this example, when I say
service, I mean process, and not "service management".

1. Sysadmin asks Service A to start.

2. The svscan process "sees" that Service A has a ./needs directory.

3. Svscan walks the directory entries for ./needs and "starts" each
symlink, one at a time, as if someone asked that service to start
normally.  Each successful start increments a single counter for that
service after-the-fact.  The counter-per-service is the only change, and is
the "global state" you are talking about.

4. If a ./needs fails start, or fails a timeout, it kills whatever it was
working on, and then walks backwards through the list of ./needs it just
started (we were just there!  we should be able to do this).  It decrements
the counter for that entry as it visits it during the walk-back.  For each
Service X that it visits, if the counter is zero after decrement, AND
Service X is not marked "wanted up", then svscan signals Service X to shut
down normally through the supervisor associated with it; otherwise it is
left running.  At this point everything happens as if Service X was asked
to shut down normally, i.e. ./finish is run, etc.

5. If all of the ./needs are reported as up, then the supervisor for
Service A is started as normal.

That's pretty much it.  If we fail, Service A never starts, and svscan can
clean up *by asking all the dependencies to clean themselves up*, using the
existing mechanisms to shut down the service.  If we succeed, nothing
changes from what happens already.  Process dependency moves out of the run
script and into a location that can "see" all the other processes, rather
than needing a helper to "ask" if a process is up inside of the script.

* * * * *

Ok, yeah, I'm looking at the list now and I can see some objections in it.
The tally table is probably not ideal because it consumes RAM.  I'm
picturing a one-time allocation of a blo

Re: Process Dependency?

2014-10-31 Thread Laurent Bercot

Le 31/10/2014 21:04, Avery Payne a écrit :


The reason I was attempting to tackle process dependencies is
actually driven by dbus & friends.  I've stumbled on many "modern"
processes that need to have A, and possibly B, running before they
launch.  Of course, you are right, in the perspective that dbus
itself is encouraging a design that is problematic, and that problem
is now extending into the issues I face when writing my (legacy?)
scripts.


 But it's okay to have dependencies !
 If there's a problem with D-Bus, it's not that it needs to be up
early because other stuff depends on it. It's that it's way too
complex for what it does at such a low level and it bundles together
functionality that has no business being provided by a bus. But you
cannot blame software for having dependencies - system software is
written to be depended on.



I haven't had time to think the entire state diagram through but I
did give it some brief thought.  Personally, I see current frameworks
as potentially having an incomplete process state diagram.  Most of
them have a tri-state arrangment, and I think this is where part of
the problem with dependencies shows up.  We currently have down -> up
-> finish -> down as a cycle.  In order for dependencies to work, we
would need a 4-state, i.e. down -> starting -> up -> finish -> down.
(...)


 You are thinking mechanism design here, but I believe it's way too
early for that: I still fail to see how the whole thing would be beneficial.
The changes you suggest are very intrusive, so they have to bring matching
benefits. What are those benefits ? What could you do with those changes
that you cannot do right now ?



The starting state is where magic happens.  During starting, other
dependencies are notified to start.  If they all succeed, we go to
up.


 So you are subjecting the starting of a process to an externally checked
global state. Why ?
 A process can start 1. when the global service state changes and this
process is now wanted up, or 2. when it has died unexpectedly and the
supervisor just maintains it alive.
 In case 1, there's no need to modify the supervisor at all : the existing
functionality is sufficient. Service management can be done on top of it,
possibly via generated scripts.
 In case 2, the global state has not changed, the process is simply wanted
up and should be restarted. If it is wanted up, then its dependencies are
*also* wanted up, and should be up. Why would you need to perform a check
at that point ? Just restart the process. If something goes wrong, it will
die and try again; it's only a transient state, so it's no big deal.
Heavyweight applications for which it *is* a big deal to unsuccessfully
start up several times can have a safeguard at the top of their run script
that checks dependencies for *this* service. The dependency checking can also
be auto-generated.



If we fail, we go to finish, where the dependencies are notified
that they aren't needed by our process; it's up to either B or B's
supervisor (not sure yet) to decide if B needs to go down.


 And thus you put service management logic into the supervisor.
I hate these #blurredlines. :P



 Looping A
seems undesirable until you realize that your issue isn't A, it's B.
And when A fails to start, there should be a notification to the
effect "can't start A because of B".


 But what are you trying to achieve with this that you cannot already
do ? Why can't you just let A be restarted until it works ?
If A is too heavy, then specific guards can be put in place, but that
should be the exception, not the rule.

 What I can envision is keeping the global "wanted" state somewhere
along with the global "actual" state, and writing primitives, to be
called from run scripts, that block until the needed "actual" sub-state
matches the needed "wanted" sub-state. That way, processes that choose
it can block, instead of loop, until their dependencies are met. But
that's a job for a global dependency manager, not a supervision suite,
and there is still no need to modify the supervisor.



So you can see, supervisors don't talk to each other, the overlord
process pretty much stays as it is


 Not at all - the changes you suggest would be quite heavy on the
overlord and the supervisor. Today, overlords can send signals to
supervisors and that's it; whereas what you want involves real two-way
communication. Sure, it's simple communication with a trivial protocol,
but it's still a significant architectural change and complexity
increase.



Yes, indirectly.  Just because A wants to start, but can't because B
is having its own issues.  I see it as separation of responsibilities
- B has to get itself off the ground to start running, but
B-the-process doesn't have to be aware of A's needs, that's a problem
for B's supervisor.


 Yes. So let B do its thing, and let A do its thing, and everything will
be all right eventually. If looping A is a problem, then add something in
A's run script that prevents it from do

Re: Process Dependency?

2014-10-31 Thread Avery Payne
Part of the temptation I've been fighting with templates is to write "the
grand unified template", that does it all.  It sounds horrible and barely
feasible, but the more I poke at it, the more I realize that there is a
specific, constrained set of requirements that could be met in a single
script...under the right circumstances.  The reality is there will be more
than one template in this arrangement.  One that is "simple service", one
that covers the unique needs of "getties", one that needs a
"var/run-and-pid" (which is just simple-service with extras), and one that
I haven't done yet that I call "swiss army knife", the script of scripts.
There are still lots of "one-offs" that will be needed, and that shows the
limits of what can be done.  All of these solve the current issues with
process management, and as Laurent has pointed out, *none* of them address
service management.  As a stop-gap, until service management is really
ready, the plan is to temporarily patch over the issue by having smaller
processes manage the state of services, and then controlling it through
process management (and again, Laurent has pointed out this is
sub-optimal). An example is using per-interface instances of dhcpcd (no,
not *dhcpd*) to manage each interface.  This is heavy and bloats up the
process tree for larger systems because a single process is needed for each
instance of state to manage, when the kernel itself is already
using/managing that state.

With regard to coming up with something akin to a domain-specific language
in the form of a specification for services, this is ideal and solves
plenty. I would love, love, LOVE to see a JSON specification that addressed
the 9+ specific needs of starting processes as a base point, and then
extend it to provide full service coverage, becoming a domain-specific
language that encompasses what is needed.  This would be backwards
compatible with daemontools/runit/s6 (within the limitations of the
environment), forwards compatible with future service management, and would
completely supplant the need for templates.  I'd like to hear more.

On Fri, Oct 31, 2014 at 2:40 AM, John Albietz 
wrote:

> Script generators are the way I've been leaning.
>
> It's really convenient to have one or more services defined in some kind
> of structured data format like yaml or json and to then generate and
> install the service scripts.
>
> I wish there was a standard format to define services so the generator
> could take one input file and output appropriate service scripts for
> different process supervisor systems.
>
> Anyone seen any efforts in this direction? Most upstart and sysv scripts
> have standard boilerplate, so it looks like there are common standards that
> could be derived.
>
> - John
>
> > On Oct 31, 2014, at 1:05 AM, Laurent Bercot 
> wrote:
> >
> >
> > First, I need to apologize, because I spend too much time talking and
> > dismissing ideas, and not enough time coding and releasing stuff. The
> > thing is, coding requires sizeable amounts of uninterrupted time - which
> > I definitely do not have at the moment and won't until December or so -
> > while writing to the mailing-list is a much lighter commitment. Please
> > don't see me as the guy who criticizes initiatives and isn't helping or
> > doing anything productive. That's not who I am. (Usually.)
> >
> > On to your idea.
> > What you're suggesting is implementing the dependency tree in the
> > filesystem and having the supervisor consult it.
> > The thing is, it's indeed a good thing (for code simplicity etc.) to
> > implement the dependency tree in the filesystem, but that does not make
> > it a good idea to make the process supervision tree handle dependencies
> > itself!
> >
> > (Additionally, the implementation should be slightly different, because
> > your ./needs directory only addresses service startup, and you would also
> > need a reverse dependency graph for service shutdown. This is an
> > implementation detail - it needs to be solved, but it's not the main
> > problem I see with your proposal. The symlink idea itself is sound.)
> >
> > The design issues I see are:
> >
> > * First and foremost, as always, services can be more than processes.
> > Your design would only handle dependencies between long-lived processes;
> > those are easy to solve and are not the annoying part of service
> > management, iow I don't think "process dependency" is worth tackling
> > per se. Dependencies between long-lived processes and machine state that
> > is *not* represented by long-lived processes is the critical part of
> > dependency manage

Fwd: Process Dependency?

2014-10-31 Thread Avery Payne
A message was dropped...passing it along as part of the discussion
-- Forwarded message --
From: Casper Ti. Vector 
Date: Fri, Oct 31, 2014 at 3:37 AM
Subject: Re: Process Dependency?
To: Avery Payne 


Sorry, but I just found that I did not list-reply your original mail, so
this pratically became a private message.  Nevertheless, you may forward
this message to the mail list if you consider it favourable :)

On Fri, Oct 31, 2014 at 08:15:03AM +0800, Casper Ti. Vector wrote:
> For one already implemented way of dependency interface in
> daemontools-like service managers, you can have a look at how nosh does
> it.

--
My current OpenPGP key:
4096R/0xE18262B5D9BF213A (expires: 2017.1.1)
D69C 1828 2BF2 755D C383 D7B2 E182 62B5 D9BF 213A


Re: Process Dependency?

2014-10-31 Thread John Albietz
Script generators are the way I've been leaning. 

It's really convenient to have one or more services defined in some kind of 
structured data format like yaml or json and to then generate and install the 
service scripts. 

I wish there was a standard format to define services so the generator could 
take one input file and output appropriate service scripts for different 
process supervisor systems. 

Anyone seen any efforts in this direction? Most upstart and sysv scripts have 
standard boilerplate, so it looks like there are common standards that could be 
derived. 

- John 

> On Oct 31, 2014, at 1:05 AM, Laurent Bercot  
> wrote:
> 
> 
> First, I need to apologize, because I spend too much time talking and
> dismissing ideas, and not enough time coding and releasing stuff. The
> thing is, coding requires sizeable amounts of uninterrupted time - which
> I definitely do not have at the moment and won't until December or so -
> while writing to the mailing-list is a much lighter commitment. Please
> don't see me as the guy who criticizes initiatives and isn't helping or
> doing anything productive. That's not who I am. (Usually.)
> 
> On to your idea.
> What you're suggesting is implementing the dependency tree in the
> filesystem and having the supervisor consult it.
> The thing is, it's indeed a good thing (for code simplicity etc.) to
> implement the dependency tree in the filesystem, but that does not make
> it a good idea to make the process supervision tree handle dependencies
> itself!
> 
> (Additionally, the implementation should be slightly different, because
> your ./needs directory only addresses service startup, and you would also
> need a reverse dependency graph for service shutdown. This is an
> implementation detail - it needs to be solved, but it's not the main
> problem I see with your proposal. The symlink idea itself is sound.)
> 
> The design issues I see are:
> 
> * First and foremost, as always, services can be more than processes.
> Your design would only handle dependencies between long-lived processes;
> those are easy to solve and are not the annoying part of service
> management, iow I don't think "process dependency" is worth tackling
> per se. Dependencies between long-lived processes and machine state that
> is *not* represented by long-lived processes is the critical part of
> dependency management, and what supervision frameworks are really lacking
> today.
> 
> * Let's focus on "process dependency" for a bit. Current process
> supervision roughly handles 4 states: the wanted up/down state x the
> actual up/down state. This is a simple model that works well for
> maintaining daemons, but what happens when you establish dependencies
> across daemons ? What does it mean for the supervisor that "A needs B" ?
> 
>   - Does that just mean that B should be started before A at boot
> time, and that A should be stopped before B at shutdown time ? That's
> sensible, but it simply translates to "have a constraint on the order
> of the wanted up/down state changes at startup or shutdown". Which  can
> be handled by the init and shutdown scripts, without the need for direct
> support in the supervision framework; an offline init script generator
> could analyze the dependency tree and output the proper script, which
> would contain the appropriate calls to sv or s6-svc in the correct order.
> 
>   - Or does that mean that every time A is started (even if it is
> already wanted up and has just unexpectedly died) the supervisor should
> check the state of B and not restart A if B happens to be down ? What
> would be the benefit of that behaviour over the current one which is
> "try and restart A after one second no matter what" ? If B is supposed to
> be up, then A should restart without making a fuss. If B is wanted up but
> happens to be down, it will be back up at some point and A will then be
> able to start again. If B is not wanted up, then why is A ? The dependency
> management system has not properly set the wanted states and that is the
> problem that needs to be fixed.
>  * Supervisors currently have no way of notifying their parent, and they
> don't need to. Their parent simply maintains them; supervisors are pretty
> much independent. You can even run s6-supervise/runsv without s6-svscan/
> runsvdir, even if that won't build a complete supervision tree. The point
> is that a supervisor maintains one process according to its wanted state,
> and that's it. (With an optional logger for runsv.) Adding a notification
> mechanism from the supervisor to its parent (other than dying and sending
> a SIGCHLD, obviously) would be a heavy change in the 

Re: Process Dependency?

2014-10-31 Thread Laurent Bercot


 First, I need to apologize, because I spend too much time talking and
dismissing ideas, and not enough time coding and releasing stuff. The
thing is, coding requires sizeable amounts of uninterrupted time - which
I definitely do not have at the moment and won't until December or so -
while writing to the mailing-list is a much lighter commitment. Please
don't see me as the guy who criticizes initiatives and isn't helping or
doing anything productive. That's not who I am. (Usually.)

 On to your idea.
 What you're suggesting is implementing the dependency tree in the
filesystem and having the supervisor consult it.
 The thing is, it's indeed a good thing (for code simplicity etc.) to
implement the dependency tree in the filesystem, but that does not make
it a good idea to make the process supervision tree handle dependencies
itself!

 (Additionally, the implementation should be slightly different, because
your ./needs directory only addresses service startup, and you would also
need a reverse dependency graph for service shutdown. This is an
implementation detail - it needs to be solved, but it's not the main
problem I see with your proposal. The symlink idea itself is sound.)

 The design issues I see are:

 * First and foremost, as always, services can be more than processes.
Your design would only handle dependencies between long-lived processes;
those are easy to solve and are not the annoying part of service
management, iow I don't think "process dependency" is worth tackling
per se. Dependencies between long-lived processes and machine state that
is *not* represented by long-lived processes is the critical part of
dependency management, and what supervision frameworks are really lacking
today.

 * Let's focus on "process dependency" for a bit. Current process
supervision roughly handles 4 states: the wanted up/down state x the
actual up/down state. This is a simple model that works well for
maintaining daemons, but what happens when you establish dependencies
across daemons ? What does it mean for the supervisor that "A needs B" ?

   - Does that just mean that B should be started before A at boot
time, and that A should be stopped before B at shutdown time ? That's
sensible, but it simply translates to "have a constraint on the order
of the wanted up/down state changes at startup or shutdown". Which  can
be handled by the init and shutdown scripts, without the need for direct
support in the supervision framework; an offline init script generator
could analyze the dependency tree and output the proper script, which
would contain the appropriate calls to sv or s6-svc in the correct order.

   - Or does that mean that every time A is started (even if it is
already wanted up and has just unexpectedly died) the supervisor should
check the state of B and not restart A if B happens to be down ? What
would be the benefit of that behaviour over the current one which is
"try and restart A after one second no matter what" ? If B is supposed to
be up, then A should restart without making a fuss. If B is wanted up but
happens to be down, it will be back up at some point and A will then be
able to start again. If B is not wanted up, then why is A ? The dependency
management system has not properly set the wanted states and that is the
problem that needs to be fixed.
 
 * Supervisors currently have no way of notifying their parent, and they

don't need to. Their parent simply maintains them; supervisors are pretty
much independent. You can even run s6-supervise/runsv without s6-svscan/
runsvdir, even if that won't build a complete supervision tree. The point
is that a supervisor maintains one process according to its wanted state,
and that's it. (With an optional logger for runsv.) Adding a notification
mechanism from the supervisor to its parent (other than dying and sending
a SIGCHLD, obviously) would be a heavy change in the design and take away
from the modularity. It can be done, but not without overwhelming benefits
to it; and so far I've found that all the additional stuff that we might
need would be best handled *outside of* the supervisors themselves.

 The more I think about it, the more convinced I am that script generators
are the way to go for dependency management, and service management in
general. Script generators can take input in any format that we want, and
output correct startup/shutdown sequences, and correct run scripts, using
the infrastructure and tools we already have without adding complexity
to them. It's something I will definitely be looking into.

--
 Laurent


Process Dependency?

2014-10-30 Thread Avery Payne
I know that most process management tools look at having the script do the
heavy lifting (or at least carry the load by calling other tools) when
trying to bring up dependencies.  Couldn't we just have a (service)/needs
directory?

The idea is that the launch program (s6-supervise or runsv) would "see" the
./needs directory the same way it would "see" the ./log directory.  Each
entry in ./needs is a symlink to an existing (service) directory, so
everything needed to start (dependency) is already available.  The
(service) launcher would notify its *parent* that it wants those launched,
and it would be the parent's responsibility to bring up each process
entry.  For s6 the parent would be s6-svscan, for runit it would be
runsvdir.  During this time the launcher simply waits until it is signaled
to either proceed, or to abort and clean up.  Once all dependency entries
are up, the parent would signal that the launcher can proceed to start
./run.  There isn't much in the way of state-tracking beyond the signals,
and the symlinks reduce the requirement for more memory.  The existing
mechanisms for checking processes remain in place, and can be re-used to
ensure that a dependent process didn't die before the ./run script turns
over control to (service).  Just about all ./run scripts remain as-is and
even if they "launch a dependency" they continue to work (because it's
already launched).

What are the hidden issues that I'm not aware of?