Re: Thoughts on First Class Services

2015-04-29 Thread Avery Payne
Note: this re-post is due to an error I made earlier today.  I've gutted 
out a bunch of stuff as well.  My apologies for the duplication.


On 4/28/2015 11:34 AM, Laurent Bercot wrote:


I'm also interested in Avery's experience with dependency handling.


Hm.  Today isn't the best day to write this (having been up since 4am) 
but I'll try to digest all the little bits and pieces into something.  
Here we go...


First, I will qualify a few things.  The project's scope is, compared to 
a lot of the discussion on the mailing list, very narrow.  There are 
several goals but the primary thrust of the project is to create a 
generic, universal set of service definitions that could be plugged into 
many init, distribution, and supervision framework arrangements. That's 
a tall order in itself, but there are ways around a lot of this.  So 
while the next three paragraphs are off-topic, they are there to address 
those three concerns mentioned.


With regard to init work, I don't touch it.  Trying to describe a proper 
init sequence is already beyond the scope of the project. I'm leaving 
that to other implementers.


With regard to distributions, well, I'm trying to make it as generic as 
possible.  Development is done on a Debian 7 box but I have made efforts 
to avoid any Debian-isms in the actual project itself.  In theory, you 
should be able to use the scripts on any distribution.


With regard to the supervision programs used, the difference in command 
names have been abstracted away.  I'm not entirely wild about how it is 
currently done, but creating definitions is a higher priority than 
revisiting this at the moment.  In the future, I will probably 
restructure it.


~ ~ ~ ~ ~ ~ ~ ~

What
---
The dependency handling in supervision-scripts is meant to be used in 
installations that don't have access to it.  Put another way, it's a 
Poor Man's Solution to the problem and functions as a convenience.  
The feature is turned off by default, and this will cause any service 
definition that requires other services to run-loop repeatedly until 
someone starts them manually.  This could be said to be the default 
behavior of most installations that don't have dependency handling, so 
I'm not introducing a disruptive behavior with this feature.


Why
---
I could have hard-coded many of the dependencies into the various run 
scripts, but this would have created a number of problems for other areas.


1. Hard-coding prevents switching from shell to execline in the future, 
by necessitating a re-write.  There will be an estimated 1,000+ scripts 
when the project is complete, so this is a major concern.


2. We are already using the filesystem as an ad-hoc database, so it 
makes sense to continue with this concept.  The dependencies should be 
stored on the filesystem and not inside of the script.


With this in mind, I picked sv/(service)/needs as a directory to hold 
the definitions to be used.  Because I can't envision what every init 
and future dependency management framework would look like, I'll simply 
make it as generic as I can, leaving things as open as possible to 
additional changes.A side note: it is by fortuitous circumstance 
that anopa uses a ./needs directory that has the same functionality and 
behavior.  I use soft links just because.  Anopa uses named files.  
The net effect is the same.


Each dependency is simply a named soft link that points to a service 
that needs to be started, typically something like 
sv/(service)/needs/foobar points to /service/foobar.  In this case, a 
soft link is made with the name of the service, pointing to the service 
definition in /service.  This also allows me to ensure that the 
dependency is actually available, and not just assume that it is there.


A single rule determines what goes into ./needs, you can only have the 
names of other services that are explicitly needed.  You can say foo 
needs baz and baz needs bar but NEVER would you say foo needs baz, 
foo needs bar.  This is intentional because it's not the job of the 
starting service to handle the entire chain.  It simplifies the list of 
dependencies because a service will only worry about its immediate 
needs, and not the needs of dependent services it launches.  It also has 
the desirable property of making dependency chains self-organizing, 
which is an important decision with hundreds of services having 
potentially hundreds of dependencies.   Setup is straightforward and you 
can easily extend a service need by adding one soft link to the new 
dependency.  This also fits with my current use of a single launch 
script; I don't have to change the script, just the parameters that the 
script uses.  The new soft link becomes just another parameter.  You 
could call this peer-level dependency resolution if you like.



How
---
Enabling this behavior requires that you set sv/.env/NEEDS_ENABLED to 
the single character 1.   It is normally set to 0.  With the setting 
disabled (zero), the entire 

Re: Thoughts on First Class Services

2015-04-28 Thread bougyman
On Tue, Apr 28, 2015 at 8:38 AM, Avery Payne avery.p.pa...@gmail.com wrote:

 The steps described are explicit and hard-coded into a service's ./run
 script.  This design works just OK and ensures complicated start-up
 sequences can be carried out, but it does not allow easy replacement of
 child services.  The catch is that the name of child(ren) will be baked
 into the parent ./run script.

I guess I don't know what this means, in practice. My child services
generally know about the
parent in their ./run script and the parent (sometimes) has to know
about the children in his
./finish script.


 The loosely-coupled approach taken by anopa (and my own work) allows easy
 replacement of a child service with some other service, at the cost of
 slightly increased complexity in the child's run script.

My child scripts are definitely more complex than the parents.
I've never had a problem replacing a child service, it's them who have
to know about my parents
and not the other way around. In practical terms:

/etc/sv/postgresql/run does not know about any of his children, but
/etc/sv/pgbouncer/run does know about his parent (postgresql) and
/etc/sv/app_which_uses_pg/run does know about his parent (pgbouncer),
but not his grandparent.

None of the parents know anything about their children.

 The net effect is
 the same but the difference is subtle.


I'd like to see the difference in a code example. I haven't had a
chance to dig in to anopa yet enough
to see how they couple it mouse loosely.

Tj


Re: Thoughts on First Class Services

2015-04-28 Thread Avery Payne

Dang it.  Hit the send button.

It will be a bit, I'll follow up with the completed email.  Sorry for 
the half-baked posting.


Re: Thoughts on First Class Services

2015-04-28 Thread bougyman
On Tue, Apr 28, 2015 at 12:31 PM, Steve Litt sl...@troubleshooters.com wrote:

 Good! I was about to ask the definitions of parent and child, but the
 preceding makes it clear.

Well at least we're talking the same language now, though reversing
parent/child is
disconcerting to my OCD.


 Here's the current version of run.sh, with dependency support baked
 in:
 https://bitbucket.org/avery_payne/supervision-scripts/src/b8383ed5aaa1f6d848c1a85e6216e59ba98c3440/sv/.run/run.sh?at=default


 That's a gnarley run script. It's as big as a lot of sysvinit or OpenRC
 scripts I've seen. One of the reasons I like daemontools style package
 management is my run scripts are usually less than 10 lines.

This was my thought, as well. It adds a level of complexity we try to
avoid in our run scripts.
It also seems to me that there is less typing involved in individual
run scripts than the
individual things that have to be configured for this script. If on
goal of this
abstraction is to minimize mistakes, adding more moving parts to edit
doesn't seem to
work towards that goal.

 And, as you said in a past email, having a run-once capability without
 insane kludges would be nice, and as you said in another past email,
 it's not enough to test for the child service to be up according to
 runit, but it must pass a test to indicate the process itself is
 functional. I've been doing that ever since you mentioned it.

When run-once is necessary we decide per-service whether it's a true
one-shot (networking,
etc) or a spooky background daemon that we may want to watch. If the
latter the ./run script
contains our (probably polling) supervisor, if it's a networking type
job, we end with exec /bin/pause (part of runit-void).
For the 'making sure it is actually up', a ./check script suffices.

Tj


Re: Thoughts on First Class Services

2015-04-28 Thread Avery Payne



On 4/28/2015 10:50 AM, bougyman wrote:
Well at least we're talking the same language now, though reversing 
parent/child is disconcerting to my OCD. 


Sorry if the terminology is reversed.


Here's the current version of run.sh, with dependency support baked
in:
https://bitbucket.org/avery_payne/supervision-scripts/src/b8383ed5aaa1f6d848c1a85e6216e59ba98c3440/sv/.run/run.sh?at=default


That's a gnarley run script. It's as big as a lot of sysvinit or OpenRC
scripts I've seen. One of the reasons I like daemontools style package
management is my run scripts are usually less than 10 lines.


This was my thought, as well. It adds a level of complexity we try to
avoid in our run scripts.
It also seems to me that there is less typing involved in individual
run scripts than the
individual things that have to be configured for this script. If on
goal of this
abstraction is to minimize mistakes, adding more moving parts to edit
doesn't seem to
work towards that goal.


Currently there are the following sections, in sequence:

1. shunt stderr to stdout for logging purposes
2. shunt supporting symlinks into the $PATH so that tools are called 
correctly.  This is critical to supporting more than just a single 
framework; all of the programs referenced in .bin are actually symlinks 
that point to the correct program to run.  See the .bin/use-* scripts 
for details.
3. if a definition is broken in some way, then immediately write a 
message to the log and abort the run.
4. if dependency handling is enabled, then process dependencies. 
Otherwise, just skip the entire thing.  By default, dependencies are 
disabled; this means ./run scripts behave as if they have no dependency 
support.
4a. should dependency handling fail, log the failing child in the 
parent's log, and abort the run.

5. figure out if user or group IDs are in used, and define them.
6. figure out if a run state directory is needed.  If so, set it up.
7. start the daemon.



Re: Thoughts on First Class Services

2015-04-28 Thread Avery Payne

On 4/28/2015 11:34 AM, Laurent Bercot wrote:


 If a lot of people would like to participate but don't want to
subscribe to the skaware mailing-list, I'll move the thread here.

Good point, I'm going to stop discussion here and go over there, where 
the discussion belongs.


Re: Thoughts on First Class Services

2015-04-28 Thread Avery Payne



On 4/28/2015 10:31 AM, Steve Litt wrote:

Good! I was about to ask the definitions of parent and child, but the
preceding makes it clear.


I'm taking it from the viewpoint that says the service that the user 
wishes to start is the parent of all other service dependencies that 
must start.



So what you're doing here is minimizing polling, right? Instead of
saying whoops, child not running yet, continue the runit loop, you
actually start the child, the hope being that no service will ever be
skipped and have to wait for the next iteration. Do I have that right?
Kinda.  A failure of a single child still causes a run loop, but the 
next time around, some of the children are already started, and a start 
of the child will quickly return a success, allowing the script to skip 
over it quickly until it is looking at the same problem child from the 
last time.  The time lost is only on failed starts, and child starts 
typically don't take that long.  If they are, well, it's not the 
parent's fault...


  

Here's the current version of run.sh, with dependency support baked
in:
https://bitbucket.org/avery_payne/supervision-scripts/src/b8383ed5aaa1f6d848c1a85e6216e59ba98c3440/sv/.run/run.sh?at=default


That's a gnarley run script.


Yup.  For the moment.


If I'm not mistaken, everything inside the if test
$( cat ../.env/NEEDS_ENABLED ) -gt 0; then block is boilerplate that
could be put inside a shellscript callable from any ./run.


True, and that idea has merit.


  That would
hack off 45 lines right there. I think you could do something similar
with everything between lines 83 and 110. The person who is truly
interested in the low level details could look at the called
shellscripts (perhaps called with the dot operator). I'm thinking you
could knock this ./run down to less than 35 lines of shellscript by
putting boilerplate in shellscripts.
I've seen this done in other projects, and for the sake of simplicity 
(and reducing subshell spawns) I've tried to avoid it. But that doesn't 
mean I'm against the idea.  Certainly, all of these are improvements 
with merit, provided that they don't interfere with some of the other 
project goals.  If I can get the time to look at all of it, I'll 
re-write it by segmenting out the various components.


In fact, you may have given me an idea to solve an existing problem I'm 
having with certain daemons...




You're doing more of a recursive start. No doubt, when there are two or
three levels of dependency and services take a non-trivial amount of
time to start (seconds), yours results in the quicker boot. But for
typical stuff, I'd imagine the old wait til next time if your ducks
aren't in line will be almost as fast, will be conceptually
simpler, and more codeable by the end user. Not because your method is
any harder, but because you're applying it against a program whose
native behavior is wait til next cycle.


Actually, I was looking for the lowest-cost solution to how do I keep 
track of dependency trees between multiple services.  The result was a 
self-organizing set of data and scripts.  I don't manage *anything* 
beyond service A must have service B.  It doesn't matter how deep that 
dependency tree goes, or even if there are common leaf nodes at the 
end of the tree, because it self-organizes.  This reduces my cognitive 
workload; as the project grows to hundreds of scripts, the number of 
possible combinations reaches a point where it would be unmanageable 
otherwise.  Using this approach means I don't care how many there are, I 
only care about what is needed for a specific service.



And, as you said in a past email, having a run-once capability without
insane kludges would be nice, and as you said in another past email,
it's not enough to test for the child service to be up according to
runit, but it must pass a test to indicate the process itself is
functional. I've been doing that ever since you mentioned it.


At some point I have to go back and start writing ./check scripts. :(


Re: Thoughts on First Class Services

2015-04-28 Thread Laurent Bercot

On 28/04/2015 20:49, Avery Payne wrote:

Good point, I'm going to stop discussion here and go over there, where the 
discussion belongs.


 That's not what I meant :)
 Keep your thread here, it interests people who are not subscribed
to skaware, and it will be simpler. But please come over there to
give your opinion on how s6-rc should do things. :)

--
 Laurent