RE: thoughts on rudimentary dependency handling

2015-01-06 Thread James Powell
The way I see it is this... Either you have high level dependency handling 
within the service supervision system itself, or you have low level dependency 
handling within the service execution files.

Keeping the system as simplified as possible lowers the probability and 
possibility of issues. That's one of the basic rules of UNIX programming.

I, personally, don't see a need other than low level handling. Systemd/uselessd 
even does this within the unit files, as does some instances of 
sysvinit/bsdinit by using numbered symlinks for execution order or using a 
master control script.

Sent from my Windows Phone

From: Avery Payne
Sent: ‎1/‎6/‎2015 10:35 PM
To: Steve Litt
Cc: supervision@list.skarnet.org
Subject: Re: thoughts on rudimentary dependency handling





> On Jan 6, 2015, at 4:56 PM, Steve Litt  wrote:
>
> On Tue, 6 Jan 2015 13:17:39 -0800
> Avery Payne  wrote:
>
>> On Tue, Jan 6, 2015 at 10:20 AM, Laurent Bercot
>> >> wrote:
>>>
>>>
>>> I firmly believe that a tool, no matter what it is, should do what
>>> the user wants, even if it's wrong or can't possibly work. If you
>>> cannot do what the user wants, don't try to be smart; yell at the
>>> user, spam the logs if necessary, and fail. But don't do anything
>>> the user has not explicitly told you to do.
>>
>> And there's the rub.  I'm at a crossroad with regard to this because:
>>
>> 1. The user wants service A to run.
>> 2. Service A needs B (and possibly C) running, or it will fail.
>>
>> Should the service fail because of B and C, even though the user
>> wants A up,
>>
>> or
>>
>> Should the service start B and C because the user requested A be
>> running?
>
> I thought the way to do the latter was like this:
>
> http://smarden.org/runit/faq.html#depends
>
> If every "upstream" simply declared that his program needs B and C
> running before his program runs, it's easy to translate that into an sv
> start command in the run script.

Normally, with hand written scripts, that would be the case.   You cobble 
together what is needed and go on your way.  But this time things are a little 
different.  The project I'm doing uses templates - pre-written scripts that 
turn the various launch issues into variables while using the same code.  This 
reduces development time and bugs - write the template once, debug it once, and 
reuse it over and over.

The idea for dependencies is that I could write something that looks at 
symlinks in a directory and if the template finds anything, it starts the 
dependency.  Otherwise it remains blissfully unaware.

The issue that Laurent is arguing for is that by creating such a framework - 
even one as thin and as carefully planned - it shuts out future possibilities 
that would handle this correctly at a very high level, without the need to 
write or maintain this layout.  To accommodate this possibility and to maximize 
compatibility, he is arguing to stay with the same "service is unaware of its 
surroundings" that have been a part of the design of all the frameworks - let 
things fail and make the admin fix it.  There is merit to this, not just 
because of future expansions, but also because it allows the end user (read: 
SysAdmin) the choice of running things.  Or long story short, "don't set 
policy, let the end user decide".  And I agree; I'm a bit old school about 
these things and I picked up Linux over a decade ago because I wanted choices.

But from a practical perspective there isn't anything right now that handles 
dependencies at a global level.  The approach of "minimum knowledge needed and 
best effort separation of duty" would give a minimal environment for the time 
being.  The design is very decentralized and works at a peer level; no service 
definition knows anything other than what is linked in its ./needs directory, 
and side effects are minimized by trying hard to keep separation of duty.  
Because the scripts are distributed and meant to be installed as a whole, I can 
kinda get away with this because all of the dependencies are hard coded out of 
the box and the assumption is that you won't break something by tinkering with 
the links. But of course this runs against the setup that Laurent was 
discussing.  It's also brittle and that means something is wrong with the 
design.  It should be flexible.

So for now, I will look at having a minimum implementation that starts things 
as needed.  But only if the user sets a flag to use that feature.  This keeps 
the scripts open for the future while giving a minimum functionality today that 
is relatively "safe".  So you don't get dependency resolution unless you 
specifically turn it on.  And turning it on comes with caveats. It's not a 
perfect solution, it has potential issues, and the user abdicates/delegates 
some of their decisions to my scripts.  A no-no, to be sure. But again, that's 
the user's choic

Re: thoughts on rudimentary dependency handling

2015-01-06 Thread Avery Payne




> On Jan 6, 2015, at 4:56 PM, Steve Litt  wrote:
> 
> On Tue, 6 Jan 2015 13:17:39 -0800
> Avery Payne  wrote:
> 
>> On Tue, Jan 6, 2015 at 10:20 AM, Laurent Bercot
>> >> wrote:
>>> 
>>> 
>>> I firmly believe that a tool, no matter what it is, should do what
>>> the user wants, even if it's wrong or can't possibly work. If you
>>> cannot do what the user wants, don't try to be smart; yell at the
>>> user, spam the logs if necessary, and fail. But don't do anything
>>> the user has not explicitly told you to do.
>> 
>> And there's the rub.  I'm at a crossroad with regard to this because:
>> 
>> 1. The user wants service A to run.
>> 2. Service A needs B (and possibly C) running, or it will fail.
>> 
>> Should the service fail because of B and C, even though the user
>> wants A up,
>> 
>> or
>> 
>> Should the service start B and C because the user requested A be
>> running?
> 
> I thought the way to do the latter was like this:
> 
> http://smarden.org/runit/faq.html#depends
> 
> If every "upstream" simply declared that his program needs B and C
> running before his program runs, it's easy to translate that into an sv
> start command in the run script.

Normally, with hand written scripts, that would be the case.   You cobble 
together what is needed and go on your way.  But this time things are a little 
different.  The project I'm doing uses templates - pre-written scripts that 
turn the various launch issues into variables while using the same code.  This 
reduces development time and bugs - write the template once, debug it once, and 
reuse it over and over. 

The idea for dependencies is that I could write something that looks at 
symlinks in a directory and if the template finds anything, it starts the 
dependency.  Otherwise it remains blissfully unaware. 

The issue that Laurent is arguing for is that by creating such a framework - 
even one as thin and as carefully planned - it shuts out future possibilities 
that would handle this correctly at a very high level, without the need to 
write or maintain this layout.  To accommodate this possibility and to maximize 
compatibility, he is arguing to stay with the same "service is unaware of its 
surroundings" that have been a part of the design of all the frameworks - let 
things fail and make the admin fix it.  There is merit to this, not just 
because of future expansions, but also because it allows the end user (read: 
SysAdmin) the choice of running things.  Or long story short, "don't set 
policy, let the end user decide".  And I agree; I'm a bit old school about 
these things and I picked up Linux over a decade ago because I wanted choices.

But from a practical perspective there isn't anything right now that handles 
dependencies at a global level.  The approach of "minimum knowledge needed and 
best effort separation of duty" would give a minimal environment for the time 
being.  The design is very decentralized and works at a peer level; no service 
definition knows anything other than what is linked in its ./needs directory, 
and side effects are minimized by trying hard to keep separation of duty.  
Because the scripts are distributed and meant to be installed as a whole, I can 
kinda get away with this because all of the dependencies are hard coded out of 
the box and the assumption is that you won't break something by tinkering with 
the links. But of course this runs against the setup that Laurent was 
discussing.  It's also brittle and that means something is wrong with the 
design.  It should be flexible. 

So for now, I will look at having a minimum implementation that starts things 
as needed.  But only if the user sets a flag to use that feature.  This keeps 
the scripts open for the future while giving a minimum functionality today that 
is relatively "safe".  So you don't get dependency resolution unless you 
specifically turn it on.  And turning it on comes with caveats. It's not a 
perfect solution, it has potential issues, and the user abdicates/delegates 
some of their decisions to my scripts.  A no-no, to be sure. But again, that's 
the user's choice to flip the switch on...

Also keep in mind that I'm bouncing ideas off of him and he is looking at it 
from a perspective much different from mine.  I'm taking a pragmatic approach 
that helps deal with an old situation - the frameworks would be more 
usable/adoptable if there was a baseline set of service definitions that 
allowed for a wholesale switch over to using them from $(whatever you're using 
now).  So it's very pragmatic and legacy and "old school" for desktops and 
servers, and that's something still needed because we still have those being 
used everywhere.  It's basically a way to grandfather the frameworks in using 
the current technology by providing all of the missing "glue" in the form of 
service definitions.  And unfortunately dependency resolution is an old issue 
that has to be worked out as one of my goals, which is to increase ease of use 
so that people w

Re: thoughts on rudimentary dependency handling

2015-01-06 Thread Steve Litt
On Tue, 6 Jan 2015 13:17:39 -0800
Avery Payne  wrote:

> On Tue, Jan 6, 2015 at 10:20 AM, Laurent Bercot
>  > wrote:
> >
> >
> >  I firmly believe that a tool, no matter what it is, should do what
> > the user wants, even if it's wrong or can't possibly work. If you
> > cannot do what the user wants, don't try to be smart; yell at the
> > user, spam the logs if necessary, and fail. But don't do anything
> > the user has not explicitly told you to do.
> >
> 
> And there's the rub.  I'm at a crossroad with regard to this because:
> 
> 1. The user wants service A to run.
> 2. Service A needs B (and possibly C) running, or it will fail.
> 
> Should the service fail because of B and C, even though the user
> wants A up,
> 
>  or
> 
> Should the service start B and C because the user requested A be
> running?

I thought the way to do the latter was like this:

http://smarden.org/runit/faq.html#depends

If every "upstream" simply declared that his program needs B and C
running before his program runs, it's easy to translate that into an sv
start command in the run script.

If "upstreams" declared this stuff, it would also be pretty easy to
write a script, with or without the make command, to do the whole
dependency tree. Given that not every init has "provides" ability,
perhaps a standard list of service names could be distributed. There's
already an /etc/services for port numbers: Maybe there could be
an /etc/servicenames for standard names for services.

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance



Re: thoughts on rudimentary dependency handling

2015-01-06 Thread Laurent Bercot

On 06/01/2015 22:17, Avery Payne wrote:

And there's the rub.  I'm at a crossroad with regard to this because:

1. The user wants service A to run.
2. Service A needs B (and possibly C) running, or it will fail.

Should the service fail because of B and C, even though the user wants A up,

  or

Should the service start B and C because the user requested A be running?


 My answer is: there are two layers, and what to do depends on what exactly
the user is asking and whom it is asking.

 If the user is asking *A* to run, as in "s6-svc -u /service/A", and A needs B
and B is down, then A should fail.
 If the user is asking *the global state manager*, i.e. the upper layer, to
change the current state to the current state + A, then the global state
manager should look up its dependency database, see that in order to bring up
A it also needs to bring up B and C, and do it.

 If you have a global state manager, that is the entity the user should
communicate with to change the state of services. It is the only entity
deciding which individual service goes up or down; if you type
"s6-svc -u /service/A", the state manager should notice this and go
"nope, this is not the state I've been asked to enforce" and bring down
A immediately. But if you tell it to bring A up itself, then it should do
whatever it takes to fulfill your request, including bringing up B and C.

 This is Unix: for every shared resource, there should be one daemon which
centralizes access to that resource to avoid conflicts. If you want service
dependencies, then the set of service states becomes a resource, and you
need that centralization.

 I'll get to writing such a daemon at some point, but it won't be
tomorrow, so feel free to implement whatever fulfills your needs:
whoever writes the code is right. I just wanted to make sure you
don't start an underspecified kitchen sink that will end up being a
maintenance nightmare, and I would really advise you to stick to the naive,
dumb approach until you can commit to a full centralized state manager.

--
 Laurent



Re: thoughts on rudimentary dependency handling

2015-01-06 Thread Avery Payne
On Tue, Jan 6, 2015 at 10:20 AM, Laurent Bercot  wrote:
>
>
>  I firmly believe that a tool, no matter what it is, should do what the
> user wants, even if it's wrong or can't possibly work. If you cannot do
> what the user wants, don't try to be smart; yell at the user, spam the
> logs if necessary, and fail. But don't do anything the user has not
> explicitly told you to do.
>

And there's the rub.  I'm at a crossroad with regard to this because:

1. The user wants service A to run.
2. Service A needs B (and possibly C) running, or it will fail.

Should the service fail because of B and C, even though the user wants A up,

 or

Should the service start B and C because the user requested A be running?

For some, the first choice, which is to immediately fail, is perfectly
fine.  I can agree to that, and I understand the "why" of it, and it makes
sense.  But in other use cases, you'll have users that aren't looking at
this chain of details.  They asked for A to be up, why do they need to
bring up B, oh look there's C too...things suddenly look "broken", even
though they aren't.  I'm caught between making sure the script comes up,
and doing the right thing consistently.

I can certainly make the scripts "naive" of each other and not start
anything at all...and leave everything up to the administrator to figure
out how to get things working.  Currently this is how the majority of them
are done, and it wouldn't take much to change the rest to match this
behavior.

It's also occurred to me that instead of making the "dependency feature" a
requirement, I can make it optional.  It could be a feature that you choose
to activate by setting a file or environment variable.  Without the
setting, you would get the default behavior you are wanting to see, with no
dependency support; this would be the default "out of the box" experience.
With the setting, you get the automatic start-up that I think people will
want.  So the choice is back with the user, and they can decide.  That
actually might be the way to handle this, and both parties - the ones that
want full control and visibility, and the ones after ease of use - will get
what they want.  On the one hand I can assure that you will get working
scripts, because scripts that have dependencies can be made to work that
way.  On the other hand, if you want strict behavior, that is assured as
well.

The only drawback is you can't get both because of the limitations of the
environment that I am in.


RE: s6 init-stage1

2015-01-06 Thread James Powell
The problem of using sockets rather than named pipes is that each UNIX socket 
requires more POSIX shared memory increasing the system resource base 
requirements. Named pipes just use normal process memory which keeps system 
requirements less. How Lennart failed to mention that in the systemd 
presentation is insane.

Sent from my Windows Phone

From: post-sysv
Sent: ‎1/‎6/‎2015 12:03 PM
To: Laurent Bercot
Cc: supervision@list.skarnet.org
Subject: Re: s6 init-stage1

On 01/06/2015 07:48 AM, Laurent Bercot wrote:
> Interesting. Thanks for the heads-up - I had heard of tsort, but didn't
> know exactly what it does.
>
>  However, I'd like a tool that knows what steps it can parallelize.
> A sequential output is great for functions name in a piece of code,
> but for services, the point is to start as many as possible in
> parallel, and minimize the amount of synchronization points.
>
> For instance, given
> 1 2
> 3 4
> meaning 2 should happen after 1, and 4 should happen after 3,
> tsort gives
> 1
> 3
> 2
> 4
> but instead, I need something like
> 1 3
> 2 4
> because 1 and 3 can happen in parallel, and same for 2 and 4.
>
>  AFAICT, tsort cannot do that. (make might not be able to either,
> but since it's more complex, it's harder to tell.)
>

  About that. Actually, I'm not even certain if there exists a service
manager that
*actually* starts processes in parallel. Usually what I've noticed is
that most of
the time what is really meant is that services are started
asynchronously, or at
best concurrently.

  Debian and other formerly sysvinit-based distributions had what was
known as
a "Makefile-style concurrent boot". To the best of my knowledge, this
was done
using a combination of LSB initscript headers through insserv, and a
program
called startpar.

  Reading the source code of startpar, I was surprised to see that it
does its job
through a primitive form of socket activation in the run() function
where it allocates
a so-called "preload" socket and determines exit status by its
availability for
connection. Secondary routines including meddling with ptys and file
descriptors
to curb interleaving and make sure the execution state is clean and free of
potentially blocking operations.

  Makes me wonder if Poettering ever read it, though his ostensible
inspiration
was from launchd. That said, it does show that the systemd supporters have
overhyped the novelty of "socket activation" (inetd) even more
significantly
than I had previously thought. Someone should make note of this.

  In any event, I'm under the impression that most so-called parallel
service starters
are really ones that start asynchronously in a clean execution state, as
true
parallelism and even concurrency sounds conceptually quite difficult,
particularly
when you keep in mind that many boot processes are I/O-bound, primarily.
systemd itself has a complex dependency system at its backbone, with socket
activation not being a mandatory thing from what I've learned. It also
blocks on
occasion to fulfill start jobs, so evidently it has synchronization
methods that are
contrary to its claims.

  If someone can clarify this issue or point to any concurrent/parallel
schemes for
starting services at boot time that have been implemented, that would be
appreciated.


Execline: was s6 init-stage1

2015-01-06 Thread Steve Litt
On Tue, 06 Jan 2015 04:28:59 +0100
Laurent Bercot  wrote:


>   Far be it from me to discourage you from your noble quest ! But
> you could write it in sh and just use the 'redirfd' command from
> execline, which does the FIFO magic.

I read the documentation of execline, complete with the diagrams, and
didn't understand a word of it. I think a lot more documentation and a
lot of examples would help immensely.

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance



Re: thoughts on rudimentary dependency handling

2015-01-06 Thread John Albietz
"Down by default" makes sense to me and will be a great feature.
I think that having it will require all services to have a 'down' script
that defines how to make sure that a service is actually down.

I wonder if this will help address a common situation for me where I
install a package and realize that at the end of the installation the
daemon is started using upstart or sysv.

At that point, to 'supervise' the app, I first have to stop the current
daemon and then start it up using runit or another process manager.

Otherwise I end up with two copies of the app running, with only one of them
being supervised.

John Albietz
m: 516-592-2372
e: inthecloud...@gmail.com
l'in: www.linkedin.com/in/inthecloud247

On Tue, Jan 6, 2015 at 10:20 AM, Laurent Bercot  wrote:

> On 06/01/2015 18:46, Avery Payne wrote:
>
>> 1. A service can ask another service to start.
>> 2. A service can only signal itself to go down.  It can never ask another
>> service to go down.
>> 3. A service can only mark itself with a ./down file.  It can never mark
>> another service with a ./down file.
>>
>> That's it.  Numbers 2 and 3 are the only times I would go against what you
>> are saying.
>>
>
>  So does number 1.
>  When you ask another service to start, you change the global state. If
> it is done automatically by any tool or service, this means the global
> state changes without the admin having requested it. This is trying to
> be smarter than the user, which is almost always a bad thing.
>
>  Number 2 is in the same boat. If the admin wants you to be up,
> then you can't just quit and decide that you will be down. You can't
> change the global state behind the admin's back.
>
>  Number 3 is different. It doesn't change the global state - unless there's
> a serious incident. But it breaks resiliency against that incident: it
> kills the guarantee the supervision tree offers you.
>
>  I firmly believe that a tool, no matter what it is, should do what the
> user wants, even if it's wrong or can't possibly work. If you cannot do
> what the user wants, don't try to be smart; yell at the user, spam the
> logs if necessary, and fail. But don't do anything the user has not
> explicitly told you to do.
>
>  Maybe a dependency manager needs to be smarter than that. In which case
> I would call it a "global state manager". There would be the "current
> state", which starts at "everything down" when the machine boots, and
> the "wanted state", which starts at "everything up"; as long as those
> states are not matched, the global state manager is running, implementing
> retry policies and such, and may change the global state at any time,
> bringing individual services up and down as it sees fit, with the
> ultimate goal of matching the global current state with the global wanted
> state. It could do stuff like exponential backoff so failing services
> would not be "wanted up" all the time; but it would never, ever change
> the global wanted state without an order to do so from the admin.
>
>  If you want a dependency manager with online properties, I think this
> is the way to do it.
>
> --
>  Laurent
>
>


Re: s6 init-stage1

2015-01-06 Thread post-sysv

On 01/06/2015 07:48 AM, Laurent Bercot wrote:

Interesting. Thanks for the heads-up - I had heard of tsort, but didn't
know exactly what it does.

 However, I'd like a tool that knows what steps it can parallelize.
A sequential output is great for functions name in a piece of code,
but for services, the point is to start as many as possible in
parallel, and minimize the amount of synchronization points.

For instance, given
1 2
3 4
meaning 2 should happen after 1, and 4 should happen after 3,
tsort gives
1
3
2
4
but instead, I need something like
1 3
2 4
because 1 and 3 can happen in parallel, and same for 2 and 4.

 AFAICT, tsort cannot do that. (make might not be able to either,
but since it's more complex, it's harder to tell.)



 About that. Actually, I'm not even certain if there exists a service 
manager that
*actually* starts processes in parallel. Usually what I've noticed is 
that most of
the time what is really meant is that services are started 
asynchronously, or at

best concurrently.

 Debian and other formerly sysvinit-based distributions had what was 
known as
a "Makefile-style concurrent boot". To the best of my knowledge, this 
was done
using a combination of LSB initscript headers through insserv, and a 
program

called startpar.

 Reading the source code of startpar, I was surprised to see that it 
does its job
through a primitive form of socket activation in the run() function 
where it allocates
a so-called "preload" socket and determines exit status by its 
availability for
connection. Secondary routines including meddling with ptys and file 
descriptors

to curb interleaving and make sure the execution state is clean and free of
potentially blocking operations.

 Makes me wonder if Poettering ever read it, though his ostensible 
inspiration

was from launchd. That said, it does show that the systemd supporters have
overhyped the novelty of "socket activation" (inetd) even more 
significantly

than I had previously thought. Someone should make note of this.

 In any event, I'm under the impression that most so-called parallel 
service starters
are really ones that start asynchronously in a clean execution state, as 
true
parallelism and even concurrency sounds conceptually quite difficult, 
particularly

when you keep in mind that many boot processes are I/O-bound, primarily.
systemd itself has a complex dependency system at its backbone, with socket
activation not being a mandatory thing from what I've learned. It also 
blocks on
occasion to fulfill start jobs, so evidently it has synchronization 
methods that are

contrary to its claims.

 If someone can clarify this issue or point to any concurrent/parallel 
schemes for

starting services at boot time that have been implemented, that would be
appreciated.


Re: thoughts on rudimentary dependency handling

2015-01-06 Thread Laurent Bercot

On 06/01/2015 18:46, Avery Payne wrote:

1. A service can ask another service to start.
2. A service can only signal itself to go down.  It can never ask another
service to go down.
3. A service can only mark itself with a ./down file.  It can never mark
another service with a ./down file.

That's it.  Numbers 2 and 3 are the only times I would go against what you
are saying.


 So does number 1.
 When you ask another service to start, you change the global state. If
it is done automatically by any tool or service, this means the global
state changes without the admin having requested it. This is trying to
be smarter than the user, which is almost always a bad thing.

 Number 2 is in the same boat. If the admin wants you to be up,
then you can't just quit and decide that you will be down. You can't
change the global state behind the admin's back.

 Number 3 is different. It doesn't change the global state - unless there's
a serious incident. But it breaks resiliency against that incident: it
kills the guarantee the supervision tree offers you.

 I firmly believe that a tool, no matter what it is, should do what the
user wants, even if it's wrong or can't possibly work. If you cannot do
what the user wants, don't try to be smart; yell at the user, spam the
logs if necessary, and fail. But don't do anything the user has not
explicitly told you to do.

 Maybe a dependency manager needs to be smarter than that. In which case
I would call it a "global state manager". There would be the "current
state", which starts at "everything down" when the machine boots, and
the "wanted state", which starts at "everything up"; as long as those
states are not matched, the global state manager is running, implementing
retry policies and such, and may change the global state at any time,
bringing individual services up and down as it sees fit, with the
ultimate goal of matching the global current state with the global wanted
state. It could do stuff like exponential backoff so failing services
would not be "wanted up" all the time; but it would never, ever change
the global wanted state without an order to do so from the admin.

 If you want a dependency manager with online properties, I think this
is the way to do it.

--
 Laurent



Re: thoughts on rudimentary dependency handling

2015-01-06 Thread Avery Payne
On Tue, Jan 6, 2015 at 8:52 AM, Laurent Bercot 
wrote:

>
>  I'm not sure exactly in what context your message needs to be taken
> - is that about a tool you have written or are writing, or something
> else ? - but if you're going to work on dependency management, it's
> important that you get it right. It's complex stuff that needs
> planning and thought.


This is in the context of "service definition A needs service definition B
to be up".


>
>  * implement a ./needs directory.  This would have symlinks to any
>> definitions that would be required to run before the main definition can
>> run.  For instance, Debian's version of lightdm requires that dbus be
>> running, or it will abort.  Should a ./needs not be met, the current
>> definition will receive a ./down file, write out a message indicating what
>> service blocked it from starting, and then will send a "down service" to
>> itself.
>>
>
>  For instance, I'm convinced that the approach you're taking here actually
> takes away from reliability. Down files are dangerous: they break the
> supervision chain guarantee. If the supervisor dies and is respawned by
> its parent, it *will not* restart the service if there's a down file.
> You want down files to be very temporary, for debugging or something,
> you don't want them to be a part of your normal operation.
>
>  If your dependency manager works online, you *will* bring services down
> when you don't want to. You *will* have more headaches making things work
> than if you had no dependency manager at all. I guarantee it.


I should have added some clarifications.  There are some basic rules I'm
using with regard to starting/stopping services:

1. A service can ask another service to start.
2. A service can only signal itself to go down.  It can never ask another
service to go down.
3. A service can only mark itself with a ./down file.  It can never mark
another service with a ./down file.

That's it.  Numbers 2 and 3 are the only times I would go against what you
are saying.  And since the only reason I would do that is because of some
failure that was unexpected, there would be a good reason to do so.  And in
all cases, there would be a message output as to why it signaled itself
back down, or why it marked itself with a ./down file.  The ./down file is,
I think, being used correctly - I'm trying to flag to the sysadmin that
"something is wrong with this service, and it shouldn't restart until you
fix it".

I'm sorry if the posting was confusing.  Hopefully the rules clarify when
and how I would be using these features.  I believe it should be safe if
they are confined within the context of the service definition itself, and
not other dependencies.  If there is something to the contrary that I'm
missing in those three rules, I'm listening.


Re: s6 init-stage1

2015-01-06 Thread Colin Booth
On Tue, Jan 6, 2015 at 6:27 AM, Avery Payne  wrote:
> On Tue, Jan 6, 2015 at 4:02 AM, Laurent Bercot 
> wrote:
>  But on servers and embedded systems, / should definitely be read-only.
>> Having it read-write makes it susceptible to filesystem corruption,
>> which kills the guarantee that your machine will boot to at least a
>> debuggable state. A read-only / saves you the hassle of having a
>> recovery system.
>>
>
> Interesting concept.

I use xfs. If I'm going to use a journaling file system, I might as
well use one that doesn't have filesystem corruption.

-- 
"If the doors of perception were cleansed every thing would appear to
man as it is, infinite. For man has closed himself up, till he sees
all things thru' narrow chinks of his cavern."
  --  William Blake


Re: s6 init-stage1

2015-01-06 Thread Colin Booth
On Tue, Jan 6, 2015 at 4:02 AM, Laurent Bercot
 wrote:
> On 06/01/2015 09:00, Colin Booth wrote:
>
>> 1. Depending on your initramfs and your on-disk layout you can skip
>> mounting proc and sys. I know this is the case with Debian, probably
>> true elsewhere as well.
>
>
>  It all depends on the assumptions that init-stage2 makes, but yes,
> now that you're mentioning it, mounting /proc and /sys may be
> delayed, as long as none of the very early services need them.
> Make sure the login process and interactive root shell do not need
> them either, because if init-stage2 fails very early, being able to
> log in will make debugging/recovery a lot easier.
>
In Debian's case, initramfs had already loaded /proc and /sys so
trying to mount them again was causing things to fail.
>
>> 2. If you aren't starting udev until init-stage2, you'll need to
>> manually mknod null and console devices before the "Reopen
>> stdin/stdout/stderr" comment.
>
>
>  That only applies to people who want a static /dev. Most people
> will run some flavour of udev, and will probably want to keep the
> devtmpfs mounted on /dev, in which case the kernel exports
> /dev/null and /dev/console itself. (Probably with the wrong rights,
> but they're functional enough to get by until udev runs.)
>
Hm, true. I guess that note is only if you are running with /dev as a
symlink to /mnt/tmpfs/dev since you get a tmpfs in that case. This is
what the init-stage1 script assumes. So, either make the nodes, run
udev as part of init-stage1, or use devtmpfs. I suggest the last :)
>
>> 3. You'll need to either symlink /tmp into your tmpfs, mount a tmpfs
>> on /tmp as part of init-stage1, or remount / to rw before s6-svscan is
>> loaded. Otherwise the catch-all logger won't be able to do its thing
>> as written. Same deal with /service, though that one is documented and
>> expected.
>
>
>  Actually, neither of those 3 things are needed for /tmp. :)
>  What *is* needed is a writable-by-root-only directory, to store the
> information init needs:
>  - The scan directory, which must be rw
>  - rw places to store the supervise/ and event/ subdirectories of
> the service directories, or a copy of the service directories
> themselves
>  - a rw place for the catch-all logger to run
>
>  /tmp is not ideal for this, for several reasons. One of which is
> as soon as stage 2 begins and user stuff runs on the system, creating
> files in /tmp isn't absolutely secure anymore, because filenames can
> be predicted and DoSsed. Another reason is conceptual: the information
> we need to store is not exactly temporary, it's not the throwaway
> stuff you'd expect to see in /tmp - on the contrary, it's vital to the
> system. So it's very unsightly to put it in /tmp.
>
Makes total sense. In that case though, s6-svscan-log/run should
probably be updated in the examples so that it doesn't try to use /tmp
since any /tmp/uncaught-logs symlink will be unavailable if a tmpfs
does get mounted or something cleans up /tmp. In the first case you're
doing more work in init-stage1 than necessary, in the second you're
back to having a rw root (if even for a second).
>
>  That is why I'm saying that s6 needs a tmpfs, distinct from /tmp,
> made in stage 1. Having a "private" tmpfs allows init to store the
> scan directory, the copies of service directories, and the catch-all
> logger directory, without impacting the rest of the system.
>  Since that tmpfs is needed anyway, /tmp might as well be a symlink
> to a public (mode 1777) subdirectory of it: it makes /proc/mounts
> cleaner. But it's not a requirement, and /tmp may be mounted as a
> separate tmpfs at some point in stage 2.
>
>
>  If you are reckless, totally insensitive to gracefulness, and you
> absolutely cannot deal with creating a tmpfs just for the sake of s6,
> you may try to use a subdirectory of the devtmpfs in /dev as an
> early root-only read-write place.
>  You will now forget I suggested that. *flash*
>
That. Wow. That's amazingly bad.
>
>> 4. If you don't want to have your dev mount in /mnt/tmpfs/dev (mostly
>> to keep ps output non-ugly and to kind-of stick to the FHS)
>
>
>  Eh, the FHS doesn't say that /dev should be a real directory. It can
> be a symlink all right. I checked. :P
>  Most Linux people will use udev, though, and for them /dev will be a
> devtmpfs: a real directory, and a mountpoint.
>
Mostly it's to keep the output of ps non-mangled when you ssh in. A
tty of pts/XX doesn't mess up the column output, a tty of
/mnt/tmpfs/dev/pts/XX definitely does. That said, I'm a bit surprised
that the FHS doesn't care beyond needing the name present.
>
>  The order in which init-stage2 starts services and interleaves them
> with one-shot commands should mirror your dependency graph. This is
> where a dependency management system would come in handy; I plan to
> work on a program that takes a dependency graph as its input (format
> TBD) and outputs a suitable init-stage2 script.
>
It would. In my case though, I knew the se

Re: thoughts on rudimentary dependency handling

2015-01-06 Thread Laurent Bercot


 I'm not sure exactly in what context your message needs to be taken
- is that about a tool you have written or are writing, or something
else ? - but if you're going to work on dependency management, it's
important that you get it right. It's complex stuff that needs
planning and thought.



* implement a ./needs directory.  This would have symlinks to any
definitions that would be required to run before the main definition can
run.  For instance, Debian's version of lightdm requires that dbus be
running, or it will abort.  Should a ./needs not be met, the current
definition will receive a ./down file, write out a message indicating what
service blocked it from starting, and then will send a "down service" to
itself.


 For instance, I'm convinced that the approach you're taking here actually
takes away from reliability. Down files are dangerous: they break the
supervision chain guarantee. If the supervisor dies and is respawned by
its parent, it *will not* restart the service if there's a down file.
You want down files to be very temporary, for debugging or something,
you don't want them to be a part of your normal operation.

 I firmly believe that in order to keep boot and shutdown procedures fast
and simple, and avoid reinventing the kitchen sink, any dependency
management on top of a supervision system should work *offline*. Keep the
dependency manager out of the supervisor's way in normal operation; just
use it to generate state change scripts.

 If your dependency manager works online, you *will* bring services down
when you don't want to. You *will* have more headaches making things work
than if you had no dependency manager at all. I guarantee it.

 I don't know how to design such a beast. I'm not there yet, I haven't
given it any thought. But a general principle applies: don't do more, do
less. If something is unnecessary, don't do it. What a supervision
framework needs is a partial order on how to bring services up or down
at boot time and shutdown time, and other global state changes; not
instructions on what to do in normal operation. Stay out of the way of
the supervisor outside of a global state change.

--
 Laurent



thoughts on rudimentary dependency handling

2015-01-06 Thread Avery Payne
I'm asking for some input.  I obviously have limitations with regard to
what I can do for all three frameworks in the project, so if this seems
limited in scope, you're right, it is.  My thoughts are:

* incorporate it into the base template.  I'm trying to minimize the number
of templates around, otherwise, why bother if there are 100's of
templates?  I might as well write all of them as unique stand-alone
scripts, and that defeats the purpose of templates in the first place.
* implement a ./wants directory.  This would have symlinks to any
definitions that would be nice to have running, but are not required.  For
instance, smbd runs fine by itself, but it would be nice to have nmbd and
winbind also running as well; if they don't start, it won't block smbd from
starting.
* implement a ./needs directory.  This would have symlinks to any
definitions that would be required to run before the main definition can
run.  For instance, Debian's version of lightdm requires that dbus be
running, or it will abort.  Should a ./needs not be met, the current
definition will receive a ./down file, write out a message indicating what
service blocked it from starting, and then will send a "down service" to
itself.
* implement a ./conflicts directory.  Any service that would conflict with
either the primary definition, or any definitions in ./needs or ./wants
would be symlinked here.  There would be a simple probe that asks "is the
service up"?  If so, it simply aborts the current definition and warns the
administrator about the conflict.

The ./needs and ./wants are pretty straightfoward, and are easily
implemented.  I don't see any real issues with them.

For ./conflict I do not want to force the conflicting service down - there
may be other services depending on it, and bringing down the conflicting
service may cause problems.

Here's where I need input.  The ./conflict concept is somewhat racy - you
could in theory have a service starting while it is being probed, the probe
coming first (and indicating the service as down), and then the service
start.  I'm not sure there's an easy way to trap for this condition.  Worst
case, I jettison the ./conflicts directory altogether and let services
"naturally fail" due to file name or port number collisions, etc.

Suggestions on how to better improve ./conflicts?  Or should it be
abandoned within the context of how these scripts are currently written?
Keep in mind, these scripts try not to assume too much, and that includes C
source code, etc., so I'm not looking to write a special program that
handles race conditions like this.


Re: s6 init-stage1

2015-01-06 Thread Avery Payne
On Tue, Jan 6, 2015 at 4:02 AM, Laurent Bercot 
wrote:
>
>  I very much dislike having / read-write. In desktops or other systems
> where /etc is not really static, it is unfortunately unavoidable
> (unless symlinks to /var are made, for instance /etc/resolv.conf should
> be a symlink to /var/etc/resolv.conf or something, but you cannot store,
> for instance, /etc/passwd on /var...)
>

What if /etc were a mount overlay?  I don't know if other *nix systems
support the concept, but under Linux, mounting a file system onto an
existing directory simply "blocks" the original directory contents
"underneath", exposing only the file system "on top", and all writes go to
the "top" filesystem.  This would allow you to cook up a minimalist /etc
that could be left read-only, but when the system comes up, /etc is
remounted as read-write with a different filesystem to capture read-write
data.  Dismounting /etc would occur along with all the other dismounts at
the tail-end of shutdown.  The only issue I could see is /etc/passwd having
a password set for root, which would be needed to secure the console in the
event that the startup failed somehow and /etc isn't mounted yet. This
implies a possible de-sync between the read-only /etc/passwd and the
read-write /etc/passwd; the former is fixed in stone, the later can change.

 But on servers and embedded systems, / should definitely be read-only.
> Having it read-write makes it susceptible to filesystem corruption,
> which kills the guarantee that your machine will boot to at least a
> debuggable state. A read-only / saves you the hassle of having a
> recovery system.
>

Interesting concept.


Re: s6 init-stage1

2015-01-06 Thread Laurent Bercot

On 06/01/2015 13:12, Peter Pentchev wrote:

Even better: most modern systems have a tsort(1) utility for this kind of
topological sorting; BSD-derived systems have had it for ages.


 Interesting. Thanks for the heads-up - I had heard of tsort, but didn't
know exactly what it does.

 However, I'd like a tool that knows what steps it can parallelize.
A sequential output is great for functions name in a piece of code,
but for services, the point is to start as many as possible in
parallel, and minimize the amount of synchronization points.

For instance, given
1 2
3 4
meaning 2 should happen after 1, and 4 should happen after 3,
tsort gives
1
3
2
4
but instead, I need something like
1 3
2 4
because 1 and 3 can happen in parallel, and same for 2 and 4.

 AFAICT, tsort cannot do that. (make might not be able to either,
but since it's more complex, it's harder to tell.)

--
 Laurent


Re: s6 init-stage1

2015-01-06 Thread Peter Pentchev
On Tue, Jan 06, 2015 at 01:02:46PM +0100, Laurent Bercot wrote:
> On 06/01/2015 09:00, Colin Booth wrote:
[snip]
> >5. I made a few more classes of services for init-stage2 to copy into
> >the service directory. Specifically for things that I wanted running
> >ASAP and were udev agnostic. Those were: syslogd (using s6-ipcserver
> >and ucspilogd), klogd, cron, and udev. Mostly that was because I
> >needed udev running (and supervised) before bringing up dbus, and I
> >wanted to make sure /dev/log had a reader before I started bringing
> >anything up that might not want to talk to stdout instead (openssh,
> >I'm looking at you).
> 
>  The order in which init-stage2 starts services and interleaves them
> with one-shot commands should mirror your dependency graph. This is
> where a dependency management system would come in handy; I plan to
> work on a program that takes a dependency graph as its input (format
> TBD) and outputs a suitable init-stage2 script.
> 
>  (Crazy idea brewing. Dependency graph management is a solved problem:
> it's exactly what "make" does. So my program could simply translate
> the service dependency graph into a Makefile, and make would
> output the script. This requires more thought.)

Even better: most modern systems have a tsort(1) utility for this kind of
topological sorting; BSD-derived systems have had it for ages.

G'luck,
Peter

-- 
Peter Pentchev  r...@ringlet.net r...@freebsd.org p.penc...@storpool.com
PGP key:http://people.FreeBSD.org/~roam/roam.key.asc
Key fingerprint 2EE7 A7A5 17FC 124C F115  C354 651E EFB0 2527 DF13


signature.asc
Description: Digital signature


Re: s6 init-stage1

2015-01-06 Thread Laurent Bercot

On 06/01/2015 09:00, Colin Booth wrote:


1. Depending on your initramfs and your on-disk layout you can skip
mounting proc and sys. I know this is the case with Debian, probably
true elsewhere as well.


 It all depends on the assumptions that init-stage2 makes, but yes,
now that you're mentioning it, mounting /proc and /sys may be
delayed, as long as none of the very early services need them.
Make sure the login process and interactive root shell do not need
them either, because if init-stage2 fails very early, being able to
log in will make debugging/recovery a lot easier.



2. If you aren't starting udev until init-stage2, you'll need to
manually mknod null and console devices before the "Reopen
stdin/stdout/stderr" comment.


 That only applies to people who want a static /dev. Most people
will run some flavour of udev, and will probably want to keep the
devtmpfs mounted on /dev, in which case the kernel exports
/dev/null and /dev/console itself. (Probably with the wrong rights,
but they're functional enough to get by until udev runs.)



3. You'll need to either symlink /tmp into your tmpfs, mount a tmpfs
on /tmp as part of init-stage1, or remount / to rw before s6-svscan is
loaded. Otherwise the catch-all logger won't be able to do its thing
as written. Same deal with /service, though that one is documented and
expected.


 Actually, neither of those 3 things are needed for /tmp. :)
 What *is* needed is a writable-by-root-only directory, to store the
information init needs:
 - The scan directory, which must be rw
 - rw places to store the supervise/ and event/ subdirectories of
the service directories, or a copy of the service directories
themselves
 - a rw place for the catch-all logger to run

 /tmp is not ideal for this, for several reasons. One of which is
as soon as stage 2 begins and user stuff runs on the system, creating
files in /tmp isn't absolutely secure anymore, because filenames can
be predicted and DoSsed. Another reason is conceptual: the information
we need to store is not exactly temporary, it's not the throwaway
stuff you'd expect to see in /tmp - on the contrary, it's vital to the
system. So it's very unsightly to put it in /tmp.

 I very much dislike having / read-write. In desktops or other systems
where /etc is not really static, it is unfortunately unavoidable
(unless symlinks to /var are made, for instance /etc/resolv.conf should
be a symlink to /var/etc/resolv.conf or something, but you cannot store,
for instance, /etc/passwd on /var...)
 But on servers and embedded systems, / should definitely be read-only.
Having it read-write makes it susceptible to filesystem corruption,
which kills the guarantee that your machine will boot to at least a
debuggable state. A read-only / saves you the hassle of having a
recovery system.
 So, it should be the admin's choice, and I do not want s6 to force
the admin to mount / rw.

 That is why I'm saying that s6 needs a tmpfs, distinct from /tmp,
made in stage 1. Having a "private" tmpfs allows init to store the
scan directory, the copies of service directories, and the catch-all
logger directory, without impacting the rest of the system.
 Since that tmpfs is needed anyway, /tmp might as well be a symlink
to a public (mode 1777) subdirectory of it: it makes /proc/mounts
cleaner. But it's not a requirement, and /tmp may be mounted as a
separate tmpfs at some point in stage 2.

 If you are reckless, totally insensitive to gracefulness, and you
absolutely cannot deal with creating a tmpfs just for the sake of s6,
you may try to use a subdirectory of the devtmpfs in /dev as an
early root-only read-write place.
 You will now forget I suggested that. *flash*



4. If you don't want to have your dev mount in /mnt/tmpfs/dev (mostly
to keep ps output non-ugly and to kind-of stick to the FHS)


 Eh, the FHS doesn't say that /dev should be a real directory. It can
be a symlink all right. I checked. :P
 Most Linux people will use udev, though, and for them /dev will be a
devtmpfs: a real directory, and a mountpoint.



5. I made a few more classes of services for init-stage2 to copy into
the service directory. Specifically for things that I wanted running
ASAP and were udev agnostic. Those were: syslogd (using s6-ipcserver
and ucspilogd), klogd, cron, and udev. Mostly that was because I
needed udev running (and supervised) before bringing up dbus, and I
wanted to make sure /dev/log had a reader before I started bringing
anything up that might not want to talk to stdout instead (openssh,
I'm looking at you).


 The order in which init-stage2 starts services and interleaves them
with one-shot commands should mirror your dependency graph. This is
where a dependency management system would come in handy; I plan to
work on a program that takes a dependency graph as its input (format
TBD) and outputs a suitable init-stage2 script.

 (Crazy idea brewing. Dependency graph management is a solved problem:
it's exactly what "make" does. So my program could simp

Re: s6 init-stage1

2015-01-06 Thread Colin Booth
On Mon, Jan 5, 2015 at 5:03 PM, James Powell  wrote:
> The initial init bootscript that I'm currently drafting is in execline using 
> the template provided by Laurent. I was going to take the advice on using 
> /bin/sh rather than /bin/execlineb but I recanted that decision due to the 
> fact I wanted the using the FIFO handling execline provides.
>
> My question about stage 1 is as follows for a target system of a PC desktop:
>
> If I am reading things correctly, assumingly, init-stage1 using the template, 
> I only need to correct the paths and include any mounts of virtual kernel 
> file systems not listed as well as get cgroups ready, and stage any core 
> one-time services like copying core service scripts to the service scan 
> directory on the tmpfs, correct, before passing off to init-stage2 to load 
> drivers, start daemons, etc.?
>
Laurent's answers are all great. Here's a few other things that I ran
into when adapting s6-init to run on my laptop (distro kernel and a
desire to not trash my root directory too badly), mostly in the
gotchas category:

1. Depending on your initramfs and your on-disk layout you can skip
mounting proc and sys. I know this is the case with Debian, probably
true elsewhere as well.
2. If you aren't starting udev until init-stage2, you'll need to
manually mknod null and console devices before the "Reopen
stdin/stdout/stderr" comment.
3. You'll need to either symlink /tmp into your tmpfs, mount a tmpfs
on /tmp as part of init-stage1, or remount / to rw before s6-svscan is
loaded. Otherwise the catch-all logger won't be able to do its thing
as written. Same deal with /service, though that one is documented and
expected.
4. If you don't want to have your dev mount in /mnt/tmpfs/dev (mostly
to keep ps output non-ugly and to kind-of stick to the FHS) you'll
need to make sure to manually create /dev/pts after you initially
mount a tmpfs or devtmpfs into /dev. This needs to get done before
starting you hotplug manager. udev mounts a devpts there for you when
started, but if you're running mdel you'll need to mount it yourself.
5. I made a few more classes of services for init-stage2 to copy into
the service directory. Specifically for things that I wanted running
ASAP and were udev agnostic. Those were: syslogd (using s6-ipcserver
and ucspilogd), klogd, cron, and udev. Mostly that was because I
needed udev running (and supervised) before bringing up dbus, and I
wanted to make sure /dev/log had a reader before I started bringing
anything up that might not want to talk to stdout instead (openssh,
I'm looking at you).
6. Lastly, since this was an init replacement on a distro-based
system, I made a script called "oneshots" that init-stage2 runs that
fired off all the fake daemons that get started when you bring up a
Debian system. This is things like checking if you're booting while on
batteries, clearing old sudo privileges, and setting the hostname.

The first four are all things that blew up in my face in one way or
another, usually as early-boot kernel panics but sometimes as just a
lot of junk logged to the console while I was trying to log in.
Everything between the fdclose line and repoening stdin is super
fragile, and since we've unmounted /dev, it's impossible to boot
half-way and then start a shell to find out what exactly went wrong.

Good luck. Barring some experiments back in the summer I never
switched any of my daily-use systems to s6-init. I have virutals that
are s6 top-to-bottom, but that doesn't particularly count.

> Thanks,
> James
>

Cheers!

-- 
"If the doors of perception were cleansed every thing would appear to
man as it is, infinite. For man has closed himself up, till he sees
all things thru' narrow chinks of his cavern."
  --  William Blake