Re: Could s6-svscan ignore non-servicedir folders?

2015-01-21 Thread Laurent Bercot

On 21/01/2015 18:24, Olivier Brunel wrote:

I'll have to setup some scripts for different init stages, using
s6-svscan as stage 2, as you've described elsewhere. But I also want to
have a system to start (and stop) services in order. I see this whole
idea of order/dependency is something that is being talked about, but
currently not supported.


 Yes. A dependency manager is something fundamentally different from
a process supervisor, and I definitely do not want to mix the two.
They can be tightly coupled, but I'm not going to make the process
supervisor more complex to accommodate the dependency manager: all the
necessary hooks are already there.

 nosh gets it right. I don't agree with all of nosh's design principles
- especially running the system manager as process 1, which needlessly
makes it more complex and increases the height of the supervision tree -
but I absolutely agree with its concept of system manager changing the
global state by, among others, sending commands to the supervision tree
(which nosh calls service manager). This is the right way to perform
dependency management with a supervision infrastructure.

 And it's also a real project, not something I can add to s6 in a week.
It needs a lot of work, and it's in my long-term plans, but not in my
short-term ones.



Furthermore, I want this system of mine to include other kinds of
services, that is one-time process/scripts that needs to be run once (on
boot), and that's it. And to make things simpler, I want to have it all
work together, mixing longrun services (s6 supervised) and oneshot
services when it comes to dependency/order definition.

So I'll have servicedir of sorts, for oneshot services. And I'm planning
of having one folder, that I tend to call runtime repository, but that
would also be the scandir for s6-svscan.

Obviously though, those aren't servicedirs in the s6 meaning, they
shoudln't be supervised, so I'd like for s6-svscan to check if a folder
does in fact have a file run, and if not to simply skip/ignore it.


 Sorry, but it's going to be a frank no.
 The scandir is the place where s6-svscan does its stuff. It belongs to
s6-svscan. Private property. Don't use it for other purposes, and don't
mess with it, or you're going to get into trouble.

 That's intentional, for several reasons.

 * s6-svscan is a very, very vital and very, very low-level piece of
infrastructure. Which means:
   - The simpler it is, the better. Every line of extra code in it is
potentially a bug in your process 1. It's not the place to handle corner
cases or convenience issues. s6-svscan and s6-supervise are the two
programs where I'm *the most* paranoid about feature creep; everything
I can take out of them and put somewhere else instead, I will.
   - You should not mix interfaces. Clear separation between what is
lower-level and what is higher-level is the key to good architecture.
s6-svscan should not have to know or care about what a one-shot is.
The facts that one-shots and supervisable services should both be
handled by the same system manager entity is of no concern to the
supervision tree.

 * Don't mix config-time and run-time.
   Your one-shot scripts, or directories, belong to a collection of scripts
that will be called whenever the global state changes, for instance at
boot time or shutdown time. They will only be used when the state changes,
not in normal system operation. As such, they belong in your service
repository, along with the entirety of your service directories, even
those that are not active. That's read-only configuration information.
 The scandir, on the other hand, is a snapshot of what is currently live;
it's run-time information. That's not the same thing, so it doesn't
belong in the same place. (That is also why I advise making a live copy
of the active service directories: to separate runtime data from
offline data.)

 * Clarity.
   - In a scandir, it's easy to identify the servicedirs: those are
the non-hidden subdirectories of the scandir. Period. If you change
that, it will be more complex to detect what is a servicedir and what
is a one-shot. ls won't tell you at a glance. Remember how much of
a pain it was, figuring out what exactly happened when entering a SysV
runlevel, which scripts were one-shots and which ones spawned a daemon ?
I don't want to make the same mistake. Especially with a supervision
infrastructure, that has the concept of live data that SysV didn't have.



So, what do you think of this? Would you be willing to have s6-svscan
ignore folders not containing a run file?


 No. Keep out of the scandir. I swear that your system, and also the
software you're developing, and its users, will thank you.

 Directories are not a scarce resource: you can always make a deeper
hierarchy to properly categorize what you're doing. In your case, I
would make at least 4 subdirectories of my main work directory:
* .../service, the live scandir, containing only symlinks
* .../services, which contains the 

Re: Could s6-scscan ignore non-servicedir folders?

2015-01-21 Thread Steve Litt
On Wed, 21 Jan 2015 18:24:58 +0100
Olivier Brunel j...@jjacky.com wrote:

 Hi Laurent,
 
 So you mentioned breaking compatibility recently, and I figure that
 might be a good time for me to mention something. I'd like to set up
 my system around s6, and have been working on this lately.
 
 I'll have to setup some scripts for different init stages, using
 s6-svscan as stage 2, as you've described elsewhere. But I also want
 to have a system to start (and stop) services in order. I see this
 whole idea of order/dependency is something that is being talked
 about, but currently not supported.

Can't you do something like this:

http://smarden.org/runit/faq.html#depends

 
 Furthermore, I want this system of mine to include other kinds of
 services, that is one-time process/scripts that needs to be run once
 (on boot), and that's it. And to make things simpler, I want to have
 it all work together, mixing longrun services (s6 supervised) and
 oneshot services when it comes to dependency/order definition.

I do too. If you have a run-once thing that quickly returns, couldn't
you just not exec the thing in the run script, and then have the last
statement in your run script write a down file to the service? I'm
assuming that s6 does down files the same way as daemontools.

Then, all that remains is to have stage 1 of your boot delete all the
down files that were put there to achieve run-once.

If it really is a daemon, but you don't want it respawned, couldn't you
have the finish script (they have those in runit, I don't know about
s6) write the down file?

You know what you could do? You could make a $servicedir/oneshot
shellscript that does something like this:

touch ./down
cat rm $scriptname/down  $whatever/enable_oneshots.sh

Then all you'd need to do is run $whatever/enable_oneshots.sh in stage
1, and right after that truncate and make executable
$whatever/enable_oneshots.

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance



Re: Could s6-scscan ignore non-servicedir folders?

2015-01-21 Thread Olivier Brunel
On 01/21/15 19:03, Steve Litt wrote:
 On Wed, 21 Jan 2015 18:24:58 +0100
 Olivier Brunel j...@jjacky.com wrote:
 
 Hi Laurent,

 So you mentioned breaking compatibility recently, and I figure that
 might be a good time for me to mention something. I'd like to set up
 my system around s6, and have been working on this lately.

 I'll have to setup some scripts for different init stages, using
 s6-svscan as stage 2, as you've described elsewhere. But I also want
 to have a system to start (and stop) services in order. I see this
 whole idea of order/dependency is something that is being talked
 about, but currently not supported.
 
 Can't you do something like this:
 
 http://smarden.org/runit/faq.html#depends

No. By which I mean, I do not want to use the run script to handle
dependency/order, for a few reasons.

Such as, I don't think it's its place; It makes things harder to
tweak/change, I'll simply use files in subfolder
needs/wants/after/before; It also produces things I don't like, where a
service is seen as up because its run script is indeed running, but the
service hasn't even began to start yet, as the run script is just
waiting for another service to be up.
I don't like that, I'd rather have something (i.e. external tool) start
one service, then waits until it's up to start the other one.

I feel a run script should only set things up (limits, environ, etc) and
exec into the actual service/daemon, nothing else. It should be as quick
as possible, much like the finish script is supposed to be (in s6, 5s
until it's killed).


 Furthermore, I want this system of mine to include other kinds of
 services, that is one-time process/scripts that needs to be run once
 (on boot), and that's it. And to make things simpler, I want to have
 it all work together, mixing longrun services (s6 supervised) and
 oneshot services when it comes to dependency/order definition.
 
 I do too. If you have a run-once thing that quickly returns, couldn't
 you just not exec the thing in the run script, and then have the last
 statement in your run script write a down file to the service? I'm
 assuming that s6 does down files the same way as daemontools.
 
 Then, all that remains is to have stage 1 of your boot delete all the
 down files that were put there to achieve run-once.

Well, it might work, but it feels muck hackier to me. Not to mention a
down file means the service is meant to be down, whereas the oneshot
service would actually be meant to be up (and be active, as in it did
start and completed its run); Also that means a supervise running all
the time for no good reason for each oneshot service...

I don't like that.


 If it really is a daemon, but you don't want it respawned, couldn't you
 have the finish script (they have those in runit, I don't know about
 s6) write the down file?
 
 You know what you could do? You could make a $servicedir/oneshot
 shellscript that does something like this:
 
 touch ./down
 cat rm $scriptname/down  $whatever/enable_oneshots.sh
 
 Then all you'd need to do is run $whatever/enable_oneshots.sh in stage
 1, and right after that truncate and make executable
 $whatever/enable_oneshots.
 
 SteveT
 
 Steve Litt*  http://www.troubleshooters.com/
 Troubleshooting Training  *  Human Performance
 



Re: [PATCH 0/4] Add info on why process is down to statusfile

2015-01-21 Thread Olivier Brunel
On 01/21/15 00:27, Laurent Bercot wrote:
 On 21/01/2015 00:20, Olivier Brunel wrote:
 Cool; I see you've also added a tainstamp into the ready file, that's
 good but you forgot to update the doc of s6-notifywhenup as well, which
 still talks of empty file.
 
  Fixed in current git.

So in the doc you mention how format and contents of this file are
subject to change; not to be a PITA but in that case I think it would
be a good idea to have functions to read (and write?) that file in
libs6, similar to what is available for the statusfile.
So if external tool wanted to use/read the ready file, they could do it
(properly).



Re: Could s6-svscan ignore non-servicedir folders?

2015-01-21 Thread Laurent Bercot

On 21/01/2015 21:47, Olivier Brunel wrote:

The thing is, that you've referred to services  oneshots, whereas I
would refer to longrun services  oneshot services resp., i.e. I see
those as two types of services.


 They are both services from a system management, higher-level, point
of view. From s6-svscan's point of view, longruns are services, oneshots
are something it's entirely irrelevant for.



In fact, a reason I want/need one folder with everything (servicedirs
for both oneshot  longrun) is to avoid issues of two services with the
same name or such complications when resolving dependencies/ordering.


 Refer to services as oneshot/foo or longrun/foo, from your system
manager's main directory. Now users can name their oneshots and their
longruns the same thing, and your dependency manager doesn't care. :)



To maybe explain/detail a bit more how what I'm planning would work: my
service manager is asked to start some services, it pulls dependencies
and ordering from subfolders (of servicedirs) needs/wants/after/before.
It then starts everything in order, waiting on some services to be
started before others can be. Starting a oneshot service would mean
running the start script, and waiting for it; Starting a longrun service
would mean sending the command to its supervise process, and waiting for
the event on its fifodir.

So while the service manager does start services, when it comes to
longrun ones it always delegate to s6, as it should.


 That sounds perfectly reasonable. Make sure to also have a stop script
in your oneshot directories, for when you deactivate the service.



(To be more precise, all longrun servicedirs will include a down file
when created (into /run), to make sure they're started as/when needed.
So starting a longrun service actually also includes removing the down
file once the event has been received (or when sending the start
command, I'm not sure yet which is best) to reflect the change of state.)


 Yeah, that's ugly, but that's because down files are inherently ugly.
I'd say a better solution for the initial boot (which is the only time
where you'd need down files) would be to start with an empty, or almost
empty, scandir, and have your dependency manager auto-populate it if the
longrun service it wants to start isn't being supervised yet.



I understand what you're saying regarding the scandir though. Now I'm
thinking I would have a folder with all servicedirs, oneshots 
longruns, and use a folder .scandir that would only contain symlinks for
all the longrun servicedirs.
That way, the scandir only has the servicedirs that needs to be
supervised, and the service manager still has a single place for all
services.


 That can work, but your ls output will still be confusing, and there
*will* come a day when you'll accidentally link a oneshot directory into
the scandir. And that will be fun.
 What do you have against subdirectories ?



(I might also say, that I'm planning of having this folder of
servicedirs be created on boot into /run, for all enabled services. So
it is all runtime information.)


 Well, you only need the oneshot information during state changes, i.e.
mostly boot and shutdown, and on admin intervention in any case. There's
no live, automatic state to maintain, so why have oneshot directories eat
precious tmpfs space when they could just sleep on your rootfs ?

--
 Laurent



Re: Could s6-scscan ignore non-servicedir folders?

2015-01-21 Thread Avery Payne


On 1/21/2015 7:19 PM, post-sysv wrote:


I'm not sure what effective and worthwhile ways there are to express 
service *relationships*,
however, or what that would exactly entail. I think service conflicts 
and service bindings might
be flimsy to express without a formal system, though I don't think 
it's anything that pre-start
conditional checks and finish checks can't emulate, perhaps less 
elegantly?


This brings to mind the discussion from Jan. 8 about ./provides, where 
a defining a daemon implies:


* the service that it actually provides (SMTP, IMAP, database, etc.); 
think of it as the doing, the piece that performs work


* a data transport (pipe, file, fifo, socket, IPv4, etc.); think of it 
as how you connect to it


* a protocol (HTTP, etc.); think of it as a grammar for conversing with 
the service, with vertical/specific applications like MySQL having their 
own grammars, i.e. MySQL-3, MySQL-4, MySQL-5, etc. for each generation 
that the grammar changes.


I'm sure there are other bits and pieces missing.  With regard to 
relationships, if you had a mapping of these, it would be a start 
towards a set of formal (although incomplete) definitions.  From that 
you could say I need a database that speaks MySQL-4 over a file socket 
and you could, in theory, have a separate program bring up MySQL 4.01 
over a file socket when needed.


But do we really need this?


Re: Could s6-scscan ignore non-servicedir folders?

2015-01-21 Thread post-sysv

On 01/21/2015 06:09 PM, Wayne Marshall wrote:

4) in general, folks here are letting their panties get far too twisted
with the dependency problem.  Actual material dependencies are
relatively few and can be easily (and best) accomodated directly in the
runscript of the dependent service.  See the perpok(8) utility for a
way to handle dependencies that is suitable in practice for most all
installations:


I'd like to second this notion, as well.

The core issue, the way I interpret it, is that dependencies within 
the context of services
and of libraries, are quite different. A library will at the least need 
a stub that exports the
expected symbols to resolve a dependency. In contrast, at its most 
primitive, services simply
need to be started in an order that descendingly satisfies the 
dependency chain.


Thus, if a dependency system is too weak, then it becomes scantly more 
than an idealized way
of expressing startup ordering, one with a little less administrator 
effort, but making the
feature conceptually uninteresting and of little use. But if it is too 
powerful, then it incurs
a maintenance and complexity cost, ends up requiring complicated 
scheduling semantics,

and thus the whole design starts to suffer.

I'm not sure what effective and worthwhile ways there are to express 
service *relationships*,
however, or what that would exactly entail. I think service conflicts 
and service bindings might
be flimsy to express without a formal system, though I don't think it's 
anything that pre-start

conditional checks and finish checks can't emulate, perhaps less elegantly?