Re: Preliminary version of s6-rc available

2015-08-25 Thread Laurent Bercot

On 25/08/2015 20:15, Guillermo wrote:

OK, today's commit  fc91cc6cd1384a315a1f33bc83e6d6e9926fc791, which I
noticed after sending that last message, had already fixed that.


 Yes, another bug-report was sent to me privately, I committed a fix;
that bug was the cause of the "-T 3" behaviour you were observing.
 Cheers!

--
 Laurent


Re: Preliminary version of s6-rc available

2015-08-25 Thread Guillermo
2015-08-25 14:01 GMT-03:00 Guillermo:
>
> But the thing is, no matter what I do, strace shows the '-T 3' never
> changes. Even when I explicitly put a 'timeout-up' file in the service
> definition directory, either containing '0' or a long enough timeout,
> or when I give s6-rc change a -t option.

OK, today's commit  fc91cc6cd1384a315a1f33bc83e6d6e9926fc791, which I
noticed after sending that last message, had already fixed that. I git
cloned again, rebuilt, and no longer have that problem :)

Thanks,
G.


Re: Preliminary version of s6-rc available

2015-08-25 Thread Guillermo
2015-08-25 11:01 GMT-03:00 Laurent Bercot:
>
>  I can't reproduce that one, can you please send me a "strace -vf -s 256"
> output of the s6-rc command that gives you these errors ?

OK, this is a tricky one. While looking at the strace output you asked
for, I realized myself what was going on. The s6-svc process 's6-rc
change' is spawning has options '-uwu -T 3', and my test system seems
to be slow enough (don't ask :) ) that it turns out a 3 millisecond
timeout is too small. A manual s6-svc invocation with that timeout
produces the same errors I saw when using s6-rc. And trying different
-T options, it takes over 30 ms to actually start the service I was
performing the tests with, and over 70 ms to do so without
s6-svlisten1 error messages. All my previous manual s6-svc invocations
didn't specify a timeout, so I didn't notice until now.

But the thing is, no matter what I do, strace shows the '-T 3' never
changes. Even when I explicitly put a 'timeout-up' file in the service
definition directory, either containing '0' or a long enough timeout,
or when I give s6-rc change a -t option. 's6-rc-db timeout' does show
the timeout specified by me, and correctly shows 0 when there is no
'timeout-up' file. So the real issue here appears to be that s6-svc is
always told to wait no longer than 3 ms, and, in particular, you
would't be able to make s6-rc wait forever in the "no readiness
notification" case.

Thanks,
G.


Re: Preliminary version of s6-rc available

2015-08-25 Thread Laurent Bercot

On 25/08/2015 02:21, Guillermo wrote:

The s6-rc-fdholder-filler issue is fixed indeed, thank you. But I
still have the 's6-ftrigrd: fatal: unable to sync with client: Broken
pipe' one (the second one from my previous message). In fact, I made
further tests, and it happened consistently with longruns that did not
have a 'notification-fd' file in their service definition directories.
It didn't matter if they had pipes to other longruns or not.


 I can't reproduce that one, can you please send me a "strace -vf -s 256"
output of the s6-rc command that gives you these errors ?
 Thanks,

--
 Laurent



Re: Preliminary version of s6-rc available

2015-08-24 Thread Guillermo
Hi,

2015-08-24 10:52 GMT-03:00 Laurent Bercot:
>
>  Should be fixed in the latest git, thanks !

The s6-rc-fdholder-filler issue is fixed indeed, thank you. But I
still have the 's6-ftrigrd: fatal: unable to sync with client: Broken
pipe' one (the second one from my previous message). In fact, I made
further tests, and it happened consistently with longruns that did not
have a 'notification-fd' file in their service definition directories.
It didn't matter if they had pipes to other longruns or not.

Thanks,
G.


Re: Preliminary version of s6-rc available

2015-08-24 Thread Laurent Bercot

 Should be fixed in the latest git, thanks !

--
 Laurent


Re: Preliminary version of s6-rc available

2015-08-23 Thread Guillermo
Hello,

I have new issues with the current s6-rc git head (after yesterday's
bugfixes), discovered with the following scenario: a service database
with only two longruns, "producersvc" and "loggersvc", the latter
being the former' s logger. Loggersvc's service definition directory
had only 'consumer-for', 'run' and 'type' files, the run script being:

#!/bin/execlineb -P
redirfd -w 1 /home/test/logfile
s6-log t 1

This means no readiness notification for this service.

So the issues:

* s6-rc-fdholder-filler appears to have a bug when creating
identifiers for the writing end of the pipe between producersvc and
loggersvc:

$ s6-fdholder-list /s6rc-fdholder/s
pipe:s6rc-r-loggersvc
pipe:s6rc-w-loggersvc5\0xdaU

I also saw this with longer pipelines; identifiers for the reading
ends were OK, identifiers for the writing ends ended with random
characters. I didn't try to start producersvc, since I expected it to
fail trying to retrieve the nonexistent "pipe:s6rc-w-loggersvc" file
descriptor.

* s6-rc was unable to start loggersvc. More specifically, 's6-rc -v3
change loggersvc' produced this output:

s6-rc: info: bringing selected services up
s6-rc: info: processing service s6rc-fdholder: already up
s6-rc: warning: unable to access /scandir/loggersvc/notification-fd: No such file or directory
s6-rc: info: processing service loggersvc: starting
s6-ftrigrd: fatal: unable to sync with client: Broken pipe
s6-svlisten1: fatal: unable to ftrigr_startf: Connection timed out
s6-rc: warning: unable to start service loggersvc: command exited 111

However, a manual 's6-svc -uwu /loggersvc' succesfully
started the service, and the following test showed that it worked:

$ execlineb -c 's6-fdholder-retrieve /s6rc-fdholder/s
"pipe:s6rc-w-loggersvc5\0xdaU" fdmove 1 0 echo Test message'

(3 times)

$ cat logfile | s6-tai64nlocal
2015-08-23 18:09:10.822137309  Test message
2015-08-23 18:09:16.871541383  Test message
2015-08-23 18:09:18.219259082  Test message

So I'd have to conclude the problem is in s6-rc, although I didn't see
anything obvious that could launch an s6-svlisten1 process.

Thanks,
G.


Re: Preliminary version of s6-rc available

2015-08-22 Thread Laurent Bercot

On 22/08/2015 08:26, Colin Booth wrote:

I run my s6 stuff in slashpackage configuration so I missed the
s6-fdholder-filler issue. The slashpackage puts full paths in for all
generated run scripts so I'm a little surprised it isn't doing that
for standard FHS layouts.


 FHS doesn't guarantee absolute paths. If you don't
--enable-slashpackage, the build system doesn't use absolute paths
and simply assumes your executables are reachable via PATH search.

 Unexported executables are a problem for FHS: by definition, they
must not be accessible via PATH, so they have to be called with an
absolute path anyway. This is a problem when using staging
directories, but FHS can't do any better.

 Here, I had simply forgotten to give the correct prefix to the
s6-fdholder-filler invocation, so the PATH search failed as it is
supposed to.

--
 Laurent



Re: Preliminary version of s6-rc available

2015-08-22 Thread Laurent Bercot

 Should be all fixed, thanks!

--
 Laurent


Re: Preliminary version of s6-rc available

2015-08-21 Thread Colin Booth
On Fri, Aug 21, 2015 at 6:36 PM, Guillermo  wrote:
> Hello,
>
> I have the following issues with the current s6-rc git head (last
> commit 8bdcc09f699a919b500885f00db15cd0764cebe1):
(snip)
>

I run my s6 stuff in slashpackage configuration so I missed the
s6-fdholder-filler issue. The slashpackage puts full paths in for all
generated run scripts so I'm a little surprised it isn't doing that
for standard FHS layouts.

All the uid/gid stuff I've verified as failing in the same ways. I'd
also expect the gid directories to either: not be symlinks but their
own directories or have a single "access" directory that both the uid
and gid entries are links to. I also don't know s6-fdholder's rules
well enough, but does it treat uid 0 special or if you specify a
non-root uid do you also need to specify root?

Lastly, I appear to have never run `s6-rc-db pipeline longrun'. From
the source it's failing in the if (buffer_flush(buffer_1)) call. I may
be wrong, but I think removing the if test and just forcing out the
flush is what you want.

Cheers!

-- 
"If the doors of perception were cleansed every thing would appear to
man as it is, infinite. For man has closed himself up, till he sees
all things thru' narrow chinks of his cavern."
  --  William Blake


Re: Preliminary version of s6-rc available

2015-08-21 Thread Guillermo
Hello,

I have the following issues with the current s6-rc git head (last
commit 8bdcc09f699a919b500885f00db15cd0764cebe1):

* s6-rc-compile doesn't copy the 'nosetsid' file in the service
definition directory of a longrun to the compiled database directory.

* s6-rc-compile produces an error if it is given the -u option with
more than one user ID. More precisely, if it was called with '-u
uid1,uid2,uid3,...', the error is 's6-rc-compile: fatal: unable to
symlink  to /servicedirs/s6rc-fdholder/data/rules/uid/: File exists'.

* s6-rc-compile produces an error if it is given the -g option without
the -u option, and produces rules directories that look wrong to me
otherwise (or I didn't understand them). More precisely, if it was
called with '-g gid1,gid2,gid3,...' and no -u option, the error is
's6-rc-compile: fatal: unable to mkdir /servicedirs/s6rc-fdholder/data/rules/uid/0/env: No such file or
directory'. And if it was called with '-u user -g gid1,gid2,gid3,...',
then:

  + Both s6rc-fdholder and s6rc-oneshot-runner have gid1, gid2, gid3,
... directories, but showing up in data/rules/uid, and

  + s6rc-fdholder has symlinks gid1, gid2, gid3, ... under
data/rules/gid, pointing to data/rules/uid/, but
s6rc-oneshot-runner has an empty data/rules/gid.

* 's6-rc-db pipeline' displays the expected result, but outputs a
's6-rc-db: fatal: unable to write to stdout: Success' message at the
end.

* Starting s6rc-fdholder produces an 's6-ipcclient: fatal: unable to
exec s6-rc-fdholder-filler: No such file or directory' error. I guess
because it exists in the libexecdir, which isn't normally in
s6-svscan's PATH, so the run script should probably use the full path?

Thanks,
G.


Re: Preliminary version of s6-rc available

2015-07-21 Thread Laurent Bercot


 Please try with the latest git versions of { skalibs, execline, s6, s6-rc }.
 It should fix all the issues you reported. Also, the -u/-g switches to
s6-rc-compile allow you to specify which users can operate the database
being built.
 If you have other suggestions, please send them !

--
 Laurent



Re: Preliminary version of s6-rc available

2015-07-19 Thread Colin Booth
On Fri, Jul 17, 2015 at 10:13 AM, Claes Wallin (韋嘉誠)
 wrote:
> On 17-Jul-2015 12:49 am, "Colin Booth"  wrote:
>
> Depending on your cron, users might be able to simply put an @reboot
> s6-svscan in their user crontab. I don't see many drawbacks with that.
>
There's nothing managing the per-user s6-svscan if it dies during
normal system runtime, which defeats the entire purpose of using a
supervision framework in the first place. With process suprevision, at
some point your supervision tree must have PID 1 bringing the tree
back up (be it an inittab entry, s6-svscan running as init, runit
managing runsvdir and so on) otherwise you're only playing tricks with
daemonization. Using @reboot crontab entries is a clever way around
the reboot case, but like I said above, it doesn't protect the
supervision root process outside of that event.

I actually think that systemd based systems can have a correctly
supervised non-privileged supervision tree through the use of loginctl
enable-linger and daemon-ish unit files. So you could bring up your
supervision tree that way, or just forego the process supervisor and
write directly against systemd. I however don't have any systemd hosts
laying around to test that on, and even if I did s6-rc and systemd
both cover the same operational space.

Cheers!

-- 
"If the doors of perception were cleansed every thing would appear to
man as it is, infinite. For man has closed himself up, till he sees
all things thru' narrow chinks of his cavern."
  --  William Blake


Re: Preliminary version of s6-rc available

2015-07-19 Thread Laurent Bercot

On 19/07/2015 20:13, Guillermo wrote:

Well, I haven't been very lucky with oneshots. First, the "#!execline"
shebang with no absolute path doesn't work on my system, even if the
execlineb program can be found via the PATH environment variable.
Neither does "#!bash", "#!python", or any similar construct. If I run
a script from the shell with such a shebang line I get a "bad
interpreter: No such file or directory" message.


 Looks like your kernel can't do PATH searches.
 The "#!execline" shebang worked on Linux 3.10.62 and 3.19.1. But yeah,
it's not standard, so I'll find a way to put absolute paths there, no
big deal.



/path-to/live/servicedirs/s6rc-oneshot-runne: No such file or
directory
s6-rc: warning: unable to start service : command exited 111

"/path-to/live/" represents here what was the full path of the live
state directory, and the "" was really a string of random
characters. I suppose this was meant to be the path to
s6rc-oneshot-runner's local socket, but somehow ended up being
gibberish instead. So oneshots still don't work for me :(


 I committed a few quick changes lately, I probably messed up some
string copying/termination. I'll investigate and fix this.



* It looks like s6-rc-compile ignores symbolic links to service
definition directories in the source directories specified in the
command line; they seem to have to be real subdirectories. I don't
know if this is deliberate or not, but I'd like symlinks to be allowed
too, just like s6-svscan allows symbolic links to service directories
in its scan directory.


 It was deliberate because I didn't want to read the same subdirectory
twice if there's a symlink to a subdirectory in the same source
directory. But you're right, this is not a good reason, I will remove
the check. Symlinks to a subdirectory in the same place will cause a
"duplicate service definition" error, though.



* I'm curious about why is it required to also have a "producer" file
pointing back from the logger, instead of just a "logger" file in the
producer's service definition directory. Is it related to the "parsing
sucks" issue?


 It's just so that if the compiler encounters the logger before the
producer, it knows right away that it is involved in a logged service
and doesn't have to do a special pass later on to adjust service
directory names.
 It also doubles up as a small database consistency check, and
clarity for the reader of the source.

 

* It doesn't really bother me that much, but it might be worth making
"down" files optional for oneshots, with an absent file being the same
as one contanining "exit", just like "finish" files are optional for
longruns.


 Right. You can have empty "down" files already for this purpose; I guess
I could make them entirely optional.



The user checked against the
data/rules rulesdir would be the one s6-rc was run as, right? So it
defines which user is allowed to run oneshots?


 Yes. And indeed, allowing s6-rc to be run by normal users implies
changing the configuration on s6rc-oneshot-runner. I'll work on it.



And finally, for the record, it appears that OpenRC doesn't mount /run
as noexec, so at least Gentoo in the non-systemd configuration, and
probably other [GNU/]Linux distributions with OpenRC as part of their
"init systems", won't have any problems with service directories under
/run.


 That's good news !

 Thanks a lot for the feedback ! I have a nice week of work ahead of me...

--
 Laurent



Re: Preliminary version of s6-rc available

2015-07-19 Thread Guillermo
2015-07-12 2:59 GMT-03:00 Laurent Bercot:
>
>  s6-rc is available to play with.
> [...] I decided to publish what's already there, so you can test it and
> give feedback while I'm working on the rest. You can compile service
> definition directories, look into the compiled database, and run the
> service manager. It works on my machines, all that's missing is the
> live update capability.

Hi,

Well, I haven't been very lucky with oneshots. First, the "#!execline"
shebang with no absolute path doesn't work on my system, even if the
execlineb program can be found via the PATH environment variable.
Neither does "#!bash", "#!python", or any similar construct. If I run
a script from the shell with such a shebang line I get a "bad
interpreter: No such file or directory" message. And s6-supervise
fails too:

s6-supervise (child): fatal: unable to exec run: No such file or directory
s6-supervise : warning: unable to spawn ./run -
waiting 10 seconds

And because s6rc-oneshot-runner has a run script with an "#!execline"
shebang, it cannot start, and therefore oneshots don't work :)
However, I was able to work around this in two ways: either by just
modifying s6rc-oneshot-runner's run script in the servicedirs/
subdirectory of the live state directory, or by using Linux
binfmt_misc magic[1]. So now I'm really curious about how the
"#!execline" shebang worked on your test systems.

But once I could get s6rc-oneshot-runner to start, I ran into another
problem. "s6-rc change" then failed to run my test oneshot with this
message:

s6-ipcclient: fatal: unable to connect to
/path-to/live/servicedirs/s6rc-oneshot-runne: No such file or
directory
s6-rc: warning: unable to start service : command exited 111

"/path-to/live/" represents here what was the full path of the live
state directory, and the "" was really a string of random
characters. I suppose this was meant to be the path to
s6rc-oneshot-runner's local socket, but somehow ended up being
gibberish instead. So oneshots still don't work for me :(

Longruns without a logger work for me as expected, and I haven't tried
loggers, bundles and dependencies yet.

Now some other general comments:

* It looks like s6-rc-compile ignores symbolic links to service
definition directories in the source directories specified in the
command line; they seem to have to be real subdirectories. I don't
know if this is deliberate or not, but I'd like symlinks to be allowed
too, just like s6-svscan allows symbolic links to service directories
in its scan directory.

* I'm curious about why is it required to also have a "producer" file
pointing back from the logger, instead of just a "logger" file in the
producer's service definition directory. Is it related to the "parsing
sucks" issue?

* It doesn't really bother me that much, but it might be worth making
"down" files optional for oneshots, with an absent file being the same
as one contanining "exit", just like "finish" files are optional for
longruns.

* I second this:

2015-07-14 13:23 GMT-03:00 Colin Booth:
>
> s6-rc-init: remove the uid 0 restriction to allow non-privileged
> accounts to set up supervision trees.

I test new versions of s6 on an entirely non-root supervision tree,
with services that can be run by that user, separate of the
"system-wide" (privileged) supervision tree, if any. And it is also
the way I'm testing s6-rc now. But, independently of any potential
use-cases, I really see it this way: s6-svscan and s6-supervise are
already installed with mode 0755 and can therefore happily run as any
user besides root. So it is possible to build a non-root supervision
tree, and if some services refuse to run because of "permission
denied" errors, they will be gracefully dealt with just like with any
other failure mode; the user will know via the supervision tree logs,
and no harm is done. So if a non-root supervision tree is allowed, why
not a service manager on top of it, too?

2015-07-16 19:16 GMT-03:00 Laurent Bercot:
>
>  I understand. I guess I can make s6-rc-init and s6-rc 0755 while
> keeping them in /sbin, where Joe User isn't supposed to find them.

It would be nice if s6rc-oneshot-runner's data/rules directory (for
s6-ipcserver-access on the local socket) could also be changed, so it
doesn't allow only root. For example, allow the user s6-rc-init ran as
instead (or in addition to root), or allow the specification of an
allowed user, or a complete rulesdir / rulesfile, with an -u, -i or -x
option to s6-rc-compile or sc-rc-init. The user checked against the
data/rules rulesdir would be the one s6-rc was run as, right? So it
defines which user is allowed to run oneshots?

And finally, for the record, it appears that OpenRC doesn't mount /run
as noexec, so at least Gentoo in the non-systemd configuration, and
probably other [GNU/]Linux distributions with OpenRC as part of their
"init systems", won't have any problems with service directories under
/run.

Cheers!
G.

[1] http://www.kernel.org/doc/Documenta

Re: Preliminary version of s6-rc available

2015-07-17 Thread 韋嘉誠
On 17-Jul-2015 12:49 am, "Colin Booth"  wrote:
> On Thu, Jul 16, 2015 at 3:16 PM, Laurent Bercot 
wrote:

> >  Oh, absolutely. It's just that a full setuidgid subtree isn't very
> > common - but for your use case, a full user service database makes
> > perfect sense.
> >
> Yup, my use case is very, very rare. Though it's a use case that I'd
> really like to have be less rare because abusing start-stop-daemon,
> backgrounded nohup, and other daemonization tricks to keep stuff
> running after you've logged off is... kinda wrong in my book.

Depending on your cron, users might be able to simply put an @reboot
s6-svscan in their user crontab. I don't see many drawbacks with that.

-- 
   /c


Re: Preliminary version of s6-rc available

2015-07-17 Thread Laurent Bercot

On 17/07/2015 09:26, Rafal Bisingier wrote:

So I run them as a service with "sleep BIG" in
finish script (it's usually unimportant if this runs on same hours
every day). I can have this sleep in the main process itself, but it
isn't really it's job


 I also use a supervision infrastructure as a cron-like tool. In those
cases, I put everything in the run script:
 if { periodic-task } sleep $BIG

 periodic-task's run time is usually more or less negligible compared
to $BIG, and I'm not expecting to be controlling it with signals anyway
- but I like to being able to kill the sleep if I want to run
periodic-task again earlier for some reason. So I don't mind executing
a short-lived (even if it takes an hour or so) process in a child, and
then having the run script exec into the sleep. And since
periodic-task exits before the sleep, it doesn't block resources
needlessly.

 Whereas if your sleep is running in the finish script, you have no
way to control it. You stay in a limbo state for $BIG and your service
is basically unresponsive that whole time; it's reported as down (or
"finish" with runit) but it's still the normal, running state. I find
this ugly.

 What do you think ? Is putting your periodic-task in a child an
envisionable solution for you, or do you absolutely need to exec into
the interpreters ?

--
 Laurent



Re: Preliminary version of s6-rc available

2015-07-16 Thread Colin Booth
On Thu, Jul 16, 2015 at 3:16 PM, Laurent Bercot  wrote:
[... bunch of stuff about finish ...]
Fully in agreement. I'm not convinced that (not) killing finish as the
default is the right thing either. I'm sure there's some weird
database out there that shuts down dirty and then needs a follow-on
task to clean up the dataset, but that's definitely not going to be
the general case.
>
>  I think the only satisfactory answer would be to leave it to the user :
> keep killing ./finish scripts on a short timer by default, but have
> a configuration option to change the timer or remove it entirely. And
> with such an option, a "burial notification" when ./finish ends becomes
> a possibility.
>
Like I said above, it'll be some rare and bizarre daemon, probably
custom and in-house, that does house cleaning post shutdown to get the
state right. And that rare case isn't enough to change the defaults.
And people running something like that will know and will be able to
set the right flags if they are provided. The rest of us should be ok
with the current behavior :)
>
>
>  I understand. I guess I can make s6-rc-init and s6-rc 0755 while
> keeping them in /sbin, where Joe User isn't supposed to find them.
>
Actually s6-rc is 0755 already. s6-rc-init and s6-rc-update are the
ones building 0700. Keeping those in /sbin is fine, since anyone who
is making private supervision trees is probably doing the manipulation
in a management script and can write six extra characters. In fact,
/sbin is the right place for those scripts since they are needed for
system initialization as well.
>
>
>  Oh, absolutely. It's just that a full setuidgid subtree isn't very
> common - but for your use case, a full user service database makes
> perfect sense.
>
Yup, my use case is very, very rare. Though it's a use case that I'd
really like to have be less rare because abusing start-stop-daemon,
backgrounded nohup, and other daemonization tricks to keep stuff
running after you've logged off is... kinda wrong in my book.

Cheers!


-- 
"If the doors of perception were cleansed every thing would appear to
man as it is, infinite. For man has closed himself up, till he sees
all things thru' narrow chinks of his cavern."
  --  William Blake


Re: Preliminary version of s6-rc available

2015-07-16 Thread Laurent Bercot

On 16/07/2015 19:22, Colin Booth wrote:

You're right, ./run is up, and being in ./finish doesn't count as up.
At work we use a lot of runit and have a lot more services that do
cleanup in their ./finish scripts so I'm more used to the runit
handling of down statuses (up for ./run, finish for ./finish, and down
for not running). My personal setup, which is pretty much all on s6
(though migrated from runit), only has informational logging in the
./finish scripts so it's rare for my services to ever be in that
interim state for long enough for anything to notice.


 I did some analysis back in the day, and my conclusion was that
admins really wanted to know whether their service was up as opposed
to... not up; and the finish script is clearly "not up". I did not
foresee a situation like a service manager, where you would need to
wait for a "really down" event.



As for notification, maybe 'd' for when ./run dies, and 'D' for when
./finish ends. Though since s6-supervise SIGKILLs long-running
./finish scripts, it encourages people to do their cleanup elsewhere
and as such removes the main reason why you'd want to be notified on
when your service is really down. If the s6-supervise timer wasn't
there, I'd definitely suggest sending some message when ./finish went
away.


 Yes, I've gotten some flak for the decision to put a hard time limit
on ./finish execution, and I'm not 100% convinced it's the right
decision - but I'm almost 100% convinced it's less wrong than just
allowing ./finish to block forever.

 ./finish is a destroyer, just like close() or free(). It is nigh
impossible to define sensical semantics that allow a destroyer to fail,
because if it does, then what do you do ? void free() is the right
prototype; int close() is a historical mistake.
 Same with ./finish ; and nobody tests ./finish's exit code and that's
okay, but since ./finish is a user-provided script, it has many more
failure modes than just exiting nonzero - in particular, it can hang
(or simply run for ages). The problem is that while it's alive, the
service is still down, and that's not what the admin wants.
Long-running ./finish scripts are almost always a mistake. And that's
why s6-supervise kills ./finish scripts so brutally.

 I think the only satisfactory answer would be to leave it to the user :
keep killing ./finish scripts on a short timer by default, but have
a configuration option to change the timer or remove it entirely. And
with such an option, a "burial notification" when ./finish ends becomes
a possibility.



Ah, gotcha. I was sending explicit timeout values in my s6-rc comands,
not using timeout-up and timeout-down files. Assuming -tN is the
global value, then passing that along definitely makes sense, if
nothing else than to bring its behavior in-line with the behavior of
timeout-up and timeout-down.


 Those pesky little s6-svlisten1 processes will get nerfed.



Part of my job entails dealing with development servers where
automatic deploys happen pretty frequently but service definitions
dont change too often. So having non-privileged access to a subsection
of the supervision tree is more important than having non-privileged
access to the pre- and post- compiled offline stuff.


 I understand. I guess I can make s6-rc-init and s6-rc 0755 while
keeping them in /sbin, where Joe User isn't supposed to find them.



By the way, that's less secure than running a full non-privileged
subtree.


 Oh, absolutely. It's just that a full setuidgid subtree isn't very
common - but for your use case, a full user service database makes
perfect sense.

--
 Laurent



Re: Preliminary version of s6-rc available

2015-07-16 Thread Colin Booth
On Thu, Jul 16, 2015 at 1:40 AM, Laurent Bercot  wrote:
> On 14/07/2015 18:23, Colin Booth wrote:
>>
... bunch of fixes ...
Looks good.
>
>
>  Well, that's the fundamental asymmetry of run and finish scripts.
> The service is considered up as long as ./run is alive, but that's all:
> as soon as ./run is dead, the service is considered down, whether or
> not ./finish exists and no matter how long it takes to run.
>
>  It may be useful for s6-supervise to report a "./finish exited" event,
> and to have an option for s6-svc to wait for that event, but I believe
> this should be different from the 'd' event - 'd' should definitely be
> for when ./run dies.  What do you think ?
>
You're right, ./run is up, and being in ./finish doesn't count as up.
At work we use a lot of runit and have a lot more services that do
cleanup in their ./finish scripts so I'm more used to the runit
handling of down statuses (up for ./run, finish for ./finish, and down
for not running). My personal setup, which is pretty much all on s6
(though migrated from runit), only has informational logging in the
./finish scripts so it's rare for my services to ever be in that
interim state for long enough for anything to notice.

As for notification, maybe 'd' for when ./run dies, and 'D' for when
./finish ends. Though since s6-supervise SIGKILLs long-running
./finish scripts, it encourages people to do their cleanup elsewhere
and as such removes the main reason why you'd want to be notified on
when your service is really down. If the s6-supervise timer wasn't
there, I'd definitely suggest sending some message when ./finish went
away.
>
>
>  s6-rc passes the "timeout-up" or "timeout-down" value to the forked
> s6-svc. But yes, when there's no service-specific timeout, it would
> probably be a good idea to pass along the global timeout value. Or
> to pass along the min in every case.
>
Ah, gotcha. I was sending explicit timeout values in my s6-rc comands,
not using timeout-up and timeout-down files. Assuming -tN is the
global value, then passing that along definitely makes sense, if
nothing else than to bring its behavior in-line with the behavior of
timeout-up and timeout-down.
>
>
>  Those are actually the same. :)
>  s6-svc has no timeout management itself. When called with a -U|-D
> option, it rewrites itself into a s6-svlisten1 command that calls
> s6-svc without the option. This is what you're seeing.
>
Cool. I did my close reading on s6 commands before s6-svlisten was a
thing so I missed (well, forgot) the bit where s6-svc execs into
s6-svlisten1.
>
>
>  It's contrary to getopt() to allow an option to either be argless or
> take an arg. Think of the default dry-run option as "-n0", not "-n". :)
>
Noted. Not to big a deal since it IS only a one-character difference after all.
>
>
>  I don't think removing the uid 0 restriction on s6-rc-init would
> accomplish what you want. It would mean that some user has access to his
> own private supervision tree along with his own complete service
> database, and manages his own sets of services with s6-rc, including
> a private instance of s6rc-oneshot-runner - in short, duplicating the
> whole s6-rc infrastructure at the user level. It's possible, but
> expensive, and I'm not convinced it would be useful.
>
That's actually pretty much was what I was talking about, I'll expand
on it a bit. You have a custom service that you want to run under
supervision, that gets regular updates from developers, and for
various reasons your setup has an application user that has a very
limited scope of interaction that it's allowed to do (primarily
limited to putting code on a host and then running it). It's trivially
easy to run a setuidgid sub-tree as a service of the main tree which
allows your application user the ability to make changes to their
services without leaking privileges for the system at large. What
isn't easy is all the stuff that supervision is historically bad at,
and that s6-rc (especially the one-shot stuff) is working on fixing.

At work we have that above setup under runit, with a collection of
mid-weight shell scripts to handle interaction between deploy scripts
and runsv. The notification and listen parts of s6 duplicates about
80% of what we have, and s6-rc provides both a convenient wrapper
around s6-svc and an in-supervisor method of dealing with oneshots.

>
>  Users can write their own source directories for service definitions,
> and the admin can take them into account by including them in the
> s6-rc-compile command line. It's not very flexible, but it's secure;
> is there some more flexible functionality that you would like to see ?
>
Part of my job entails dealing with development servers where
automatic deploys happen pretty frequently but service definitions
dont change too often. So having non-privileged access to a subsection
of the supervision tree is more important than having non-privileged
access to the pre- and post- compiled offline stuff.

By the way, that's less secure

Re: Preliminary version of s6-rc available

2015-07-16 Thread Laurent Bercot

On 14/07/2015 18:23, Colin Booth wrote:

And s6-rc shouldn't be responsible for handling the creation and
mounting of its tmpfs, system specific or not. That's the
responsibility of the system administrator or the package maintainer.


 Obviously.



s6-rc-db: [-d] dependencies servicename exits 1 if you pass it a
bundle. Interestingly, all-dependencies servicename shows the full
dependency tree if you pass it a bundle and the docs makes no special
mention of bundles so I'm guessing that the failure when checking
dependencies of bundles is a bug and that the docs are correct.


 Fixed.



s6-rc-init.html: "Typical usage" could be mis-read to have someone who
hasn't been working with s6 for a while to think that s6-rc-init
should be run before the catch-all logger is set up.
index.html Discussion location listed twice.
s6-rc.html: longrun transitions for non-notification supporting
services should say that the service is considered to be up as soon as
s6-supervise is forked and ./run is executed. This deals with an
ambiguity case for non-supervision experts who may not think of the
run script as part of the service. This might be talked about in the
s6 docs, but it's important and should be repeated if that is the
case.

s6-rc.html: note that s6-rc will block indefinitely when starting
services with notification support unless a timeout is set. Similar to
the above, dry-running commands will tell you what's going on under
the hood, but otherwise it's a bit of a black box.


 Fixed.



s6-rc: if you run `s6-rc -utN change service' and the timeout occurs,
s6-rc -da list still reports the service down (as per the docs) but
subsequent runs of `s6-rc -u change service' complain about not being
able to remove the down file.


 Fixed.



s6-svc: -Dd doesn't seem to take finish scripts into account. Not a
bug per-se, but somewhat surprising since a run script is considered
to be part of the service. Initially I thought this was a s6-rc
timeout bug which is why I noticed it here originally.


 Well, that's the fundamental asymmetry of run and finish scripts.
The service is considered up as long as ./run is alive, but that's all:
as soon as ./run is dead, the service is considered down, whether or
not ./finish exists and no matter how long it takes to run.

 It may be useful for s6-supervise to report a "./finish exited" event,
and to have an option for s6-svc to wait for that event, but I believe
this should be different from the 'd' event - 'd' should definitely be
for when ./run dies.  What do you think ?



s6-rc: Unless there's a really good reason not to, -tN should pass
along its timeout value to the forked s6-svc and s6-svlisten1
processes. If for no other reason than it'll keep impatient
administrators with misbehaving processes and too-low shutdown
timeouts from spawning tons and tons of orphaned s6-svlisten1
processes.


 s6-rc passes the "timeout-up" or "timeout-down" value to the forked
s6-svc. But yes, when there's no service-specific timeout, it would
probably be a good idea to pass along the global timeout value. Or
to pass along the min in every case.



s6-rc: dryrun shows inaccurate commands when timeouts are involved:
Shown:
# s6-rc -l s6-rc-live -d -t 1000 -n0 change sleeper
s6-rc-dryrun: /package/admin/s6/command/s6-svc -Dd -T 0 --
s6-rc-live/servicedirs/sleeper
actual when running the above:
package/admin/s6-2.1.5.0/command/s6-svlisten1 -d --
s6-rc-live/servicedirs/sleeper
/package/admin/s6-2.1.5.0/command/s6-svc -d --
s6-rc-live/servicedirs/sleeper
Not sure where this is going wrong, but I bet it's related to the
previous issue as well.


 Those are actually the same. :)
 s6-svc has no timeout management itself. When called with a -U|-D
option, it rewrites itself into a s6-svlisten1 command that calls
s6-svc without the option. This is what you're seeing.



Functionality requests:
s6-rc: it'd be nice if omitting a timeout for -n didn't throw an error
and instead passed -t0 to s6-rc-dryrun.


 It's contrary to getopt() to allow an option to either be argless or
take an arg. Think of the default dry-run option as "-n0", not "-n". :)



s6-rc-init: remove the uid 0 restriction to allow non-privileged
accounts to set up supervision trees. There are occasional situations
where you have a service that you want to supervise but want to have a
non-privileged user be able to make adjustments to that service
without allowing that account sudoers access to your entire
supervision tree.


 I don't think removing the uid 0 restriction on s6-rc-init would
accomplish what you want. It would mean that some user has access to his
own private supervision tree along with his own complete service
database, and manages his own sets of services with s6-rc, including
a private instance of s6rc-oneshot-runner - in short, duplicating the
whole s6-rc infrastructure at the user level. It's possible, but
expensive, and I'm not convinced it would be useful.

 Users can write their own source directories for service defi

Re: Preliminary version of s6-rc available

2015-07-14 Thread Colin Booth
On Mon, Jul 13, 2015 at 3:20 PM, Laurent Bercot  wrote:
>  Ah, so that's why you didn't like the "must not exist yet" requirement.
> OK, got it.
>  Yeah, mounting another tmpfs inside the noexec tmpfs can work, thanks
> for the idea. It's still ugly, but a bit less ugly than the other choices.
> I don't see anything inherently bad in nesting tmpfses either, it's just a
> small waste of resources - and distros that insist on having /run noexec
> are probably not the ones that care about thrifty resource management.
>
It's the (least) ugly option that I can think of. Like I said, not
great but better than the alternative. It does give some nice per-user
isolation as well if you're running multiple sub-trees.
>
>  s6-rc obviously won't mount a tmpfs itself, since the operation is
> system-specific. I will simply document that some distros like to have
> /run noexec and suggest that workaround.
>
And s6-rc shouldn't be responsible for handling the creation and
mounting of its tmpfs, system specific or not. That's the
responsibility of the system administrator or the package maintainer.
>
>
>  Yes, I'm going to change that. "absent" was to ensure that s6-rc-init
> was really called early at boot time in a clean tmpfs, but "absent|empty"
> should be fine too.
>
A fresh, empty tmpfs is probably cleaner than a freshly created
directory in a dirty tmpfs (like /run can be), at least if you're
running s6-svscan in non-pid1 mode.
>
>  Landmines indeed. Services aren't guaranteed to keep the same numbers
> from one compiled to another, so you may well have shuffled the live
> state without noticing, and your next s6-rc change could have very
> unexpected results.
>
Everything seemed to work out ok but live-updating stuff without
adjusting the state file seemed dicy.
>
>  But yes, bundle and dependency changes are easy. The hard part is when
> atomic services change, and that's when I need a whiteboard with tables
> and flowcharts everywhere to keep track of what to do in every case.
>
Yeah, that'll be a bit harder. Good luck with your whiteboarding.
>
>
>  Please mention them. If you're having trouble with the tools, so will
> other people.
>
Most of the stuff has been handled with my closer reading of s6-rc -a,
plus the changes to s6-rc list. Plus simply familiarizing myself with
the tools and their output has helped a lot. I did find a few bugs,
documentation or otherwise:

s6-rc-db: [-d] dependencies servicename exits 1 if you pass it a
bundle. Interestingly, all-dependencies servicename shows the full
dependency tree if you pass it a bundle and the docs makes no special
mention of bundles so I'm guessing that the failure when checking
dependencies of bundles is a bug and that the docs are correct.

s6-rc-init.html: "Typical usage" could be mis-read to have someone who
hasn't been working with s6 for a while to think that s6-rc-init
should be run before the catch-all logger is set up.
index.html Discussion location listed twice.

s6-rc.html: longrun transitions for non-notification supporting
services should say that the service is considered to be up as soon as
s6-supervise is forked and ./run is executed. This deals with an
ambiguity case for non-supervision experts who may not think of the
run script as part of the service. This might be talked about in the
s6 docs, but it's important and should be repeated if that is the
case.

s6-rc.html: note that s6-rc will block indefinitely when starting
services with notification support unless a timeout is set. Similar to
the above, dry-running commands will tell you what's going on under
the hood, but otherwise it's a bit of a black box.

s6-rc: if you run `s6-rc -utN change service' and the timeout occurs,
s6-rc -da list still reports the service down (as per the docs) but
subsequent runs of `s6-rc -u change service' complain about not being
able to remove the down file. I'd expect a service that timed out on
startup to have the down file since s6-rc-compile.html notes that down
files are used to mark services that s6-rc considers to be down. Maybe
make the removal of the down file the last thing the startup routine
does instead of the first since I'd consider interrupting or killing a
call to s6-rc the same as timing out (and as such shouldn't change the
reported state). -dtN has the same behavior (putting the down file in
place before calling s6-svc) but in that case erring on the side of
down feels correct.

s6-svc: -Dd doesn't seem to take finish scripts into account. Not a
bug per-se, but somewhat surprising since a run script is considered
to be part of the service. Initially I thought this was a s6-rc
timeout bug which is why I noticed it here originally.

s6-rc: Unless there's a really good reason not to, -tN should pass
along its timeout value to the forked s6-svc and s6-svlisten1
processes. If for no other reason than it'll keep impatient
administrators with misbehaving processes and too-low shutdown
timeouts from spawning tons and tons of orphaned s6-svlisten

Re: Preliminary version of s6-rc available

2015-07-13 Thread Laurent Bercot

On 13/07/2015 17:35, Colin Booth wrote:

Those options are all bad. My workaround was to mount a new tmpfs
inside of run (that wasn't noexec) but that made using s6-rc annoying
due to the no directory requirement. I don't think there's anything
inherently bad about nesting mounts in this way though I could be
mistaken.


 Ah, so that's why you didn't like the "must not exist yet" requirement.
OK, got it.
 Yeah, mounting another tmpfs inside the noexec tmpfs can work, thanks
for the idea. It's still ugly, but a bit less ugly than the other choices.
I don't see anything inherently bad in nesting tmpfses either, it's just a
small waste of resources - and distros that insist on having /run noexec
are probably not the ones that care about thrifty resource management.

 s6-rc obviously won't mount a tmpfs itself, since the operation is
system-specific. I will simply document that some distros like to have
/run noexec and suggest that workaround.



My suggestion is for one of: changing the s6-rc-init behavior to
accept an empty or absent directory as a valid target instead of just
absent


 Yes, I'm going to change that. "absent" was to ensure that s6-rc-init
was really called early at boot time in a clean tmpfs, but "absent|empty"
should be fine too.



Hm, either the documentation or my reading skills need work (and I'm
not really sure which).


 When in doubt, I'll improve the doc: a good doc should be understandable
even by people with uncertain reading skills. :)



Actually, assuming you're only making bundle and dependency changes,
it looks like swapping out db, n, and resolve,cdb from under s6-rc's
nose works. I'd be unsurprised if there were some landmines in doing
that but it worked for hot-updating my service sequence.


 Landmines indeed. Services aren't guaranteed to keep the same numbers
from one compiled to another, so you may well have shuffled the live
state without noticing, and your next s6-rc change could have very
unexpected results.

 But yes, bundle and dependency changes are easy. The hard part is when
atomic services change, and that's when I need a whiteboard with tables
and flowcharts everywhere to keep track of what to do in every case.



Glad to hear it. So far s6-rc feels like what I'd expect from a
supervision-oriented rc system. There are some issues that I I haven't
mentioned but I'm pretty sure those are mostly due to unfamiliarity
with the tools more than anything else.


 Please mention them. If you're having trouble with the tools, so will
other people.

--
 Laurent



Re: Preliminary version of s6-rc available

2015-07-13 Thread Colin Booth
On Mon, Jul 13, 2015 at 1:40 AM, Laurent Bercot  wrote:
> On 12/07/2015 23:53, Colin Booth wrote:
>>
>  Ah, now *that* will be a problem. Service directories are copied
> to /run/s6-rc/servicedirs and run scripts are executed from there,
> and that will not work if /run is noexec.
>  This is annoying. There are good reasons to run a daemon on a live
> copy of its service directory instead of on its stock service
> directory, which could be on a read-only filesystem.
>  An ugly alternative would be to put dangling symlinks for supervise/
> and event/ in the database's service directories, but this would
> require s6-rc-compile to know about the live directory location, and
> I really don't like that.
>  Another ugly alternative would be to symlink, not copy, run and
> finish scripts only, but then other scripts in the service directory,
> callable from run and finish, won't work, which defeats expectations.
>
Those options are all bad. My workaround was to mount a new tmpfs
inside of run (that wasn't noexec) but that made using s6-rc annoying
due to the no directory requirement. I don't think there's anything
inherently bad about nesting mounts in this way though I could be
mistaken.
>
>  I'm inclined to simply document that /run should *not* be mounted
> noexec - it's no more dangerous to exec stuff on a tmpfs than on another
> read-write filesystem, and there's no particular reason to enforce
> noexec on /run unless you have an embedded appliance with a read-only
> firmware and no user programs at all, in which case s6-rc is probably not
> useful anyway. But I'm pretty sure distro maintainers would bitch and
> whine and dismiss s6-rc, because, you know, "security".
>
>  I'm open to any ideas to solve this.
>
My suggestion is for one of: changing the s6-rc-init behavior to
accept an empty or absent directory as a valid target instead of just
absent, adding a flag to s6-rc-init to ignore the mkdir portion of its
setup, or adding a --rc-root=DIR configure option. The first option is
my own preference since it's the most flexable but I can see arguments
for the others.
>
>
>> s6-rc -da list doesn't seem to work right. At least, it doesn't work
>> like I'd expect, which is to show all services that are down.
>
>
>  Hmm. Interesting.
>
>  It's not how it works logically.
>  "s6-rc -a list" just puts your up services in the selection, then
> prints the selection - and the -u/-d flag has no influence on it,
> because the selection is the same anyway. Contrary to "listall",
> where dependencies are closed, and s6-rc needs to know whether you're
> going up or down to perform the correct closure.
>
>  However, your expectation makes sense; it needs specialcasing the
> -u|-d flag for "list", but at a human level, it's more intuitive than
> the purely logical behaviour. I changed the behaviour and documented
> it, tell me what you think.
>
The changed behavior is exactly what I'd expected initially. The
original behavior makes sense for the logical case, but it's nice
being able to actually see the state of the world from the rc systems
perspective.
>
>
>  Yes. -a doesn't select everything, it only selects what's up. So
> "s6-rc -ua change" is expected to do nothing, because you're changing
> the state to the current state. :)
>
Clearly, I can't read.
>
>  Contrary to the "list" above, however, I think this is intuitive,
> because -u, -d and -a always have the same meaning. Does -a need to be
> documented more clearly ?
>
Hm, either the documentation or my reading skills need work (and I'm
not really sure which). I think what threw me was that I'd been
assuming that -a was shorthand for "all services" instead of "all
active services" which was then modified by the -u/-d flags. In my
reading would have gotten the following option interactions:
-da list = show all down services
-ua list = show all up services
-da change = bring down all services
-ua change = bring up all services
-pua change / -pda change = swap all up and down services (while the
selection logic is reversed, the behavior is the same).
>
>  If you think it's necessary, I can add a -A option that would mean
> SELECT ALL THE THINGS, but I'm afraid it would be misused, and it's
> easy enough for administrators to make a bundle containing everything
> if they so choose.
>
No. I'd been assuming that -a and the hypothetical -A were the same. I
still prefer my mis-read over the actual functionality of -a, but I
can get over it. You are right that people would misuse it but people
will misuse an all services bundle just as quickly.
>
>> Lastly, I know you're working on it but s6-rc-update will be much
>> appreciated. Having to tear down the entire supervision tree, delete
>> the compiled and live directories, and then re-initialize everything
>> with s6-rc-init is awkward to say the least.
>
>  Oh yes, s6-rc-update is absolutely necessary for s6-rc to be more
> than just a toy. :)
>
Actually, assuming you're only making bundle and dependency changes,
it looks l

Re: Preliminary version of s6-rc available

2015-07-13 Thread Laurent Bercot

On 12/07/2015 23:53, Colin Booth wrote:

The requirement for the s6-rc-init live directory to not exist is
awkward if trying to go with the defaults on a distro system


 Why ? I don't think any distro will create /run/s6-rc at boot
time; they'll mount /run, but /run/s6-rc won't exist. Or is /run
persistent on some systems ?



since /run is mounted noexec.


 Ah, now *that* will be a problem. Service directories are copied
to /run/s6-rc/servicedirs and run scripts are executed from there,
and that will not work if /run is noexec.
 This is annoying. There are good reasons to run a daemon on a live
copy of its service directory instead of on its stock service
directory, which could be on a read-only filesystem.
 An ugly alternative would be to put dangling symlinks for supervise/
and event/ in the database's service directories, but this would
require s6-rc-compile to know about the live directory location, and
I really don't like that.
 Another ugly alternative would be to symlink, not copy, run and
finish scripts only, but then other scripts in the service directory,
callable from run and finish, won't work, which defeats expectations.

 I'm inclined to simply document that /run should *not* be mounted
noexec - it's no more dangerous to exec stuff on a tmpfs than on another
read-write filesystem, and there's no particular reason to enforce
noexec on /run unless you have an embedded appliance with a read-only
firmware and no user programs at all, in which case s6-rc is probably not
useful anyway. But I'm pretty sure distro maintainers would bitch and
whine and dismiss s6-rc, because, you know, "security".

 I'm open to any ideas to solve this.



It'd be nice if s6-rc-db contents printed a newline after the last
service for single-item bundles:


 Fixed.



s6-rc -da list doesn't seem to work right. At least, it doesn't work
like I'd expect, which is to show all services that are down.


 Hmm. Interesting.

 It's not how it works logically.
 "s6-rc -a list" just puts your up services in the selection, then
prints the selection - and the -u/-d flag has no influence on it,
because the selection is the same anyway. Contrary to "listall",
where dependencies are closed, and s6-rc needs to know whether you're
going up or down to perform the correct closure.

 However, your expectation makes sense; it needs specialcasing the
-u|-d flag for "list", but at a human level, it's more intuitive than
the purely logical behaviour. I changed the behaviour and documented
it, tell me what you think.



`s6-rc -ua change' also doesn't seem to do what I'd expect. `s6-rc -da
change' brings down all running services, `s6-rc -pda change' brings
down all running services and then starts all stopped services.
Following that logic I'd expect `s6-rc -ua change' to start all
stopped services, however it instead appears to do nothing. My guess
is that it's related to the issues with -a above and that -a is only
ever returning the things in the "up" group.


 Yes. -a doesn't select everything, it only selects what's up. So
"s6-rc -ua change" is expected to do nothing, because you're changing
the state to the current state. :)
 Contrary to the "list" above, however, I think this is intuitive,
because -u, -d and -a always have the same meaning. Does -a need to be
documented more clearly ?
 If you think it's necessary, I can add a -A option that would mean
SELECT ALL THE THINGS, but I'm afraid it would be misused, and it's
easy enough for administrators to make a bundle containing everything
if they so choose.



Not exactle a bug but the docs are wrong: the index page points to
s6-rc-upgrade when it should point to s6-rc-update.


 Fixed.



Lastly, I know you're working on it but s6-rc-update will be much
appreciated. Having to tear down the entire supervision tree, delete
the compiled and live directories, and then re-initialize everything
with s6-rc-init is awkward to say the least.


 Oh yes, s6-rc-update is absolutely necessary for s6-rc to be more
than just a toy. :)



That's everything I've found in an hour or two of messing around. I
haven't done anything with oneshots or larger dependency trees yet, so
far it's just been a few getty processes and some wrapper bundles.


 Thanks a lot for your comments ! Much, much appreciated, and very
useful.

 (Grrr, looks like BSD has troubles with fdopendir. Incoming more
portability hacks, yay. Please tell your BSD friends that they would have
more followers if they paid a little more attention to POSIX.)

--
 Laurent



Re: Preliminary version of s6-rc available

2015-07-12 Thread Colin Booth
On Sat, Jul 11, 2015 at 10:59 PM, Laurent Bercot
 wrote:
>
>
>  So I decided to publish what's already there, so you can test it and
> give feedback while I'm working on the rest. You can compile service
> definition directories, look into the compiled database, and run the
> service manager. It works on my machines, all that's missing is the
> live update capability.
>
The requirement for the s6-rc-init live directory to not exist is
awkward if trying to go with the defaults on a distro system since
/run is mounted noexec. It's pretty easy to work around but then the
defaults are broken on distro systems.

It'd be nice if s6-rc-db contents printed a newline after the last
service for single-item bundles:
root@radon:/run/s6# s6-rc-db -l /run/s6/s6-rc/ contents 1
getty-5root@radon:/run/s6# s6-rc-db -l /run/s6/s6-rc/ contents 2
getty-6root@radon:/run/s6# s6-rc-db -l /run/s6/s6-rc/ contents 3
getty-5
getty-6
root@radon:/run/s6#
>
>  Bug-reports more than welcome: they are in demand!
>
s6-rc -da list doesn't seem to work right. At least, it doesn't work
like I'd expect, which is to show all services that are down. Given
two longruns getty-5 and getty-6, with getty-5 up and getty-6 down,
I'd expect s6-rc -da list to show getty-6 (and s6-rc -ua list to only
show getty-5). Currently -da list and -ua list show the same thing:
root@radon:/run/s6# s6-rc -l /run/s6/s6-rc -da list
getty-5
root@radon:/run/s6# s6-rc -l /run/s6/s6-rc -ua list
getty-5

s6-svstat shows the correct status of the world:
root@radon:/run/s6# s6-svstat service/getty-5/
up (pid 8944) 301 seconds
root@radon:/run/s6# s6-svstat service/getty-6/
down (signal SIGTERM) 206 seconds

`s6-rc -ua change' also doesn't seem to do what I'd expect. `s6-rc -da
change' brings down all running services, `s6-rc -pda change' brings
down all running services and then starts all stopped services.
Following that logic I'd expect `s6-rc -ua change' to start all
stopped services, however it instead appears to do nothing. My guess
is that it's related to the issues with -a above and that -a is only
ever returning the things in the "up" group.

Not exactle a bug but the docs are wrong: the index page points to
s6-rc-upgrade when it should point to s6-rc-update.

> --
>  Laurent

Lastly, I know you're working on it but s6-rc-update will be much
appreciated. Having to tear down the entire supervision tree, delete
the compiled and live directories, and then re-initialize everything
with s6-rc-init is awkward to say the least. Especially with the above
issues involving s6-rc-init and not being able to overwrite the
contents of directories if they exist.

That's everything I've found in an hour or two of messing around. I
haven't done anything with oneshots or larger dependency trees yet, so
far it's just been a few getty processes and some wrapper bundles.

Cheers!

-- 
"If the doors of perception were cleansed every thing would appear to
man as it is, infinite. For man has closed himself up, till he sees
all things thru' narrow chinks of his cavern."
  --  William Blake


Preliminary version of s6-rc available

2015-07-11 Thread Laurent Bercot


 Hello,

 s6-rc is available to play with.

 There is still a huge piece missing: the s6-rc-update program,
which allows admins to update their current service database without
basically rebooting. This program is vital for distributions, for
instance: if a distribution was using s6-rc, a command such as
"apt-get upgrade" would need to call s6-rc-update after compiling a
new service database.
 But s6-rc-update is not trivial to write, I'm still wrapping my head
around a few design issues - I'm certain all those issues are solvable,
but it may take some time, and I'm already behind schedule.

 So I decided to publish what's already there, so you can test it and
give feedback while I'm working on the rest. You can compile service
definition directories, look into the compiled database, and run the
service manager. It works on my machines, all that's missing is the
live update capability.

 Later on, I plan to add support for things like instantiation, but
I'm not sure yet how to proceed and didn't want to add complexity in
the early stages.

 No official release yet, but you can download s6-rc via git:

 git://git.skarnet.org/s6-rc
 https://github.com/skarnet/s6-rc

 Enjoy,
 Bug-reports more than welcome: they are in demand!

--
 Laurent