execline changes

2013-11-01 Thread Laurent Bercot


 Hello,

 execline-1.3.0 will be out soon. Some changes will be made to the UI:

 * The deprecated execline program will be suppressed: the only launcher will 
be execlineb.
 * The LASTPID and LASTEXITCODE environment variables will be replaced with ! 
and ?
respectively, for more consistency with the shell and the general execline 
naming policy -
and less namespace pollution. The old behaviour will still be available for 
some time via a
compilation switch, but it is strongly discouraged.
 * Substitution commands with -E | -e options will switch default behaviour to 
-e: the
substituted value will be stored into an environment variable, not on the 
command line.

 Please update your scripts accordingly.
 Thanks,

--
 Laurent


Re: ARM none gnieabi sysdeps

2014-01-22 Thread Laurent Bercot


 Hi Vincent,
 You want sysdeps, so I assume you're cross-compiling.
 The easiest way for you to get sysdeps is to actually compile skalibs in
a *native* ARM environment, and fetch the sysdeps from there. If you don't
have a development environment on your real target, qemu can definitely
help : you can use Aboriginal's native-compiler-arm*toolchain inside
a qemu disk to compile skalibs, and then look at the sysdeps file.

 I've stopped collecting sysdeps sets for different architectures,
because it's too much maintenance and unreliable ; and now that qemu is
widely deployed, packaged and everything, and you can set up a small
Aboriginal development environment on a virtual host in a matter of
minutes, there's just no reason anymore to keep a repository of sysdeps.

--
 Laurent


announce: s6-networking-0.0.4

2014-03-05 Thread Laurent Bercot


 Hi,
 s6-networking-0.0.4 is out.

 This release fixes a bug in s6-tcpserver-access and
s6-ipcserver-access, where user-supplied environment
was ignored.
 Thanks to Vallo Kallaste for the bug-report.

 http://skarnet.org/software/s6-networking/

 Enjoy,
 More bug-reports welcome.

--
 Laurent


[announce] s6-networking-0.0.5

2014-03-07 Thread Laurent Bercot


 Hi,
 s6-networking-0.0.5 is out.
 (Because you always find bugs *right after* submitting a release.)

 This release fixes a bug in s6-tcpserver-access which occasionally
caused improper TCPREMOTEHOST resolution.

 http://skarnet.org/software/s6-networking/

 Enjoy,
 Bug-reports still welcome.

--
 Laurent


[announce] skalibs-1.5.1, s6-portable-utils-1.0.3

2014-03-27 Thread Laurent Bercot


 Hello,
 skalibs-1.5.1 is out.
 It fixes a build bug when cross-compiling. Thanks to Vincent De Ribou
for the report.
 It also makes the default conf-cc not in debug mode, duh. .

 http://skarnet.org/software/skalibs/


 s6-portable-utils-1.0.3 is out.
 It fixes a s6-mkfifo bug where the umask was incorrectly respected.
Thanks to Vallo Kallaste for the report.

 http://skarnet.org/software/s6-portable-utils/

 Enjoy,
 Keep those reports comin'.
 
--

 Laurent


Re: backtick -C

2014-04-03 Thread Laurent Bercot



On 04/03/2014 12:27 PM, Vallo Kallaste wrote:

Hi

I noticed that backtick -C does not work for preceding spaces, but
import -C does. Is it intentional?


 -C | -c | -d | -s only make sense when backtick is performing the 
substitution itself, i.e. with the -E option (which is deprecated).


 In your examples, backtick puts the value into the environment, and 
you perform the substitution explicitly with import. Value 
transformations are only done at substitution time, so backtick's -C

option doesn't do anything.
 I should probably print a warning or error message when value 
transformations are used without -E, but then again, -E is going away soon.




This isn't exactly related, but how I can get rid of preceding
space(s) or delimitors commonly. Using pipeline with sed is one way
of course but.. well, any better ways?


 It depends on what you're using the strings for, and your definition 
of better :)
 Value transformations, including crunching, were made so that execline 
could easily process words coming from files or programs' output; in 
that context, it made sense to handle delimitors *after* words.


 I'd say the way to handle preceding delimitors would be:
 - fix your input so you don't have them XD
 - if the input is meant to be split, as with multidefine or 
forbacktickx, then just ignore the first word if it is empty

 - else, pipeline { sed ... } is the most generic way.

--
 Laurent


Re: [announce] 2014 spring release

2014-05-14 Thread Laurent Bercot



Is  -DEXECLINE_OLD_VARNAMES still possible when compiling execline?
If so, maybe it should be mentioned (I find the old names more
readable, and besides there are scripts that would need to be changed,
always a possible source of mayhem when the scripts perform critical
functions!)


 It is still possible, and since it's just a bit of preprocessor work
and doesn't add to the code, I have no technical reason to remove it.
However, I prefer not to support it explicitly, because that would
create script incompatibilities. So it's still there, and it's
probably going to remain there, but it's a hack, and you should not
rely on it or distribute scripts that use that feature.



Maybe the new behaviour of s6-setuidgid should be optional, via some
command line flag? Note also that the first paragraph of Notes in
http://skarnet.org/software/s6/s6-envuidgid.html is now wrong.


 Ah, thanks for the report. Documentation updated.
 I don't think GIDLIST should be optional: a user's identity is not
only its uid and primary group id, it's also a list of groups the user
belongs to; without supplementary groups, Unix rights lose a lot of
power. The original daemontools utilities were lacking that, and I
simply hadn't noticed until I added myself to some group in
/etc/group and what I wanted to do didn't work because the processes
behind s6-setuidgid didn't pick it up.

 You can see it as a bugfix, not as an additional feature.



The documentation in http://skarnet.org/software/conf-compile.html
suggests that conf-home should not be touched; but the default makes
the existence of /package/... mandatory, which is somewhat unexpected
and shocking when using statically compiled binaries. A few comments
about this would make it easy to use in less standard setups.


 I'm afraid I don't understand your point. What are you trying to do
that requires manually setting conf-home ? How does that relate to
using slashpackage and to using statically compiled binaries ?

--
 Laurent


Re: slashpackage again

2014-05-16 Thread Laurent Bercot

Without a conf-home file, s6-svscan requires s6-supervise in
/package/admin/s6-1.1.3.1/command/
defined in s6-config.h as S6_BINPREFIX

Why isn't it enough to find the binary in /package/admin/s6/command/ ?
(This was what I asked hours ago)


 Because it's an intra-package dependency, not a cross-package dependency.
You want commands to call their dependencies from the same version of the
package, not from the version that happens to be the current one,
because the API could have changed in-between. Remember your issue with
execlineb and foreground being incompatible ? That's exactly the kind of
problem versioned paths are preventing.



Maybe setting conf-home to /package/admin/s6 would make it?


 That, again, would solve your immediate problem in a hackish way that is
not guaranteed to work.
 The right way would be to have /package/admin/s6 being a symlink to
/ppackage/admin/s6-1.1.3.1, which would be the directory containing your
command symlink.



I suppose I don't understand slashpackage...


 You are just trying to use it in an environment where it's not suited, so
of course it's making your life harder.
Slashpackage is about:
 - easy package management (having all your package data under a single
directory, easy versioning, etc.)
 - Fixed full pathname guarantees, both versioned and non-versioned.

 For an initramfs, you need neither of those, and the framework, as light
as it is, is still too heavy for your needs. The price to pay is the forest
of symlinks.

--
 Laurent



Re: s6-portable-utils build problem

2014-05-27 Thread Laurent Bercot

in s6-portable-utils software, why copying 'library.so/s6-memoryhog' when, I 
think, It should be 'command/s6-memoryhog'.


 Actually, it should be /usr/libexec/s6-memoryhog .
 You must have changed conf-compile/conf-install-libexec to /library.so ; this 
is wrong,
library.so is for dynamically linked libraries, not executables (even if
.so files are indeed executables, they are generally not supposed to be
used as such).

 Unexported commands are a pain without slashpackage. FHS has no way
to specify that a command should be available to commands belonging to
the same package, but not externally. Historically, /usr/libexec was
used for those commands, but that's just a poor way of doing what
slashpackage is doing (guaranteed access paths to executables). Some
packages use the /usr/lib/$package hierarchy to store their data,
including unexported commands, and this is another poor way of doing
the same thing.
 The place where binaries are stored and the way they are accessed is
ultimately the admin's responsibility. Slashpackage helps. For people
who don't want to use it, conf-compile/conf-install-* are the files to
customize.

 Anyway, s6-memoryhog is undocumented and dangerous, I only wrote it
for testing purposes, and that's why it's unexported in the first place.
Maybe I should remove it entirely.

--
 Laurent



Re: s6-envdir scope

2014-08-22 Thread Laurent Bercot

On 22/08/2014 23:32, John Vogel wrote:

I'm hoping this is right and also all by design


 You are correct, and it is by design indeed, but not my design -
it's simply Unix.

 As a very rough rule of thumb, execline blocks represent processes.
A sequence of commands in the same block will run with the same PID;
the environment set with s6-envdir will then propagate to the end
of the block. A new process will be spawned to run an inner block,
and it will inherit that environment. And when a block ends, the
process dies, and the outer block, an ancestor, has no idea of the
environment that was set in the inner block.

 There are many exceptions to the block = process rule, but if
you're only concerned with environment scoping, it always behaves
according to that rule. When you set an environment variable, its
scope is always from the point where you set it to the end of the
current block, including all the inner blocks.

 There is *one* exception: ifthenelse -s. Here, environment will
leak out of the then-block or the else-block into the remainder part.
But it's such an ugly hack, so out of place with the rest of the
execline design, that I'm not even sure I should keep that option
available in 2.0.

--
 Laurent


Re: superstrip

2014-09-15 Thread Laurent Bercot

Hi Jorge,


Your site has superstrip.c as a single file, but I have superstrip-0.12sp,
which I think I downloaded from your site long ago. Can you clarify?


 superstrip.c is indeed a single file, and it did not justify the hassle
and overhead of fully packaging - either for me or for users. So I just
superstripped the thing down to the essentials :P

 gcc -I/package/prog/skalibs/include -L/package/prog/skalibs/library -o 
superstrip superstrip.c -lstddjb
should work.

--
 Laurent


Re: [skalibs-2.0][PATCH] posting new clean patch (from supervision lits)

2014-12-28 Thread Laurent Bercot


 Hi Toki,

 * Please don't send binaries to the list. If a file is too big for you
to send it, then put it on your favorite pastebin-like service and send
the URL instead.

 *Contrary to what you are saying, there is no problem with libdir -
I just tried again, to make sure. When you specify --libdir as an option,
the value you specify overrides the default. When you do not,
$prefix/usr/lib/package name is used instead, for whatever value of
$prefix you give. And this is true for every --*dir option.  There is
no hardcoded path in the configure scripts, everything is configurable.

 * The install.sh script is there for a reason, as well as the distinction
between dynlibdir and libdir. Please stop suggesting changes before you
understand why the design is as it is. If you have trouble understanding
some design choices, ask specific questions and I will answer them; then
we can discuss. But so far you have not shown me any need to change
anything.

 * To make it abundantly clear: the autoconf-generated installation
directories, as well as the pkgconfig format, *are not* well-designed.
Static libraries, which are build-time dependencies, and should be
packaged in a 'development' package with the header files, have no
business being installed in the same directory as shared libraries,
which are run-time dependencies. This confusion is one of the banes
of GNU, and one of the reasons why autotools and pkgconfig suck.  I am
not going to pay tribute to it, and it irritates me when people
suggest I do so, because it means they do not understand the point of
the skarnet.org project.

--
 Laurent



execline feature: import -u

2014-12-31 Thread Laurent Bercot


 I find myself writing import VAR unexport VAR all the time
in execline scripts, because some environment variables are just
used for substitution and keeping them would only pollute the
rest of the script.

 For convenience, I have added a new -u option to import and
importas. import -u VAR substitutes VAR in the script and unexports
it at the same time. The option can also be used in multisubstitute
import directives.

 The feature is available in the latest git://git.skarnet.org/execline ;
please test it if you're interested (I've performed some basic tests
but might have missed some corner cases).

 And that wraps up 2014. Happy New Year !

--
 Laurent


Re: skalibs ./configure args of form VAR=VALUE ignored

2015-01-02 Thread Laurent Bercot

On 02/01/2015 21:22, Patrick Mahoney wrote:

In skalibs, ./configure --help says:

   To assign environment variables (e.g., CC, CFLAGS...), specify them as
   VAR=VALUE.  See below for descriptions of some of the useful variables.

Though specifying CC=something seems to have no effect.


 Ah, indeed. Thanks for the report.
 I removed the variable assignment on the command line because there needed
to be a specific line in the script for each single variable handled this
way; so, some variables would have an effect, and some would not. I didn't
like it, so I scrapped it all. But I forgot to update the --help message to
mention it. The --help message is now fixed in the current gits.



Of course, the easy workaround of exporting the desired vars before running
./configure does have the intended effect.


 Yes, the intention was to support *environment* variables all along.
Specific VAR=value treatment in the configure arguments isn't worth the loss
of clarity, I think, since putting variables in the environment - or, with
most shells, simply specifying VAR=value before ./configure on the command
line - is so easy.

--
 Laurent



Re: skarnet software packaged in nixpkgs

2015-01-17 Thread Laurent Bercot

 Thanks Patrick!

--
 Laurent


Re: How to report a death by signal ?

2015-02-18 Thread Laurent Bercot

On 18/02/2015 14:55, Olivier Brunel wrote:

But isn't the whole anything = 128 will be reported as 128, and
anything higher is actually 128+signum also a convention that both needs
to agree upon?


 Sure, but most commands exit 128 so that's reliable enough, and it's
a lot easier to follow than the whole pipe shebang. It's much, much
simpler to exit with a given code than to write stuff to a pipe (what
do you do if it blocks ? what do you do if you're fd-constrained ?
what do you do if setting up the plumbing in the parent fails for
whatever reason ? etc. etc.)



Noting that shells do not actually clamp the exit code to 128.


 Indeed, but it comes at the price of uncertainty - you get
accurate information if you're lucky, and complete misinformation
if you're not. It works for shells most of the time because you
don't manually nest shells - it's much riskier for execline.



Just the difference shall probably be pointed out/documented.)


 Definitely.

--
 Laurent



Re: How to report a death by signal ?

2015-02-18 Thread Laurent Bercot

On 18/02/2015 14:20, Peter Pentchev wrote:

[roam@straylight ~]$ perl -e 'die(foo!\n);'; echo $?
foo!
255


 I think you should be ok, for the same reason why a shell is ok:
if you're using Perl, you're most likely writing your whole script
with it, especially control flow and error/crash checking.
You're not playing with an inner interpreter reporting a code to an
outer interpreter. So the weird 255 should not be a problem in
practice.

 If I'm wrong and your use case precisely involves a perl script
running as P or C with G being an execline command, please mention
it! Just because I'd be curious. :)

--
 Laurent



Re: How to report a death by signal ?

2015-02-18 Thread Laurent Bercot

On 18/02/2015 11:58, Peter Pentchev wrote:

OK, so the not using the whole range of valid exit codes point rules
out my obvious reply - do what the shell does - exit 128 + signum.


 Well the shell is happily ignoring the problem, but it doesn't mean
it has solved it. The shell reserves a few exit codes, then does some
best effort, hoping its invoked commands do not step on its feet.
It works because most commands will avoid exiting something  125,
but it's still a convention, and most importantly, the shell itself
does not follow that convention (it obviously cannot!)
 So, something like sh -c sh -c foobar does not report errors
properly: for 126 and 127, there's no way to know if the code belongs
to the inner shell or the outer shell, and for 128+, there's no way
to know if the inner shell or the foobar process got killed.

 Shells get away with it because when they're nested, it's usually
auto-subshell magic and users don't want to know about the inner
shell; but here, I'm trying to solve the problem for execline commands,
and those tend to be nested a lot - so I definitely cannot reserve codes
for the outer command, because the inner command may very well use the
same ones too.



Now the question is, do you want to solve this problem in general, or do
you want to solve it for a particular combination of programs, even if
new programs may be added to that combination in the future, but only
under certain rules?  If it's the former (in general), then, sorry, I
don't have a satisfactory answer for you, and the fact that the POSIX
shell still keeps the exit 128 + signum behavior mostly means that
nobody else has come up with a better one, either (or it might be
available at least as some kind of an option).


 It just means that nobody cares about shell exit codes. Error handling,
if any, is done inside of shell scripts anyway; and in most scripts, a
random signal killing a running command isn't even something people think
about, and I'm sure there are hilarious behaviours hiding in dark corners
of very popular shell scripts, that fortunately remain asleep to this day.

 For execline, however, I cannot use the same casual approach. Execline
scripts live and die by proper exit code reporting, and carelessness may
lead to very obvious breakage.



Personally, I quite like the idea of some kind of a pipe (be it a
pipe(2) pair of file descriptors or an AF_UNIX/PF_UNSPEC socketpair or
some other kind of communication channel based on file descriptors),
even if it is only unidirectional:


 Oh, don't get me wrong, I'm a fan of child-to-parent communication via
pipes, and I use it wherever applicable. Unfortunately, the child may
be anything here, so I need something generic.

 Thanks for your input !

--
 Laurent



Re: How to report a death by signal ?

2015-02-18 Thread Laurent Bercot

On 18/02/2015 14:04, Olivier Brunel wrote:

I don't follow, what's wrong with using a fd?


 It needs a convention between G and P. And I can't do that, because
G and P are not necessarily both execline commands. They are normal
Unix programs, and the whole point of execline is to have commands
that work transparently in any environment, with only the Unix argv
and envp as conventions.



Cause that was my idea as well: return the exit code or 255.


 I was considering it for a while, then figured that the signal number
is an interesting information to have, if G remotely cares about
C crashing. I prefer to reserve the whole range of 128+ for
something went very wrong, most likely a crash at some point, and
if you get 129+, it was directly below you and you get the signal
number.



Though if you want shell compatibility you could also have an option
to return exit code, or 128+signum when signaled, and similarly one
would either be fine with that, or have to use the fd for full/complete
info.


 Programs that can establish a convention between one another are easy
to deal with. If I remember to document the convention (finish scripts
*whistle*)

--
 Laurent



Re: How to report a death by signal ?

2015-02-18 Thread Laurent Bercot


 I'm leaning more and more towards the following approach:

 - child crashed: exit 128 + signal number
 - child exited with 128 or more: exit 128
 - else: exit the child's exit code.

 Assuming normal commands never exit more than 127, that
reports the whole information to the immediate parent, and
correct information, if incomplete, higher up. That should
be enough to make things work in all cases.

 Thoughts ?

--
 Laurent


Re: s6, execline, skalibs in FreeBSD

2015-02-01 Thread Laurent Bercot


 Thanks Colin !

--
 Laurent


Re: Feature requests for execline s6

2015-01-26 Thread Laurent Bercot



- execline: I'd like the addition of a new command, e.g. readvar, that
would allow to read the content/first line of a file into a variable.
IOW, something similar to importas (including an optional default
value), only instead of specifying an environment variable one would
give a file name to load the value from (in a similar manner as to what
s6-envdir does, only for one var/file and w/out actually using
environment variable).


 How about
backtick -n SOMETHING { redirfd -r 0 FILE cat } import -U SOMETHING ?

 Sure it's longer, but it's exactly what the readvar command would do.
You can even program the readvar command in execline. ;)

 If for some reason it's inconvenient for you, I can envision writing
something similar to readvar, but it *is* going to use the environment:
I've been trying to reduce the number of commands that perform
substitution themselves, so that passing state via the environment is
prioritized over rewriting the argv, and import is kind of the
one-stop-shop when you need substitution. I don't want to go back on
that intent and add another substituting command. Is it a problem for
you to use the environment ?



- s6-log: I think a new action to write to a file could be useful.


 The problem with that is the whole design of s6-log revolves around
controlling the size of your logging space: no uncontrolled file
growth. Writing to a file would forfeit that.

 What I'm considering, though, is to add a spawn a given program
action, like a !processor but every time a line is selected and triggers
that action. It would answer your need, and it would also answer
Patrick's need without fswatch. However, it would be dangerous: a
misconfiguration could uncontrollably spawn children. I'll do that when
I find a safe way to proceed.



Now, shouldn't thoose 2 simply be 1-s, since the NUL is already
accounted for with sizeof? Or am I missing/misunderstanding something here?


 I'm looking, I'm thinking, and I can't find a good reason why those
shouldn't be 1s.
 ... That means... all this time, s6-supervise has been reading from an
invalid memory location (the byte after the end of /supervise/status) ?
Ouch. That one could have seriously hurt. Thanks for the bug-report.

 (And valgrind never said anything. Bad, bad valgrind.)

--
 Laurent



Re: Fwd: [skalibs-2.0][PATCH] posting new clean patch (from supervision lits)

2015-01-05 Thread Laurent Bercot

On 05/01/2015 23:01, Paul Jarc wrote:

Is there any autoconf-equivalent processing that needs to be done to a
fresh git clone, or is it already in the same state as a released
tarball?


 No processing necessary, fresh git clones should be usable as is.
Tarballs are just made from tagged git snapshots.

--
 Laurent


Re: tai confusion

2015-01-07 Thread Laurent Bercot

On 07/01/2015 08:40, Paul Jarc wrote:

I'm finally digging into a long-standing bug exhibited by runwhen
(rw-match computes a timestamp 10 seconds too early), and I think the
problem is in skalibs.  tai_from_sysclock() adds 10 seconds depending
on whether skalibs is configured with --enable-tai-clock.  But
tai_from_timeval doesn't, so they're inconsistent.


 Actually, they're not; what is inconsistent is the naming. I probably
should have paid more attention to that, and may change it in the
future (yay API changes).

 In tai_from_sysclock, tai means: what will be stored in that
structure is a real, absolute TAI time. It's the TAI time corresponding
to the sysclock time. It's the same whether the clock is TAI-10 (in
which case you simply add 10 seconds) or whether it's UTC (in which case
you add 10 seconds plus the leap seconds). The tai_from_utc function,
which tai_from_sysclock resolves to when --enable-tai-clock is not given,
does add the 10 seconds too.

 In tai_from_timeval, tai means: store the same information in a
tai_t. It's a simple format conversion function - the struct timeval
could hold anything, a TAI-10 time, a UTC time, or something else, as
long as it's absolute. No time conversion operation is done here.
Yes, it's confusing. My bad.



actually both wrong: the correct behavior for both should be to
unconditionally add 10 seconds, and conditionally add leap seconds
depending on --enable-tai-clock


 That is what happens for tai_from_sysclock.
 That is not what happens for tai_from_timeval, on purpose.

 I suspect your problem in runwhen is that you are calling
tai_from_timeval(), or any other simple format conversion function,
while expecting a time conversion function.
 You should always be using tain_now() to get the current time,
it will give you TAI no matter what your setup is. You should not
generally use tai_from_timeval() yourself.



With a POSIX clock and no --enable-tai-clock, we need to add the
appropriate amount of leap seconds or else the tai_t values we
generate will differ from those simultaneously generated on a system
using TAI-10 and --enable-tai-clock.


 Yes, that is exactly what happens. When you call tai_sysclock(), the
TAI value you get is the same whether your clock is set to TAI-10 and
you have --enable-tai-clock, or your clock is set to UTC and you have
--disable-tai-clock.
 The tain_sysclock() function, which is what tain_now() normally
calls (unless you asked --enable-monotonic), goes like this:

* sysclock_get() gets the time from the system clock, no matter its
format, into a tain_t. tain_from_timespec or tain_from_timeval are
just struct conversions, they're time-agnostic.

* tai_from_sysclock() assumes that the time it is given comes directly
from the system clock in its native format, and converts it into TAI:
  - by adding 10 seconds if the system clock is TAI-10
  - by calling tai_from_utc() otherwise
+ tai_from_utc() adds 10 seconds plus the leap seconds

 so what you get in the end is always TAI.



 (This means that on a POSIX
system, converting future times to TAI will give you wrong results
after the time when the as-yet-unknown next leap second will be
added.)


 That's unfortunately unavoidable, and a limitation of POSIX.
Time arithmetic can only be performed correctly with linear time,
which TAI is and UTC is not. That is why skalibs uses TAI for all
its time computations. But with a UTC clock, you do need an accurate
leap second table to make the correct conversions, and if you're
computing past the last known leap second, tough luck.

 The alternative with a UTC clock, which basically every non-skalibs-
based system uses, is to perform time arithmetic with UTC, which
gives you wrong results whether far into the future or not, and they
simply don't care because it would be too hard. Thanks POSIX.

--
 Laurent



Re: Typo in http://skarnet.org/software/s6-portable-utils/upgrade.html

2015-01-08 Thread Laurent Bercot

On 08/01/2015 16:32, Vallo Kallaste wrote:

in 2.0.0.1
skalibs dependency bumped to 2.0.0.0
 ^^^ 2.1.0.0



 Fixed. Thanks.

--
 Laurent


Re: s6-tcpclient read/write FDs

2015-03-17 Thread Laurent Bercot

On 17/03/2015 16:52, Vincent de RIBOU wrote:

Hi all,
I assume that read and write separate FDs (6 and 7) are only present as 
compliancy with other tools which produce 2 real different FDs.But on TCP is 
not really.
I've made TLS client over s6-tcpclient with wolfSSL. This lib takes only 1 FD 
for context.Since FDs 6 and 7 are same link with s6-tcpclient, I used FD 6 for 
ctx building and it works.
How should I use 2 really different FD's in that context??May I set internal 
unix concept to mux 2 FD's on 1??


 You can simply ignore the second fd, which is just a copy of the first one.
 You can even close(7) at the start of your client and use 6 everywhere, it
will work.

--
 Laurent


Re: s6-devd, stdin stdout

2015-03-07 Thread Laurent Bercot

On 07/03/2015 18:37, Olivier Brunel wrote:

Hi,

I have a question regarding s6-devd: why does it set its stdin  stdout
to /dev/null on start?


 Hi Olivier,
 The original purpose of s6-devd was actually to emulate the behaviour
of /proc/sys/kernel/hotplug, using the netlink to serialize calls instead
of having them all started in parallel.
 A helper program called by /proc/sys/kernel/hotplug would start with
stdin and stdout - and even stderr - set to /dev/null. That's where the
redirection in s6-devd comes from.

 Changing that behaviour means that a program that's used with s6-devd
is not guaranteed to be able to use as a /proc/sys/kernel/hotplug
helper if it performs I/O.

 But that's probably not important, so I can remove the stdout redirection,
it shouldn't be a problem - stderr is already non-redirected anyway.
 The stdin redirection, though, should stay: you don't want a hotplug
helper to depend on a state defined by streamed userland input.



Specifically, the doc for s6-setuidgid says:

If account contains a colon, it is interpreted as uid:gid, else it is
interpreted as a username and looked up by name in the account database.

This doesn't seem to be true (anymore?).


 Gah. I totally forgot about that change when rewriting s6-setuidgid.
Now that s6-applyuidgid exists, I really want to get rid of that
quirkiness... but it's my fault, so rather than removing the bit of
documentation, I'll reimplement the feature. Duh.

--
 Laurent



[announce] skalibs-2.3.2.0, s6-2.1.3.0

2015-03-13 Thread Laurent Bercot


 Hello,

 * skalibs-2.3.2.0 is out.
 It fixes bugs reported by altell.ru's static analyzer. Thanks Altell!
 It also adds the gid0_scan() macro.

 http://skarnet.org/software/skalibs/
 git://git.skarnet.org/skalibs


 * s6-2.1.3.0 is out.
 It features new options to s6-envuidgid.

 http://skarnet.org/software/s6/
 git://git.skarnet.org/s6

 Enjoy,
 More bug-reports welcome.

--
 Laurent


Re: NULL pointer dereference in skalibs's mininetstring_write()

2015-03-13 Thread Laurent Bercot

On 13/03/2015 15:50, Roman Khimov wrote:

11if (!w)


 That one should be: if (!*w)



It's obvious that if 'w' is NULL there will be NULL pointer dereference on
line 19 or 20. What's not so obvious is how to properly fix that.


 Actually, w is never supposed to be NULL. Calling mininetstring_write()
with a NULL w is a programming error, and testing w instead of *w was
a bug.

 Thanks for the report!

--
 Laurent



Re: [PATCH 0/7] static analysis fixes for skalibs

2015-03-13 Thread Laurent Bercot

On 13/03/2015 15:24, Roman I Khimov wrote:

Hello.

Here at Altell we daily pass all of our project's software (and that is kinda
whole distribution) through special 'static analysis' build that doesn't
actually produce any output other than reports from two (currently) tools:
cppcheck and Clang's scan-build.

As we've added skalibs into our project we've immediately received reports for
skalibs. It's nice overall, but there are some things that can fixed or
improved, thus this patch series for your review and (probably) merge.


 Hi Roman,

 Thanks a lot for the reports! This is very interesting. I'll try and see if
I can use cppcheck and scan-build myself in the future, if it can detect
errors like the ones you submitted.

 My comments on your patches:

 1/7: I incremented 's' for clarity, because that's I always do in scanning
functions. Normally the compiler ignores the useless increments and this does
not worsen the resulting code.
 Do you think the increment actually takes away from clarity ? Or does clang
emit a warning about it ? (gcc does not.) I think it's harmless, but if you
disagree, I don't really care either way.

 2/7: applied.

 3/7: applied.

 4/7: I've actually tried going the opposite way lately: reducing the amount
of parentheses I'm using. I think it's better to ensure your C programmers
know their operators' precedence than to defensively add parentheses whenever
there's uncertainty. Uncertainty is a bad thing - if you're not sure, read
the language spec. Besides, usually, the language's precedence makes sense.
So I'm not going to apply that one; is there a way to silence that type of
report in the static analyzer ?

 5/7: applied.

 6/7: I'm surprised your tools detected that one, but not the zillion other
cases in skalibs. There are lots of functions that do not modify their
arguments but do not declare them as const... I basically never bother using
const qualifiers in function arguments - force of habit; and I think compilers
are able to themselves deduce the const status of those arguments, so the
code isn't worse for it. At some point, when OCD overcomes laziness, I may
make a complete pass over all of my code and fix that, but I don't think
it's needed, and changing it in one function only doesn't really make sense.

 7/7: applied.

 I'll commit when I've made sense of the mininetstring_write thing. ;)

 Thanks again!

--
 Laurent



Re: [PATCH 0/7] static analysis fixes for skalibs

2015-03-13 Thread Laurent Bercot

On 13/03/2015 16:47, Roman Khimov wrote:

Both scan-build and cppcheck complain here. Sure, it's not an error, just a
harmless dead code, but well, tools don't like dead code and I personally
don't like it either, so IMO it's better to drop it if there are no valid
reasons for it to stay.


 Fine, I removed it. *shrug*



Speaking of dead code, cppcheck also sees some in src/sysdeps/trycmsgcloexec.c
and src/sysdeps/trygetpeerucred.c, but from what I see those are currently
stubs, so I didn't touch them.


 Yes, some of the code in src/sysdeps/ is not supposed to be run, but
only compiled and/or linked. It's there to test for feature of the host
system.



It's purely stylistic thing, so if you as a git master owner think it doesn't
make sense, I'm fine with it.


 Oh, it makes sense, but I don't like this approach. It smells too much of
defensive programming, in which you do things just to be sure. Well,
when in doubt, add parentheses is the wrong approach to me, the right
approach being when in doubt, RTFM and remove the doubt.



Well, this one is from me personally, fixing 5/7 and 7/7 I wasn't sure that
nothing changes 'n' because child_spawn() is not a 10-line function and 'n' is
not fun to search for. Making it const easily ensures that 'n' is the same 'n'
in error handler as it was in the beginning of the function.


 Oh, OK. I understand now. And you're right, n isn't modified in child_spawn().

 Fixes committed, new release ready. Thanks again!

--
 Laurent


[announce] Minor releases

2015-03-30 Thread Laurent Bercot


 Hello,

 A series of small releases.


 * skalibs-2.3.3.0
   ---

 - A bugfix in buffer_get, that returned an error on short read
instead of simply returning the number of bytes read. (For error
on short reads, buffer_getall() is where it's at.)
 - A sha512 implementation, skalibs/sha512.h

 http://skarnet.org/software/skalibs/
 git://git.skarnet.org/skalibs


 * execline-2.1.1.1
   

 - The execline parser is now a library function, el_parse().

 http://skarnet.org/software/execline/
 git://git.skarnet.org/execline


 * s6-dns-2.0.0.3
   --

 - A bugfix in s6dns_engine that sometimes performed a
double close in case of a read error.

 http://skarnet.org/software/s6-dns/
 git://git.skarnet.org/s6-dns


 * s6-networking-2.1.0.1
   -

 - A regression fix: s6-tcpclient and s6-tcpserver-access did not
read /etc/resolv.conf, leading to incorrect DNS resolution.

 http://skarnet.org/software/s6-networking/
 git://git.skarnet.org/s6-networking


 GitHub is currently the target of a DoS attack (apparently from
the Chinese censorship authorities); I had trouble pushing the
changes to GitHub. Don't be surprised if you have trouble pulling
from it. When in doubt, always pull from git.skarnet.org, which
probably won't be under a political Chinese attack in the foreseeable
future.

 Enjoy,
 Bug-reports welcome.

--
 Laurent


[announce] execline-2.1.1.0, s6-portable-utils-2.0.3.0

2015-03-03 Thread Laurent Bercot


 Hello,

 * execline-2.1.1.0 is out.
 It adds a new command: forstdin, which splits its standard input and
spawns a program for every element.
 The forbacktickx command is now a wrapper around pipeline and forstdin.

 http://skarnet.org/software/execline/
 git://git.skarnet.org/execline


 * s6-portable-utils-2.0.3.0 is out.
 It adds a new command: s6-dumpenv, which dumps its whole environment
into an envdir.

 http://skarnet.org/software/s6-portable-utils/
 git://git.skarnet.org/s6-portable-utils


 Enjoy,
 Bug-reports welcome.

--
 Laurent


GitHub mirrors

2015-02-28 Thread Laurent Bercot


 I finally caved in and set up GitHub mirrors for all the
skarnet.org packages.

 https://github.com/skarnet

(Yes, the picture is ugly. I may get a better one in a few
months. :P)

 So, if you wanted a web interface to browse the source, here
you go. I don't like GitHub much, but if it saves me headaches
with cgit, all the better.

--
 Laurent


Re: s6 readiness support

2015-02-25 Thread Laurent Bercot

On 25/02/2015 22:29, Patrick Mahoney wrote:

The loopwhilex keeps the pump primed, so to speak, so /service/s can
be stopped and started many times with readiness reporting working.
Otherwise, I'd need to restart /service/s/log as well as /service/s.

On the other hand, I have mostly idle backtick and head commands hanging
around.


 How about
 pipeline -w
 {
cd ..
   forbacktickx -d\n i { cat } s6-notifywhenup -f echo
 }
 ...
 ?

 If you don't mind using a shell, you can even have a single shell
lying around instead of both a forbacktickx and a cat process:
   /bin/sh -c while read ; do s6-notifywhenup -f echo ; done

 (But really, s6-ftrig-notify event U is less hackish than
s6-notifywhenup echo.)

 And this need for a cat process makes me think forbacktickx is
badly designed. It should parse its own stdin instead of spawning
a command; forbacktickx functionality can be achieved by combining
the parse-stdin program with pipeline.

--
 Laurent



Re: wait but kill if a max. time was exceeeded

2015-04-23 Thread Laurent Bercot

On 23/04/2015 17:41, Gorka Lertxundi wrote:

I have a very simple question, is it possible in execline to wait up to a
maximum amount of time to finish a background program execution? And if it
didn't finish, kill it forcibly?


 Does this help ?
 http://skarnet.org/software/s6-portable-utils/s6-maximumtime.html

--
 Laurent


s6-rc design ; comparison with anopa

2015-04-23 Thread Laurent Bercot


 So, I've been planning to write s6-rc, a complete startup/shutdown script
system based on s6, with complete dependency management, and of course
optimal parallelization - a real init system done right.
 I worked on the design, and I think I have it more or less down; and I
started coding.

 Then Olivier released anopa: http://jjacky.com/anopa/

 anopa is pretty close to my vision. It's well-designed. It's good.
There *are* essential differences with s6-rc, though, and some of them
are important enough that I don't want to immediately stop writing s6-rc
and start endorsing anopa instead.

 This post tries to explain how s6-rc is supposed to work, and how it
differs from anopa, and why I find the differences important. What I hope
to achieve is a design discussion, with Olivier of course, but also other
people interested in the subject, on how an ideal init system should work.

 My goals are to:
 - reach a decision point: should I keep writing s6-rc or drop it ?
Dropping it can probably only happen if Olivier agrees on making a few
modifications to anopa, based on the present discussion, but I don't
think it will be the case because some of those modifications are
pretty hardcore.
 - if I keep writing s6-rc: benefit from this discussion and from
Olivier's experience to avoid pitfalls or designs that would not stand
the test of real-life situations.

 So, on to it.


 Three kinds of services
 ---

 Like anopa, s6-rc works internally with two kinds of services: longrun,
which is simply defined by a service directory that will be directly
managed by s6, and oneshot, which is defined by a directory containing
data (a start script, a stop script, and some optional stuff).

 s6-rc allows the user to provide a third kind of service: a bundle.
A bundle is simply a set of other services. Starting a bundle means
starting all the services contained in the bundle.
 A bundle can be used to emulate a SysV runlevel: the user can put all the

services he needs into a single bundle, then tell s6-rc to change the machine

state to exactly that bundle.
 Bundles can of course contain other bundles.

 A oneshot or a longrun are called atomic services, as opposed to a bundle,
which is not atomic.
 Bundles are useful for the user, because oneshot and longrun are
often too small a granularity. For instance, the Samba service is made
of two longruns, smbd and nmbd, but it's still a single service. So,
samba would be a bundle containing smbd and nmbd.

 Also, the smbd daemon itself could want its own logger, smbd-log.
Correct daemon operation depends on the existence of a logger (a daemon
cannot start if its logger isn't working). So smbd would actually be a
bundle of two long-runs, smbd-run (which is the smbd process itself) and
smbd-log (which is the logger process), and smbd-run would depend on
smbd-log.

 Users who want to start Samba don't want to deal with smbd-run, smbd-log,
nmbd-run and nmbd-log manually, so they would just start samba, and
s6-rc would resolve samba to the proper set of atomic services.


 Source, compiled and live
 -

 Unlike anopa, s6-rc does not operate directly at run-time on the
user-provided service definitions. Why ? Because user-provided data is
error-prone, and boot time is a horrible time for debugging. Also, s6-rc
uses a complete graph of all services for dependency management, and
generating that graph at run-time is costly.

 Instead, s6-rc provides a s6-rc-compile utility that takes the
user-provided service definitions, the source, and compiles it into
binary form in a place in the root filesystem, the compiled.

 At run-time, s6-rc ignores the source, but reads its data from the
compiled, which can be on a read-only filesystem. It also needs a
read-write place to maintain information about its state; this place is
called the live. Unlike the compiled, the live is small: it can reside
in RAM.

 The point of this separation is multifold: efficiency (all checks,
parsing and graph generation performed at compile-time), safety (the
compiled can be write-protected), and clarity (separation of user-
modifiable data, current configuration data, and current live data).

 Atomic services can be very small. It can be a single line of shell
for a oneshot, for instance. I fully expect package developers to
produce source definitions with multiple atomic services (and dependencies
between those services) and a bundle representing the whole package.
I expect the total number of atomic services on a typical reasonably
loaded machine to be around a thousand. Yes, it can grow very fast -
so having a compiled database isn't a luxury.


 Run-time
 

 At run-time, s6-rc only works in *stage 2*.
 That is important, and one of the few things I do not like in anopa:
stage 1 should be completely off-limits to any tool.

 s6-rc only wants a machine with a s6-svscan running on a scandir. It does
not care what happened before. It does not care whether s6-svscan is

Re: s6-rc design ; comparison with anopa

2015-04-23 Thread Laurent Bercot

On 23/04/2015 23:26, Joan Picanyol i Puig wrote:

I'd really expect a ui that can diff compiled  live vs. source (and
obviously, to inspect compile  live).


 There will definitely be a ui to inspect compiled + live.

 As for diffing the current state vs. source, I think it will be too
complex, because it would amount more or less to performing the work
of the compiler again. What can be done is something that compares
two compiled databases, so to diff against the source, you would
compile the source into a temp database then compare the temp to the
current compiled.



I find ./down so convinient that would like having support for it in the
source format.


 The thing is, s6-rc already makes use of ./down internally. When you
run s6-rc -d this-longrun-service, it first brings down everything
that depends on this-longrun-service, then creates ./down in
this-longrun-service's service directory in live, then calls
s6-svc -d on this-longrun-service. It says that the service is supposed
to be down and remain that way, because that is the state you want to
see enforced.
 s6-rc is a global state manager. If you use it, you delegate all your
service management to it.



Any heuristics will face unsolvable situations. I'd aim at getting the
patch (dual of diff above) action right all the time first.


 That can be done, but with or without heuristics, there will still need
to be a tool to actually update the live state. diff is easier than
patch; the details of patch are what I'm interested in.

--
 Laurent



Re: wait but kill if a max. time was exceeeded

2015-04-24 Thread Laurent Bercot

On 24/04/2015 13:28, Peter Pentchev wrote:

Oof, thanks a LOT for taking away the opportunity for me to advertise
http://devel.ringlet.net/sysutils/timelimit/ :P


 Sorry about that. :P
 It's not a very original idea anyway. busybox timeout, for instance,
does the same thing. I'm sure there are plenty of other implementations
too.

--
 Laurent


Re: s6-rc design ; comparison with anopa

2015-04-26 Thread Laurent Bercot

On 25/04/2015 21:38, Colin Booth wrote:

This actually brings up another question, is there any provision for
automatic bundling? If sshd requires sshd-log and a oneshot to create a
chroot directory does s6-compile also create a bundle to represent that
relationship or do we need to define those namings ourselves. This is
the inverse of my question about loggers being implicit dependencies of
services.


 I'll probably add an automatic bundling feature for a daemon and its
logger; however, a oneshot to create a chroot directory? That's too
specific. That's not even guaranteed portable :) (chroot isn't in Single
Unix.) If you want a change from the default daemon+logger configuration,
you'll have to manually set up your own bundle.



It depends on the persons setup. In my case it's a user-supplied
dependency since there's nothing intrisic to dnsmasq or hostapd that
reqires them to run together which I currently get around by polling for
dnsmasq's status from within the hostapd run script. All in all it's a
semantic difference between dependencies that are needed to start
(dependencies), and depenencies that are needed for correct functioning
but are not needed to run (orderings). The nice part is that while there
is a slim difference between the two, all the mechanisms for handling
dependencies can also handle orderings as long as the dependency tracker
handles user-supplied dependencies. Handling user supplied dependencies
also simplifies the system conceptually since people won't have to track
multiple types of requirements.


 What do you mean by user-supplied dependencies ? Every dependency is
basically user-supplied - in the case of a distro, the user will be the
packager, but s6-rc won't deduce dependencies from a piece of software
or anything: the source will be entirely made by people. So I don't
understand the distinction here.



One last thing and I'm not sure if this has been mentioned earlier, but
how easy is it to introspect the compiled form without using the
supplied tools? A binary dumper is probably good enough, but I'd hate to
be in a situation where the only way to debug a dependency ordering
issue that the compiler introduced is from within the confines of the
s6-rc-* ecosystem.


 The s6-rc tool includes switches to print resolved bundles and resolved
dependency graphs. I will make it evolve depending on what is actually
needed. What kind of functionality would you like to see ?
 (There is also a library to load the compiled form into memory, so writing
a specific binary dumper shouldn't be too hard.)

--
 Laurent



Re: s6-rc design ; comparison with anopa

2015-04-26 Thread Laurent Bercot

On 25/04/2015 11:24, Joan Picanyol i Puig wrote:

What I'd like is the ability to have some services ready-to-run, but not
up by default. Some of them might be there for contingency purposes (so
that an operator can start a failover), some of them might have to go up
(and down) at certain times only.


 If a service S doesn't need to be up for other services to run, then
simply don't set dependencies on S. Don't include S in your main bundle
of services. Start your main bundle; then, when you want to start S,
s6-rc -u S. When you want to stop it, s6-rc -d S. Since nothing
depends on S, s6-rc -d S will only stop S.
 While S is considered down from the s6-rc standpoint, its service
directory will include a ./down file.



I lack knowledge  experience to attempt to provide details, so I'll
just handwave poiting that the first concept to have clear is that of
identity: is this servicedir a new service definition or a
modification of an existing one? It then should be feasible to compute
the modifications needed to the live DAG (inserting/removing nodes as
well as restarting them).


 Yes. Identity is simply defined by the service name. The hard part is
what to do when dependencies change for a same name.

--
 Laurent


Re: s6-rc design ; comparison with anopa

2015-04-24 Thread Laurent Bercot

On that note, one thing you've apparently done/planned is auto-stopping,
whereas there is no such thing in anopa. This is because I always felt
like while auto-starting can be easily predictable/have expected
behavior, things aren't the same when it comes to stopping.

That is, start httpd and it will auto-start dependency php which will
auto-start dependency sqld; fine. But I'm not sure stopping sqld should
stop php, despite the dependency. Maybe one just wants to shut down
sqld, not the whole webserver.


 If php depends on sqld, it means php isn't functional when sqld is down.
So, if you shut down sqld without shutting down php, your webserver will
be unreliable.
 If you really want to do that, bypass s6-rc and run
s6-svc -d $live/servicedirs/sqld. s6-rc won't notice when you manually
down a service. You can keep it down, or manually bring back up later.

 It's a case of if you do this, you're supposed to know what you are doing
and the tool won't try to change it behind your back.



Then there's also the case of, imagine you start foobar  web. web is a
bundle with httpd, php  sqld, foobar is a service w/ a dependency on sqld.
Now you stop web; should sqld be stopped? about foobar? What is the
expected behavior here?


 If you stop web, it means you stop httpd, php and sqld, and also foobar
since foobar depends on sqld. If you don't want to do that, don't stop web:
stop httpd and php instead. You'll keep sqld and foobar.

 Better, make a bundle web = httpd+php and web+sqld = httpd+php+sqld.
Start web+sqld, stop web. sqld and foobar are untouched.

 Bundles are great. You can define a bundle for every possible combination
of services if you're so inclined (it's still 2^n, so don't overdo it, but
yeah). Bundle resolution is fast: it's a name lookup in a directory. With
ext4, you could have a million bundle names before you'd start noticing
slowdowns in resolutions.



I'm not sure there's one, different
people/scenario will have different expectations... it always seemed too
complicated so I went with no auto-stopping at all.


 I don't think auto-stopping is any more complex than auto-starting. It's
exactly symmetrical. The difficult part is designing the right dependency
graph, and that's a job for packagers. Yes, they will screw it up, but
please let's fix the world one step at a time.



Just so I understand: why are you talking about smbd-log as a separate
service, and not the logger of service smbd, as usual with s6? Or is
that what you're referring to here with smbd-log?


 A service and its logger are defined as separate longrun services for
s6-rc. The compiler will recognize that smbd-log is a logger for smbd-run
and create the proper service directory with a log/ subdirectory, and
register smbd-run as the service directory for the daemon and
smbd-run/log as the service directory for the logger. But in the
compiled database, those are two different services.

 It makes sense, because unlike systemd, we don't want to start daemons
before their loggers are operational. And loggers are not a given: they
may depend on the oneshot service mount /var/log, for instance. It would
be silly to have the daemon itself depend on /var/log.



I'm not sure where you put the scandir in this? I believe original
servicedirs where taken from the source and put into the compiled, and
s6-rc will create the actual servicedirs in the scandir with s6-rc-init,
correct? So that would be part of live?
How about the oneshot scripts?


 s6-rc-init is supposed to run in stage 2, so it assumes there is an
operational scandir already (probably empty or almost empty). You give
the location of the scandir as an argument to s6-rc-init.
 s6-rc-init creates live, makes $live/scandir a symlink to the scandir
given in argument (so s6-rc always knows how to find it), and creates
all service directories under $live/servicedirs with down files. Then
it symlinks them all into the scandir and calls s6-svscanctl -a.

 The oneshot scripts are read-only, they don't need anything in live,
so they're a part of compiled.
 Of course, there's a $live/state file that maintains the state of
all atomic services, oneshots as well as longruns.



Also, with anopa I wanted to fill what's missing in s6, and that
included a full init system, so stage 1. You originally said s6-rc was
meant to be a full init system, but now you're saying you have another
tool/package in mind for that, s6-init. That's fine, but with anopa I
wanted the whole thing, yes.


 Which is great! Making stage 1 more accessible is arguably more urgent
or important than managing services, but it's also harder to do in a
portable way, that's why I started working on s6-rc first. It's more or
less the only reason.



One thing though, your compilation process only does copy servicedirs,
or is there more (besides the whole packing them into a binary form
alongside dependency graphs)?


 It actually creates service directories depending on the information
that has been given in the 

Re: s6-rc design ; comparison with anopa

2015-04-25 Thread Laurent Bercot

On 25/04/2015 09:35, Colin Booth wrote:

I've been having a hard time thinking about bundles the right way. At
first they seemed like first-class services along with longruns and
oneshots, but it sounds more like they are more of a shorthand to
reference a collection of atomic services than a service in their own
right. Especially since bundles can't depend on anything it clarifies
things greatly to think of them as a shorthand for a collection of
atomic services instead of a service itself.


 Yes, that's exactly what a bundle is: a shorthand for a collection of
atomic services. Sorry if I didn't make it clear enough.



As long as A depends on B depends on C, if you ask s6-rc (or whatever)
to shutdown A, the dependency manager should be able to walk the graph,
find C as a terminal point, and then unwind C to B then finally A. While
packagers will screw up their dependency graph, they'll screw it up (and
fix it) in the instantiation direction.


 If A depends on B depends on C, and you ask s6-rc to shutdown A, then it
will *only* shutdown A. Hoever, if you ask it to shutdown C, then it will
shutdown A first, then B, then C. For shutdowns, s6-rc uses the graph of
reverse dependencies (which is computed at compile time).



Will s6-rc make loggers implicit dependencies of services, or will need
to define that? In other words, if we have a bundle 'ssh-daemon' that
contains the longruns sshd and sshd/log, will the dependency compiler
correctly link those so that when you ask the bundle to start it brings
up sshd/log first, and when you ask it to start it brings down the
logger last.


 Unless there's a compelling reason not to, if there's a sshd service
and a sshd-log service and there's an annotation somewhere in the
definition of sshd or sshd-log that sshd-log is the logger for
sshd, then s6-rc-compile will automatically create a dependency from
sshd to sshd-log, and ensure that $live/servicedirs/sshd is the servie
directory for sshd and $live/servicedirs/sshd/log is the service
directory for sshd-log.
 So yes, when you start the sshd-daemon bundle, s6-rc will start the
logger first and then the daemon, or stop the daemon first and then the
logger.



Are oneshots assumed (requored) to be idempotent? Or does $live/state
track if a oneshot has already been fired and no-op if that is the case?


 $live/state tracks oneshots, of course. :)



Not to be too pedantic, but how often do daemons ask for user input? I
don't know about you, but being required to pass user input on boot is a
big nono in my book.


 Tell that to Olivier. :)
 ISTR some encrypted filesystems require a passphrase to be given at
mount time, so this is a real use case. I don't intend to add terminal
support to s6-rc; if the problem comes up, i.e. if several terminal-
using services are started in parallel and conflicts arise, I'll think
of a specific solution in time.



So components of a bundle can fail and the bundle is still considered to
be functional? This make sense only if bundles are really tag sets or
some other loose grouping.


 Yes, that's what they are. The atomic services that can be started are
still started; however, the s6-rc -u bundle invocation will exit
nonzero, since some atomic services failed. It is then possible to use
some ui to list running atomic services and see what has succeeded and
what has failed.



If you have a wireless router running hostapd (to handle monitor mode on
your radios) and dnsmasq (for dhcp and dns) you're going to want an
ordering where dnsmasq starts before hostapd is allowed to run. There's
isn't anything in hostapd what explicitly requires dnsmasq to be running
(so no dependency in the classic sense) but you do need those started in
the right order to avoid a race between a wireless connection and the
ability to get an ip address.


 Hmmm.
 If hostapd dies and is restarted while dnsmasq is down, the race condition
will also occur, right ?
 Since hostapd, like any longrun process, may die at any time, I would
argue that there's a real dependency from hostapd to dnsmasq. If dnsmasq
is down, then hostapd is at risk of being nonfunctional. I still doubt
ordering without dependency is a thing.



Shouldn't this be handled prior to Stage-2 handoff? At the tail-end of
Stage-1 you Try to start a getty along with the catch-all logger. If
that fails we bring up our debug shell. Assuming it doesn't fail,
Stage-2 starts, s6-rc does its work, and brings up the `gettys' bundle
which no-ops on getty-1 (already started) and bring up 2-N. As long as
s6-rc is constrained to Stage-2 and multiple start attempts on a service
remain idempotent, we should be able to use the same shell escape-hatch
mechanisms in the early stages without having to add extra logic
handling to the application.


 Yes, for the initial getty or debug shell, this can be directly handled
in stage 1. For other conditional executions, I think conditionally
running different s6-rc invocations will be enough. In any case, it can
be done 

github and dropbear

2015-04-22 Thread Laurent Bercot


 Since a few days ago (but I haven't tried committing anything for a
long time before that, so I'm not sure when it started) I've had
trouble pushing commits to the github mirror of my packages.
I push via git over SSH, with the dropbear SSH client, dbclient,
that reports:
 
dbclient: Connection to g...@github.com:22 exited: Integrity error


i.e. it interprets github packets as corrupted ones.

I can't pull from github either: same symptoms.

 Could anyone who uses github and dropbear (I figure there are
such people on this list :)) try pulling from or pushing to
github and tell me if they experience the same, or if it is
just me ?
 Could anyone using git over SSH with another SSH client try
pulling from or pushing to github and tell me if it works for
them ?

 Thanks,

--
 Laurent


Re: s6-rc design ; comparison with anopa

2015-04-27 Thread Laurent Bercot

On 27/04/2015 07:59, Colin Booth wrote:

OpenSSH, at least on Linux and *BSD, chroots into an empty directory
after forking for your login. That was an example but I think the
question is still valid: if you have a logical grouping of longrun foo,
longrun foo-log, and a oneshot helper bar, where foo depends on foo-log
and bar, does s6-rc automatically create a bundle contaning foo,
foo-log, and bar?


 No, because it cannot tell where the logical grouping will end. Chances
are that bar, or foo, or foo-log, depends on something else; you don't
run services, or even service groups, in a void - see below.



In hindsight, this question could probably have been better asked as
follows: does s6-rc automatically create bundles for each complete
dependency chain?


 No, and not only because of the naming - just because it would not make
sense to shutdown such a bundle.
 If foo-log depends on a mountvarlog oneshot that mounts /var/log
because foo-log logs into /var/log/foo, and mountvarlog depends on a
mountvar oneshot, then an automatic foo-bundle would include foo,
foo-log, mountvarlog, and mountvar. Shutting down foo-bundle would
shut down everything in the bundle, so it would bring down everything
that depends on /var and /var/log, then unmount /var/log, then unmount
/var. That is probably not what you want. :)

 A bundle is only a good idea when it makes sense to bring all its
contents up or down at the same time. This is the case for a daemon and
its logger; this is the case for a runlevel. This is not the case for a
complete dependency chain, which you don't need to bundle since s6-rc
will properly act on dependencies anyway.



It's mostly a distinction of does service foo start but maybe not
do the right thing if bar isn't running (httpd and php-fpm) vs. does
service foo crash if bar isn't running (basically everything that
depends on dbus).


 Call me uncompromising, but I don't think start but maybe not do the
right thing if bar isn't running is a useful category. Either foo can
work without bar or it cannot; the unwillingness to make a decision
about it is unjustified cowardice - uncertainty and grey areas bring
nothing but suckiness and pain to a software system.

 If the user decides that foo can work without bar, then fine: he won't
declare a dependency from foo to bar. If he decides otherwise, he will
declare such a dependency. But s6-rc will not provide support for
wishy-washiness.



Also, by no means was I trying to imply that s6-rc should deduce
anything. If anything I was saying that, as an SA, being able to take
implicit dependencies that mostly exist in the form of polling loops in
run scripts and other such hackery (such as my wireless setup) and
rewrite them as explicit dependencies for the state manager to manage
sounds great and is probably the part of s6-rc that I'm most excited
about.


 Well, yes, that's the only reason why I think a real state manager can
be useful :)



Low-surprise interoperability with standard unix tools mostly. Assuming
the compiled format isn't human readable, having the functionality to do
a human-readable dump to stdout (so people can diff, etc) is totally
fine.


 Yes, that's planned.



If we can hit the compiled db directly with the same tools and get
meaningful results, than all the better.


 The db is in binary form, so you'll need the dumping tool.

--
 Laurent



Re: [PATCH] devd: Fix invalid option used for s6-uevent-listener

2015-04-27 Thread Laurent Bercot

 Ah, good catch. Patch applied, thanks. It's available in the
current git.

 Note that I still can't push to github because they've broken
their sshd's compatibility with dropbear. I've reported the
issue, but it hasn't been fixed yet. Until they fix it, the
GitHub mirror for skarnet.org packages will be stale.
 
--

 Laurent



Re: Very basic question, regarding redirects

2015-05-11 Thread Laurent Bercot

On 11/05/2015 13:52, Scott Mebberson wrote:

I'm working on an addition to the s6-overlay project. I want to make it
super easy to create environment variables within a Docker container.


 IIRC, /var/run/s6/container_environment is meant to hold the variables
that the container is actually started with; outside interaction with
the container may rely on that (i.e. commands started with with-contenv
may not work correctly if that environment has been modified). I'm not
sure it's safe to do that.

 But if that's what you want, you can just type

 redirfd -w 1 /var/run/s6/container_environment/env_var_name
 s6-echo -- env_var_value

and you're done. I'm not sure it justifies a specific script to
do that, unless you want to add some more processing.

 Be aware that with-contenv reads the environment variables
verbatim from the files, so you have to manually strip newlines and
otherwise sanitize env_var_value before storing it.



 tr a-z A-Z


 You can't do that in the overlay, because there's no guarantee
that every image will have a tr binary. That's the reason why
everything in the overlay is written in execline and uses
s6-* binaries: no external dependencies at all, so it works with
every possible image.
 Just trust the user to use the correct filename. Also, it is valid
to use lowercase in environment variables - upper case is just a
convention used to work around a deficiency of the shell, i.e. treating
internal shell variables and environment variables the same way (which
makes it harder to tell the difference, that's why people usually
reserve full upper case for the environment).



But I can't get it to work. I guess redirection isn't supported? I've got
no idea really, I'm lost!


 http://skarnet.org/software/execline/redirfd.html :)

 Good luck,

--
 Laurent



Small spring releases

2015-05-05 Thread Laurent Bercot


 Hello,

 Just a few small releases to keep you waiting until s6-rc is ready. :P


 skalibs-2.3.4.0
 ---

 Cleanups and bugfixes. New stat_at() and lstat_at() functions.


 http://skarnet.org/software/skalibs/
 git://git.skarnet.org/skalibs


 execline-2.1.2.0
 

 New command: trap. It does what it sounds like it does.

 http://skarnet.org/software/execline/
 git://git.skarnet.org/execline


 s6-portable-utils-2.0.5.0
 -

 - s6-sort fixed to work with skalibs-2.3.4.0.
 - New command: s6-seq. It also does what it sounds like it does.
This is for use in environments where seq is not guaranteed
available - for instance container overlays.

 http://skarnet.org/software/s6-portable-utils/
 git://git.skarnet.org/s6-portable-utils


 Enjoy,
 Bug-reports welcome.

--
 Laurent


Re: execline's pipeline (and forbacktickx) with closed stdin

2015-05-09 Thread Laurent Bercot

On 10/05/2015 00:07, Guillermo wrote:

I ran into this while experimenting with the example / template stage
1 and 3 init scripts that come with s6's source code. Both of them do
an early fdclose 0 to ignore input. Wouldn't that be tempting the
demons to fly through your nose, then? :)


 Know that I precisely audited the whole series of programs running
with 0, 1 and 2 closed in those stages before answering you. And
there's no risk there, it works. :)
 (In stage 3, you can replace fdclose 0 with redirfd -r 0 /dev/null
and it will be conformant. No such easy way out in stage 1, though,
if you want to change /dev after the kernel has mounted it. Note that
you don't have to do that if you're using devtmpfs and keeping it.)

 

Anyway, I had either replaced the early fdclose 0 with redirfd -r 0
/dev/null (and also realized that worked by accident, because I
somehow have a nonempty /dev at startup) or delayed it a bit. I
suppose that's good enough...


 Did you really manage to umount /dev (maybe) and mount a tmpfs
over it (for sure) with fds still open to the old /dev ? Without
an EBUSY error ? If it's the case, and you're using Linux, then the
kernel's behaviour changed.

--
 Laurent



Re: execline's pipeline (and forbacktickx) with closed stdin

2015-05-10 Thread Laurent Bercot

On 10/05/2015 20:14, Guillermo wrote:

I mostly followed the example init scripts, but I did deviate, among
other things to delegate tasks to OpenRC (as a bundle of oneshot
services in s6-rc terminology, hehe). And now that you remind me they
were originally there, I don't have the fdclose 1 and fdclose 2 in
stage 1 either. I wanted the stage 2 init to have open FDs to
/dev/console, to show OpenRC's output.


 Aren't you redirecting all the logs to a catch-all logger, the
one I call s6-svscan-log in the examples/ subdirectory ?
 It's entirely possible to do without a catch-all logger, but
if there's ever something wrong with the supervision tree
(e.g. one of your services cannot start), your /dev/console
will get spammed.



No kernel complaints
when sysvinit is process 1 (I don't know what it does with its open
FDs), no kernel complaints with the custom stage 1 init having devfs
start with open FDs to /dev/console and /dev/null.
There is no unmounting /dev (maybe that's specifically what would
trigger an EBUSY?). The kernel has CONFIG_DEVTMPFS_MOUNT=n, so no
devtmpfs is mounted after the rootfs. And when I boot with
init=/bin/bash, mount shows nothing but / and a manually mounted
/proc. However, /dev is not empty. I don't know where the device nodes
came from, but they are there. I didn't create them, maybe they were
put in there during Gentoo's installation, I don't know. I fact, the
after boot /dev looks quite different (and I do see a devtmpfs
mounted). A mount --bind / /tmp shows me the original boot time
/dev, so I suppose the root FS has actual static device nodes.


 Yeah, that must be it: static device nodes in your rootfs, but they
are overriden by the ones in your devtmpfs when you mount it. But
when I tried that a few years ago, mounting a new /dev when there were
still fds open to it simply didn't work. They must have changed that.
And if the behaviour you're observing can be consistently relied upon,
that's pretty good news.

--
 Laurent


Re: execline's pipeline (and forbacktickx) with closed stdin

2015-05-09 Thread Laurent Bercot

On 09/05/2015 01:13, Guillermo wrote:

Are we not supposed to use pipeline or forbacktickx with a closed
stdin, or is this something that needs fixing?


 Honestly... both. It's Complicated (tm).

 I read your mail yesterday, shortly after you wrote it, but it
opened a rabbit hole in more than one way. And the correct answer
is: both.

 I consider it a bug, because there are cases where I do need to
run programs with fds 0, 1 or 2 closed, and I generally try to
pay attention to this. So I've pushed a fix to the current execline
git, please tell me if it works for you.

 However, POSIX considers that UB is acceptable when you run a
program with 0, 1 or 2 closed: look for If file descriptor 0 in
http://pubs.opengroup.org/onlinepubs/9699919799/functions/execve.html
So, stricto sensu, it's a case of don't do that - it's acceptable
for pipeline, and other programs, to fly demons through your nose
when you run it with stdin closed.

 And Unix primitives do not make it easy to support that case
without bugs. I have run into that problem before, and your report
is just another incarnation of the problem, and I'm sure there are
other similar ones hiding in my code.

 Whenever you open a file, Unix guarantees that the descriptor you get
is the smallest unused number. So if you run a program with 0 closed,
when the program opens something, or creates a pipe, or anything that
requires a descriptor, descriptor 0 will be used. If you then need to
exec into a program with 0 redirected, you should pay attention to
not overwrite your (internal) descriptor 0 when you dup2() into it.
And dup2() when both descriptors are the same does not clear the
close-on-exec flag, which leads to the problem you observed.

 I'll try to support the case as much as I can, and squash those bugs
whenever they're found, but still, don't do that - Big Bad POSIX will
bite you if you do.

--
 Laurent



Re: skalibs-2.3.5.0 leapsecs.dat

2015-06-08 Thread Laurent Bercot

On 08/06/2015 12:21, Vallo Kallaste wrote:

The leapsecs.dat from skalibs-2.3.5.0.tar.gz matches with old leapsecs.dat
file on the three old systems I tried. Has it been updated?


 Has it been 3 years already ? My, how time flies. Or leaps.
 Sorry, I completely missed the new leap second announcement. I should
check on those things more often, thanks for the reminder!

 skalibs-2.3.5.1 should be available now with an updated leap second table.

--
 Laurent


Re: [PATCH] forstdin: Fix possible error/hanging w/ -p

2015-06-09 Thread Laurent Bercot

On 09/06/2015 20:17, Olivier Brunel wrote:

+  if (pids.s) sig_block(SIGCHLD) ;
(...)
+sig_unblock(SIGCHLD) ;


 Gah. Of course that's it - the noob mistake of looping around fork() without
blocking SIGCHLD. That's so, so bad - I'm really ashamed. :(

 That's what happens when you rely on selfpipes all the time: you forget how
to properly deal with signals without them!

 I did some tests after changing the final waiting logic to sig_pause(), and
didn't get any errors, so I figured it was good - but it obviously wasn't.

 Thanks for the report and the fix! Applied in current git, new release soonish.

 I'll still keep the sig_pause() part: it's actually more ad-hoc work to remove
the signal handler and enter a blocking wait() loop than to simply let the 
signal
handler do its job until there's nothing left. I find the latter more elegant,
even if it didn't work as a fix for the race condition.

--
 Laurent


Re: how can I install runit on oracle linux?

2015-06-09 Thread Laurent Bercot

On 09/06/2015 23:09, Amin Rasooli wrote:

For the past few days I have been banning my head to the wall to figure how to 
install runit on oracle linux.
Could someone give me a working repository ? (all the ones in internet seem to 
be unavailable)


 Hi Amin,
 Wrong list! I didn't write runit. Gerrit Pape did. :)
 runit is discussed on the supervision mailing-list, not the skaware one.

 That said, have you tried the tarball from the original site? That should
work on any flavour of Linux. http://smarden.org/runit/install.html

 Of course, you could always switch to s6. :P

--
 Laurent


Re: how can I install runit on oracle linux?

2015-06-09 Thread Laurent Bercot

On 09/06/2015 23:37, Amin Rasooli wrote:

Sorry for emailing the wrong mailing list, do you happen to have a
repository that I can add ? the tar installation, doesn’t seem to be
working for me.


 That's weird. You may want to report a bug to Gerrit or to the
supervision mailing-list, with the exact error messages you are
getting.

 Sorry, I don't know about runit repositories.

--
 Laurent


Re: [announce] s6-linux-init-0.0.1.0

2015-06-20 Thread Laurent Bercot

On 2015-06-19 19:19, Les Aker wrote:

Looks like s6-linux-init 0.0.1.0 pulls s6 in as a build-time
dependency. Not a huge issue, but might be worth updating the docs to
clarify that until the next release removes that? I've learned to
trust your docs and build tools enough that I spent a while hunting
for what I was doing wrong :)


 Actually, the docs provided with the 0.0.1.0 tarball would tell you
that :)
 I changed the dependencies in a later git commit, and updated the
online docs to match that commit, but it means they don't match the
0.0.1.0 release anymore. My apologies for the misdirection here.

 That's an interesting question: should the docs match the latest
versioned release or the latest git commit ? There are pros and cons
for both.

 Anyway, please use the latest git, it comes with a few other fixes
in addition to removing almost all build-time deps.



Also, I'm in favor of making the shebang use bindir


 Also done in the latest git.

--
 Laurent


[announce] s6-2.1.5.0, s6-linux-init-0.0.1.1

2015-06-25 Thread Laurent Bercot


 Hello,

 s6-2.1.5.0 is out.

 It adds support of SIGHUP to s6-log for a clean exit even when the
-p option is present.
 This is useful when bringing down a supervision tree logging its
output into a s6-log -p instance: the instance survives, but the
.s6-svscan/finish script still has a fd open to it, so the logger
is still alive - which is intentional, but it's good to have a way
to cleanly kill the logger nonetheless.

 http://skarnet.org/software/s6/
 git://git.skarnet.org/s6


 s6-linux-init-0.0.1.1 is out.

 It has no built-time dependencies except skalibs as documented;
it uses bindir in every shebang in order to work with kernels that
don't do PATH resolution. It also fixes a bug where the finish script
could hang forever in some cases (the fix relies on the s6-log change
above).

 http://skarnet.org/software/s6-linux-init/
 git://git.skarnet.org/s6-linux-init

 Enjoy,
 Bug-reports welcome.

--
 Laurent


Re: [announce] s6-2.1.5.0, s6-linux-init-0.0.1.1

2015-06-25 Thread Laurent Bercot

On 25/06/2015 20:16, Les Aker wrote:

I'm not sure if this is intentional given your latest update, but it
looks like the GitHub mirrors for s6 and s6-linux-init don't have then
new versions.


 Because I forgot to tag them. Fixed now. :)
 However, don't grab those - better versions are coming out tonight.

--
 Laurent


[announce] s6-2.1.6.0, s6-linux-init-0.0.1.2

2015-06-25 Thread Laurent Bercot


 Hello,

 (It's always like that. No matter how many checks you perform before
hitting the release button, you always discover the worst bugs right
afterwards.)


 s6-2.1.6.0 is out.

 It adds a -X command to s6-svc, that is like -x except it makes s6-supervise
instantly close its stdin/stdout/stderr before priming for exit.
 This is used in the latest version of s6-linux-init.

 http://skarnet.org/software/s6/
 git://git.skarnet.org/s6


 s6-linux-init-0.0.1.2 is out.

 Oh boy. The race condition that all the crafty fifo shenanigans were
supposed to avoid was still there, sneakily hidden in the ugliness of
the aforementioned shenanigans. Well, now it's fixed. Also, the
possible hang in stage 3 has been fixed too, and the catch-all logger
should exit cleanly as soon as nothing is writing to it anymore, but
not too early.
 I'm reasonably confident that this version works. :)

 http://skarnet.org/software/s6-linux-init/
 git://git.skarnet.org/s6-linux-init

 Enjoy,
 Bug-reports welcome.

--
 Laurent


Re: [PATCH 1/2] s6dns_resolve_parse: always clean up dt, prevent fd leaking

2015-06-11 Thread Laurent Bercot

On 11/06/2015 16:06, Roman I Khimov wrote:

It was noted that with no servers in resolv.conf s6-dns always leaks an fd
after s6dns_resolve_parse_g() usage. I wasn't able to trace it deeper, but
always cleaning up in s6dns_resolve_parse() won't hurt.


 Thanks for the report!
 However, this isn't the correct fix. I have committed the correct one in the
latest git head. (s6dns_resolve_core is supposed to recycle dt itself if it
fails.)
 I also have applied your second patch, thanks :)
 New release coming soon.

--
 Laurent



Re: [PATCH] s6-uevent-spawner: Fix possibly delaying uevents

2015-06-14 Thread Laurent Bercot

On 14/06/2015 21:57, Olivier Brunel wrote:

That is, in your test now you're using x[1] even though it might not
have been used in the iopause call before, so while I guess this isn't
random memory, it doesn't really feel right either.


 You're right, of course, that's why the else was there in the first
place, and removing it can't be done thoughtlessly.

 I've committed something closer to your patch. It's still simpler
because I eliminated redundant tests.



and we're gonna block in handle_stdin.


 That was actually another bug... stdin should be non-blocking. .
 Fixed.

--
 Laurent


Re: [PATCH] s6-uevent-spawner: Fix possibly delaying uevents

2015-06-14 Thread Laurent Bercot

On 14/06/2015 14:37, Olivier Brunel wrote:

Because of the buffered IO, the possible scenario could occur:
- netlink uevents (plural) occur, i.e. data ready on stdin
- iopause triggered, handle_stdin() called. The first uevent is processed, child
   launched, we're waiting for a signal
- SIGCHLD occurs, we're back to iopausing on stdin again, only it's not ready
   yet; Because we've read it all already and still have unprocessed data
   (uevents) on our own internal buffer (buffer_0)


 Right, thanks for the catch. I usually avoid that trap, but meh.
 I committed a simpler change than your patch, please tell me if it fixes
things for you.

--
 Laurent


[announce] s6-2.1.4.0

2015-06-17 Thread Laurent Bercot


 Hello,

 s6-2.1.4.0 is out. It features:

 - Direct readiness notification support in s6-supervise (and consequently
deprecation of the s6-notifywhenup binary).
 - Optimization of the service respawn delay by s6-supervise: the
security delay is now one second between two successive ./run
executions, instead of one second after the service is down. In
other words: if a service that just died had been running for more
than one second beforehand, s6-supervise will restart it immediately.
 - Support for SIGUSR1 in s6-svscan, traditionally meaning poweroff
the machine.
 - Easier stage 1 init support for Linux users via a new package,
s6-linux-init. (See the next announcement on the skaware mailing-list.)

 http://skarnet.org/software/s6/
 git://git.skarnet.org/s6

 Enjoy,
 Bug-reports welcome.

--
 Laurent


Re: [announce] s6-linux-init-0.0.1.0

2015-06-18 Thread Laurent Bercot

On 18/06/2015 05:12, Guillermo wrote:

I did a quick run and found out that in generated execline scripts
except the stage 1 init, the shebang line starts with #!execlineb.


 Yes (unless you use slashpackage).
 And on the machines where I tested, it's not a problem as long as
execlineb is in the PATH - the kernel still finds the binary. This
is certainly nonstandard, and surprised me, but I saw it work, so
since the package is Linux-specific anyway, I didn't change it.

 I can change it to bindir if it doesn't work on some configurations,
though.

--
 Laurent


[announce] s6-linux-init-0.0.1.0

2015-06-17 Thread Laurent Bercot


 Hello,

 s6-linux-init-0.0.1.0 is out.
 It is a new package that, for now, only contains one program,
and more documentation than source code. :)

 Its goal is to automate the creation of stage 1 (/sbin/init) binaries
for people who want to run s6-svscan as process 1.

 Unfortunately, it has to be system-specific; this is the Linux version
because it's the OS I know the best. However, it is very possible to
run it, examine the created scripts, and adjust them to another system's
idiosyncrasies.

 http://skarnet.org/software/s6-linux-init/
 git://git.skarnet.org/s6-linux-init
 Mirrored on github as well.

 Enjoy,
 Bug-reports and suggestions welcome, especially since it's still brand
new and probably rough around the edges.

--
 Laurent


Re: Native readiness notification support in s6-supervise

2015-06-16 Thread Laurent Bercot

On 16/06/2015 04:47, Guillermo wrote:

In the examples/ROOT/img/services-local/syslogd-linux subdirectory,
there is an implementation of the syslogd service for Linux , using
s6-ipcserver with the -1 option and s6-notifywhenup for readiness
notification. Maybe you could modify it in the s6 git header to use
the new s6-supervise feature, too.


 Right. Fixed.

--
 Laurent



Native readiness notification support in s6-supervise

2015-06-15 Thread Laurent Bercot


 When you use s6-notifywhenup, or any readiness notification helper
that is not the service's direct supervisor, there is still a small
race condition - which can only bite in a very, very pathological
case, when the stars align in an incredibly evil way and your
system's scheduler decides that it really hates you. But it's
theoretically still there.

 I don't like it.

 So I bit the bullet and implemented readiness support directly in
s6-supervise. Come at me, pathological cases.

 Now, instead of using s6-notifywhenup, you write a fd number into
a notification-fd file in the service directory. s6-supervise
picks that file up when starting the service, and will read
notification messages (i.e. everything up to the first newline)
that the daemon writes to the specified file descriptor.

 Quick upgrade HOWTO:
 if you were using s6-notifywhenup -d FD -- foobard before,
run echo FD  notification-fd in your service directory, and
just use foobard in your run script now.
 Special annoying case: if FD is 1, i.e. the default, and your
service is logged, you'll want to do the following instead:
echo 3  notification-fd, and then use fdmove 1 3 foobard in
your run script. (Because the notification pipe will conflict with
the logging pipe otherwise.)

 The feature is available in the latest s6 git head, which is also
a release candidate for 2.1.4.0. Please test the feature and send
your comments. If no bugs or immediate improvements are found, I'll
make the release soon.

--
 Laurent


Re: Packaging skaware for Guix

2015-07-06 Thread Laurent Bercot

On 06/07/2015 12:13, Laurent Bercot wrote:

may not find it without the proper --with-libdir option.


 I meant --with-lib, of course.

--
 Laurent


Re: Packaging skaware for Guix

2015-07-06 Thread Laurent Bercot


 Hi Claes !



For me execline fails to build from source because -lskarnet is listed
as a dependency instead of in EXTRA_LIBS. Is this on purpose?


 Yes, this is intentional. EXTRA_LIBS is only used for things like
-lrt or -lsocket which are needed for some binaries on some systems
(and I don't think any execline binary needs them).

 -lskarnet works as a dependency because GNU make will translate it
to libskarnet.a or libskarnet.so depending on what you have and the
configure options you've given. Are you using GNU make 4.0 or later ?
What's the exact configure and make command line you are using, and
the exact error message you're getting ?

 Bear in mind that libskarnet.a is installed in /usr/lib/skalibs/
by default, and if you change anything in the execline config, it
may not find it without the proper --with-libdir option.



I patched tools/gen-deps.sh (recognize ${LIB,*} and -l.* as libraries,
in addition to ${.*_LIB}) and added a phase to generate a new
package/deps.mak before doing the compilation, and then it worked.


 You should not have to patch anything. You can always get the
behaviour you want by providing the appropriate options to configure.



I'm surprised nobody else seems to be having this problem. In
particular, the Nix definition is just a plain build from source and
apparently Just Works. What am I missing?


 I don't know, but your error messages will tell. :)

--
 Laurent



Re: Packaging skaware for Guix

2015-07-06 Thread Laurent Bercot

On 06/07/2015 14:07, Claes Wallin (韋嘉誠) wrote:

./configure: error: target x86_64-unknown-linux-gnu does not match the
contents of 
/gnu/store/hynkavlnn6j0x6aifrawx9d27j6vmzb1-skalibs-2.3.5.1/lib/skalibs/sysdeps/target


 Weird. What's the content of 
/gnu/store/hynkavlnn6j0x6aifrawx9d27j6vmzb1-skalibs-2.3.5.1/lib/skalibs/sysdeps/target
 ?
This error usually means you used a sysdeps directory meant to represent
another architecture, which is of course not good - but I've never seen
it occur if you're *not* cross-compiling.

--
 Laurent


Re: Packaging skaware for Guix

2015-07-06 Thread Laurent Bercot

Interesting, never heard of this make feature before. Does that only
work with static libs? Because the .so is in the search path.


 It's supposed to work with both static and shared libs.
 The defaults for skarnet.org package installations specify *different*
directories for static libraries and shared libraries. It is a misdesign
of autotools, or other build systems, to ensure that those installation
directories are the same: static and shared libraries are very different
objects and should not be handled the same way or stored at the same
place. (Shared libraries are a run-time object; static libraries are a
compile-time object, only used for development.)



Or is it the case that ld/gcc understand LIBRARY_PATH but make doesn't?


 Yes. CPATH and LIBRARY_PATH are gcc-specific constructs. The equivalent
of LIBRARY_PATH for make is named vpath, and needs to be declared in the
Makefile. The configure script builds such a vpath with the contents of the
given --with-lib options.



--enable-fast-install


 The configure scripts are not generated by autotools. This flag won't do
anything. Please see ./configure --help to see what flags are supported.



skalibs lib and includes are found using CPATH and LIBRARY_PATH.


 Yeah, that will work with gcc but make will not find libraries.
As you could see for yourself, it works with --with-lib. :)

--
 Laurent



Gentoo building bug: should be fixed

2015-08-12 Thread Laurent Bercot

 Hi,
 Guillermo reported a bug discovered here:
 https://bugs.gentoo.org/show_bug.cgi?id=541092

 The latest git versions of skarnet.org packages should fix
the issue (with the wonderful magic of XYZZY!), so the
Gentoo workaround should now be unnecessary.

 The latest versions also rework how shared libraries are
built: now they are linked against their dependencies. This
may be ugly in some cases, but is the safe option, and
non-totally-braindead systems shouldn't see too much
ugliness.

 I'll do some more testing tomorrow, and if everything appears
to work fine, I'll do the official releases.

--
 Laurent


Re: Preliminary version of s6-rc available

2015-08-22 Thread Laurent Bercot

 Should be all fixed, thanks!

--
 Laurent


Re: Preliminary version of s6-rc available

2015-08-22 Thread Laurent Bercot

On 22/08/2015 08:26, Colin Booth wrote:

I run my s6 stuff in slashpackage configuration so I missed the
s6-fdholder-filler issue. The slashpackage puts full paths in for all
generated run scripts so I'm a little surprised it isn't doing that
for standard FHS layouts.


 FHS doesn't guarantee absolute paths. If you don't
--enable-slashpackage, the build system doesn't use absolute paths
and simply assumes your executables are reachable via PATH search.

 Unexported executables are a problem for FHS: by definition, they
must not be accessible via PATH, so they have to be called with an
absolute path anyway. This is a problem when using staging
directories, but FHS can't do any better.

 Here, I had simply forgotten to give the correct prefix to the
s6-fdholder-filler invocation, so the PATH search failed as it is
supposed to.

--
 Laurent



Re: skaware tests?

2015-08-21 Thread Laurent Bercot

On 21/08/2015 22:05, Buck Evan wrote:

Is there any kind of test suite for skalibs/execline/s6 ?


 I would love it if there were one. :)
 See
 http://skarnet.org/cgi-bin/archive.cgi?1:mss:276:201502:mndjpngghogjemeljjac
 and the ensuing thread, also available at
 https://www.mail-archive.com/skaware@list.skarnet.org/msg00270.html

--
 Laurent


Re: skaware manpages?

2015-08-21 Thread Laurent Bercot

On 21/08/2015 22:10, Buck Evan wrote:

@Laurent: What's your take on man pages?


 Short version: I like them, as long as I don't have to write them
or move a finger to generate them.

 Long version:

 I honestly believe man pages are obsolete. They were cool in the
90's when they were all we had; but today, *everyone* has a web
browser, and can look at HTML documentation. Even if they don't have
an Internet access.

 I still find myself typing man sometimes. It's a reflex because
I'm a dinosaur. But if it doesn't work, I don't mind: the documentation
*is* somewhere, I just have to grab my browser.
 GNU people never write man pages. They write info pages. That blows,
and I'd rather look at the source code to understand what it does
than install and run an info client. Fortunately, the documentation is
also available in HTML, so I go read the doc on the web. When I was
writing my build system, I was very, very glad that the make manual
was available in HTML; I spent hours on that document, with several
tabs open at various places - browsers are user-friendly. Much more
so than xterms running a rich text visualizer.

 So, info2html, man2html, or SGML/DocBook source, and so on?
 Well, as much as I love Unix, one aspect of it that I really dislike
is the proliferation of markup languages. nroff is one, info is
another one, pod is one, and so on; I've stopped counting the number
of initiatives aiming to produce rich text. I've always managed to
avoid learning those languages. I've only learned LaTeX and HTML;
I quickly forgot the former as soon as I was out of academia and
didn't need it anymore, and I only memorized the latter because it's
ubiquitously useful.  Markup, or markdown, languages, are really
not my cup of tea; and if I didn't learn nroff in 1995, when there
actually was a serious use case for it, I'm definitely not going
to learn it today.

 I'll keep providing HTML docs, and only HTML docs. If you want to
provide man pages, you're very welcome to it, as long as I don't
have to do anything. :P

 Since I don't believe in the future of man pages, I even think
that only providing stub man pages would be perfectly acceptable:
in the man page, only have a link to the relevant HTML document,
on the local machine as well as on the Web.

 If you don't like stubs, heinous scripts should produce more
acceptable results than you think. I try to keep a reasonably
regular format for the doc pages of executables; I don't mind
enforcing the regularity a bit more seriously if it makes your
scripts easier or more accurate.

--
 Laurent



Re: s6-rc - odd warn logging and a best practices question

2015-08-20 Thread Laurent Bercot

On 20/08/2015 16:43, Colin Booth wrote:

Yeah, this is for the special case where you have a daemon that
doesn't do readiness notification but also has a non-trivial amount of
initialization work before it starts. For most things doing the below
talked about oneshot/longrun split is best, but sometimes you need to
run that initialization every time (data validators are the most
obvious example).


 In that case, yes,
 if { init } if { notification } daemon is probably the best. It
represents service readiness almost correctly, if service includes
the initialization.



It does provide notification, but only if you're running under
systemd. At least according to the sd_notify() docs. I'll see about
faking up the environment so sd_notify() is happy and report back.


 systemd's notification API is a pain. It forces you to have a daemon
listening on a Unix socket. So basically you'd have to have a
notification receiver service, communicating with the supervisors -
which eventually makes it a lot simpler to integrate everything into
a single binary.
 This API was made to make systemd look like the only possible design
for a service manager. That's political design to the utmost, and I
hate that with a passion.

 I have a wrapper to make things work the other way (i.e. using
s6-like daemons under systemd), but a wrapper that would actually
understand sd_notify() notifications would be much more painful to
write.


Actually, the more I think about it, the less s6-rc-update will help
me avoid reboots in the short term since part of what I need to get
back is a pristine post-boot environment.


 What do you have in that post-boot environment that would be different
from what you have after shutting down all your s6-rc services and
wiping the live directory ?

--
 Laurent



Re: s6-rc - odd warn logging and a best practices question

2015-08-20 Thread Laurent Bercot

On 20/08/2015 10:57, Laurent Bercot wrote:

s6-svc: warning: /run/s6/rc/scandir/s6rc-fdholder/notification-fdpost
addition of notification-fd

  Looks like a missing/wrong string terminator. Thanks for the report,
I'll look for it.


 I can't grep the word addition in my current git, either s6 or s6-rc.
Are you sure it's not a message you wrote? Can you please give me the
exact line you're running and the exact output you're getting?
 Thanks,

--
 Laurent



Re: [announce] s6-2.2.0.0

2015-07-28 Thread Laurent Bercot

On 28/07/2015 16:59, Patrick Mahoney wrote:

If I understand correctly, any 'readiness' reporting mechanism must originate
from the run script (to inherit notification-fd).


 Yes.



Do you have any suggestions for adding readiness support to a daemon *without
modifying that daemon*?


 It's still possible, and exactly as hackish as before. :)



Previously, I had been using s6-log to match a particular line from the
daemon's log output (e.g. $service is ready), sending the matched line to
something like 'cd .. s6-notifywhenup echo' (note: a child process of run/log,
not run).


 Yes, I remember that case.



To support the same through the notification-fd, I can imagine a rough scheme
such as:

in run:

   background { if { s6-ftrig-wait fifodir U } fdmove 1 3 echo }

   ... run the daemon

in log/run

   pipeline -w
   {
 forstdin -d\n i s6-ftrig-noitfy ../fifodir U
   }

   s6-log
 - +daemon is ready 1
 + t n20 !gzip -nq9 logdir


 Yes, something like that would work. No need for a fifodir: a simple
named pipe would do, just make sure only your logger writes to it
and only your service reads from it.
 My take would be something like:

./run:
fdmove -c 2 1
foreground { mkfifo readiness-fifo }
background -d
{
  fdmove 1 3
  redirfd -r 0 readiness-fifo
  head -n 1
}
fdclose 3
daemon

./log/run:
redirfd -w 1 ../readiness-fifo
s6-log - +daemon is ready 1
  rest-of-your-logging-script

--
 Laurent



Re: Bug in ucspilogd v2.2.0.0

2015-08-09 Thread Laurent Bercot

On 09/08/2015 09:27, Colin Booth wrote:

I haven't yet dug into the skalibs code to see what changed between
those tags, or started bisecting it to find out which commit broke.


 The git diff between 2.3.5.1 and current HEAD is pretty small, and
there's really nothing that changed in the graph of functions
accessed by skagetlnsep(), the failing entry point.



Functional:
[pid 19388] readv(0, [{13Aug  9 07:26:07 cathexis: wo..., 8191},
{NULL, 0}], 2) = 34
(...)
Dysfunctional:
[pid 31983] readv(0, [{13Aug  8 23:46:57 cathexis: wo..., 8191},
{NULL, 0}], 2) = 33


 The path leading to the first invocation of readv() hasn't changed,
but readv() gives different results. My first suspicion is that logger
isn't sending the last character (newline or \0) in the second case
before exiting, which skagetlnsep() interprets as I was unable to
read a full line before EOF happened and reports as EPIPE.
 Are you using the same version of logger on both machines ?

 Grrr.  If logger starts sending incomplete lines, I may have to change
the ucspilogd code to accommodate it.

--
 Laurent



Re: s6-rc plans

2015-08-13 Thread Laurent Bercot

On 13/08/2015 19:55, Colin Booth wrote:

Makes sense. In this case can we get a --livedir=DIR buildtime option
so us suckers using a noexec-mounted /run can relocate things easily
without having to type -l livepath every time we want to interact
with s6-rc?


 Unless I encounter a strong reason not to, sure, no problem.

--
 Laurent


s6-rc plans (was: Build break in s6-rc)

2015-08-13 Thread Laurent Bercot


 Oh, and btw, I'll have to change s6-rc-init and go back to the
the directory must not exist model, and you won't be able to
use a tmpfs as live directory - you'll have to use a subdirectory
of your tmpfs.

 The reason: as it is now, it's too hard to handle all the failure
cases when updating live. It's much easier to build another live
directory, and atomically change what live points to - by renaming
a symlink. And that can't be done if live is a mount point.

--
 Laurent



Re: Build Break in s6-rc

2015-08-13 Thread Laurent Bercot

On 13/08/2015 18:05, Laurent Bercot wrote:

  If you're going to pull from git head, then you should pull from
the git head of *every* project, including dependencies. Which you
didn't for execline. :)


 I'm not lying! I'm just chronologically challenged sometimes. See,
if you had pulled from the execline head from the future, i.e. from
now, your build wouldn't have broken. Really!

--
 Laurent



Re: Build Break in s6-rc

2015-08-14 Thread Laurent Bercot

On 14/08/2015 01:25, Colin Booth wrote:

I'm not sure how I feel about having the indestructibility guarantee
residing in a service that isn't the root of the supervision tree. I
haven't done much with s6-fdholderd but unless there's some extra
magic going on in s6rc-fdholderd, if it goes down it won't be able to
re-establish its control over the overall communications state due to
it creating a fresh socket. I know, I know, it should be fine, but
accidents happen.


 I've thought about it for a while, and finally decided that the
advantages overshadowed the drawbacks.

 First, the only time this makes a qualitative difference is when
the pipe maintainer cannot die at all. In one setup, you lose your
pipe when s6-svscan dies; in the other setup, you lose your pipes
when s6-fdholderd dies. The only way to prevent that is to forbid
your pipe maintainer from dying entirely.

 Second, the only way to do that is to put the pipe maintainer as
process 1; but I don't think putting things in process 1 to make
them indestructible is the answer. It's the systemd way. We're
process 1, so we cannot die, and we can do everything on the system
that needs reliability.
 Granted, it's a nice thing to have, and I do advocate the use of
s6-svscan as process 1, but not because it's a pipe maintainer. I
use s6-svscan as process 1 because it's the natural place for the
root of a supervision tree; and everything else is a bonus.

 The logged service feature of s6-svscan is a direct legacy of
daemontools. It was very cool at the time because we had nothing
else; and I keep it because there's a large daemontools user base,
and breaking compatibility would not make sense because the code
that handles logged services isn't complex enough to be a
maintenance burden. (And still, it is one of the very few places
where I had to write a detailed comment labelled BLACK MAGIC,
because there *is* some complexity to it.)
 So it's not going away any time soon, but it's still a legacy
ad-hoc functionality. If I was writing s6-svscan today, I would
not implement this feature; I would advertise the use of a
dedicated fd-holder instead. And that would cut the code size of
s6-svscan by a non-negligible amount, getting it closer to the
ideal of the minimal process 1.

 The correct approach to reliability is not to try and force
processes not to die; and it's not to cram more stuff into the
only process that cannot die. It's to make sure it's not a
serious problem when processes die. And that, btw, is exactly
what supervision is about in the first place.

 So, let's make sure it's not a problem when the pipe maintainer
dies. In this case, let's add a watcher for s6-fdholderd.
Instead of oneshots that store pipes into the s6-fdholderd, how
about filling up s6-fdholderd at start time with all the pipes
it needs ? The processes in a pipeline will keep using the old
pipes until one of them dies, at which point the old pipe will
close, propagating the EOF or EPIPE to the other processes in
the pipeline; eventually all the processes in the pipeline will
restart, and fetch the new set of pipes from s6-fdholderd.

 That sounds reliable to me, and even cleaner than the current
approach, where the services can't reliably restart if
s6-fdholderd has died; and it doesn't need additional
autogenerated oneshots. (Thanks for the rubber duck debugging!
That's a huge part of why I like design discussions.)

 So yeah, if s6-fdholderd dies, and one process in a pipeline
dies, then the whole pipeline will restart. I think it's an
acceptable price to pay, and it's the best we can do without
involving process 1.

--
 Laurent



Re: Bug in ucspilogd v2.2.0.0

2015-08-11 Thread Laurent Bercot

On 12/08/2015 04:45, Guillermo wrote:

I don't know about syslog on /dev/log, but for syslog over a network
there is this:


 Yeah, I know about the syslog RFCs. The mild way to put it is that
they're about as useful, well-engineered and enticing as a steaming
pile of donkey shit. And donkey shit can at least be used as manure.

 Logs are data, if they need to be transported over the network,
there's no lack of complex, over-engineered and insecure ways to
transport data over the network - no need to come up with yet
another one specifically for logs, with its own quirks and
idiosyncratic formatting that peeks into user content when it
has no business doing so. You want to standardize a universal
format for logs (gl with that), then write a RFC about a universal
format for logs, don't mix that with a network protocol, like,
duh.

 The only part of syslog that is worth normalizing is the
interaction between syslog() and syslogd, on the *local* machine,
because there's a lot of code using syslog() that doesn't care
about the network, and several implementations of syslogd. And,
of course, that's exactly the part those RFCs do not talk about.

 It shouldn't come as a surprise when you know that Eric Allman,
of sendmail shame, is the original syslog designer, and the author
of RFC 5424, Rainer Gerhards, is also the main author of rsyslogd.
Do these people actually get *respect* for what they do? Geez this
community lacks critical thinking.



Or using
ucspilogd with option datagram mode sockets, which would also make
musl syslog() work.


 It's more complicated than that. A datagram syslogd server cannot
listen() and accept(); it receives messages from every process that
uses syslog(). A datagram /dev/log socket enforces the fan-in,
enforces a single instance of syslogd that has to analyze and
authenticate every single log message from the whole machine, which
is precisely what I want to avoid; ucspilogd makes no sense in this
case, you have to use a complete (and big and inefficient) syslogd
implementation.

 ucspilogd relies on the fact that there's a SOCK_STREAM super-server
above it to fork an instance per openlog() connection, and that its
stdin is private to this connection. That's what allows it to be so
simple - and not having the syslog() client try talking to a
SOCK_STREAM socket completely defeats it.



And GNU libc syslog() works fine using ucspilogd
with current stream mode sockets using non-transparent framing with
NUL as trailer character behaviour :P


 ucspilogd doesn't care about the chosen trailer character. It will
treat \0 and \n equally as line terminators - which is the only
sensible choice when logging to a text file and prepending every
line with some data.

 glibc syslog() works because it does some ugly, ugly things like
trying with SOCK_DGRAM, and retrying with SOCK_STREAM if it failed.
In the absence of normalization for syslog(), I'm afraid this is
the only possible behaviour, though; I've swallowed my tears and
submitted a feature request to musl.

--
 Laurent



Re: [s6] debian packaging

2015-08-12 Thread Laurent Bercot


(Please follow-up this part of the thread to the skaware mailing-list.)

On 12/08/2015 08:37, Buck Evan wrote:

-

https://github.com/bukzor/s6-packaging/blob/dockerize/execline/debian/patches/02_link_against_libskarnet.patch
-

https://github.com/bukzor/s6-packaging/blob/dockerize/s6/debian/patches/75_dot_so_link_skarlib.patch


Again this is because the build derps without them, but I forget the exact
failure mode.
I'll track down details upon request.


 The parts for binaries and static libraries are clearly invalid. If
something breaks while building those, then there's a problem with the
way the build is invoked, or the options to configure.
 For static libraries, -lskarnet is nonsense. For binaries, -lskarnet
is already listed in the requirements ($^) and should be translated
to a .a or .so by vpath resolution, so it is incorrect to list it
again. Something is definitely wrong if the package builds with them
while it won't build without.

 I'm still unsure about the shared libraries parts. I don't think
it should be needed, but my test suite isn't up to par and I need to
update it to test the problematic cases and understand exactly what
is happening.

 In the meantime, please find the problem with your build and fix it.
Chances are you won't need the shared libraries patch either once
you've done that. :)



It seems likely to me that you'll want to figure out and fix these two
issues given your response to the above patch.
Is that right?


 Yes, and now you have work to do too. :P

--
 Laurent



[announce] skalibs-2.3.6.0, execline-2.1.3.0

2015-07-27 Thread Laurent Bercot


 Hello,

 skalibs-2.3.6.0 is out.

 A couple bugfixes (including a possible crash in socket_local46)
and a new openreadnclose_nb() function.

 http://skarnet.org/software/skalibs/
 git://git.skarnet.org/skalibs


 execline-2.1.3.0 is out.

 A new configure option, --shebangdir, to specify the absolute
path to the execlineb binary for use in shebang lines in execline
scripts.

 Enjoy,
 Bug-reports welcome.

--
 Laurent


Re: Preliminary version of s6-rc available

2015-07-16 Thread Laurent Bercot

On 16/07/2015 19:22, Colin Booth wrote:

You're right, ./run is up, and being in ./finish doesn't count as up.
At work we use a lot of runit and have a lot more services that do
cleanup in their ./finish scripts so I'm more used to the runit
handling of down statuses (up for ./run, finish for ./finish, and down
for not running). My personal setup, which is pretty much all on s6
(though migrated from runit), only has informational logging in the
./finish scripts so it's rare for my services to ever be in that
interim state for long enough for anything to notice.


 I did some analysis back in the day, and my conclusion was that
admins really wanted to know whether their service was up as opposed
to... not up; and the finish script is clearly not up. I did not
foresee a situation like a service manager, where you would need to
wait for a really down event.



As for notification, maybe 'd' for when ./run dies, and 'D' for when
./finish ends. Though since s6-supervise SIGKILLs long-running
./finish scripts, it encourages people to do their cleanup elsewhere
and as such removes the main reason why you'd want to be notified on
when your service is really down. If the s6-supervise timer wasn't
there, I'd definitely suggest sending some message when ./finish went
away.


 Yes, I've gotten some flak for the decision to put a hard time limit
on ./finish execution, and I'm not 100% convinced it's the right
decision - but I'm almost 100% convinced it's less wrong than just
allowing ./finish to block forever.

 ./finish is a destroyer, just like close() or free(). It is nigh
impossible to define sensical semantics that allow a destroyer to fail,
because if it does, then what do you do ? void free() is the right
prototype; int close() is a historical mistake.
 Same with ./finish ; and nobody tests ./finish's exit code and that's
okay, but since ./finish is a user-provided script, it has many more
failure modes than just exiting nonzero - in particular, it can hang
(or simply run for ages). The problem is that while it's alive, the
service is still down, and that's not what the admin wants.
Long-running ./finish scripts are almost always a mistake. And that's
why s6-supervise kills ./finish scripts so brutally.

 I think the only satisfactory answer would be to leave it to the user :
keep killing ./finish scripts on a short timer by default, but have
a configuration option to change the timer or remove it entirely. And
with such an option, a burial notification when ./finish ends becomes
a possibility.



Ah, gotcha. I was sending explicit timeout values in my s6-rc comands,
not using timeout-up and timeout-down files. Assuming -tN is the
global value, then passing that along definitely makes sense, if
nothing else than to bring its behavior in-line with the behavior of
timeout-up and timeout-down.


 Those pesky little s6-svlisten1 processes will get nerfed.



Part of my job entails dealing with development servers where
automatic deploys happen pretty frequently but service definitions
dont change too often. So having non-privileged access to a subsection
of the supervision tree is more important than having non-privileged
access to the pre- and post- compiled offline stuff.


 I understand. I guess I can make s6-rc-init and s6-rc 0755 while
keeping them in /sbin, where Joe User isn't supposed to find them.



By the way, that's less secure than running a full non-privileged
subtree.


 Oh, absolutely. It's just that a full setuidgid subtree isn't very
common - but for your use case, a full user service database makes
perfect sense.

--
 Laurent



Re: Preliminary version of s6-rc available

2015-07-17 Thread Laurent Bercot

On 17/07/2015 09:26, Rafal Bisingier wrote:

So I run them as a service with sleep BIG in
finish script (it's usually unimportant if this runs on same hours
every day). I can have this sleep in the main process itself, but it
isn't really it's job


 I also use a supervision infrastructure as a cron-like tool. In those
cases, I put everything in the run script:
 if { periodic-task } sleep $BIG

 periodic-task's run time is usually more or less negligible compared
to $BIG, and I'm not expecting to be controlling it with signals anyway
- but I like to being able to kill the sleep if I want to run
periodic-task again earlier for some reason. So I don't mind executing
a short-lived (even if it takes an hour or so) process in a child, and
then having the run script exec into the sleep. And since
periodic-task exits before the sleep, it doesn't block resources
needlessly.

 Whereas if your sleep is running in the finish script, you have no
way to control it. You stay in a limbo state for $BIG and your service
is basically unresponsive that whole time; it's reported as down (or
finish with runit) but it's still the normal, running state. I find
this ugly.

 What do you think ? Is putting your periodic-task in a child an
envisionable solution for you, or do you absolutely need to exec into
the interpreters ?

--
 Laurent



Re: Preliminary version of s6-rc available

2015-07-19 Thread Laurent Bercot

On 19/07/2015 20:13, Guillermo wrote:

Well, I haven't been very lucky with oneshots. First, the #!execline
shebang with no absolute path doesn't work on my system, even if the
execlineb program can be found via the PATH environment variable.
Neither does #!bash, #!python, or any similar construct. If I run
a script from the shell with such a shebang line I get a bad
interpreter: No such file or directory message.


 Looks like your kernel can't do PATH searches.
 The #!execline shebang worked on Linux 3.10.62 and 3.19.1. But yeah,
it's not standard, so I'll find a way to put absolute paths there, no
big deal.



/path-to/live/servicedirs/s6rc-oneshot-runne: No such file or
directory
s6-rc: warning: unable to start service oneshot name: command exited 111

/path-to/live/ represents here what was the full path of the live
state directory, and the  was really a string of random
characters. I suppose this was meant to be the path to
s6rc-oneshot-runner's local socket, but somehow ended up being
gibberish instead. So oneshots still don't work for me :(


 I committed a few quick changes lately, I probably messed up some
string copying/termination. I'll investigate and fix this.



* It looks like s6-rc-compile ignores symbolic links to service
definition directories in the source directories specified in the
command line; they seem to have to be real subdirectories. I don't
know if this is deliberate or not, but I'd like symlinks to be allowed
too, just like s6-svscan allows symbolic links to service directories
in its scan directory.


 It was deliberate because I didn't want to read the same subdirectory
twice if there's a symlink to a subdirectory in the same source
directory. But you're right, this is not a good reason, I will remove
the check. Symlinks to a subdirectory in the same place will cause a
duplicate service definition error, though.



* I'm curious about why is it required to also have a producer file
pointing back from the logger, instead of just a logger file in the
producer's service definition directory. Is it related to the parsing
sucks issue?


 It's just so that if the compiler encounters the logger before the
producer, it knows right away that it is involved in a logged service
and doesn't have to do a special pass later on to adjust service
directory names.
 It also doubles up as a small database consistency check, and
clarity for the reader of the source.

 

* It doesn't really bother me that much, but it might be worth making
down files optional for oneshots, with an absent file being the same
as one contanining exit, just like finish files are optional for
longruns.


 Right. You can have empty down files already for this purpose; I guess
I could make them entirely optional.



The user checked against the
data/rules rulesdir would be the one s6-rc was run as, right? So it
defines which user is allowed to run oneshots?


 Yes. And indeed, allowing s6-rc to be run by normal users implies
changing the configuration on s6rc-oneshot-runner. I'll work on it.



And finally, for the record, it appears that OpenRC doesn't mount /run
as noexec, so at least Gentoo in the non-systemd configuration, and
probably other [GNU/]Linux distributions with OpenRC as part of their
init systems, won't have any problems with service directories under
/run.


 That's good news !

 Thanks a lot for the feedback ! I have a nice week of work ahead of me...

--
 Laurent



Re: Preliminary version of s6-rc available

2015-07-13 Thread Laurent Bercot

On 13/07/2015 17:35, Colin Booth wrote:

Those options are all bad. My workaround was to mount a new tmpfs
inside of run (that wasn't noexec) but that made using s6-rc annoying
due to the no directory requirement. I don't think there's anything
inherently bad about nesting mounts in this way though I could be
mistaken.


 Ah, so that's why you didn't like the must not exist yet requirement.
OK, got it.
 Yeah, mounting another tmpfs inside the noexec tmpfs can work, thanks
for the idea. It's still ugly, but a bit less ugly than the other choices.
I don't see anything inherently bad in nesting tmpfses either, it's just a
small waste of resources - and distros that insist on having /run noexec
are probably not the ones that care about thrifty resource management.

 s6-rc obviously won't mount a tmpfs itself, since the operation is
system-specific. I will simply document that some distros like to have
/run noexec and suggest that workaround.



My suggestion is for one of: changing the s6-rc-init behavior to
accept an empty or absent directory as a valid target instead of just
absent


 Yes, I'm going to change that. absent was to ensure that s6-rc-init
was really called early at boot time in a clean tmpfs, but absent|empty
should be fine too.



Hm, either the documentation or my reading skills need work (and I'm
not really sure which).


 When in doubt, I'll improve the doc: a good doc should be understandable
even by people with uncertain reading skills. :)



Actually, assuming you're only making bundle and dependency changes,
it looks like swapping out db, n, and resolve,cdb from under s6-rc's
nose works. I'd be unsurprised if there were some landmines in doing
that but it worked for hot-updating my service sequence.


 Landmines indeed. Services aren't guaranteed to keep the same numbers
from one compiled to another, so you may well have shuffled the live
state without noticing, and your next s6-rc change could have very
unexpected results.

 But yes, bundle and dependency changes are easy. The hard part is when
atomic services change, and that's when I need a whiteboard with tables
and flowcharts everywhere to keep track of what to do in every case.



Glad to hear it. So far s6-rc feels like what I'd expect from a
supervision-oriented rc system. There are some issues that I I haven't
mentioned but I'm pretty sure those are mostly due to unfamiliarity
with the tools more than anything else.


 Please mention them. If you're having trouble with the tools, so will
other people.

--
 Laurent



Re: [announce] New skarnet.org release, with relaxed requirements.

2015-10-22 Thread Laurent Bercot

On 23/10/2015 00:57, Guillermo wrote:

So, I don't know if the handler scripts for diverted signals that the
new version of s6-linux-init-maker generates are intended to be
compatible with BusyBox. But if that's the intention, then the ones
for SIGUSR1 and SIGUSR2 are inverted: I think that the signal sent by
'busybox halt' to process 1 is SIGUSR1, so its handler should be the
one calling s6-svscanctl -0 $tmpfsdir/service, and the signal sent by
'busybox poweroff' is SIGUSR2, so its handler should be the one
calling s6-svscanctl -7 $tmpfsdir/service.


 Ah, this is unfortunate. I don't think there's an universal convention
for those signals; I looked at suckless init, which uses USR1 for poweroff
(and doesn't have a signal for halt). I'm more interested in supporting
busybox init than sinit, though (because sinit is incorrect: it lacks
supervision of at least one process) - so I'll reverse the signals in
s6-linux-init-maker. Thanks for the report.



And speaking of s6-linux-init-maker, the -e VAR=VALUE option generates
a $basedir/env/VAR file that doesn't have a trailing newline after
VALUE, although I don't know if s6-envdir cares.


 s6-envdir does not care.

--
 Laurent



Re: s6-ftrig-wait is not relocatable

2015-10-14 Thread Laurent Bercot

On 14/10/2015 02:58, Buck Evan wrote:

The packaging system I'm targeting (pypi wheels =X) are built binaries that
are relocated arbitrarily, so the "run-time installation path" is entirely
unknown at compile time.


 I don't understand. How is that even supposed to work? If packages
want to install files in /etc, they can't? If you need to access
binaries, how do you know what to put in your PATH?

 If the packaging system can't provide answers to those questions,
it's not "weird". It's "broken".



Making everything fully static makes everything but this one bit work
though.


 Not really: execline and s6 binaries expect to be able to spawn other
execline and s6 binaries, some of which are found via PATH search, some
of which are in /usr/libexec/s6 or something, depending on your
./configure options. If you don't know your run-time installation paths,
things will *appear* to work, but be subtly broken.



As a nasty hack, --enable-tai-clock seems like it will disable the use of
leapsecs.dat.


 Please don't do that unless you know that your system clock is TAI-10.

--
 Laurent



Re: daemontools tai64n is unbuffered, s6-tai64n is fully buffered

2015-10-20 Thread Laurent Bercot

On 20/10/2015 02:16, Buck Evan wrote:

My canonical slowly-printing example is:

 yes hello world | pv -qL 10 | tai64n

Under daemontools classic you'll see the output gradually appear character
by character, with timestamps.
Under s6, this seems to hang and I ctrl-c it. I'm sure if I waited a good
long while it would print, but this shows the difference in usability.


 s6-tai64n flushes its stdout before going back to read its stdin again.
It will never keep unflushed logs in memory.

 You are very likely using a version of s6-tai64n linked against a shared
libskarnet.so.2.3.7.0 or earlier, which sometimes flushes stdout
incorrectly. Please grab the latest skalibs and recompile. (Or use the
static version of libskarnet, which does not exhibit the bug.)

--
 Laurent



Re: daemontools tai64n is unbuffered, s6-tai64n is fully buffered

2015-10-20 Thread Laurent Bercot

On 20/10/2015 23:36, Buck Evan wrote:

Is it expected that it's line-buffered?


 It's not line-buffered. It's optimally buffered, i.e. the buffer
is flushed whenever it's full (obviously) or whenever the loop
goes back to reading with a chance of blocking. When you test
with a loop around echo, you send lines one by one, so the
behaviour appears to be line buffering, but that's only an
artifact of your test.

--
 Laurent



Re: s6-rc shutdown timing issue

2015-09-13 Thread Laurent Bercot

On 13/09/2015 09:08, Colin Booth wrote:

I've been digging into managing a system completely under s6 and I
can't seem to find the right time to run `s6-rc -da change'. Run it
before sending s6-svscan the shutdown/reboot/halt commands and can end
up with a situation where your read/write drive has been set read-only
before your services have come down.


 This is the right way to proceed:
 * First s6-rc -da change
 * Then s6-svscanctl -t /run/service

 I don't understand the issue you're having: why would your rw filesystem
be set read-only before the services come down ?
 - Your non-root filesystems should be unmounted via the corresponding
"down" scripts of the oneshot services you're using to mount them, so
they will be unmounted in order.
 - Your root filesystem will be remounted read-only in stage 3, i.e.
when everything is down.

 If your dependencies are correct, there should be no problem. What
sequence of events is happening to you ?



One other question that doesn't really belong here but doesn't need
its own thread. If I have a oneshot that only does any work on
shutdown, can I get away with having the required ./up script be
empty


 Yes, an empty ./up script will work. (Oneshot scripts are interpreted
at compile time by the execlineb parser, which treats an empty script
as "exit 0".)

--
 Laurent



Re: s6-rc shutdown timing issue

2015-09-13 Thread Laurent Bercot

On 13/09/2015 20:25, Colin Booth wrote:

My current issue is that I'm initially remounting my root filesystem
as r/w as one of the first steps for s6-rc, which means that if I'm
doing everything correctly, s6-rc attempts to remount root as
read-only as part of its shutdown.


 Yeah, indeed, that won't work in all cases, because the situation is not
symmetrical. Remounting your rootfs rw at boot time will always succeed,
but remounting it ro before killing everything may fail.

 You have absolutely no way of ensuring that nothing will attempt to write
to your rootfs before you nuke everything. That's the fundamental difference
between startup and shutdown: during shutdown, and until your nuke, other
stuff may still be running that you have no control on.

 If it's only about your rootfs, I'd simply keep the "remount it rw/ro" part
out of s6-rc.
 If it's about a more complex fs infrastructure and you may still have
processes with open handles to the mounted filesystems, it's more annoying,
and I don't have a perfect solution. The simplest thing is to do all the
unmounts outside of s6-rc, but it's asymmetrical and doesn't feel right.
Another possibility is to have a nuke in a ./down script of a oneshot that
depends on all your filesystem-mounting service, so you're sure to have
already killed everything when you get to unmounting; but that's not great
either, especially if you switch databases and s6-rc-update computes that
it has to remount one of your filesystems - oops, it just killed everything
on your machine and can't complete its work. That shouldn't happen if your
"mount stuff" services are very low-level and always up, but if you give
users a set of teeth, they will manage to bite themselves.

 I'm afraid there's no real solution to the stragglers problem, and the
only safe approach is to keep everything mounted all the time and only
perform the unmounts in stage 3 after everything else has been done and
the processes have been killed.



Cool, thanks. That's what I thought but I wasn't sure to what degree
execlineb cared about script validity and I don't have a terribly
great test methodology for oneshots figured out yet.


 If you run s6-rc -v3, it should show you exactly what commands it's
running.

 In other news, I'm now in the process of testing s6-rc-update. I've
finished brooming the obvious bugs, now is the most annoying part: so
many test cases, so many things that can go wrong, and I have to try
them one by one with specifically crafted evil databases. Ugh. I said
I'd release that thing in September, but that won't happen if I die of
boredom first.
 If you're totally crazy, you can try running it, but unless you're
doing something trivial such as switching to the exact same database,
chances are that something will blow up in your face - in which case
please let me analyze the smoke and ashes.

--
 Laurent



Re: s6-rc-update initial findings

2015-09-14 Thread Laurent Bercot

On 15/09/2015 00:40, Colin Booth wrote:

Ok, did some more testing and it looks like the contents of $SVCDIR
end up being the additive delta between current and new. When
initializing, there are no s6-rc managed servoces in $SVCDIR so of
course the delta will be all new services. When adding a new longrun,
your contents of $SVCDIR will only be the new service. It's probably
safe since giving s6-svscan SIGALRM only adds services (never
removes), and s6-rc brings down services by directly sending s6-svc
-wD -dx to the service. Not sure if this was a design decision, but I
still prefer having $SVCDIR be representative of my run state. At
least I now know what's going on.


 Yeah, that's not normal. s6-rc-update should remove the links when it
brings the old services down, and should also add the links when it
brings the new services up. I don't have an exact picture of what is
actually happening in all cases; I didn't have the time today, but I'll
do more testing on that tomorrow.

--
 Laurent


Re: readiness notification from non-subprocess

2015-09-29 Thread Laurent Bercot

On 29/09/2015 00:08, Buck Evan wrote:

If it's not good for s6, I'm not sure it's good for my framework either.


 Not necessarily. s6-supervise is extremely paranoid; depending on its
use cases, your framework doesn't have to be. Also, if you control both
ends of a named pipe and can reasonably assume that nobody's going to
mess with them, that's a lot safer than having a named pipe as part of
a public interface.



Would this be a good candidate for the ftrig* stuff?
It does seem like event notification.


 fifodirs internally use named pipes ;) That would probably make your
interface more complex than it needs to be.



Or: How do you pass a file descriptor such that it can be used by a parent
process?


 By fd-passing, which is black magic over a Unix domain socket. You need
your client (child of ./run) and your server (your framework) to
communicate over a Unix socket (have the server listen and accept to
some socket in the filesystem, have your client connect to it), and
there are functions to magically copy a descriptor from one to the other.
 Google "fd-passing".
 I have functions in skalibs to make it easier to transfer fds along
with text over a stream socket: unixmessage.h
See s6-sudoc.c and s6-sudod.c for a relatively simple usage example.



I'd need to document "you need to write a ./ready script *and* prefix your
service with my-poll-helper".
If I can get this down to just one thing that people have to do right, it's
going to work better.


 'Your run script must be called "./run.framework" no matter what'
 'If you want poll support, write a ./ready script'

 Then automate the creation of ./run files that will just exec
"my-poll-helper ./run.framework". And my-poll-helper tests for ./ready
scripts and does what it needs, just executing into its argument if
there's no valid ./ready script. You don't even need to connect
my-poll-helper to the framework in that case: you can just have it
do its ./ready polling in the background and write a newline to
notification-fd when it succeeds, as we talked about earlier.

 Wrapping ./run gives you a lot of freedom.



An svc --hey-im-ready option would continue that trend.


 Using s6-supervise's control pipe as the mechanism to report
readiness. Yes, that's a possibility, and I thought about it, but
ultimately rejected it, because s6-svc sends orders to control the
service, not to fake its state.
 s6-supervise tries to always report the *real* state, not an
arbitrary state defined by the user. You have to realize that all
this discussion is about an attempt to make s6-supervise report a
fake state when the daemon doesn't have support: no wonder it's
difficult! I didn't make it intentionally difficult, but I didn't
design the thing around making it easy to fake states, either.
A s6-svc option to fake a state would make it a bit *too* easy for
my taste.

 Wrap the goddamn run script. It's by far the easiest way to get
what you want.

--
 Laurent



Re: readiness notification from non-subprocess

2015-09-29 Thread Laurent Bercot

On 29/09/2015 00:15, Olivier Brunel wrote:

[2] https://github.com/jjk-jacky/anopa/blob/master/src/utils/aa-setready.c


 Yeah, the problem is, aa-setready is prone to the same race condition as
s6-notifywhenup was, which is the reason why I scrapped s6-notifywhenup and
made a fd to report to s6-supervise instead.

 If the service dies while aa-setready does its thing, s6-supervise will
modify the status file and send a fifodir event to report service death,
and depending on the scheduler's whim, the status file may get incorrect
information, and the fifodir events may be sent in the wrong order.

 I hated it when I realized it, but the only way to prevent that is to
make the supervisor the only authority on the service state - only the
supervisor should modify the status file and send fifodir events. So,
from the service's point of view, only the notification-fd is a safe
channel to use.

--
 Laurent



  1   2   3   4   5   6   >