execline changes
Hello, execline-1.3.0 will be out soon. Some changes will be made to the UI: * The deprecated execline program will be suppressed: the only launcher will be execlineb. * The LASTPID and LASTEXITCODE environment variables will be replaced with ! and ? respectively, for more consistency with the shell and the general execline naming policy - and less namespace pollution. The old behaviour will still be available for some time via a compilation switch, but it is strongly discouraged. * Substitution commands with -E | -e options will switch default behaviour to -e: the substituted value will be stored into an environment variable, not on the command line. Please update your scripts accordingly. Thanks, -- Laurent
Re: ARM none gnieabi sysdeps
Hi Vincent, You want sysdeps, so I assume you're cross-compiling. The easiest way for you to get sysdeps is to actually compile skalibs in a *native* ARM environment, and fetch the sysdeps from there. If you don't have a development environment on your real target, qemu can definitely help : you can use Aboriginal's native-compiler-arm*toolchain inside a qemu disk to compile skalibs, and then look at the sysdeps file. I've stopped collecting sysdeps sets for different architectures, because it's too much maintenance and unreliable ; and now that qemu is widely deployed, packaged and everything, and you can set up a small Aboriginal development environment on a virtual host in a matter of minutes, there's just no reason anymore to keep a repository of sysdeps. -- Laurent
announce: s6-networking-0.0.4
Hi, s6-networking-0.0.4 is out. This release fixes a bug in s6-tcpserver-access and s6-ipcserver-access, where user-supplied environment was ignored. Thanks to Vallo Kallaste for the bug-report. http://skarnet.org/software/s6-networking/ Enjoy, More bug-reports welcome. -- Laurent
[announce] s6-networking-0.0.5
Hi, s6-networking-0.0.5 is out. (Because you always find bugs *right after* submitting a release.) This release fixes a bug in s6-tcpserver-access which occasionally caused improper TCPREMOTEHOST resolution. http://skarnet.org/software/s6-networking/ Enjoy, Bug-reports still welcome. -- Laurent
[announce] skalibs-1.5.1, s6-portable-utils-1.0.3
Hello, skalibs-1.5.1 is out. It fixes a build bug when cross-compiling. Thanks to Vincent De Ribou for the report. It also makes the default conf-cc not in debug mode, duh. . http://skarnet.org/software/skalibs/ s6-portable-utils-1.0.3 is out. It fixes a s6-mkfifo bug where the umask was incorrectly respected. Thanks to Vallo Kallaste for the report. http://skarnet.org/software/s6-portable-utils/ Enjoy, Keep those reports comin'. -- Laurent
Re: backtick -C
On 04/03/2014 12:27 PM, Vallo Kallaste wrote: Hi I noticed that backtick -C does not work for preceding spaces, but import -C does. Is it intentional? -C | -c | -d | -s only make sense when backtick is performing the substitution itself, i.e. with the -E option (which is deprecated). In your examples, backtick puts the value into the environment, and you perform the substitution explicitly with import. Value transformations are only done at substitution time, so backtick's -C option doesn't do anything. I should probably print a warning or error message when value transformations are used without -E, but then again, -E is going away soon. This isn't exactly related, but how I can get rid of preceding space(s) or delimitors commonly. Using pipeline with sed is one way of course but.. well, any better ways? It depends on what you're using the strings for, and your definition of better :) Value transformations, including crunching, were made so that execline could easily process words coming from files or programs' output; in that context, it made sense to handle delimitors *after* words. I'd say the way to handle preceding delimitors would be: - fix your input so you don't have them XD - if the input is meant to be split, as with multidefine or forbacktickx, then just ignore the first word if it is empty - else, pipeline { sed ... } is the most generic way. -- Laurent
Re: [announce] 2014 spring release
Is -DEXECLINE_OLD_VARNAMES still possible when compiling execline? If so, maybe it should be mentioned (I find the old names more readable, and besides there are scripts that would need to be changed, always a possible source of mayhem when the scripts perform critical functions!) It is still possible, and since it's just a bit of preprocessor work and doesn't add to the code, I have no technical reason to remove it. However, I prefer not to support it explicitly, because that would create script incompatibilities. So it's still there, and it's probably going to remain there, but it's a hack, and you should not rely on it or distribute scripts that use that feature. Maybe the new behaviour of s6-setuidgid should be optional, via some command line flag? Note also that the first paragraph of Notes in http://skarnet.org/software/s6/s6-envuidgid.html is now wrong. Ah, thanks for the report. Documentation updated. I don't think GIDLIST should be optional: a user's identity is not only its uid and primary group id, it's also a list of groups the user belongs to; without supplementary groups, Unix rights lose a lot of power. The original daemontools utilities were lacking that, and I simply hadn't noticed until I added myself to some group in /etc/group and what I wanted to do didn't work because the processes behind s6-setuidgid didn't pick it up. You can see it as a bugfix, not as an additional feature. The documentation in http://skarnet.org/software/conf-compile.html suggests that conf-home should not be touched; but the default makes the existence of /package/... mandatory, which is somewhat unexpected and shocking when using statically compiled binaries. A few comments about this would make it easy to use in less standard setups. I'm afraid I don't understand your point. What are you trying to do that requires manually setting conf-home ? How does that relate to using slashpackage and to using statically compiled binaries ? -- Laurent
Re: slashpackage again
Without a conf-home file, s6-svscan requires s6-supervise in /package/admin/s6-1.1.3.1/command/ defined in s6-config.h as S6_BINPREFIX Why isn't it enough to find the binary in /package/admin/s6/command/ ? (This was what I asked hours ago) Because it's an intra-package dependency, not a cross-package dependency. You want commands to call their dependencies from the same version of the package, not from the version that happens to be the current one, because the API could have changed in-between. Remember your issue with execlineb and foreground being incompatible ? That's exactly the kind of problem versioned paths are preventing. Maybe setting conf-home to /package/admin/s6 would make it? That, again, would solve your immediate problem in a hackish way that is not guaranteed to work. The right way would be to have /package/admin/s6 being a symlink to /ppackage/admin/s6-1.1.3.1, which would be the directory containing your command symlink. I suppose I don't understand slashpackage... You are just trying to use it in an environment where it's not suited, so of course it's making your life harder. Slashpackage is about: - easy package management (having all your package data under a single directory, easy versioning, etc.) - Fixed full pathname guarantees, both versioned and non-versioned. For an initramfs, you need neither of those, and the framework, as light as it is, is still too heavy for your needs. The price to pay is the forest of symlinks. -- Laurent
Re: s6-portable-utils build problem
in s6-portable-utils software, why copying 'library.so/s6-memoryhog' when, I think, It should be 'command/s6-memoryhog'. Actually, it should be /usr/libexec/s6-memoryhog . You must have changed conf-compile/conf-install-libexec to /library.so ; this is wrong, library.so is for dynamically linked libraries, not executables (even if .so files are indeed executables, they are generally not supposed to be used as such). Unexported commands are a pain without slashpackage. FHS has no way to specify that a command should be available to commands belonging to the same package, but not externally. Historically, /usr/libexec was used for those commands, but that's just a poor way of doing what slashpackage is doing (guaranteed access paths to executables). Some packages use the /usr/lib/$package hierarchy to store their data, including unexported commands, and this is another poor way of doing the same thing. The place where binaries are stored and the way they are accessed is ultimately the admin's responsibility. Slashpackage helps. For people who don't want to use it, conf-compile/conf-install-* are the files to customize. Anyway, s6-memoryhog is undocumented and dangerous, I only wrote it for testing purposes, and that's why it's unexported in the first place. Maybe I should remove it entirely. -- Laurent
Re: s6-envdir scope
On 22/08/2014 23:32, John Vogel wrote: I'm hoping this is right and also all by design You are correct, and it is by design indeed, but not my design - it's simply Unix. As a very rough rule of thumb, execline blocks represent processes. A sequence of commands in the same block will run with the same PID; the environment set with s6-envdir will then propagate to the end of the block. A new process will be spawned to run an inner block, and it will inherit that environment. And when a block ends, the process dies, and the outer block, an ancestor, has no idea of the environment that was set in the inner block. There are many exceptions to the block = process rule, but if you're only concerned with environment scoping, it always behaves according to that rule. When you set an environment variable, its scope is always from the point where you set it to the end of the current block, including all the inner blocks. There is *one* exception: ifthenelse -s. Here, environment will leak out of the then-block or the else-block into the remainder part. But it's such an ugly hack, so out of place with the rest of the execline design, that I'm not even sure I should keep that option available in 2.0. -- Laurent
Re: superstrip
Hi Jorge, Your site has superstrip.c as a single file, but I have superstrip-0.12sp, which I think I downloaded from your site long ago. Can you clarify? superstrip.c is indeed a single file, and it did not justify the hassle and overhead of fully packaging - either for me or for users. So I just superstripped the thing down to the essentials :P gcc -I/package/prog/skalibs/include -L/package/prog/skalibs/library -o superstrip superstrip.c -lstddjb should work. -- Laurent
Re: [skalibs-2.0][PATCH] posting new clean patch (from supervision lits)
Hi Toki, * Please don't send binaries to the list. If a file is too big for you to send it, then put it on your favorite pastebin-like service and send the URL instead. *Contrary to what you are saying, there is no problem with libdir - I just tried again, to make sure. When you specify --libdir as an option, the value you specify overrides the default. When you do not, $prefix/usr/lib/package name is used instead, for whatever value of $prefix you give. And this is true for every --*dir option. There is no hardcoded path in the configure scripts, everything is configurable. * The install.sh script is there for a reason, as well as the distinction between dynlibdir and libdir. Please stop suggesting changes before you understand why the design is as it is. If you have trouble understanding some design choices, ask specific questions and I will answer them; then we can discuss. But so far you have not shown me any need to change anything. * To make it abundantly clear: the autoconf-generated installation directories, as well as the pkgconfig format, *are not* well-designed. Static libraries, which are build-time dependencies, and should be packaged in a 'development' package with the header files, have no business being installed in the same directory as shared libraries, which are run-time dependencies. This confusion is one of the banes of GNU, and one of the reasons why autotools and pkgconfig suck. I am not going to pay tribute to it, and it irritates me when people suggest I do so, because it means they do not understand the point of the skarnet.org project. -- Laurent
execline feature: import -u
I find myself writing import VAR unexport VAR all the time in execline scripts, because some environment variables are just used for substitution and keeping them would only pollute the rest of the script. For convenience, I have added a new -u option to import and importas. import -u VAR substitutes VAR in the script and unexports it at the same time. The option can also be used in multisubstitute import directives. The feature is available in the latest git://git.skarnet.org/execline ; please test it if you're interested (I've performed some basic tests but might have missed some corner cases). And that wraps up 2014. Happy New Year ! -- Laurent
Re: skalibs ./configure args of form VAR=VALUE ignored
On 02/01/2015 21:22, Patrick Mahoney wrote: In skalibs, ./configure --help says: To assign environment variables (e.g., CC, CFLAGS...), specify them as VAR=VALUE. See below for descriptions of some of the useful variables. Though specifying CC=something seems to have no effect. Ah, indeed. Thanks for the report. I removed the variable assignment on the command line because there needed to be a specific line in the script for each single variable handled this way; so, some variables would have an effect, and some would not. I didn't like it, so I scrapped it all. But I forgot to update the --help message to mention it. The --help message is now fixed in the current gits. Of course, the easy workaround of exporting the desired vars before running ./configure does have the intended effect. Yes, the intention was to support *environment* variables all along. Specific VAR=value treatment in the configure arguments isn't worth the loss of clarity, I think, since putting variables in the environment - or, with most shells, simply specifying VAR=value before ./configure on the command line - is so easy. -- Laurent
Re: skarnet software packaged in nixpkgs
Thanks Patrick! -- Laurent
Re: How to report a death by signal ?
On 18/02/2015 14:55, Olivier Brunel wrote: But isn't the whole anything = 128 will be reported as 128, and anything higher is actually 128+signum also a convention that both needs to agree upon? Sure, but most commands exit 128 so that's reliable enough, and it's a lot easier to follow than the whole pipe shebang. It's much, much simpler to exit with a given code than to write stuff to a pipe (what do you do if it blocks ? what do you do if you're fd-constrained ? what do you do if setting up the plumbing in the parent fails for whatever reason ? etc. etc.) Noting that shells do not actually clamp the exit code to 128. Indeed, but it comes at the price of uncertainty - you get accurate information if you're lucky, and complete misinformation if you're not. It works for shells most of the time because you don't manually nest shells - it's much riskier for execline. Just the difference shall probably be pointed out/documented.) Definitely. -- Laurent
Re: How to report a death by signal ?
On 18/02/2015 14:20, Peter Pentchev wrote: [roam@straylight ~]$ perl -e 'die(foo!\n);'; echo $? foo! 255 I think you should be ok, for the same reason why a shell is ok: if you're using Perl, you're most likely writing your whole script with it, especially control flow and error/crash checking. You're not playing with an inner interpreter reporting a code to an outer interpreter. So the weird 255 should not be a problem in practice. If I'm wrong and your use case precisely involves a perl script running as P or C with G being an execline command, please mention it! Just because I'd be curious. :) -- Laurent
Re: How to report a death by signal ?
On 18/02/2015 11:58, Peter Pentchev wrote: OK, so the not using the whole range of valid exit codes point rules out my obvious reply - do what the shell does - exit 128 + signum. Well the shell is happily ignoring the problem, but it doesn't mean it has solved it. The shell reserves a few exit codes, then does some best effort, hoping its invoked commands do not step on its feet. It works because most commands will avoid exiting something 125, but it's still a convention, and most importantly, the shell itself does not follow that convention (it obviously cannot!) So, something like sh -c sh -c foobar does not report errors properly: for 126 and 127, there's no way to know if the code belongs to the inner shell or the outer shell, and for 128+, there's no way to know if the inner shell or the foobar process got killed. Shells get away with it because when they're nested, it's usually auto-subshell magic and users don't want to know about the inner shell; but here, I'm trying to solve the problem for execline commands, and those tend to be nested a lot - so I definitely cannot reserve codes for the outer command, because the inner command may very well use the same ones too. Now the question is, do you want to solve this problem in general, or do you want to solve it for a particular combination of programs, even if new programs may be added to that combination in the future, but only under certain rules? If it's the former (in general), then, sorry, I don't have a satisfactory answer for you, and the fact that the POSIX shell still keeps the exit 128 + signum behavior mostly means that nobody else has come up with a better one, either (or it might be available at least as some kind of an option). It just means that nobody cares about shell exit codes. Error handling, if any, is done inside of shell scripts anyway; and in most scripts, a random signal killing a running command isn't even something people think about, and I'm sure there are hilarious behaviours hiding in dark corners of very popular shell scripts, that fortunately remain asleep to this day. For execline, however, I cannot use the same casual approach. Execline scripts live and die by proper exit code reporting, and carelessness may lead to very obvious breakage. Personally, I quite like the idea of some kind of a pipe (be it a pipe(2) pair of file descriptors or an AF_UNIX/PF_UNSPEC socketpair or some other kind of communication channel based on file descriptors), even if it is only unidirectional: Oh, don't get me wrong, I'm a fan of child-to-parent communication via pipes, and I use it wherever applicable. Unfortunately, the child may be anything here, so I need something generic. Thanks for your input ! -- Laurent
Re: How to report a death by signal ?
On 18/02/2015 14:04, Olivier Brunel wrote: I don't follow, what's wrong with using a fd? It needs a convention between G and P. And I can't do that, because G and P are not necessarily both execline commands. They are normal Unix programs, and the whole point of execline is to have commands that work transparently in any environment, with only the Unix argv and envp as conventions. Cause that was my idea as well: return the exit code or 255. I was considering it for a while, then figured that the signal number is an interesting information to have, if G remotely cares about C crashing. I prefer to reserve the whole range of 128+ for something went very wrong, most likely a crash at some point, and if you get 129+, it was directly below you and you get the signal number. Though if you want shell compatibility you could also have an option to return exit code, or 128+signum when signaled, and similarly one would either be fine with that, or have to use the fd for full/complete info. Programs that can establish a convention between one another are easy to deal with. If I remember to document the convention (finish scripts *whistle*) -- Laurent
Re: How to report a death by signal ?
I'm leaning more and more towards the following approach: - child crashed: exit 128 + signal number - child exited with 128 or more: exit 128 - else: exit the child's exit code. Assuming normal commands never exit more than 127, that reports the whole information to the immediate parent, and correct information, if incomplete, higher up. That should be enough to make things work in all cases. Thoughts ? -- Laurent
Re: s6, execline, skalibs in FreeBSD
Thanks Colin ! -- Laurent
Re: Feature requests for execline s6
- execline: I'd like the addition of a new command, e.g. readvar, that would allow to read the content/first line of a file into a variable. IOW, something similar to importas (including an optional default value), only instead of specifying an environment variable one would give a file name to load the value from (in a similar manner as to what s6-envdir does, only for one var/file and w/out actually using environment variable). How about backtick -n SOMETHING { redirfd -r 0 FILE cat } import -U SOMETHING ? Sure it's longer, but it's exactly what the readvar command would do. You can even program the readvar command in execline. ;) If for some reason it's inconvenient for you, I can envision writing something similar to readvar, but it *is* going to use the environment: I've been trying to reduce the number of commands that perform substitution themselves, so that passing state via the environment is prioritized over rewriting the argv, and import is kind of the one-stop-shop when you need substitution. I don't want to go back on that intent and add another substituting command. Is it a problem for you to use the environment ? - s6-log: I think a new action to write to a file could be useful. The problem with that is the whole design of s6-log revolves around controlling the size of your logging space: no uncontrolled file growth. Writing to a file would forfeit that. What I'm considering, though, is to add a spawn a given program action, like a !processor but every time a line is selected and triggers that action. It would answer your need, and it would also answer Patrick's need without fswatch. However, it would be dangerous: a misconfiguration could uncontrollably spawn children. I'll do that when I find a safe way to proceed. Now, shouldn't thoose 2 simply be 1-s, since the NUL is already accounted for with sizeof? Or am I missing/misunderstanding something here? I'm looking, I'm thinking, and I can't find a good reason why those shouldn't be 1s. ... That means... all this time, s6-supervise has been reading from an invalid memory location (the byte after the end of /supervise/status) ? Ouch. That one could have seriously hurt. Thanks for the bug-report. (And valgrind never said anything. Bad, bad valgrind.) -- Laurent
Re: Fwd: [skalibs-2.0][PATCH] posting new clean patch (from supervision lits)
On 05/01/2015 23:01, Paul Jarc wrote: Is there any autoconf-equivalent processing that needs to be done to a fresh git clone, or is it already in the same state as a released tarball? No processing necessary, fresh git clones should be usable as is. Tarballs are just made from tagged git snapshots. -- Laurent
Re: tai confusion
On 07/01/2015 08:40, Paul Jarc wrote: I'm finally digging into a long-standing bug exhibited by runwhen (rw-match computes a timestamp 10 seconds too early), and I think the problem is in skalibs. tai_from_sysclock() adds 10 seconds depending on whether skalibs is configured with --enable-tai-clock. But tai_from_timeval doesn't, so they're inconsistent. Actually, they're not; what is inconsistent is the naming. I probably should have paid more attention to that, and may change it in the future (yay API changes). In tai_from_sysclock, tai means: what will be stored in that structure is a real, absolute TAI time. It's the TAI time corresponding to the sysclock time. It's the same whether the clock is TAI-10 (in which case you simply add 10 seconds) or whether it's UTC (in which case you add 10 seconds plus the leap seconds). The tai_from_utc function, which tai_from_sysclock resolves to when --enable-tai-clock is not given, does add the 10 seconds too. In tai_from_timeval, tai means: store the same information in a tai_t. It's a simple format conversion function - the struct timeval could hold anything, a TAI-10 time, a UTC time, or something else, as long as it's absolute. No time conversion operation is done here. Yes, it's confusing. My bad. actually both wrong: the correct behavior for both should be to unconditionally add 10 seconds, and conditionally add leap seconds depending on --enable-tai-clock That is what happens for tai_from_sysclock. That is not what happens for tai_from_timeval, on purpose. I suspect your problem in runwhen is that you are calling tai_from_timeval(), or any other simple format conversion function, while expecting a time conversion function. You should always be using tain_now() to get the current time, it will give you TAI no matter what your setup is. You should not generally use tai_from_timeval() yourself. With a POSIX clock and no --enable-tai-clock, we need to add the appropriate amount of leap seconds or else the tai_t values we generate will differ from those simultaneously generated on a system using TAI-10 and --enable-tai-clock. Yes, that is exactly what happens. When you call tai_sysclock(), the TAI value you get is the same whether your clock is set to TAI-10 and you have --enable-tai-clock, or your clock is set to UTC and you have --disable-tai-clock. The tain_sysclock() function, which is what tain_now() normally calls (unless you asked --enable-monotonic), goes like this: * sysclock_get() gets the time from the system clock, no matter its format, into a tain_t. tain_from_timespec or tain_from_timeval are just struct conversions, they're time-agnostic. * tai_from_sysclock() assumes that the time it is given comes directly from the system clock in its native format, and converts it into TAI: - by adding 10 seconds if the system clock is TAI-10 - by calling tai_from_utc() otherwise + tai_from_utc() adds 10 seconds plus the leap seconds so what you get in the end is always TAI. (This means that on a POSIX system, converting future times to TAI will give you wrong results after the time when the as-yet-unknown next leap second will be added.) That's unfortunately unavoidable, and a limitation of POSIX. Time arithmetic can only be performed correctly with linear time, which TAI is and UTC is not. That is why skalibs uses TAI for all its time computations. But with a UTC clock, you do need an accurate leap second table to make the correct conversions, and if you're computing past the last known leap second, tough luck. The alternative with a UTC clock, which basically every non-skalibs- based system uses, is to perform time arithmetic with UTC, which gives you wrong results whether far into the future or not, and they simply don't care because it would be too hard. Thanks POSIX. -- Laurent
Re: Typo in http://skarnet.org/software/s6-portable-utils/upgrade.html
On 08/01/2015 16:32, Vallo Kallaste wrote: in 2.0.0.1 skalibs dependency bumped to 2.0.0.0 ^^^ 2.1.0.0 Fixed. Thanks. -- Laurent
Re: s6-tcpclient read/write FDs
On 17/03/2015 16:52, Vincent de RIBOU wrote: Hi all, I assume that read and write separate FDs (6 and 7) are only present as compliancy with other tools which produce 2 real different FDs.But on TCP is not really. I've made TLS client over s6-tcpclient with wolfSSL. This lib takes only 1 FD for context.Since FDs 6 and 7 are same link with s6-tcpclient, I used FD 6 for ctx building and it works. How should I use 2 really different FD's in that context??May I set internal unix concept to mux 2 FD's on 1?? You can simply ignore the second fd, which is just a copy of the first one. You can even close(7) at the start of your client and use 6 everywhere, it will work. -- Laurent
Re: s6-devd, stdin stdout
On 07/03/2015 18:37, Olivier Brunel wrote: Hi, I have a question regarding s6-devd: why does it set its stdin stdout to /dev/null on start? Hi Olivier, The original purpose of s6-devd was actually to emulate the behaviour of /proc/sys/kernel/hotplug, using the netlink to serialize calls instead of having them all started in parallel. A helper program called by /proc/sys/kernel/hotplug would start with stdin and stdout - and even stderr - set to /dev/null. That's where the redirection in s6-devd comes from. Changing that behaviour means that a program that's used with s6-devd is not guaranteed to be able to use as a /proc/sys/kernel/hotplug helper if it performs I/O. But that's probably not important, so I can remove the stdout redirection, it shouldn't be a problem - stderr is already non-redirected anyway. The stdin redirection, though, should stay: you don't want a hotplug helper to depend on a state defined by streamed userland input. Specifically, the doc for s6-setuidgid says: If account contains a colon, it is interpreted as uid:gid, else it is interpreted as a username and looked up by name in the account database. This doesn't seem to be true (anymore?). Gah. I totally forgot about that change when rewriting s6-setuidgid. Now that s6-applyuidgid exists, I really want to get rid of that quirkiness... but it's my fault, so rather than removing the bit of documentation, I'll reimplement the feature. Duh. -- Laurent
[announce] skalibs-2.3.2.0, s6-2.1.3.0
Hello, * skalibs-2.3.2.0 is out. It fixes bugs reported by altell.ru's static analyzer. Thanks Altell! It also adds the gid0_scan() macro. http://skarnet.org/software/skalibs/ git://git.skarnet.org/skalibs * s6-2.1.3.0 is out. It features new options to s6-envuidgid. http://skarnet.org/software/s6/ git://git.skarnet.org/s6 Enjoy, More bug-reports welcome. -- Laurent
Re: NULL pointer dereference in skalibs's mininetstring_write()
On 13/03/2015 15:50, Roman Khimov wrote: 11if (!w) That one should be: if (!*w) It's obvious that if 'w' is NULL there will be NULL pointer dereference on line 19 or 20. What's not so obvious is how to properly fix that. Actually, w is never supposed to be NULL. Calling mininetstring_write() with a NULL w is a programming error, and testing w instead of *w was a bug. Thanks for the report! -- Laurent
Re: [PATCH 0/7] static analysis fixes for skalibs
On 13/03/2015 15:24, Roman I Khimov wrote: Hello. Here at Altell we daily pass all of our project's software (and that is kinda whole distribution) through special 'static analysis' build that doesn't actually produce any output other than reports from two (currently) tools: cppcheck and Clang's scan-build. As we've added skalibs into our project we've immediately received reports for skalibs. It's nice overall, but there are some things that can fixed or improved, thus this patch series for your review and (probably) merge. Hi Roman, Thanks a lot for the reports! This is very interesting. I'll try and see if I can use cppcheck and scan-build myself in the future, if it can detect errors like the ones you submitted. My comments on your patches: 1/7: I incremented 's' for clarity, because that's I always do in scanning functions. Normally the compiler ignores the useless increments and this does not worsen the resulting code. Do you think the increment actually takes away from clarity ? Or does clang emit a warning about it ? (gcc does not.) I think it's harmless, but if you disagree, I don't really care either way. 2/7: applied. 3/7: applied. 4/7: I've actually tried going the opposite way lately: reducing the amount of parentheses I'm using. I think it's better to ensure your C programmers know their operators' precedence than to defensively add parentheses whenever there's uncertainty. Uncertainty is a bad thing - if you're not sure, read the language spec. Besides, usually, the language's precedence makes sense. So I'm not going to apply that one; is there a way to silence that type of report in the static analyzer ? 5/7: applied. 6/7: I'm surprised your tools detected that one, but not the zillion other cases in skalibs. There are lots of functions that do not modify their arguments but do not declare them as const... I basically never bother using const qualifiers in function arguments - force of habit; and I think compilers are able to themselves deduce the const status of those arguments, so the code isn't worse for it. At some point, when OCD overcomes laziness, I may make a complete pass over all of my code and fix that, but I don't think it's needed, and changing it in one function only doesn't really make sense. 7/7: applied. I'll commit when I've made sense of the mininetstring_write thing. ;) Thanks again! -- Laurent
Re: [PATCH 0/7] static analysis fixes for skalibs
On 13/03/2015 16:47, Roman Khimov wrote: Both scan-build and cppcheck complain here. Sure, it's not an error, just a harmless dead code, but well, tools don't like dead code and I personally don't like it either, so IMO it's better to drop it if there are no valid reasons for it to stay. Fine, I removed it. *shrug* Speaking of dead code, cppcheck also sees some in src/sysdeps/trycmsgcloexec.c and src/sysdeps/trygetpeerucred.c, but from what I see those are currently stubs, so I didn't touch them. Yes, some of the code in src/sysdeps/ is not supposed to be run, but only compiled and/or linked. It's there to test for feature of the host system. It's purely stylistic thing, so if you as a git master owner think it doesn't make sense, I'm fine with it. Oh, it makes sense, but I don't like this approach. It smells too much of defensive programming, in which you do things just to be sure. Well, when in doubt, add parentheses is the wrong approach to me, the right approach being when in doubt, RTFM and remove the doubt. Well, this one is from me personally, fixing 5/7 and 7/7 I wasn't sure that nothing changes 'n' because child_spawn() is not a 10-line function and 'n' is not fun to search for. Making it const easily ensures that 'n' is the same 'n' in error handler as it was in the beginning of the function. Oh, OK. I understand now. And you're right, n isn't modified in child_spawn(). Fixes committed, new release ready. Thanks again! -- Laurent
[announce] Minor releases
Hello, A series of small releases. * skalibs-2.3.3.0 --- - A bugfix in buffer_get, that returned an error on short read instead of simply returning the number of bytes read. (For error on short reads, buffer_getall() is where it's at.) - A sha512 implementation, skalibs/sha512.h http://skarnet.org/software/skalibs/ git://git.skarnet.org/skalibs * execline-2.1.1.1 - The execline parser is now a library function, el_parse(). http://skarnet.org/software/execline/ git://git.skarnet.org/execline * s6-dns-2.0.0.3 -- - A bugfix in s6dns_engine that sometimes performed a double close in case of a read error. http://skarnet.org/software/s6-dns/ git://git.skarnet.org/s6-dns * s6-networking-2.1.0.1 - - A regression fix: s6-tcpclient and s6-tcpserver-access did not read /etc/resolv.conf, leading to incorrect DNS resolution. http://skarnet.org/software/s6-networking/ git://git.skarnet.org/s6-networking GitHub is currently the target of a DoS attack (apparently from the Chinese censorship authorities); I had trouble pushing the changes to GitHub. Don't be surprised if you have trouble pulling from it. When in doubt, always pull from git.skarnet.org, which probably won't be under a political Chinese attack in the foreseeable future. Enjoy, Bug-reports welcome. -- Laurent
[announce] execline-2.1.1.0, s6-portable-utils-2.0.3.0
Hello, * execline-2.1.1.0 is out. It adds a new command: forstdin, which splits its standard input and spawns a program for every element. The forbacktickx command is now a wrapper around pipeline and forstdin. http://skarnet.org/software/execline/ git://git.skarnet.org/execline * s6-portable-utils-2.0.3.0 is out. It adds a new command: s6-dumpenv, which dumps its whole environment into an envdir. http://skarnet.org/software/s6-portable-utils/ git://git.skarnet.org/s6-portable-utils Enjoy, Bug-reports welcome. -- Laurent
GitHub mirrors
I finally caved in and set up GitHub mirrors for all the skarnet.org packages. https://github.com/skarnet (Yes, the picture is ugly. I may get a better one in a few months. :P) So, if you wanted a web interface to browse the source, here you go. I don't like GitHub much, but if it saves me headaches with cgit, all the better. -- Laurent
Re: s6 readiness support
On 25/02/2015 22:29, Patrick Mahoney wrote: The loopwhilex keeps the pump primed, so to speak, so /service/s can be stopped and started many times with readiness reporting working. Otherwise, I'd need to restart /service/s/log as well as /service/s. On the other hand, I have mostly idle backtick and head commands hanging around. How about pipeline -w { cd .. forbacktickx -d\n i { cat } s6-notifywhenup -f echo } ... ? If you don't mind using a shell, you can even have a single shell lying around instead of both a forbacktickx and a cat process: /bin/sh -c while read ; do s6-notifywhenup -f echo ; done (But really, s6-ftrig-notify event U is less hackish than s6-notifywhenup echo.) And this need for a cat process makes me think forbacktickx is badly designed. It should parse its own stdin instead of spawning a command; forbacktickx functionality can be achieved by combining the parse-stdin program with pipeline. -- Laurent
Re: wait but kill if a max. time was exceeeded
On 23/04/2015 17:41, Gorka Lertxundi wrote: I have a very simple question, is it possible in execline to wait up to a maximum amount of time to finish a background program execution? And if it didn't finish, kill it forcibly? Does this help ? http://skarnet.org/software/s6-portable-utils/s6-maximumtime.html -- Laurent
s6-rc design ; comparison with anopa
So, I've been planning to write s6-rc, a complete startup/shutdown script system based on s6, with complete dependency management, and of course optimal parallelization - a real init system done right. I worked on the design, and I think I have it more or less down; and I started coding. Then Olivier released anopa: http://jjacky.com/anopa/ anopa is pretty close to my vision. It's well-designed. It's good. There *are* essential differences with s6-rc, though, and some of them are important enough that I don't want to immediately stop writing s6-rc and start endorsing anopa instead. This post tries to explain how s6-rc is supposed to work, and how it differs from anopa, and why I find the differences important. What I hope to achieve is a design discussion, with Olivier of course, but also other people interested in the subject, on how an ideal init system should work. My goals are to: - reach a decision point: should I keep writing s6-rc or drop it ? Dropping it can probably only happen if Olivier agrees on making a few modifications to anopa, based on the present discussion, but I don't think it will be the case because some of those modifications are pretty hardcore. - if I keep writing s6-rc: benefit from this discussion and from Olivier's experience to avoid pitfalls or designs that would not stand the test of real-life situations. So, on to it. Three kinds of services --- Like anopa, s6-rc works internally with two kinds of services: longrun, which is simply defined by a service directory that will be directly managed by s6, and oneshot, which is defined by a directory containing data (a start script, a stop script, and some optional stuff). s6-rc allows the user to provide a third kind of service: a bundle. A bundle is simply a set of other services. Starting a bundle means starting all the services contained in the bundle. A bundle can be used to emulate a SysV runlevel: the user can put all the services he needs into a single bundle, then tell s6-rc to change the machine state to exactly that bundle. Bundles can of course contain other bundles. A oneshot or a longrun are called atomic services, as opposed to a bundle, which is not atomic. Bundles are useful for the user, because oneshot and longrun are often too small a granularity. For instance, the Samba service is made of two longruns, smbd and nmbd, but it's still a single service. So, samba would be a bundle containing smbd and nmbd. Also, the smbd daemon itself could want its own logger, smbd-log. Correct daemon operation depends on the existence of a logger (a daemon cannot start if its logger isn't working). So smbd would actually be a bundle of two long-runs, smbd-run (which is the smbd process itself) and smbd-log (which is the logger process), and smbd-run would depend on smbd-log. Users who want to start Samba don't want to deal with smbd-run, smbd-log, nmbd-run and nmbd-log manually, so they would just start samba, and s6-rc would resolve samba to the proper set of atomic services. Source, compiled and live - Unlike anopa, s6-rc does not operate directly at run-time on the user-provided service definitions. Why ? Because user-provided data is error-prone, and boot time is a horrible time for debugging. Also, s6-rc uses a complete graph of all services for dependency management, and generating that graph at run-time is costly. Instead, s6-rc provides a s6-rc-compile utility that takes the user-provided service definitions, the source, and compiles it into binary form in a place in the root filesystem, the compiled. At run-time, s6-rc ignores the source, but reads its data from the compiled, which can be on a read-only filesystem. It also needs a read-write place to maintain information about its state; this place is called the live. Unlike the compiled, the live is small: it can reside in RAM. The point of this separation is multifold: efficiency (all checks, parsing and graph generation performed at compile-time), safety (the compiled can be write-protected), and clarity (separation of user- modifiable data, current configuration data, and current live data). Atomic services can be very small. It can be a single line of shell for a oneshot, for instance. I fully expect package developers to produce source definitions with multiple atomic services (and dependencies between those services) and a bundle representing the whole package. I expect the total number of atomic services on a typical reasonably loaded machine to be around a thousand. Yes, it can grow very fast - so having a compiled database isn't a luxury. Run-time At run-time, s6-rc only works in *stage 2*. That is important, and one of the few things I do not like in anopa: stage 1 should be completely off-limits to any tool. s6-rc only wants a machine with a s6-svscan running on a scandir. It does not care what happened before. It does not care whether s6-svscan is
Re: s6-rc design ; comparison with anopa
On 23/04/2015 23:26, Joan Picanyol i Puig wrote: I'd really expect a ui that can diff compiled live vs. source (and obviously, to inspect compile live). There will definitely be a ui to inspect compiled + live. As for diffing the current state vs. source, I think it will be too complex, because it would amount more or less to performing the work of the compiler again. What can be done is something that compares two compiled databases, so to diff against the source, you would compile the source into a temp database then compare the temp to the current compiled. I find ./down so convinient that would like having support for it in the source format. The thing is, s6-rc already makes use of ./down internally. When you run s6-rc -d this-longrun-service, it first brings down everything that depends on this-longrun-service, then creates ./down in this-longrun-service's service directory in live, then calls s6-svc -d on this-longrun-service. It says that the service is supposed to be down and remain that way, because that is the state you want to see enforced. s6-rc is a global state manager. If you use it, you delegate all your service management to it. Any heuristics will face unsolvable situations. I'd aim at getting the patch (dual of diff above) action right all the time first. That can be done, but with or without heuristics, there will still need to be a tool to actually update the live state. diff is easier than patch; the details of patch are what I'm interested in. -- Laurent
Re: wait but kill if a max. time was exceeeded
On 24/04/2015 13:28, Peter Pentchev wrote: Oof, thanks a LOT for taking away the opportunity for me to advertise http://devel.ringlet.net/sysutils/timelimit/ :P Sorry about that. :P It's not a very original idea anyway. busybox timeout, for instance, does the same thing. I'm sure there are plenty of other implementations too. -- Laurent
Re: s6-rc design ; comparison with anopa
On 25/04/2015 21:38, Colin Booth wrote: This actually brings up another question, is there any provision for automatic bundling? If sshd requires sshd-log and a oneshot to create a chroot directory does s6-compile also create a bundle to represent that relationship or do we need to define those namings ourselves. This is the inverse of my question about loggers being implicit dependencies of services. I'll probably add an automatic bundling feature for a daemon and its logger; however, a oneshot to create a chroot directory? That's too specific. That's not even guaranteed portable :) (chroot isn't in Single Unix.) If you want a change from the default daemon+logger configuration, you'll have to manually set up your own bundle. It depends on the persons setup. In my case it's a user-supplied dependency since there's nothing intrisic to dnsmasq or hostapd that reqires them to run together which I currently get around by polling for dnsmasq's status from within the hostapd run script. All in all it's a semantic difference between dependencies that are needed to start (dependencies), and depenencies that are needed for correct functioning but are not needed to run (orderings). The nice part is that while there is a slim difference between the two, all the mechanisms for handling dependencies can also handle orderings as long as the dependency tracker handles user-supplied dependencies. Handling user supplied dependencies also simplifies the system conceptually since people won't have to track multiple types of requirements. What do you mean by user-supplied dependencies ? Every dependency is basically user-supplied - in the case of a distro, the user will be the packager, but s6-rc won't deduce dependencies from a piece of software or anything: the source will be entirely made by people. So I don't understand the distinction here. One last thing and I'm not sure if this has been mentioned earlier, but how easy is it to introspect the compiled form without using the supplied tools? A binary dumper is probably good enough, but I'd hate to be in a situation where the only way to debug a dependency ordering issue that the compiler introduced is from within the confines of the s6-rc-* ecosystem. The s6-rc tool includes switches to print resolved bundles and resolved dependency graphs. I will make it evolve depending on what is actually needed. What kind of functionality would you like to see ? (There is also a library to load the compiled form into memory, so writing a specific binary dumper shouldn't be too hard.) -- Laurent
Re: s6-rc design ; comparison with anopa
On 25/04/2015 11:24, Joan Picanyol i Puig wrote: What I'd like is the ability to have some services ready-to-run, but not up by default. Some of them might be there for contingency purposes (so that an operator can start a failover), some of them might have to go up (and down) at certain times only. If a service S doesn't need to be up for other services to run, then simply don't set dependencies on S. Don't include S in your main bundle of services. Start your main bundle; then, when you want to start S, s6-rc -u S. When you want to stop it, s6-rc -d S. Since nothing depends on S, s6-rc -d S will only stop S. While S is considered down from the s6-rc standpoint, its service directory will include a ./down file. I lack knowledge experience to attempt to provide details, so I'll just handwave poiting that the first concept to have clear is that of identity: is this servicedir a new service definition or a modification of an existing one? It then should be feasible to compute the modifications needed to the live DAG (inserting/removing nodes as well as restarting them). Yes. Identity is simply defined by the service name. The hard part is what to do when dependencies change for a same name. -- Laurent
Re: s6-rc design ; comparison with anopa
On that note, one thing you've apparently done/planned is auto-stopping, whereas there is no such thing in anopa. This is because I always felt like while auto-starting can be easily predictable/have expected behavior, things aren't the same when it comes to stopping. That is, start httpd and it will auto-start dependency php which will auto-start dependency sqld; fine. But I'm not sure stopping sqld should stop php, despite the dependency. Maybe one just wants to shut down sqld, not the whole webserver. If php depends on sqld, it means php isn't functional when sqld is down. So, if you shut down sqld without shutting down php, your webserver will be unreliable. If you really want to do that, bypass s6-rc and run s6-svc -d $live/servicedirs/sqld. s6-rc won't notice when you manually down a service. You can keep it down, or manually bring back up later. It's a case of if you do this, you're supposed to know what you are doing and the tool won't try to change it behind your back. Then there's also the case of, imagine you start foobar web. web is a bundle with httpd, php sqld, foobar is a service w/ a dependency on sqld. Now you stop web; should sqld be stopped? about foobar? What is the expected behavior here? If you stop web, it means you stop httpd, php and sqld, and also foobar since foobar depends on sqld. If you don't want to do that, don't stop web: stop httpd and php instead. You'll keep sqld and foobar. Better, make a bundle web = httpd+php and web+sqld = httpd+php+sqld. Start web+sqld, stop web. sqld and foobar are untouched. Bundles are great. You can define a bundle for every possible combination of services if you're so inclined (it's still 2^n, so don't overdo it, but yeah). Bundle resolution is fast: it's a name lookup in a directory. With ext4, you could have a million bundle names before you'd start noticing slowdowns in resolutions. I'm not sure there's one, different people/scenario will have different expectations... it always seemed too complicated so I went with no auto-stopping at all. I don't think auto-stopping is any more complex than auto-starting. It's exactly symmetrical. The difficult part is designing the right dependency graph, and that's a job for packagers. Yes, they will screw it up, but please let's fix the world one step at a time. Just so I understand: why are you talking about smbd-log as a separate service, and not the logger of service smbd, as usual with s6? Or is that what you're referring to here with smbd-log? A service and its logger are defined as separate longrun services for s6-rc. The compiler will recognize that smbd-log is a logger for smbd-run and create the proper service directory with a log/ subdirectory, and register smbd-run as the service directory for the daemon and smbd-run/log as the service directory for the logger. But in the compiled database, those are two different services. It makes sense, because unlike systemd, we don't want to start daemons before their loggers are operational. And loggers are not a given: they may depend on the oneshot service mount /var/log, for instance. It would be silly to have the daemon itself depend on /var/log. I'm not sure where you put the scandir in this? I believe original servicedirs where taken from the source and put into the compiled, and s6-rc will create the actual servicedirs in the scandir with s6-rc-init, correct? So that would be part of live? How about the oneshot scripts? s6-rc-init is supposed to run in stage 2, so it assumes there is an operational scandir already (probably empty or almost empty). You give the location of the scandir as an argument to s6-rc-init. s6-rc-init creates live, makes $live/scandir a symlink to the scandir given in argument (so s6-rc always knows how to find it), and creates all service directories under $live/servicedirs with down files. Then it symlinks them all into the scandir and calls s6-svscanctl -a. The oneshot scripts are read-only, they don't need anything in live, so they're a part of compiled. Of course, there's a $live/state file that maintains the state of all atomic services, oneshots as well as longruns. Also, with anopa I wanted to fill what's missing in s6, and that included a full init system, so stage 1. You originally said s6-rc was meant to be a full init system, but now you're saying you have another tool/package in mind for that, s6-init. That's fine, but with anopa I wanted the whole thing, yes. Which is great! Making stage 1 more accessible is arguably more urgent or important than managing services, but it's also harder to do in a portable way, that's why I started working on s6-rc first. It's more or less the only reason. One thing though, your compilation process only does copy servicedirs, or is there more (besides the whole packing them into a binary form alongside dependency graphs)? It actually creates service directories depending on the information that has been given in the
Re: s6-rc design ; comparison with anopa
On 25/04/2015 09:35, Colin Booth wrote: I've been having a hard time thinking about bundles the right way. At first they seemed like first-class services along with longruns and oneshots, but it sounds more like they are more of a shorthand to reference a collection of atomic services than a service in their own right. Especially since bundles can't depend on anything it clarifies things greatly to think of them as a shorthand for a collection of atomic services instead of a service itself. Yes, that's exactly what a bundle is: a shorthand for a collection of atomic services. Sorry if I didn't make it clear enough. As long as A depends on B depends on C, if you ask s6-rc (or whatever) to shutdown A, the dependency manager should be able to walk the graph, find C as a terminal point, and then unwind C to B then finally A. While packagers will screw up their dependency graph, they'll screw it up (and fix it) in the instantiation direction. If A depends on B depends on C, and you ask s6-rc to shutdown A, then it will *only* shutdown A. Hoever, if you ask it to shutdown C, then it will shutdown A first, then B, then C. For shutdowns, s6-rc uses the graph of reverse dependencies (which is computed at compile time). Will s6-rc make loggers implicit dependencies of services, or will need to define that? In other words, if we have a bundle 'ssh-daemon' that contains the longruns sshd and sshd/log, will the dependency compiler correctly link those so that when you ask the bundle to start it brings up sshd/log first, and when you ask it to start it brings down the logger last. Unless there's a compelling reason not to, if there's a sshd service and a sshd-log service and there's an annotation somewhere in the definition of sshd or sshd-log that sshd-log is the logger for sshd, then s6-rc-compile will automatically create a dependency from sshd to sshd-log, and ensure that $live/servicedirs/sshd is the servie directory for sshd and $live/servicedirs/sshd/log is the service directory for sshd-log. So yes, when you start the sshd-daemon bundle, s6-rc will start the logger first and then the daemon, or stop the daemon first and then the logger. Are oneshots assumed (requored) to be idempotent? Or does $live/state track if a oneshot has already been fired and no-op if that is the case? $live/state tracks oneshots, of course. :) Not to be too pedantic, but how often do daemons ask for user input? I don't know about you, but being required to pass user input on boot is a big nono in my book. Tell that to Olivier. :) ISTR some encrypted filesystems require a passphrase to be given at mount time, so this is a real use case. I don't intend to add terminal support to s6-rc; if the problem comes up, i.e. if several terminal- using services are started in parallel and conflicts arise, I'll think of a specific solution in time. So components of a bundle can fail and the bundle is still considered to be functional? This make sense only if bundles are really tag sets or some other loose grouping. Yes, that's what they are. The atomic services that can be started are still started; however, the s6-rc -u bundle invocation will exit nonzero, since some atomic services failed. It is then possible to use some ui to list running atomic services and see what has succeeded and what has failed. If you have a wireless router running hostapd (to handle monitor mode on your radios) and dnsmasq (for dhcp and dns) you're going to want an ordering where dnsmasq starts before hostapd is allowed to run. There's isn't anything in hostapd what explicitly requires dnsmasq to be running (so no dependency in the classic sense) but you do need those started in the right order to avoid a race between a wireless connection and the ability to get an ip address. Hmmm. If hostapd dies and is restarted while dnsmasq is down, the race condition will also occur, right ? Since hostapd, like any longrun process, may die at any time, I would argue that there's a real dependency from hostapd to dnsmasq. If dnsmasq is down, then hostapd is at risk of being nonfunctional. I still doubt ordering without dependency is a thing. Shouldn't this be handled prior to Stage-2 handoff? At the tail-end of Stage-1 you Try to start a getty along with the catch-all logger. If that fails we bring up our debug shell. Assuming it doesn't fail, Stage-2 starts, s6-rc does its work, and brings up the `gettys' bundle which no-ops on getty-1 (already started) and bring up 2-N. As long as s6-rc is constrained to Stage-2 and multiple start attempts on a service remain idempotent, we should be able to use the same shell escape-hatch mechanisms in the early stages without having to add extra logic handling to the application. Yes, for the initial getty or debug shell, this can be directly handled in stage 1. For other conditional executions, I think conditionally running different s6-rc invocations will be enough. In any case, it can be done
github and dropbear
Since a few days ago (but I haven't tried committing anything for a long time before that, so I'm not sure when it started) I've had trouble pushing commits to the github mirror of my packages. I push via git over SSH, with the dropbear SSH client, dbclient, that reports: dbclient: Connection to g...@github.com:22 exited: Integrity error i.e. it interprets github packets as corrupted ones. I can't pull from github either: same symptoms. Could anyone who uses github and dropbear (I figure there are such people on this list :)) try pulling from or pushing to github and tell me if they experience the same, or if it is just me ? Could anyone using git over SSH with another SSH client try pulling from or pushing to github and tell me if it works for them ? Thanks, -- Laurent
Re: s6-rc design ; comparison with anopa
On 27/04/2015 07:59, Colin Booth wrote: OpenSSH, at least on Linux and *BSD, chroots into an empty directory after forking for your login. That was an example but I think the question is still valid: if you have a logical grouping of longrun foo, longrun foo-log, and a oneshot helper bar, where foo depends on foo-log and bar, does s6-rc automatically create a bundle contaning foo, foo-log, and bar? No, because it cannot tell where the logical grouping will end. Chances are that bar, or foo, or foo-log, depends on something else; you don't run services, or even service groups, in a void - see below. In hindsight, this question could probably have been better asked as follows: does s6-rc automatically create bundles for each complete dependency chain? No, and not only because of the naming - just because it would not make sense to shutdown such a bundle. If foo-log depends on a mountvarlog oneshot that mounts /var/log because foo-log logs into /var/log/foo, and mountvarlog depends on a mountvar oneshot, then an automatic foo-bundle would include foo, foo-log, mountvarlog, and mountvar. Shutting down foo-bundle would shut down everything in the bundle, so it would bring down everything that depends on /var and /var/log, then unmount /var/log, then unmount /var. That is probably not what you want. :) A bundle is only a good idea when it makes sense to bring all its contents up or down at the same time. This is the case for a daemon and its logger; this is the case for a runlevel. This is not the case for a complete dependency chain, which you don't need to bundle since s6-rc will properly act on dependencies anyway. It's mostly a distinction of does service foo start but maybe not do the right thing if bar isn't running (httpd and php-fpm) vs. does service foo crash if bar isn't running (basically everything that depends on dbus). Call me uncompromising, but I don't think start but maybe not do the right thing if bar isn't running is a useful category. Either foo can work without bar or it cannot; the unwillingness to make a decision about it is unjustified cowardice - uncertainty and grey areas bring nothing but suckiness and pain to a software system. If the user decides that foo can work without bar, then fine: he won't declare a dependency from foo to bar. If he decides otherwise, he will declare such a dependency. But s6-rc will not provide support for wishy-washiness. Also, by no means was I trying to imply that s6-rc should deduce anything. If anything I was saying that, as an SA, being able to take implicit dependencies that mostly exist in the form of polling loops in run scripts and other such hackery (such as my wireless setup) and rewrite them as explicit dependencies for the state manager to manage sounds great and is probably the part of s6-rc that I'm most excited about. Well, yes, that's the only reason why I think a real state manager can be useful :) Low-surprise interoperability with standard unix tools mostly. Assuming the compiled format isn't human readable, having the functionality to do a human-readable dump to stdout (so people can diff, etc) is totally fine. Yes, that's planned. If we can hit the compiled db directly with the same tools and get meaningful results, than all the better. The db is in binary form, so you'll need the dumping tool. -- Laurent
Re: [PATCH] devd: Fix invalid option used for s6-uevent-listener
Ah, good catch. Patch applied, thanks. It's available in the current git. Note that I still can't push to github because they've broken their sshd's compatibility with dropbear. I've reported the issue, but it hasn't been fixed yet. Until they fix it, the GitHub mirror for skarnet.org packages will be stale. -- Laurent
Re: Very basic question, regarding redirects
On 11/05/2015 13:52, Scott Mebberson wrote: I'm working on an addition to the s6-overlay project. I want to make it super easy to create environment variables within a Docker container. IIRC, /var/run/s6/container_environment is meant to hold the variables that the container is actually started with; outside interaction with the container may rely on that (i.e. commands started with with-contenv may not work correctly if that environment has been modified). I'm not sure it's safe to do that. But if that's what you want, you can just type redirfd -w 1 /var/run/s6/container_environment/env_var_name s6-echo -- env_var_value and you're done. I'm not sure it justifies a specific script to do that, unless you want to add some more processing. Be aware that with-contenv reads the environment variables verbatim from the files, so you have to manually strip newlines and otherwise sanitize env_var_value before storing it. tr a-z A-Z You can't do that in the overlay, because there's no guarantee that every image will have a tr binary. That's the reason why everything in the overlay is written in execline and uses s6-* binaries: no external dependencies at all, so it works with every possible image. Just trust the user to use the correct filename. Also, it is valid to use lowercase in environment variables - upper case is just a convention used to work around a deficiency of the shell, i.e. treating internal shell variables and environment variables the same way (which makes it harder to tell the difference, that's why people usually reserve full upper case for the environment). But I can't get it to work. I guess redirection isn't supported? I've got no idea really, I'm lost! http://skarnet.org/software/execline/redirfd.html :) Good luck, -- Laurent
Small spring releases
Hello, Just a few small releases to keep you waiting until s6-rc is ready. :P skalibs-2.3.4.0 --- Cleanups and bugfixes. New stat_at() and lstat_at() functions. http://skarnet.org/software/skalibs/ git://git.skarnet.org/skalibs execline-2.1.2.0 New command: trap. It does what it sounds like it does. http://skarnet.org/software/execline/ git://git.skarnet.org/execline s6-portable-utils-2.0.5.0 - - s6-sort fixed to work with skalibs-2.3.4.0. - New command: s6-seq. It also does what it sounds like it does. This is for use in environments where seq is not guaranteed available - for instance container overlays. http://skarnet.org/software/s6-portable-utils/ git://git.skarnet.org/s6-portable-utils Enjoy, Bug-reports welcome. -- Laurent
Re: execline's pipeline (and forbacktickx) with closed stdin
On 10/05/2015 00:07, Guillermo wrote: I ran into this while experimenting with the example / template stage 1 and 3 init scripts that come with s6's source code. Both of them do an early fdclose 0 to ignore input. Wouldn't that be tempting the demons to fly through your nose, then? :) Know that I precisely audited the whole series of programs running with 0, 1 and 2 closed in those stages before answering you. And there's no risk there, it works. :) (In stage 3, you can replace fdclose 0 with redirfd -r 0 /dev/null and it will be conformant. No such easy way out in stage 1, though, if you want to change /dev after the kernel has mounted it. Note that you don't have to do that if you're using devtmpfs and keeping it.) Anyway, I had either replaced the early fdclose 0 with redirfd -r 0 /dev/null (and also realized that worked by accident, because I somehow have a nonempty /dev at startup) or delayed it a bit. I suppose that's good enough... Did you really manage to umount /dev (maybe) and mount a tmpfs over it (for sure) with fds still open to the old /dev ? Without an EBUSY error ? If it's the case, and you're using Linux, then the kernel's behaviour changed. -- Laurent
Re: execline's pipeline (and forbacktickx) with closed stdin
On 10/05/2015 20:14, Guillermo wrote: I mostly followed the example init scripts, but I did deviate, among other things to delegate tasks to OpenRC (as a bundle of oneshot services in s6-rc terminology, hehe). And now that you remind me they were originally there, I don't have the fdclose 1 and fdclose 2 in stage 1 either. I wanted the stage 2 init to have open FDs to /dev/console, to show OpenRC's output. Aren't you redirecting all the logs to a catch-all logger, the one I call s6-svscan-log in the examples/ subdirectory ? It's entirely possible to do without a catch-all logger, but if there's ever something wrong with the supervision tree (e.g. one of your services cannot start), your /dev/console will get spammed. No kernel complaints when sysvinit is process 1 (I don't know what it does with its open FDs), no kernel complaints with the custom stage 1 init having devfs start with open FDs to /dev/console and /dev/null. There is no unmounting /dev (maybe that's specifically what would trigger an EBUSY?). The kernel has CONFIG_DEVTMPFS_MOUNT=n, so no devtmpfs is mounted after the rootfs. And when I boot with init=/bin/bash, mount shows nothing but / and a manually mounted /proc. However, /dev is not empty. I don't know where the device nodes came from, but they are there. I didn't create them, maybe they were put in there during Gentoo's installation, I don't know. I fact, the after boot /dev looks quite different (and I do see a devtmpfs mounted). A mount --bind / /tmp shows me the original boot time /dev, so I suppose the root FS has actual static device nodes. Yeah, that must be it: static device nodes in your rootfs, but they are overriden by the ones in your devtmpfs when you mount it. But when I tried that a few years ago, mounting a new /dev when there were still fds open to it simply didn't work. They must have changed that. And if the behaviour you're observing can be consistently relied upon, that's pretty good news. -- Laurent
Re: execline's pipeline (and forbacktickx) with closed stdin
On 09/05/2015 01:13, Guillermo wrote: Are we not supposed to use pipeline or forbacktickx with a closed stdin, or is this something that needs fixing? Honestly... both. It's Complicated (tm). I read your mail yesterday, shortly after you wrote it, but it opened a rabbit hole in more than one way. And the correct answer is: both. I consider it a bug, because there are cases where I do need to run programs with fds 0, 1 or 2 closed, and I generally try to pay attention to this. So I've pushed a fix to the current execline git, please tell me if it works for you. However, POSIX considers that UB is acceptable when you run a program with 0, 1 or 2 closed: look for If file descriptor 0 in http://pubs.opengroup.org/onlinepubs/9699919799/functions/execve.html So, stricto sensu, it's a case of don't do that - it's acceptable for pipeline, and other programs, to fly demons through your nose when you run it with stdin closed. And Unix primitives do not make it easy to support that case without bugs. I have run into that problem before, and your report is just another incarnation of the problem, and I'm sure there are other similar ones hiding in my code. Whenever you open a file, Unix guarantees that the descriptor you get is the smallest unused number. So if you run a program with 0 closed, when the program opens something, or creates a pipe, or anything that requires a descriptor, descriptor 0 will be used. If you then need to exec into a program with 0 redirected, you should pay attention to not overwrite your (internal) descriptor 0 when you dup2() into it. And dup2() when both descriptors are the same does not clear the close-on-exec flag, which leads to the problem you observed. I'll try to support the case as much as I can, and squash those bugs whenever they're found, but still, don't do that - Big Bad POSIX will bite you if you do. -- Laurent
Re: skalibs-2.3.5.0 leapsecs.dat
On 08/06/2015 12:21, Vallo Kallaste wrote: The leapsecs.dat from skalibs-2.3.5.0.tar.gz matches with old leapsecs.dat file on the three old systems I tried. Has it been updated? Has it been 3 years already ? My, how time flies. Or leaps. Sorry, I completely missed the new leap second announcement. I should check on those things more often, thanks for the reminder! skalibs-2.3.5.1 should be available now with an updated leap second table. -- Laurent
Re: [PATCH] forstdin: Fix possible error/hanging w/ -p
On 09/06/2015 20:17, Olivier Brunel wrote: + if (pids.s) sig_block(SIGCHLD) ; (...) +sig_unblock(SIGCHLD) ; Gah. Of course that's it - the noob mistake of looping around fork() without blocking SIGCHLD. That's so, so bad - I'm really ashamed. :( That's what happens when you rely on selfpipes all the time: you forget how to properly deal with signals without them! I did some tests after changing the final waiting logic to sig_pause(), and didn't get any errors, so I figured it was good - but it obviously wasn't. Thanks for the report and the fix! Applied in current git, new release soonish. I'll still keep the sig_pause() part: it's actually more ad-hoc work to remove the signal handler and enter a blocking wait() loop than to simply let the signal handler do its job until there's nothing left. I find the latter more elegant, even if it didn't work as a fix for the race condition. -- Laurent
Re: how can I install runit on oracle linux?
On 09/06/2015 23:09, Amin Rasooli wrote: For the past few days I have been banning my head to the wall to figure how to install runit on oracle linux. Could someone give me a working repository ? (all the ones in internet seem to be unavailable) Hi Amin, Wrong list! I didn't write runit. Gerrit Pape did. :) runit is discussed on the supervision mailing-list, not the skaware one. That said, have you tried the tarball from the original site? That should work on any flavour of Linux. http://smarden.org/runit/install.html Of course, you could always switch to s6. :P -- Laurent
Re: how can I install runit on oracle linux?
On 09/06/2015 23:37, Amin Rasooli wrote: Sorry for emailing the wrong mailing list, do you happen to have a repository that I can add ? the tar installation, doesn’t seem to be working for me. That's weird. You may want to report a bug to Gerrit or to the supervision mailing-list, with the exact error messages you are getting. Sorry, I don't know about runit repositories. -- Laurent
Re: [announce] s6-linux-init-0.0.1.0
On 2015-06-19 19:19, Les Aker wrote: Looks like s6-linux-init 0.0.1.0 pulls s6 in as a build-time dependency. Not a huge issue, but might be worth updating the docs to clarify that until the next release removes that? I've learned to trust your docs and build tools enough that I spent a while hunting for what I was doing wrong :) Actually, the docs provided with the 0.0.1.0 tarball would tell you that :) I changed the dependencies in a later git commit, and updated the online docs to match that commit, but it means they don't match the 0.0.1.0 release anymore. My apologies for the misdirection here. That's an interesting question: should the docs match the latest versioned release or the latest git commit ? There are pros and cons for both. Anyway, please use the latest git, it comes with a few other fixes in addition to removing almost all build-time deps. Also, I'm in favor of making the shebang use bindir Also done in the latest git. -- Laurent
[announce] s6-2.1.5.0, s6-linux-init-0.0.1.1
Hello, s6-2.1.5.0 is out. It adds support of SIGHUP to s6-log for a clean exit even when the -p option is present. This is useful when bringing down a supervision tree logging its output into a s6-log -p instance: the instance survives, but the .s6-svscan/finish script still has a fd open to it, so the logger is still alive - which is intentional, but it's good to have a way to cleanly kill the logger nonetheless. http://skarnet.org/software/s6/ git://git.skarnet.org/s6 s6-linux-init-0.0.1.1 is out. It has no built-time dependencies except skalibs as documented; it uses bindir in every shebang in order to work with kernels that don't do PATH resolution. It also fixes a bug where the finish script could hang forever in some cases (the fix relies on the s6-log change above). http://skarnet.org/software/s6-linux-init/ git://git.skarnet.org/s6-linux-init Enjoy, Bug-reports welcome. -- Laurent
Re: [announce] s6-2.1.5.0, s6-linux-init-0.0.1.1
On 25/06/2015 20:16, Les Aker wrote: I'm not sure if this is intentional given your latest update, but it looks like the GitHub mirrors for s6 and s6-linux-init don't have then new versions. Because I forgot to tag them. Fixed now. :) However, don't grab those - better versions are coming out tonight. -- Laurent
[announce] s6-2.1.6.0, s6-linux-init-0.0.1.2
Hello, (It's always like that. No matter how many checks you perform before hitting the release button, you always discover the worst bugs right afterwards.) s6-2.1.6.0 is out. It adds a -X command to s6-svc, that is like -x except it makes s6-supervise instantly close its stdin/stdout/stderr before priming for exit. This is used in the latest version of s6-linux-init. http://skarnet.org/software/s6/ git://git.skarnet.org/s6 s6-linux-init-0.0.1.2 is out. Oh boy. The race condition that all the crafty fifo shenanigans were supposed to avoid was still there, sneakily hidden in the ugliness of the aforementioned shenanigans. Well, now it's fixed. Also, the possible hang in stage 3 has been fixed too, and the catch-all logger should exit cleanly as soon as nothing is writing to it anymore, but not too early. I'm reasonably confident that this version works. :) http://skarnet.org/software/s6-linux-init/ git://git.skarnet.org/s6-linux-init Enjoy, Bug-reports welcome. -- Laurent
Re: [PATCH 1/2] s6dns_resolve_parse: always clean up dt, prevent fd leaking
On 11/06/2015 16:06, Roman I Khimov wrote: It was noted that with no servers in resolv.conf s6-dns always leaks an fd after s6dns_resolve_parse_g() usage. I wasn't able to trace it deeper, but always cleaning up in s6dns_resolve_parse() won't hurt. Thanks for the report! However, this isn't the correct fix. I have committed the correct one in the latest git head. (s6dns_resolve_core is supposed to recycle dt itself if it fails.) I also have applied your second patch, thanks :) New release coming soon. -- Laurent
Re: [PATCH] s6-uevent-spawner: Fix possibly delaying uevents
On 14/06/2015 21:57, Olivier Brunel wrote: That is, in your test now you're using x[1] even though it might not have been used in the iopause call before, so while I guess this isn't random memory, it doesn't really feel right either. You're right, of course, that's why the else was there in the first place, and removing it can't be done thoughtlessly. I've committed something closer to your patch. It's still simpler because I eliminated redundant tests. and we're gonna block in handle_stdin. That was actually another bug... stdin should be non-blocking. . Fixed. -- Laurent
Re: [PATCH] s6-uevent-spawner: Fix possibly delaying uevents
On 14/06/2015 14:37, Olivier Brunel wrote: Because of the buffered IO, the possible scenario could occur: - netlink uevents (plural) occur, i.e. data ready on stdin - iopause triggered, handle_stdin() called. The first uevent is processed, child launched, we're waiting for a signal - SIGCHLD occurs, we're back to iopausing on stdin again, only it's not ready yet; Because we've read it all already and still have unprocessed data (uevents) on our own internal buffer (buffer_0) Right, thanks for the catch. I usually avoid that trap, but meh. I committed a simpler change than your patch, please tell me if it fixes things for you. -- Laurent
[announce] s6-2.1.4.0
Hello, s6-2.1.4.0 is out. It features: - Direct readiness notification support in s6-supervise (and consequently deprecation of the s6-notifywhenup binary). - Optimization of the service respawn delay by s6-supervise: the security delay is now one second between two successive ./run executions, instead of one second after the service is down. In other words: if a service that just died had been running for more than one second beforehand, s6-supervise will restart it immediately. - Support for SIGUSR1 in s6-svscan, traditionally meaning poweroff the machine. - Easier stage 1 init support for Linux users via a new package, s6-linux-init. (See the next announcement on the skaware mailing-list.) http://skarnet.org/software/s6/ git://git.skarnet.org/s6 Enjoy, Bug-reports welcome. -- Laurent
Re: [announce] s6-linux-init-0.0.1.0
On 18/06/2015 05:12, Guillermo wrote: I did a quick run and found out that in generated execline scripts except the stage 1 init, the shebang line starts with #!execlineb. Yes (unless you use slashpackage). And on the machines where I tested, it's not a problem as long as execlineb is in the PATH - the kernel still finds the binary. This is certainly nonstandard, and surprised me, but I saw it work, so since the package is Linux-specific anyway, I didn't change it. I can change it to bindir if it doesn't work on some configurations, though. -- Laurent
[announce] s6-linux-init-0.0.1.0
Hello, s6-linux-init-0.0.1.0 is out. It is a new package that, for now, only contains one program, and more documentation than source code. :) Its goal is to automate the creation of stage 1 (/sbin/init) binaries for people who want to run s6-svscan as process 1. Unfortunately, it has to be system-specific; this is the Linux version because it's the OS I know the best. However, it is very possible to run it, examine the created scripts, and adjust them to another system's idiosyncrasies. http://skarnet.org/software/s6-linux-init/ git://git.skarnet.org/s6-linux-init Mirrored on github as well. Enjoy, Bug-reports and suggestions welcome, especially since it's still brand new and probably rough around the edges. -- Laurent
Re: Native readiness notification support in s6-supervise
On 16/06/2015 04:47, Guillermo wrote: In the examples/ROOT/img/services-local/syslogd-linux subdirectory, there is an implementation of the syslogd service for Linux , using s6-ipcserver with the -1 option and s6-notifywhenup for readiness notification. Maybe you could modify it in the s6 git header to use the new s6-supervise feature, too. Right. Fixed. -- Laurent
Native readiness notification support in s6-supervise
When you use s6-notifywhenup, or any readiness notification helper that is not the service's direct supervisor, there is still a small race condition - which can only bite in a very, very pathological case, when the stars align in an incredibly evil way and your system's scheduler decides that it really hates you. But it's theoretically still there. I don't like it. So I bit the bullet and implemented readiness support directly in s6-supervise. Come at me, pathological cases. Now, instead of using s6-notifywhenup, you write a fd number into a notification-fd file in the service directory. s6-supervise picks that file up when starting the service, and will read notification messages (i.e. everything up to the first newline) that the daemon writes to the specified file descriptor. Quick upgrade HOWTO: if you were using s6-notifywhenup -d FD -- foobard before, run echo FD notification-fd in your service directory, and just use foobard in your run script now. Special annoying case: if FD is 1, i.e. the default, and your service is logged, you'll want to do the following instead: echo 3 notification-fd, and then use fdmove 1 3 foobard in your run script. (Because the notification pipe will conflict with the logging pipe otherwise.) The feature is available in the latest s6 git head, which is also a release candidate for 2.1.4.0. Please test the feature and send your comments. If no bugs or immediate improvements are found, I'll make the release soon. -- Laurent
Re: Packaging skaware for Guix
On 06/07/2015 12:13, Laurent Bercot wrote: may not find it without the proper --with-libdir option. I meant --with-lib, of course. -- Laurent
Re: Packaging skaware for Guix
Hi Claes ! For me execline fails to build from source because -lskarnet is listed as a dependency instead of in EXTRA_LIBS. Is this on purpose? Yes, this is intentional. EXTRA_LIBS is only used for things like -lrt or -lsocket which are needed for some binaries on some systems (and I don't think any execline binary needs them). -lskarnet works as a dependency because GNU make will translate it to libskarnet.a or libskarnet.so depending on what you have and the configure options you've given. Are you using GNU make 4.0 or later ? What's the exact configure and make command line you are using, and the exact error message you're getting ? Bear in mind that libskarnet.a is installed in /usr/lib/skalibs/ by default, and if you change anything in the execline config, it may not find it without the proper --with-libdir option. I patched tools/gen-deps.sh (recognize ${LIB,*} and -l.* as libraries, in addition to ${.*_LIB}) and added a phase to generate a new package/deps.mak before doing the compilation, and then it worked. You should not have to patch anything. You can always get the behaviour you want by providing the appropriate options to configure. I'm surprised nobody else seems to be having this problem. In particular, the Nix definition is just a plain build from source and apparently Just Works. What am I missing? I don't know, but your error messages will tell. :) -- Laurent
Re: Packaging skaware for Guix
On 06/07/2015 14:07, Claes Wallin (éŸ‹å˜‰èª ) wrote: ./configure: error: target x86_64-unknown-linux-gnu does not match the contents of /gnu/store/hynkavlnn6j0x6aifrawx9d27j6vmzb1-skalibs-2.3.5.1/lib/skalibs/sysdeps/target Weird. What's the content of /gnu/store/hynkavlnn6j0x6aifrawx9d27j6vmzb1-skalibs-2.3.5.1/lib/skalibs/sysdeps/target ? This error usually means you used a sysdeps directory meant to represent another architecture, which is of course not good - but I've never seen it occur if you're *not* cross-compiling. -- Laurent
Re: Packaging skaware for Guix
Interesting, never heard of this make feature before. Does that only work with static libs? Because the .so is in the search path. It's supposed to work with both static and shared libs. The defaults for skarnet.org package installations specify *different* directories for static libraries and shared libraries. It is a misdesign of autotools, or other build systems, to ensure that those installation directories are the same: static and shared libraries are very different objects and should not be handled the same way or stored at the same place. (Shared libraries are a run-time object; static libraries are a compile-time object, only used for development.) Or is it the case that ld/gcc understand LIBRARY_PATH but make doesn't? Yes. CPATH and LIBRARY_PATH are gcc-specific constructs. The equivalent of LIBRARY_PATH for make is named vpath, and needs to be declared in the Makefile. The configure script builds such a vpath with the contents of the given --with-lib options. --enable-fast-install The configure scripts are not generated by autotools. This flag won't do anything. Please see ./configure --help to see what flags are supported. skalibs lib and includes are found using CPATH and LIBRARY_PATH. Yeah, that will work with gcc but make will not find libraries. As you could see for yourself, it works with --with-lib. :) -- Laurent
Gentoo building bug: should be fixed
Hi, Guillermo reported a bug discovered here: https://bugs.gentoo.org/show_bug.cgi?id=541092 The latest git versions of skarnet.org packages should fix the issue (with the wonderful magic of XYZZY!), so the Gentoo workaround should now be unnecessary. The latest versions also rework how shared libraries are built: now they are linked against their dependencies. This may be ugly in some cases, but is the safe option, and non-totally-braindead systems shouldn't see too much ugliness. I'll do some more testing tomorrow, and if everything appears to work fine, I'll do the official releases. -- Laurent
Re: Preliminary version of s6-rc available
Should be all fixed, thanks! -- Laurent
Re: Preliminary version of s6-rc available
On 22/08/2015 08:26, Colin Booth wrote: I run my s6 stuff in slashpackage configuration so I missed the s6-fdholder-filler issue. The slashpackage puts full paths in for all generated run scripts so I'm a little surprised it isn't doing that for standard FHS layouts. FHS doesn't guarantee absolute paths. If you don't --enable-slashpackage, the build system doesn't use absolute paths and simply assumes your executables are reachable via PATH search. Unexported executables are a problem for FHS: by definition, they must not be accessible via PATH, so they have to be called with an absolute path anyway. This is a problem when using staging directories, but FHS can't do any better. Here, I had simply forgotten to give the correct prefix to the s6-fdholder-filler invocation, so the PATH search failed as it is supposed to. -- Laurent
Re: skaware tests?
On 21/08/2015 22:05, Buck Evan wrote: Is there any kind of test suite for skalibs/execline/s6 ? I would love it if there were one. :) See http://skarnet.org/cgi-bin/archive.cgi?1:mss:276:201502:mndjpngghogjemeljjac and the ensuing thread, also available at https://www.mail-archive.com/skaware@list.skarnet.org/msg00270.html -- Laurent
Re: skaware manpages?
On 21/08/2015 22:10, Buck Evan wrote: @Laurent: What's your take on man pages? Short version: I like them, as long as I don't have to write them or move a finger to generate them. Long version: I honestly believe man pages are obsolete. They were cool in the 90's when they were all we had; but today, *everyone* has a web browser, and can look at HTML documentation. Even if they don't have an Internet access. I still find myself typing man sometimes. It's a reflex because I'm a dinosaur. But if it doesn't work, I don't mind: the documentation *is* somewhere, I just have to grab my browser. GNU people never write man pages. They write info pages. That blows, and I'd rather look at the source code to understand what it does than install and run an info client. Fortunately, the documentation is also available in HTML, so I go read the doc on the web. When I was writing my build system, I was very, very glad that the make manual was available in HTML; I spent hours on that document, with several tabs open at various places - browsers are user-friendly. Much more so than xterms running a rich text visualizer. So, info2html, man2html, or SGML/DocBook source, and so on? Well, as much as I love Unix, one aspect of it that I really dislike is the proliferation of markup languages. nroff is one, info is another one, pod is one, and so on; I've stopped counting the number of initiatives aiming to produce rich text. I've always managed to avoid learning those languages. I've only learned LaTeX and HTML; I quickly forgot the former as soon as I was out of academia and didn't need it anymore, and I only memorized the latter because it's ubiquitously useful. Markup, or markdown, languages, are really not my cup of tea; and if I didn't learn nroff in 1995, when there actually was a serious use case for it, I'm definitely not going to learn it today. I'll keep providing HTML docs, and only HTML docs. If you want to provide man pages, you're very welcome to it, as long as I don't have to do anything. :P Since I don't believe in the future of man pages, I even think that only providing stub man pages would be perfectly acceptable: in the man page, only have a link to the relevant HTML document, on the local machine as well as on the Web. If you don't like stubs, heinous scripts should produce more acceptable results than you think. I try to keep a reasonably regular format for the doc pages of executables; I don't mind enforcing the regularity a bit more seriously if it makes your scripts easier or more accurate. -- Laurent
Re: s6-rc - odd warn logging and a best practices question
On 20/08/2015 16:43, Colin Booth wrote: Yeah, this is for the special case where you have a daemon that doesn't do readiness notification but also has a non-trivial amount of initialization work before it starts. For most things doing the below talked about oneshot/longrun split is best, but sometimes you need to run that initialization every time (data validators are the most obvious example). In that case, yes, if { init } if { notification } daemon is probably the best. It represents service readiness almost correctly, if service includes the initialization. It does provide notification, but only if you're running under systemd. At least according to the sd_notify() docs. I'll see about faking up the environment so sd_notify() is happy and report back. systemd's notification API is a pain. It forces you to have a daemon listening on a Unix socket. So basically you'd have to have a notification receiver service, communicating with the supervisors - which eventually makes it a lot simpler to integrate everything into a single binary. This API was made to make systemd look like the only possible design for a service manager. That's political design to the utmost, and I hate that with a passion. I have a wrapper to make things work the other way (i.e. using s6-like daemons under systemd), but a wrapper that would actually understand sd_notify() notifications would be much more painful to write. Actually, the more I think about it, the less s6-rc-update will help me avoid reboots in the short term since part of what I need to get back is a pristine post-boot environment. What do you have in that post-boot environment that would be different from what you have after shutting down all your s6-rc services and wiping the live directory ? -- Laurent
Re: s6-rc - odd warn logging and a best practices question
On 20/08/2015 10:57, Laurent Bercot wrote: s6-svc: warning: /run/s6/rc/scandir/s6rc-fdholder/notification-fdpost addition of notification-fd Looks like a missing/wrong string terminator. Thanks for the report, I'll look for it. I can't grep the word addition in my current git, either s6 or s6-rc. Are you sure it's not a message you wrote? Can you please give me the exact line you're running and the exact output you're getting? Thanks, -- Laurent
Re: [announce] s6-2.2.0.0
On 28/07/2015 16:59, Patrick Mahoney wrote: If I understand correctly, any 'readiness' reporting mechanism must originate from the run script (to inherit notification-fd). Yes. Do you have any suggestions for adding readiness support to a daemon *without modifying that daemon*? It's still possible, and exactly as hackish as before. :) Previously, I had been using s6-log to match a particular line from the daemon's log output (e.g. $service is ready), sending the matched line to something like 'cd .. s6-notifywhenup echo' (note: a child process of run/log, not run). Yes, I remember that case. To support the same through the notification-fd, I can imagine a rough scheme such as: in run: background { if { s6-ftrig-wait fifodir U } fdmove 1 3 echo } ... run the daemon in log/run pipeline -w { forstdin -d\n i s6-ftrig-noitfy ../fifodir U } s6-log - +daemon is ready 1 + t n20 !gzip -nq9 logdir Yes, something like that would work. No need for a fifodir: a simple named pipe would do, just make sure only your logger writes to it and only your service reads from it. My take would be something like: ./run: fdmove -c 2 1 foreground { mkfifo readiness-fifo } background -d { fdmove 1 3 redirfd -r 0 readiness-fifo head -n 1 } fdclose 3 daemon ./log/run: redirfd -w 1 ../readiness-fifo s6-log - +daemon is ready 1 rest-of-your-logging-script -- Laurent
Re: Bug in ucspilogd v2.2.0.0
On 09/08/2015 09:27, Colin Booth wrote: I haven't yet dug into the skalibs code to see what changed between those tags, or started bisecting it to find out which commit broke. The git diff between 2.3.5.1 and current HEAD is pretty small, and there's really nothing that changed in the graph of functions accessed by skagetlnsep(), the failing entry point. Functional: [pid 19388] readv(0, [{13Aug 9 07:26:07 cathexis: wo..., 8191}, {NULL, 0}], 2) = 34 (...) Dysfunctional: [pid 31983] readv(0, [{13Aug 8 23:46:57 cathexis: wo..., 8191}, {NULL, 0}], 2) = 33 The path leading to the first invocation of readv() hasn't changed, but readv() gives different results. My first suspicion is that logger isn't sending the last character (newline or \0) in the second case before exiting, which skagetlnsep() interprets as I was unable to read a full line before EOF happened and reports as EPIPE. Are you using the same version of logger on both machines ? Grrr. If logger starts sending incomplete lines, I may have to change the ucspilogd code to accommodate it. -- Laurent
Re: s6-rc plans
On 13/08/2015 19:55, Colin Booth wrote: Makes sense. In this case can we get a --livedir=DIR buildtime option so us suckers using a noexec-mounted /run can relocate things easily without having to type -l livepath every time we want to interact with s6-rc? Unless I encounter a strong reason not to, sure, no problem. -- Laurent
s6-rc plans (was: Build break in s6-rc)
Oh, and btw, I'll have to change s6-rc-init and go back to the the directory must not exist model, and you won't be able to use a tmpfs as live directory - you'll have to use a subdirectory of your tmpfs. The reason: as it is now, it's too hard to handle all the failure cases when updating live. It's much easier to build another live directory, and atomically change what live points to - by renaming a symlink. And that can't be done if live is a mount point. -- Laurent
Re: Build Break in s6-rc
On 13/08/2015 18:05, Laurent Bercot wrote: If you're going to pull from git head, then you should pull from the git head of *every* project, including dependencies. Which you didn't for execline. :) I'm not lying! I'm just chronologically challenged sometimes. See, if you had pulled from the execline head from the future, i.e. from now, your build wouldn't have broken. Really! -- Laurent
Re: Build Break in s6-rc
On 14/08/2015 01:25, Colin Booth wrote: I'm not sure how I feel about having the indestructibility guarantee residing in a service that isn't the root of the supervision tree. I haven't done much with s6-fdholderd but unless there's some extra magic going on in s6rc-fdholderd, if it goes down it won't be able to re-establish its control over the overall communications state due to it creating a fresh socket. I know, I know, it should be fine, but accidents happen. I've thought about it for a while, and finally decided that the advantages overshadowed the drawbacks. First, the only time this makes a qualitative difference is when the pipe maintainer cannot die at all. In one setup, you lose your pipe when s6-svscan dies; in the other setup, you lose your pipes when s6-fdholderd dies. The only way to prevent that is to forbid your pipe maintainer from dying entirely. Second, the only way to do that is to put the pipe maintainer as process 1; but I don't think putting things in process 1 to make them indestructible is the answer. It's the systemd way. We're process 1, so we cannot die, and we can do everything on the system that needs reliability. Granted, it's a nice thing to have, and I do advocate the use of s6-svscan as process 1, but not because it's a pipe maintainer. I use s6-svscan as process 1 because it's the natural place for the root of a supervision tree; and everything else is a bonus. The logged service feature of s6-svscan is a direct legacy of daemontools. It was very cool at the time because we had nothing else; and I keep it because there's a large daemontools user base, and breaking compatibility would not make sense because the code that handles logged services isn't complex enough to be a maintenance burden. (And still, it is one of the very few places where I had to write a detailed comment labelled BLACK MAGIC, because there *is* some complexity to it.) So it's not going away any time soon, but it's still a legacy ad-hoc functionality. If I was writing s6-svscan today, I would not implement this feature; I would advertise the use of a dedicated fd-holder instead. And that would cut the code size of s6-svscan by a non-negligible amount, getting it closer to the ideal of the minimal process 1. The correct approach to reliability is not to try and force processes not to die; and it's not to cram more stuff into the only process that cannot die. It's to make sure it's not a serious problem when processes die. And that, btw, is exactly what supervision is about in the first place. So, let's make sure it's not a problem when the pipe maintainer dies. In this case, let's add a watcher for s6-fdholderd. Instead of oneshots that store pipes into the s6-fdholderd, how about filling up s6-fdholderd at start time with all the pipes it needs ? The processes in a pipeline will keep using the old pipes until one of them dies, at which point the old pipe will close, propagating the EOF or EPIPE to the other processes in the pipeline; eventually all the processes in the pipeline will restart, and fetch the new set of pipes from s6-fdholderd. That sounds reliable to me, and even cleaner than the current approach, where the services can't reliably restart if s6-fdholderd has died; and it doesn't need additional autogenerated oneshots. (Thanks for the rubber duck debugging! That's a huge part of why I like design discussions.) So yeah, if s6-fdholderd dies, and one process in a pipeline dies, then the whole pipeline will restart. I think it's an acceptable price to pay, and it's the best we can do without involving process 1. -- Laurent
Re: Bug in ucspilogd v2.2.0.0
On 12/08/2015 04:45, Guillermo wrote: I don't know about syslog on /dev/log, but for syslog over a network there is this: Yeah, I know about the syslog RFCs. The mild way to put it is that they're about as useful, well-engineered and enticing as a steaming pile of donkey shit. And donkey shit can at least be used as manure. Logs are data, if they need to be transported over the network, there's no lack of complex, over-engineered and insecure ways to transport data over the network - no need to come up with yet another one specifically for logs, with its own quirks and idiosyncratic formatting that peeks into user content when it has no business doing so. You want to standardize a universal format for logs (gl with that), then write a RFC about a universal format for logs, don't mix that with a network protocol, like, duh. The only part of syslog that is worth normalizing is the interaction between syslog() and syslogd, on the *local* machine, because there's a lot of code using syslog() that doesn't care about the network, and several implementations of syslogd. And, of course, that's exactly the part those RFCs do not talk about. It shouldn't come as a surprise when you know that Eric Allman, of sendmail shame, is the original syslog designer, and the author of RFC 5424, Rainer Gerhards, is also the main author of rsyslogd. Do these people actually get *respect* for what they do? Geez this community lacks critical thinking. Or using ucspilogd with option datagram mode sockets, which would also make musl syslog() work. It's more complicated than that. A datagram syslogd server cannot listen() and accept(); it receives messages from every process that uses syslog(). A datagram /dev/log socket enforces the fan-in, enforces a single instance of syslogd that has to analyze and authenticate every single log message from the whole machine, which is precisely what I want to avoid; ucspilogd makes no sense in this case, you have to use a complete (and big and inefficient) syslogd implementation. ucspilogd relies on the fact that there's a SOCK_STREAM super-server above it to fork an instance per openlog() connection, and that its stdin is private to this connection. That's what allows it to be so simple - and not having the syslog() client try talking to a SOCK_STREAM socket completely defeats it. And GNU libc syslog() works fine using ucspilogd with current stream mode sockets using non-transparent framing with NUL as trailer character behaviour :P ucspilogd doesn't care about the chosen trailer character. It will treat \0 and \n equally as line terminators - which is the only sensible choice when logging to a text file and prepending every line with some data. glibc syslog() works because it does some ugly, ugly things like trying with SOCK_DGRAM, and retrying with SOCK_STREAM if it failed. In the absence of normalization for syslog(), I'm afraid this is the only possible behaviour, though; I've swallowed my tears and submitted a feature request to musl. -- Laurent
Re: [s6] debian packaging
(Please follow-up this part of the thread to the skaware mailing-list.) On 12/08/2015 08:37, Buck Evan wrote: - https://github.com/bukzor/s6-packaging/blob/dockerize/execline/debian/patches/02_link_against_libskarnet.patch - https://github.com/bukzor/s6-packaging/blob/dockerize/s6/debian/patches/75_dot_so_link_skarlib.patch Again this is because the build derps without them, but I forget the exact failure mode. I'll track down details upon request. The parts for binaries and static libraries are clearly invalid. If something breaks while building those, then there's a problem with the way the build is invoked, or the options to configure. For static libraries, -lskarnet is nonsense. For binaries, -lskarnet is already listed in the requirements ($^) and should be translated to a .a or .so by vpath resolution, so it is incorrect to list it again. Something is definitely wrong if the package builds with them while it won't build without. I'm still unsure about the shared libraries parts. I don't think it should be needed, but my test suite isn't up to par and I need to update it to test the problematic cases and understand exactly what is happening. In the meantime, please find the problem with your build and fix it. Chances are you won't need the shared libraries patch either once you've done that. :) It seems likely to me that you'll want to figure out and fix these two issues given your response to the above patch. Is that right? Yes, and now you have work to do too. :P -- Laurent
[announce] skalibs-2.3.6.0, execline-2.1.3.0
Hello, skalibs-2.3.6.0 is out. A couple bugfixes (including a possible crash in socket_local46) and a new openreadnclose_nb() function. http://skarnet.org/software/skalibs/ git://git.skarnet.org/skalibs execline-2.1.3.0 is out. A new configure option, --shebangdir, to specify the absolute path to the execlineb binary for use in shebang lines in execline scripts. Enjoy, Bug-reports welcome. -- Laurent
Re: Preliminary version of s6-rc available
On 16/07/2015 19:22, Colin Booth wrote: You're right, ./run is up, and being in ./finish doesn't count as up. At work we use a lot of runit and have a lot more services that do cleanup in their ./finish scripts so I'm more used to the runit handling of down statuses (up for ./run, finish for ./finish, and down for not running). My personal setup, which is pretty much all on s6 (though migrated from runit), only has informational logging in the ./finish scripts so it's rare for my services to ever be in that interim state for long enough for anything to notice. I did some analysis back in the day, and my conclusion was that admins really wanted to know whether their service was up as opposed to... not up; and the finish script is clearly not up. I did not foresee a situation like a service manager, where you would need to wait for a really down event. As for notification, maybe 'd' for when ./run dies, and 'D' for when ./finish ends. Though since s6-supervise SIGKILLs long-running ./finish scripts, it encourages people to do their cleanup elsewhere and as such removes the main reason why you'd want to be notified on when your service is really down. If the s6-supervise timer wasn't there, I'd definitely suggest sending some message when ./finish went away. Yes, I've gotten some flak for the decision to put a hard time limit on ./finish execution, and I'm not 100% convinced it's the right decision - but I'm almost 100% convinced it's less wrong than just allowing ./finish to block forever. ./finish is a destroyer, just like close() or free(). It is nigh impossible to define sensical semantics that allow a destroyer to fail, because if it does, then what do you do ? void free() is the right prototype; int close() is a historical mistake. Same with ./finish ; and nobody tests ./finish's exit code and that's okay, but since ./finish is a user-provided script, it has many more failure modes than just exiting nonzero - in particular, it can hang (or simply run for ages). The problem is that while it's alive, the service is still down, and that's not what the admin wants. Long-running ./finish scripts are almost always a mistake. And that's why s6-supervise kills ./finish scripts so brutally. I think the only satisfactory answer would be to leave it to the user : keep killing ./finish scripts on a short timer by default, but have a configuration option to change the timer or remove it entirely. And with such an option, a burial notification when ./finish ends becomes a possibility. Ah, gotcha. I was sending explicit timeout values in my s6-rc comands, not using timeout-up and timeout-down files. Assuming -tN is the global value, then passing that along definitely makes sense, if nothing else than to bring its behavior in-line with the behavior of timeout-up and timeout-down. Those pesky little s6-svlisten1 processes will get nerfed. Part of my job entails dealing with development servers where automatic deploys happen pretty frequently but service definitions dont change too often. So having non-privileged access to a subsection of the supervision tree is more important than having non-privileged access to the pre- and post- compiled offline stuff. I understand. I guess I can make s6-rc-init and s6-rc 0755 while keeping them in /sbin, where Joe User isn't supposed to find them. By the way, that's less secure than running a full non-privileged subtree. Oh, absolutely. It's just that a full setuidgid subtree isn't very common - but for your use case, a full user service database makes perfect sense. -- Laurent
Re: Preliminary version of s6-rc available
On 17/07/2015 09:26, Rafal Bisingier wrote: So I run them as a service with sleep BIG in finish script (it's usually unimportant if this runs on same hours every day). I can have this sleep in the main process itself, but it isn't really it's job I also use a supervision infrastructure as a cron-like tool. In those cases, I put everything in the run script: if { periodic-task } sleep $BIG periodic-task's run time is usually more or less negligible compared to $BIG, and I'm not expecting to be controlling it with signals anyway - but I like to being able to kill the sleep if I want to run periodic-task again earlier for some reason. So I don't mind executing a short-lived (even if it takes an hour or so) process in a child, and then having the run script exec into the sleep. And since periodic-task exits before the sleep, it doesn't block resources needlessly. Whereas if your sleep is running in the finish script, you have no way to control it. You stay in a limbo state for $BIG and your service is basically unresponsive that whole time; it's reported as down (or finish with runit) but it's still the normal, running state. I find this ugly. What do you think ? Is putting your periodic-task in a child an envisionable solution for you, or do you absolutely need to exec into the interpreters ? -- Laurent
Re: Preliminary version of s6-rc available
On 19/07/2015 20:13, Guillermo wrote: Well, I haven't been very lucky with oneshots. First, the #!execline shebang with no absolute path doesn't work on my system, even if the execlineb program can be found via the PATH environment variable. Neither does #!bash, #!python, or any similar construct. If I run a script from the shell with such a shebang line I get a bad interpreter: No such file or directory message. Looks like your kernel can't do PATH searches. The #!execline shebang worked on Linux 3.10.62 and 3.19.1. But yeah, it's not standard, so I'll find a way to put absolute paths there, no big deal. /path-to/live/servicedirs/s6rc-oneshot-runne: No such file or directory s6-rc: warning: unable to start service oneshot name: command exited 111 /path-to/live/ represents here what was the full path of the live state directory, and the was really a string of random characters. I suppose this was meant to be the path to s6rc-oneshot-runner's local socket, but somehow ended up being gibberish instead. So oneshots still don't work for me :( I committed a few quick changes lately, I probably messed up some string copying/termination. I'll investigate and fix this. * It looks like s6-rc-compile ignores symbolic links to service definition directories in the source directories specified in the command line; they seem to have to be real subdirectories. I don't know if this is deliberate or not, but I'd like symlinks to be allowed too, just like s6-svscan allows symbolic links to service directories in its scan directory. It was deliberate because I didn't want to read the same subdirectory twice if there's a symlink to a subdirectory in the same source directory. But you're right, this is not a good reason, I will remove the check. Symlinks to a subdirectory in the same place will cause a duplicate service definition error, though. * I'm curious about why is it required to also have a producer file pointing back from the logger, instead of just a logger file in the producer's service definition directory. Is it related to the parsing sucks issue? It's just so that if the compiler encounters the logger before the producer, it knows right away that it is involved in a logged service and doesn't have to do a special pass later on to adjust service directory names. It also doubles up as a small database consistency check, and clarity for the reader of the source. * It doesn't really bother me that much, but it might be worth making down files optional for oneshots, with an absent file being the same as one contanining exit, just like finish files are optional for longruns. Right. You can have empty down files already for this purpose; I guess I could make them entirely optional. The user checked against the data/rules rulesdir would be the one s6-rc was run as, right? So it defines which user is allowed to run oneshots? Yes. And indeed, allowing s6-rc to be run by normal users implies changing the configuration on s6rc-oneshot-runner. I'll work on it. And finally, for the record, it appears that OpenRC doesn't mount /run as noexec, so at least Gentoo in the non-systemd configuration, and probably other [GNU/]Linux distributions with OpenRC as part of their init systems, won't have any problems with service directories under /run. That's good news ! Thanks a lot for the feedback ! I have a nice week of work ahead of me... -- Laurent
Re: Preliminary version of s6-rc available
On 13/07/2015 17:35, Colin Booth wrote: Those options are all bad. My workaround was to mount a new tmpfs inside of run (that wasn't noexec) but that made using s6-rc annoying due to the no directory requirement. I don't think there's anything inherently bad about nesting mounts in this way though I could be mistaken. Ah, so that's why you didn't like the must not exist yet requirement. OK, got it. Yeah, mounting another tmpfs inside the noexec tmpfs can work, thanks for the idea. It's still ugly, but a bit less ugly than the other choices. I don't see anything inherently bad in nesting tmpfses either, it's just a small waste of resources - and distros that insist on having /run noexec are probably not the ones that care about thrifty resource management. s6-rc obviously won't mount a tmpfs itself, since the operation is system-specific. I will simply document that some distros like to have /run noexec and suggest that workaround. My suggestion is for one of: changing the s6-rc-init behavior to accept an empty or absent directory as a valid target instead of just absent Yes, I'm going to change that. absent was to ensure that s6-rc-init was really called early at boot time in a clean tmpfs, but absent|empty should be fine too. Hm, either the documentation or my reading skills need work (and I'm not really sure which). When in doubt, I'll improve the doc: a good doc should be understandable even by people with uncertain reading skills. :) Actually, assuming you're only making bundle and dependency changes, it looks like swapping out db, n, and resolve,cdb from under s6-rc's nose works. I'd be unsurprised if there were some landmines in doing that but it worked for hot-updating my service sequence. Landmines indeed. Services aren't guaranteed to keep the same numbers from one compiled to another, so you may well have shuffled the live state without noticing, and your next s6-rc change could have very unexpected results. But yes, bundle and dependency changes are easy. The hard part is when atomic services change, and that's when I need a whiteboard with tables and flowcharts everywhere to keep track of what to do in every case. Glad to hear it. So far s6-rc feels like what I'd expect from a supervision-oriented rc system. There are some issues that I I haven't mentioned but I'm pretty sure those are mostly due to unfamiliarity with the tools more than anything else. Please mention them. If you're having trouble with the tools, so will other people. -- Laurent
Re: [announce] New skarnet.org release, with relaxed requirements.
On 23/10/2015 00:57, Guillermo wrote: So, I don't know if the handler scripts for diverted signals that the new version of s6-linux-init-maker generates are intended to be compatible with BusyBox. But if that's the intention, then the ones for SIGUSR1 and SIGUSR2 are inverted: I think that the signal sent by 'busybox halt' to process 1 is SIGUSR1, so its handler should be the one calling s6-svscanctl -0 $tmpfsdir/service, and the signal sent by 'busybox poweroff' is SIGUSR2, so its handler should be the one calling s6-svscanctl -7 $tmpfsdir/service. Ah, this is unfortunate. I don't think there's an universal convention for those signals; I looked at suckless init, which uses USR1 for poweroff (and doesn't have a signal for halt). I'm more interested in supporting busybox init than sinit, though (because sinit is incorrect: it lacks supervision of at least one process) - so I'll reverse the signals in s6-linux-init-maker. Thanks for the report. And speaking of s6-linux-init-maker, the -e VAR=VALUE option generates a $basedir/env/VAR file that doesn't have a trailing newline after VALUE, although I don't know if s6-envdir cares. s6-envdir does not care. -- Laurent
Re: s6-ftrig-wait is not relocatable
On 14/10/2015 02:58, Buck Evan wrote: The packaging system I'm targeting (pypi wheels =X) are built binaries that are relocated arbitrarily, so the "run-time installation path" is entirely unknown at compile time. I don't understand. How is that even supposed to work? If packages want to install files in /etc, they can't? If you need to access binaries, how do you know what to put in your PATH? If the packaging system can't provide answers to those questions, it's not "weird". It's "broken". Making everything fully static makes everything but this one bit work though. Not really: execline and s6 binaries expect to be able to spawn other execline and s6 binaries, some of which are found via PATH search, some of which are in /usr/libexec/s6 or something, depending on your ./configure options. If you don't know your run-time installation paths, things will *appear* to work, but be subtly broken. As a nasty hack, --enable-tai-clock seems like it will disable the use of leapsecs.dat. Please don't do that unless you know that your system clock is TAI-10. -- Laurent
Re: daemontools tai64n is unbuffered, s6-tai64n is fully buffered
On 20/10/2015 02:16, Buck Evan wrote: My canonical slowly-printing example is: yes hello world | pv -qL 10 | tai64n Under daemontools classic you'll see the output gradually appear character by character, with timestamps. Under s6, this seems to hang and I ctrl-c it. I'm sure if I waited a good long while it would print, but this shows the difference in usability. s6-tai64n flushes its stdout before going back to read its stdin again. It will never keep unflushed logs in memory. You are very likely using a version of s6-tai64n linked against a shared libskarnet.so.2.3.7.0 or earlier, which sometimes flushes stdout incorrectly. Please grab the latest skalibs and recompile. (Or use the static version of libskarnet, which does not exhibit the bug.) -- Laurent
Re: daemontools tai64n is unbuffered, s6-tai64n is fully buffered
On 20/10/2015 23:36, Buck Evan wrote: Is it expected that it's line-buffered? It's not line-buffered. It's optimally buffered, i.e. the buffer is flushed whenever it's full (obviously) or whenever the loop goes back to reading with a chance of blocking. When you test with a loop around echo, you send lines one by one, so the behaviour appears to be line buffering, but that's only an artifact of your test. -- Laurent
Re: s6-rc shutdown timing issue
On 13/09/2015 09:08, Colin Booth wrote: I've been digging into managing a system completely under s6 and I can't seem to find the right time to run `s6-rc -da change'. Run it before sending s6-svscan the shutdown/reboot/halt commands and can end up with a situation where your read/write drive has been set read-only before your services have come down. This is the right way to proceed: * First s6-rc -da change * Then s6-svscanctl -t /run/service I don't understand the issue you're having: why would your rw filesystem be set read-only before the services come down ? - Your non-root filesystems should be unmounted via the corresponding "down" scripts of the oneshot services you're using to mount them, so they will be unmounted in order. - Your root filesystem will be remounted read-only in stage 3, i.e. when everything is down. If your dependencies are correct, there should be no problem. What sequence of events is happening to you ? One other question that doesn't really belong here but doesn't need its own thread. If I have a oneshot that only does any work on shutdown, can I get away with having the required ./up script be empty Yes, an empty ./up script will work. (Oneshot scripts are interpreted at compile time by the execlineb parser, which treats an empty script as "exit 0".) -- Laurent
Re: s6-rc shutdown timing issue
On 13/09/2015 20:25, Colin Booth wrote: My current issue is that I'm initially remounting my root filesystem as r/w as one of the first steps for s6-rc, which means that if I'm doing everything correctly, s6-rc attempts to remount root as read-only as part of its shutdown. Yeah, indeed, that won't work in all cases, because the situation is not symmetrical. Remounting your rootfs rw at boot time will always succeed, but remounting it ro before killing everything may fail. You have absolutely no way of ensuring that nothing will attempt to write to your rootfs before you nuke everything. That's the fundamental difference between startup and shutdown: during shutdown, and until your nuke, other stuff may still be running that you have no control on. If it's only about your rootfs, I'd simply keep the "remount it rw/ro" part out of s6-rc. If it's about a more complex fs infrastructure and you may still have processes with open handles to the mounted filesystems, it's more annoying, and I don't have a perfect solution. The simplest thing is to do all the unmounts outside of s6-rc, but it's asymmetrical and doesn't feel right. Another possibility is to have a nuke in a ./down script of a oneshot that depends on all your filesystem-mounting service, so you're sure to have already killed everything when you get to unmounting; but that's not great either, especially if you switch databases and s6-rc-update computes that it has to remount one of your filesystems - oops, it just killed everything on your machine and can't complete its work. That shouldn't happen if your "mount stuff" services are very low-level and always up, but if you give users a set of teeth, they will manage to bite themselves. I'm afraid there's no real solution to the stragglers problem, and the only safe approach is to keep everything mounted all the time and only perform the unmounts in stage 3 after everything else has been done and the processes have been killed. Cool, thanks. That's what I thought but I wasn't sure to what degree execlineb cared about script validity and I don't have a terribly great test methodology for oneshots figured out yet. If you run s6-rc -v3, it should show you exactly what commands it's running. In other news, I'm now in the process of testing s6-rc-update. I've finished brooming the obvious bugs, now is the most annoying part: so many test cases, so many things that can go wrong, and I have to try them one by one with specifically crafted evil databases. Ugh. I said I'd release that thing in September, but that won't happen if I die of boredom first. If you're totally crazy, you can try running it, but unless you're doing something trivial such as switching to the exact same database, chances are that something will blow up in your face - in which case please let me analyze the smoke and ashes. -- Laurent
Re: s6-rc-update initial findings
On 15/09/2015 00:40, Colin Booth wrote: Ok, did some more testing and it looks like the contents of $SVCDIR end up being the additive delta between current and new. When initializing, there are no s6-rc managed servoces in $SVCDIR so of course the delta will be all new services. When adding a new longrun, your contents of $SVCDIR will only be the new service. It's probably safe since giving s6-svscan SIGALRM only adds services (never removes), and s6-rc brings down services by directly sending s6-svc -wD -dx to the service. Not sure if this was a design decision, but I still prefer having $SVCDIR be representative of my run state. At least I now know what's going on. Yeah, that's not normal. s6-rc-update should remove the links when it brings the old services down, and should also add the links when it brings the new services up. I don't have an exact picture of what is actually happening in all cases; I didn't have the time today, but I'll do more testing on that tomorrow. -- Laurent
Re: readiness notification from non-subprocess
On 29/09/2015 00:08, Buck Evan wrote: If it's not good for s6, I'm not sure it's good for my framework either. Not necessarily. s6-supervise is extremely paranoid; depending on its use cases, your framework doesn't have to be. Also, if you control both ends of a named pipe and can reasonably assume that nobody's going to mess with them, that's a lot safer than having a named pipe as part of a public interface. Would this be a good candidate for the ftrig* stuff? It does seem like event notification. fifodirs internally use named pipes ;) That would probably make your interface more complex than it needs to be. Or: How do you pass a file descriptor such that it can be used by a parent process? By fd-passing, which is black magic over a Unix domain socket. You need your client (child of ./run) and your server (your framework) to communicate over a Unix socket (have the server listen and accept to some socket in the filesystem, have your client connect to it), and there are functions to magically copy a descriptor from one to the other. Google "fd-passing". I have functions in skalibs to make it easier to transfer fds along with text over a stream socket: unixmessage.h See s6-sudoc.c and s6-sudod.c for a relatively simple usage example. I'd need to document "you need to write a ./ready script *and* prefix your service with my-poll-helper". If I can get this down to just one thing that people have to do right, it's going to work better. 'Your run script must be called "./run.framework" no matter what' 'If you want poll support, write a ./ready script' Then automate the creation of ./run files that will just exec "my-poll-helper ./run.framework". And my-poll-helper tests for ./ready scripts and does what it needs, just executing into its argument if there's no valid ./ready script. You don't even need to connect my-poll-helper to the framework in that case: you can just have it do its ./ready polling in the background and write a newline to notification-fd when it succeeds, as we talked about earlier. Wrapping ./run gives you a lot of freedom. An svc --hey-im-ready option would continue that trend. Using s6-supervise's control pipe as the mechanism to report readiness. Yes, that's a possibility, and I thought about it, but ultimately rejected it, because s6-svc sends orders to control the service, not to fake its state. s6-supervise tries to always report the *real* state, not an arbitrary state defined by the user. You have to realize that all this discussion is about an attempt to make s6-supervise report a fake state when the daemon doesn't have support: no wonder it's difficult! I didn't make it intentionally difficult, but I didn't design the thing around making it easy to fake states, either. A s6-svc option to fake a state would make it a bit *too* easy for my taste. Wrap the goddamn run script. It's by far the easiest way to get what you want. -- Laurent
Re: readiness notification from non-subprocess
On 29/09/2015 00:15, Olivier Brunel wrote: [2] https://github.com/jjk-jacky/anopa/blob/master/src/utils/aa-setready.c Yeah, the problem is, aa-setready is prone to the same race condition as s6-notifywhenup was, which is the reason why I scrapped s6-notifywhenup and made a fd to report to s6-supervise instead. If the service dies while aa-setready does its thing, s6-supervise will modify the status file and send a fifodir event to report service death, and depending on the scheduler's whim, the status file may get incorrect information, and the fifodir events may be sent in the wrong order. I hated it when I realized it, but the only way to prevent that is to make the supervisor the only authority on the service state - only the supervisor should modify the status file and send fifodir events. So, from the service's point of view, only the notification-fd is a safe channel to use. -- Laurent