Re: [PATCH] Fix typo
Fixed, thanks! (I assume you meant in the s6 package. :)) -- Laurent
Re: [announce] small skarnet.org Spring 2024 update
Thank you! Sorry for the rather bare initial report - was very much one of trying to work out what had gone wrong initially! It's all good - I'm supposed to catch these things and I failed, so the next best thing is to get them fixed as quickly as possible :) -- Laurent
Re: [announce] small skarnet.org Spring 2024 update
I can confirm that the patch worked: Thanks, execline-2.9.5.1 is out now. -- Laurent
Re: [announce] small skarnet.org Spring 2024 update
Running backtick with gdb reveals that the crash is caused by the `memcpy' at line 63 of src/libexecline/el_modifs_and_exec.c Thanks for doing my work for me :D (these are the bugs I usually catch before release, but, laziness.) The latest execline git head should fix it. If it works for you, I'll cut the 2.9.5.1 release. -- Laurent
Re: [announce] small skarnet.org Spring 2024 update
backtick -E A_LONGISH_NAME { s6-echo foo } It fails with: Huh. I must have missed something. Thanks for the report, will investigate and fix. -- Laurent
[announce] small skarnet.org Spring 2024 update
Hello, New versions of some skarnet.org packages are available. A very light update this time, just keeping the lights on. skalibs-2.14.1.1(release) execline-2.9.5.0(minor) s6-2.12.0.4 (release) tipidee-0.0.4.0 (minor) skalibs and s6 get tiny bugfixes. Alongside minor bugfixes as well, the execline release makes backtick add its child's exit code to the ? environment variable (as foreground does) when used with an option making it continue when the child fails. tipidee can now be used with sites whose root is served via a unique CGI script. The Server: header is now overridable, for people who don't want to broadcast the exact version of their web server in an HTTP response. And there's a new ls.cgi program, that can be used as an index.cgi to serve a list of all the files in a directory. Enjoy, Bug-reports welcome. -- Laurent
Re: Update: s6 and utmps rpm package
I would like to package an example service for s6. Could you suggest one? tipidee would be a good one. I plan to release tipidee-0.4.0.0 very soon and have an Alpine package for it early next week, if you want to have example scripts. So, is s6-rc a good candidate for rpm package? I am preparing to build s6-rc package in next few days. Having a s6-rc package won't hurt. I'm not sure it would be useful, because there's no point in having an alternative service manager when you already have systemd, but it's not doing any harm. Thanks for your suggestion, when I see the content of utmp-prepare and utmp-init, I have the same question: will this conflict with the existing utmp/wtmp service? Yes, this will conflict. Anything that's in the utmps package is made to work with a utmps installation, not with a regular glibc utmp installation. On Fedora, utmp works with glibc as is, so you shouldn't add anything. -- Laurent
Re: Update: s6 and utmps rpm package
1. Run btmpd, utmpd, wtmpd as s6 service. But this option will add s6 as extra dependency. 2. Run btmpd, utmpd, wtmpd as systemd service. The dependency is minimal. Only depends on s6-ipcserver. On Alpine, s6-ipcserver is in a separate package because Alpine is very careful about disk space, so much that they wanted me to make utmps available without the bulk of s6. (Yes, I find this pretty hypocritical given other decisions they make, but I was tired of arguing with them.) On RedHat, you will not have the same concern: s6 is a drop in the water compared to the amount of disk space you need to boot anyway. So it does not make sense to separate s6 from s6-ipcserver, and I suggest making the utmps package depend on the s6 package anyway. This is a separate question from running the [uwb]tmpd services under s6-svscan or systemd. Both approaches have advantages. Running the utmps services under systemd: - they start earlier - you can make any systemd service depend on them Running the utmps services under s6: - independence from systemd, can be portable anywhere - shows an example of how to run a service under s6 3. Run btmpd, utmpd, wtmpd as s6-rc service. Add two more dependencies: s6 and s6-rc. That option, on the other hand, isn't a good one. There is an argument for running a s6 supervision tree under systemd, but there is little argument for running s6-rc and having a parallel service manager ecosystem - this probably adds more complexity than it's worth. (Unless it's for transitional purposes, but transitioning Fedora out of systemd isn't happening.) ... All of that being said, however, my opinion is that you *should not* package utmps for Fedora. utmp management is a distro-wide decision: the utmp database is unique and accessed by several components in the system. Fedora uses glibc, and glibc has its own utmp implementation, and all the existing Fedora packages expect utmp to be managed by the glibc implementation. Adding utmps, and packages that will use utmps, will introduce conflict, and break things. (The utmp databases won't have the correct permissions, glibc will access the files directly without the locking that utmps does and concurrent access will cause file corruption, etc.) utmps isn't something that you can add like this and have some packages depend on it and others not. It has to be a concerted effort by the whole distribution, to decide if they switch to it or not. Alpine uses it because musl doesn't provide a real utmp implementation; the transition could be done incrementally without conflicting. glibc-based distros are another story, a transition would need to be done atomically. And unless you submit a proposal to Fedora and it is discussed and accepted by the Powers That Be, it's not happening. -- Laurent
Re: Update: rpm package for utmps, skalibs.
Please note that this list isn't meant for real-time debugging. If you want real-time help, please join IRC (#s6 on OFTC), that's what it is for. Apr 10 22:06:53 rpm-builder s6.systemd-boot[15235]: s6-supervise s6-svscan-log: warning: unable to spawn ./run (waiting 60 seconds): No such file or directory Looks like your scandir isn't empty. Again, do not use s6-svscanboot or anything similar. Start s6-svscan directly. -- Laurent
Re: Update: rpm package for utmps, skalibs.
I prefer this way. Some packages prefer s6 as their process supervisor, some packages prefer systemd. With the help of s6 rpm package, other rpm packages who depend on s6 can install their service in s6’s service directory. We just pave for the community, the choice is in their hands. All right. Then, as Guillaume says, you cannot use /run/service as your scandir, because everything under /run will be wiped at boot time. You should pick a permanent place, such as (for instance) /var/lib/s6/service, and make it your scandir. All the packages that have services they want s6 to manage will need to create their own service directories, and symlink them into the scandir. Could you tell me how to verify s6-svscan works well as there is nothing in log. Did you start the process? If you did, it's running, and it's working. If it isn't, it's a bug, either in the way you started it, or in the code. Do not runtime-check the result of your policies. If your policy says that s6-svscan is started, then you should assume it is working. If it isn't, it's not something that should be handled at run time, it's something that should be handled before anything ships. -- Laurent
Re: Update: rpm package for utmps, skalibs.
Since your s6-svscan doesn't run as pid 1, you don't need a finish or a crash script. Not creating the .s6-svscan directory at all is good: the default behaviour is suitable for running s6-svscan as a normal service. The answer to the rest of your questions implies policy decisions. In other words, what do you want the package to do, exactly, and how do you want it to interact with other services that you want to run under s6? - Do you want to start an empty supervision tree at boot, and then have the service manager (probably systemd, here) populate it with various symlinks to service directories as it brings services up? - Do you want to start a pre-populated supervision tree that survives across reboots, and have packages install their services there and not be handled by the service manager at all? - Do you want another behaviour? If you're going to make a package, you first need to think about exactly what it is you're trying to accomplish. In accurate, painful technical detail. -- Laurent
Re: Update: rpm package for utmps, skalibs.
I notice s6-svscanboot is the start script for s6-svscan. I am not an execline expert, but I can see that s6-svscanboot prepare log directories and start s6-svscan. If systemd provides log service for s6-svscan. Do we need s6-svscanboot for rpm package? No, you don't. As I said in a previous mail: you probably want to throw out everything that isn't the original sources, and make a fresh start. This includes throwing out s6-svscanboot, which sets up a catch-all logger because sysvinit/busybox init + openrc doesn't provide one. Since systemd has its own catch-all logger, you don't need s6-svscanboot. The next question is about s6.pre-install, s6.pre-upgrade script. they do the same thing (setup catchlog user/group), why we need s6.pre-upgrade script if the previous installation already setup the catchlog user/group? As Hoël said, it's a legacy script, for very old installations that need to upgrade. It could probably be removed. In any case, in a new rpm, you don't need it. And since you're not setting up a catch-all logger, you don't need a user for it either, so you can do away with the install scripts entirely. -- Laurent
Re: Update: rpm package for utmps, skalibs.
One last question: do we need the s6-openrc rpm package? I know systemd is more popular for Redhat and Fedora. Any suggestion? I doubt anyone is going to run openrc on Fedora. If you're going to package s6 for a given distribution, you should integrate it properly with that distribution, not copy what is done on other distributions. To that end, you should forget about openrc, and probably assume systemd is running. That means: - removing all the other files than the sources. - making a suitable unit file to start s6-svscan. - taking advantage of the fact that systemd, unlike openrc, has a logging mechanism, so you don't need to set up a catch-all logger for the supervision tree - on the contrary, you can just let all the logs fall through to stderr. I would help you do all that, except I have no experience with rpm (except from 20 years ago, that is). -- Laurent
Re: Update: rpm package for utmps, skalibs.
RHEL and Fedora have an alternatives system: * https://docs.fedoraproject.org/en-US/packaging-guidelines/Alternatives/ * https://www.linux.org/docs/man8/alternatives.html Then it looks like the correct way to proceed, if Eric can coordinate with the maintainers of the filesystem and bash packages. -- Laurent
Re: Update: rpm package for utmps, skalibs.
There have been some discussions, starting at Fedora, about unifying the bin and sbin directories: https://fedoraproject.org/wiki/Changes/Unify_bin_and_sbin Ha. 25 years later, they understand that the separation makes no sense, and *just* when we were going to use that silly separation to work around an even sillier idiosyncrasy. Talk about timing. Also, even apart from unifying the directories, there are various people who have expressed concern about having different programs with the same name in /usr/bin and /usr/sbin, thus making it something of a potluck which one will be invoked depending on the user's search path. I have to admit that I am kind of in agreement with that: different binaries with the same name in directories that are both meant to be in the search path seems... a bit fishy to me, and, yeah, with the potential for problems if the directories are reordered (I have seen arguments for both sides: "things in /sbin are more important, so it should come before /bin; things in /bin are used much more often, so it should come before /sbin"). I agree with all this. In principle, /usr/bin and /usr/sbin should not be distinct, for all these reasons. The thing is, we're not in the realm of "good design", here. We're in the realm of "work around the braindeadness and use the cracks to uglyhack something that works". If rpm doesn't have an alternatives system to get the useless binaries out of the way, and if /usr/sbin is unusable, then there's nothing left but "add another directory to the global PATH", which is super invasive. -- Laurent
Re: s6-rc user services on Gentoo
2) The presence of a notification-fd file tells s6 that dbus-daemon can be somehow coerced into producing an s6-style readiness notification using file descriptor 3 without changing its code, are you sure that's the case with this script? My service definition for the system-wide message bus polls for readiness using s6-notifyoncheck and a dbus-send command... "dbus-daemon --print-address=3" should produce a suitable notification. The address of the bus can only be printed once the bus exists. :) -- Laurent
Re: Update: rpm package for utmps, skalibs.
After check the installed package of execline on alpine. I choose to install main part of execline to /usr/bin. Create /usr/sbin directory, create relative symbol link for cd, umask and wait to /usr/bin/execline. Does that mean you're using --enable-multicall? You can, it's just surprising for a distribution that uses glibc and doesn't care much about size. :) And yes, I think you have it right. Put the binary and most of the symlinks in /usr/bin, and only use /usr/sbin for the cd, umask and wait symlinks. -- Laurent
Re: Update: rpm package for utmps, skalibs.
And yes, since execline-provided cd, umask and wait, when called via a PATH search (not that a shell will ever do that, but execvp() can), will substitute themselves to Fedora-provided POSIX binaries, it is necessary to build execline with --enable-pedantic-posix in order to prevent trouble with whatever pathological case Fedora could come up with. -- Laurent
Re: Update: rpm package for utmps, skalibs.
[packager@rpm-builder etc]$ env | grep PATH PATH=/home/packager/.local/bin:/home/packager/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin I guess /user/local/bin or /usr/local/sbin is our first choice? Do we need --enable-pedantic-posix for /usr/local/bin or /usr/local/sbin? No, /usr/local is reserved, as the name implies, for local installations: packaged software cannot use it. If the default PATH has /usr/sbin before /usr/bin for all users, then the best thing is probably to install cd, umask and wait into /usr/sbin. It's not exactly clean, but at this point we're not trying to be clean, we're trying to make things work. And it wouldn't be the first time a binary that's available to all users gets installed in /usr/sbin. -- Laurent
Re: Update: rpm package for utmps, skalibs.
In my (admittedly ugly) package, I simply delete execline's `cd' and `umask'; `wait' is renamed to `execline-wait', just like `execline-cd' and `execline-umask' (which are not conflicting and so not deleted). This means that your execline package cannot run execline scripts that use cd, umask or wait. It may work for you, but it is not suitable as a general audience package. -- Laurent
Re: Update: rpm package for utmps, skalibs.
file /usr/bin from install of execline-2.9.4.0-1.fc39.x86_64 conflicts with file from package filesystem-3.18-6.fc39.x86_64 file /usr/bin/cd from install of execline-2.9.4.0-1.fc39.x86_64 conflicts with file from package bash-5.2.26-1.fc39.x86_64 file /usr/bin/umask from install of execline-2.9.4.0-1.fc39.x86_64 conflicts with file from package bash-5.2.26-1.fc39.x86_64 file /usr/bin/wait from install of execline-2.9.4.0-1.fc39.x86_64 conflicts with file from package bash-5.2.26-1.fc39.x86_64 Oh, Red Hat, never change. The correct answer to this problem is that these binaries from the "filesystem" and "bash" packages should not exist. They are just never called - any instance of "cd", "umask" or "wait" in a shell script calls sa shell builtin, and they *have to* be builtins - they cannot work when called as external tools. The only reason why these binaries exist is to comply with a broad statement in POSIX that every builtin must also be provided as an external tool, even those where it does not make sense. Red Hat-based distributions are the only ones that do this. Other ones have understood that these binaries are useless. But obviously you cannot remove these binaries unless you're the "filesystem" and "bash" maintainer, so workarounds must be found. The best workaround is to use an alternatives system if available. I don't know if rpm provides this. The idea is to offer execline as an alternative source for the cd, umask and wait binaries. If you build execline with --enable-pedantic-posix (which you should do in this case), the binaries provided by execline are fully compatible with the POSIX requirement, and can replace the default Fedora binaries entirely; they also provide actually useful functionality when used in ways not explicitly covered by POSIX (i.e. when chain-loading, which is how they're used in execline scripts). You may need to work with the filesystem and bash maintainers for this. Short of that, the only possible workaround is to find a place that appears *before* /usr/bin in the default PATH, and install, or link, execline binaries there. This may be difficult to find, because /usr/bin is generally one of the first locations in PATH. If you cannot find this, then the only way is to install execline binaries in their own directory (e.g. /var/lib/execline/bin) *and* add that directory to the default PATH of every user, before /usr/bin, which is a lot more invasive. (If there is no policy that forbids creation of subdirectories in /, you could consider building skaware with --enable-slashpackage, and adding /command at the top of the user PATH. execline would have its binaries in /package/admin/execline/command, accessible via /command, nothing would conflict with stuff living in FHS, and as long as /command is before /usr/bin in PATH, things would work. That is what I do on my machines. But unfortunately, most distributions can be pretty anal about /package and /command - which is hypocritical considering they have no problem with /media and /srv, but that's a fight for another day - so it's doubtful you can do that.) If everything else fails, document somewhere that execline, *and* packages that depend on execline, will not be usable unless that directory is added *at the top of* the PATH. It's not only about finding the binaries, it's about making sure that the correct cd, umask and wait binaries are found; if a binary from "filesystem" or "bash" is found instead, this will break execline scripts. Note that what some distros did, i.e. putting the execline binaries in /var/lib/execline/bin and adding a /usr/bin/execlineb wrapper that prepends PATH with /var/lib/execline/bin before executing /var/lib/execlineb/bin/execlineb, is explicitly NOT correct. execline binaries aren't supposed to be accessible only when called by execlineb; they're supposed to be accessible via the default PATH, and some parts of skaware will break if they're not. Putting them in /var/lib/execline/bin is fine, but then /var/lib/execline/bin needs to be in the default PATH, and *before* /usr/bin, instead of being activated by a wrapper. -- Laurent
Re: Update: rpm package for utmps, skalibs.
Yes, skalibs, execline are different projects. The GitHub site is just a central and temporary place to hold the spec files. For skalibs project, I build 4 rpm packages: skalibs, skalibs-devel, skalibs-devel-static, skalibs-doc. skalibs-devel depends on skalibs. Just follow the aports counterpart pacakges dependency rule. The rpmbuild tool supports building different packages in one directory. OK, thanks for the clarification. -- Laurent
Re: Update: rpm package for utmps, skalibs.
I haven't looked in detail, but I'm not sure why you want everything in one single RPM. skalibs, utmps, execline and s6 are different projects. A package should be one project, not a set of projects. A package manager will handle dependencies between packages and install all the rpms that are needed by a given project. Or isn't it how the rpm packaging system handles dependencies? -- Laurent
Re: is there any rpm package for utmps, skalibs?
my first question is: does skalibs support glibc? alpine only support musl. Yes. skalibs supports everything that makes a good attempt to be POSIX-conformant, so that includes glibc. -- Laurent
Re: is there any rpm package for utmps, skalibs?
Hi Wang, Your e-mail client seems to be broken. It sends HTML entities as text/plain, and it makes the content of your mail unreadable. Please fix this, if you can. From what I can understand, you're looking for rpm packages for skalibs and utmps. I don't know if there are any; I haven't had any contact from a Fedora maintainer or user. If anyone's interested in making rpm packages for skaware, or in helping Wang make them, please show yourself :) -- Laurent
Re: version information output
there is no version information option (like say "-V") for the s6 utils. such a command line option should make the tool output its version number and terminate. it would be nice if such an option could be added to the tools. It would also add boilerplate to every single binary, which would make them bigger, as well as longer and more annoying to write. Most of the time, the version information is available elsewhere; typically, in your package manager. Or in the filesystem if you're using slashpackage. (That's one of the issues with FHS, it requires an additional system such as a package manager to retain the meta-information it loses by having binaries in fixed directories.) Binaries are not the place to store meta-information. There is nothing you can programmatically do with version information; if you require a specific minimal version of a tool, then by policy you should have it on your system, and you should assume that your requirements are met (and it's a bug if they are not). "true --help" and "true --version" are often mentioned for laughs; there is a reason for that. -- Laurent
Re: [announce] skalibs-2.14.1.0, tipidee-0.0.3.0, shibari-0.0.1.0
Additionally, the shibari documentation has been ported: * https://git.sr.ht/~flexibeast/shibari-man-pages/refs/v0.0.1.0.1 (For those wondering, porting the two man pages for shibari took me roughly an hour.) You are awesome.
Re: [announce] skalibs-2.14.1.0, tipidee-0.0.3.0, shibari-0.0.1.0
The difference in UDP is that not having a connection makes it harder to model with the stdin/stdout method of UCSPI, right? Yes. A super-server model makes sense for TCP because you can spawn one server to handle one stream; not so much for UDP, because there is no stream, only packets, and you don't want to spawn a process for every packet. A UDP server doesn't have to deal with the complexity of multiplexing exchanges with different clients, either, since it only needs to respond to every packet in sequence. No parallelism needed, it's all very straightforward and simple to write. djbdns does the exact same: axfrdns is spawned by a super-server, but tinydns binds to its UDP socket itself. -- Laurent
[announce] skalibs-2.14.1.0, tipidee-0.0.3.0, shibari-0.0.1.0
Hello, New versions of some skarnet.org packages are available. This is mostly a bugfix release, with some new features. skalibs-2.14.1.0 (minor) s6-2.12.0.3(release) s6-dns-2.3.7.1 (release) s6-networking-2.7.0.1 (release) tipidee-0.0.3.0(minor) shibari-0.0.1.0(new!) * skalibs-2.14.1.0 Despite the minor bump, that was necessary for one of the bug fixes, this is still a bugfix release - but an important one. All users should upgrade. The upgrade breaks the build of old s6 and s6-networking versions, despite not being a major upgrade. This is intentional; the 'broken' functionality actually never worked, and the old interfaces *could* never work, so, better get rid of them and expose problems at build time rather than run time. The new versions of s6 and s6-networking use the new, working interfaces. Other packages are not impacted. https://skarnet.org/software/skalibs/ git://git.skarnet.org/skalibs * tipidee-0.0.3.0 --- tipidee now supports ranges! And also a new XXX_no_translate configuration option, which - as the name implies - is pretty dangerous: it disables path translations and interprets the requested URI as is, which allows symlinks to documents located outside of the server's root. https://skarnet.org/software/tipidee/ git://git.skarnet.org/tipidee * shibari-0.0.1.0 --- A brand new project, because I clearly don't have enough on my plate. shibari is a suite of DNS tools, a successor to s6-dns. Eventually it will fully replace s6-dns, but for now it simply depends on it. This first version of shibari comes with two DNS server programs (one for UDP and one for TCP), that are more or less drop-in replacements for djb's tinydns and axfrdns. (I wrote them because it was a better long-term solution than adding a patch to fix a bug in axfrdns.) https://skarnet.org/software/shibari/ git://git.skarnet.org/shibari Enjoy, Bug-reports welcome - but I probably won't be working much on them during the end of the year. Merry Christmas, happy holidays, and happy new year! -- Laurent
Re: A define program using blocks for execline?
Yes, it can be done with current execline tools through options like -s in define and importas, but I feel something like this would be clearer: block-define var { 1 2 3 } printf "%s\n" "This is ${var}" Does this already exist? Not really, but that sounds like a possible addition, the model sounds sane. Thanks for the suggestion, I'll think about it. -- Laurent
Re: How does s6-linux-init-shutdownd.c contact pid 1?
I've been trying to find out why my "finish" script is not working (or perhaps it is working but not printing output anywhere I can see) The ways of shutdown are mysterious. :) However, I don't think kill(-1, n) *works* for pid 1. Indeed, it does not. That kill is supposed to be sent to every process *except* s6-svscan, which will survive and restart the supervisor for s6-linux-init-shutdownd, which will restart s6-linux-init-shutdownd, which will then execute stage 4. If you're not in a container, stage 4 will unmount the filesystems then hard halt/poweroff/reboot the machine. https://git.skarnet.org/cgi-bin/cgit.cgi/s6-linux-init/tree/src/shutdown/s6-linux-init-shutdownd.c#n193 s6-linux-init-shutdownd never tells s6-svscan to exit, so if you're running s6-linux-init, it's normal that your .s6-svscan/finish script is not executed. The place where you want to hack things is /etc/rc.shutdown.final, which is run by the stage 4 script right before the hard reboot. At this point nothing is mounted anymore and the process tree is only s6-svscan \_ s6-supervise s6-linux-init-shutdownd \_ foreground { rc.shutdown.final } reboot \_ rc.shutdown.final so you can do dirty stuff like "rm -f /run/service/s6-linux-init-shutdownd && s6-svc -dxwD /run/service/s6-linux-init-shutdownd && s6-svscanctl -b /run/service" which should clean up the s6-supervise and the foreground, and give control to .s6-svscan/finish. Start your finish script with "wait { }" because s6-svscan will probably exec into it before rc.shutdown.final dies, and you don't want a zombie hanging around. -- Laurent
[announce] s6-2.12.0.2
Hello, I don't normally spam all of you for bugfix releases, but this one is important. You definitely want to grab the 2.12.0.2 version of s6, not the 2.12.0.1 one. The bug could prevent a shutdown from completing. https://skarnet.org/software/s6/ git://git.skarnet.org/s6 Sorry about that, -- Laurent
[announce] but what about *second* skarnet.org November 2023 release?
Hello, New versions of some skarnet.org packages are available. This is mostly a bugfix release, addressing the problems that were reported since the big release two weeks ago. Despite that, s6-dns got a minor version bump because the fixes needed an additional interface; and s6-networking got a major bump, because it needed an interface change. Nothing that *should* impact you, the changes are pretty innocuous; but see below. skalibs-2.14.0.1(release) s6-2.12.0.1(release) s6-dns-2.3.7.0(minor) s6-networking-2.7.0.0(major) tipidee-0.0.2.0(minor) * skalibs-2.14.0.1 This release is important if you want the fixes in s6-dns: the ipv6 parsing code has been revamped. https://skarnet.org/software/skalibs/ git://git.skarnet.org/skalibs * s6-2.12.0.1 --- It's only a bugfix, but you want to grab this version, because the bug was impactful (s6-svscanctl -an not working as intended). https://skarnet.org/software/s6/ git://git.skarnet.org/s6 * s6-dns-2.3.7.0 -- - The parsing of /etc/hosts now ignores link-local addresses instead of refusing to process the whole file. - New interface to only process /etc/hosts if a client requires it. https://skarnet.org/software/s6-dns/ git://git.skarnet.org/s6-dns * s6-networking-2.7.0.0 - - s6-tlsc-io has changed interfaces; now it's directly usable from a terminal. This change should be invisible unless you were using s6-tlsc-io without going through s6-tlsc (which, until now, there was no reason to do). - s6-tcpserverd now logs "accept" and "reject" instead of "allow" and "deny", this terminology now being reserved to s6-tcpserver-access. - The -h option to s6-tcpclient and s6-tcpserver-access has changed semantics. Previously it was used to require a DNS lookup, and was hardly ever specified since it was the default (with -H disabling DNS lookups). Now it means that DNS lookups must be preceded by a lookup in the hosts database. - A new pair of options, -J|-j, are accepted by s6-tlsc-io and s6-tlsd-io, and by extension the whole TLS chain of tools. -J means that s6-tls[cd]-io should exit nonzero with an error message if the peer fails to send a close_notify before closing the connection; -j, which is the default, means ignore it and exit normally. - The TLS tunnels work as intended in more corner cases and pathological situations. https://skarnet.org/software/s6-networking/ git://git.skarnet.org/s6-networking * tipidee-0.0.2.0 --- - Bugfixes. - New configuration options: "log x-forwarded-for", to log the contents of the X-Forwarded-For header, if any, along with the request; and "global executable_means_cgi", to treat any executable file as a CGI script (which is useful when you control the document hierarchy, but dangerous when it's left to third-party content manager programs). https://skarnet.org/software/tipidee/ git://git.skarnet.org/tipidee Enjoy, As always, bug-reports welcome. -- Laurent
Re: [announce] skarnet.org November 2023 release
Minor issue, the version linked from the web page (https://skarnet.org/software/skalibs/) needs a bump Whoops. Fixed. -- Laurent
Re: tipidee - uri parse when port missing
Hi Vincent, I'm not sure if you're testing with the released version of tipidee or not. Please make sure to only report bugs against the released version or the git head. In any case, the absence of a port in the Host field is certainly not the reason why tipidee would answer a 400. There has to be something else in the request coming from the proxy that it doesn't like. Can you post the full request? -- Laurent
[announce] skarnet.org November 2023 release
Hello, New versions of all the skarnet.org packages are available. This is a big one, fixing a lot of small bugs, optimizing a lot behind the scenes, adding some functionality. Some major version bumps were necessary, which means compatibility with previous versions is not guaranteed; updating the whole stack is strongly recommended. Also, tipidee is out! If you've been looking for a small inetd-like Web server that is still standards-compliant and fast, you should definitely check it out. skalibs-2.14.0.0 (major) nsss-0.2.0.4 (release) utmps-0.1.2.2(release) execline-2.9.4.0 (minor) s6-2.12.0.0 (major) s6-rc-0.5.4.2(release) s6-linux-init-1.1.2.0(minor) s6-portable-utils-2.3.0.3(release) s6-linux-utils-2.6.2.0 (minor) s6-dns-2.3.6.0 (minor) s6-networking-2.6.0.0(major) mdevd-0.1.6.3(release) smtpd-starttls-proxy-0.0.1.3 (release) bcnm-0.0.1.7 (release) dnsfunnel-0.0.1.6(release) tipidee-0.1.0.0 (new!) * skalibs-2.14.0.0 This version of skalibs adds a lot of new sysdeps, a lot of new functions, and changes to existing functions, in order to support the new features in other packages. The most important change is the new cspawn() function, providing an interface to posix_spawn() with support for most of its options with a fork() fallback for systems that do not have it. What this means is that on systems supporting posix_spawn(), the number of calls to fork() in the whole skarnet.org stack has been significantly reduced. This is important for programs where spawning a new process is in a hot path - typically s6-tcpserver. Updating skalibs is a prerequisite for updating any other part of the skarnet.org stack. Once you've updated skalibs, you probably don't *have to* update the rest; old versions of packages should generally build with the new skalibs as is, and if indeed they do, nothing should break. But it is a major update, so there are no guarantees; please update to the latest versions at your convenience. https://skarnet.org/software/skalibs/ git://git.skarnet.org/skalibs * execline-2.9.4.0 - execlineb now has a dummy -e option (it does nothing). This is so it can be used as a replacement for a shell in more environments. Also, execline programs use fork() a lot less, so overall execline script performance is better. - The multicall setup did not properly install symbolic links for execline programs; this is fixed, and is fixed as well as in other packages supporting a multicall setup (s6-portable-utils and s6-linux-utils). https://skarnet.org/software/execline/ git://git.skarnet.org/execline * s6-2.12.0.0 --- - s6 programs use fork() less. - New -s option to s6-svc, to send a signal by name or number. - s6-svscan has been entirely rewritten, in order to handle logged services in a more logical, less ad-hoc way. It should also be more performant when running as init for a system with lots of s6-supervise processes (improved reaping routine). - The obsolete (and clunky) s6lockd subsystem has been deleted. s6-setlock now implements timed locking in a much simpler way. https://skarnet.org/software/s6/ git://git.skarnet.org/s6 * s6-linux-init-1.1.2.0 - - New -v option to s6-linux-init-maker, setting the boot verbosity. - Several small bugfixes, one of them being crucial: now your systems shut down one second faster! https://skarnet.org/software/s6-linux-init/ git://git.skarnet.org/s6-linux-init * s6-linux-utils-2.6.2.0 -- - Support for the minflt and majflt fields in s6-ps. https://skarnet.org/software/s6-linux-utils/ git://git.skarnet.org/s6-linux-utils * s6-dns-2.3.6.0 -- - Support for on-demand /etc/hosts data in s6-dnsip and s6-dnsname. It is achieved by first processing /etc/hosts into a cdb, then looking up data in the cdb. You can, if you so choose, perform this processing in advance via a new binary: s6-dns-hosts-compile. https://skarnet.org/software/s6-dns/ git://git.skarnet.org/s6-dns * s6-networking-2.6.0.0 - This is the package that has undergone the biggest changes. - No more s6-tcpserver{4,6}[d]. IPv4 and IPv6 are now handled by the same program, s6-tcpserver, which chainloads into a unique long-lived one, s6-tcpserverd. - s6-tcpserver now exports TCPLOCALIP and TCPLOCALPORT without the need to invoke s6-tcpserver-access. - s6-tcpserver-access does not hardcode a warning when it is invoked without a ruleset. It can now just be used for additional data gathering (such as TCPREMOTEHOST) without jumping through hoops. - s6-tcpserverd has been thoroughly optimized for performance. It will handle as heavy a load as the underlying system will allow. - Yes, this means you can now use s6-tcpserver to serve
Re: tipidee/s6-tlsserver crash with tls launch
Fixed in latest s6-networking git head. It was an invocation of tls_error() with the wrong context. When run with the fixed version, s6-tlsd-io prints this error: s6-tlsd-io: fatal: unable to tls_configure: failed to read private key which means there's an issue with your fd.key file, probably the format. But this shows it's important to test error paths too! Thanks for that :) -- Laurent
Re: [announce] tipidee is now in open beta!
The release date of tipidee is approaching. Since the last announcement, there have been some significant changes to tipidee, including: - a more flexible logging configuration - custom error pages (by domain) - custom headers as well as many bugfixes, thanks to everyone's reports. These changes were the last important ones. There will be no more major additions before the release. But the new features need testing! In particular, custom error pages and custom headers are new enough and complex enough that they could use more people stress-testing them. So please, if you can, grab tipidee's git head, and check that it works for you and your twisted mind. (Some parts of the /etc/tipidee.conf syntax have changed, you will need to perform minor edits to your configuration and run tipidee-config again.) Thanks a lot in advance! The more testing, the sooner we can put a number on it. :) (Alexis, the documentation is in a good place now and I do not expect any important changes to it before the release. There will probably be minor tweaks and rephrasing, but that's it.) -- Laurent
Re: tipidee/s6-tlsserver crash with tls launch
Just tried with latest s6-networking HEAD (and deps) and also libressl 3.7.3. Unfortunately same issue. I hope it was not due to my certs. Those are generated with openssl 3rd book (self signed certs). LibreSSL hardcodes its list of trusted anchors so it won't be able to *verify* self-signed certs, but it should be able to *serve* them, which is what you are doing. And in any case, it shouldn't crash. I'm going to need your full sysdeps file (from skalibs), as well as your kernel and libc versions, please, to see if I can reproduce this. -- Laurent
Re: Any chainloading program to call prctl(PR_SET_PDEATHSIG, signal)
It turns out there is this Linux specific syscall (prctl(PR_SET_PDEATHSIG, signal)) to set the saner behavior of actually being informed if your parent dies and react to it so s6 is able to bring service back up, but it’s opt-in. Is there any tool in the s6 ecosystem or otherwise that I can use to call it before exec’ing to the service itself? I couldn’t find in s6-linux-utils and I would guess it’s not part of the portable tools, being Linux specific. Is there a portable equivalent? Is there any interest in receiving a patch, alternatively? The short answer is: no, not yet, sorry. The long answer is: prctl is complicated, and I do intend to do something with it in the long run because there are useful things in there, but it requires thought and planning. Typically, what I'd want is to check what functionality is available on other systems than Linux and how reasonably easy it is to implement, then add that functionality as part of a portable program suite, and add the rest, or a useful subset of, as part of s6-linux-utils or something. Also, I'd like to see whether there isn't a better abstraction level than blindly implementing all the options and flags in prctl. So, I don't want to rush this, especially not while focused on other things. For your use case, however, you can certainly implement a prctl(PR_SET_PDEATHSIG, SIGTERM) wrapper yourself, that doesn't sound difficult. Or you could use cgroups: have your run script put itself in a dedicated process cgroup, and a finish script ensure that all the processes in the cgroup are killed when the service goes down. -- Laurent
Re: tipidee/s6-tlsserver crash with tls launch
I used Libressl 3.8.1 with all official releases (skalibs, execline, s6, s6-networking, s6-dns and s6-portable-utils). Except for tipidee with skalibs all on HEAD. Thanks. I cannot reproduce the crash with the s6-networking git head. Can you please test with it? (Even if there's a bug in s6-networking-2.5.1.3, the git head is where it would be fixed; and the next release is close.) You'll need to build skalibs and s6 from git in order for s6-networking to build. And if s6-tlsd-io from git still crashes, can you please test with libressl-3.7.3 (the stable libressl release) as well? 3.8.1 is a development release; it has worked for me so far, but you never know. -- Laurent
Re: tipidee/s6-tlsserver crash with tls launch
In that situation it produces a SIGSEGV during s6-tlsd-io execution.In attachment 2 strace log outputs (pid 14138 for the caller of s6-tlsd-io, pid 14141 for s6-tlsd-io itself). Why this s6-tlsd-io is always crashing (some credential/Id's)? It's a libtls crash during the preparation of the tunnel (before the handshake). What version of LibreSSL are you using? And what version of s6-networking are you using? There was such a crash in an old version of s6-networking, that should be fixed in 2.5.1.3 (and is of course also fixed in the git head). In general, if you're going to run a piece of skaware from git - and thank you for that! Beta testing is helpful! - then you need to build the whole skaware stack from the git head. -- Laurent
Re: [announce] tipidee is now in open beta!
- is it possible to customise error pages as static pages? Currently I think not but is it forecasted? I initially wanted to *specifically* avoid this, because some Web servers return a 200 status when serving their customized error page, which is a terrible idea. But then I realized you didn't have to do that, and serve a customized page while still returning the proper error code. So, yeah, why not - maybe not right now, but in a later version, sure. I'll add that to the future.html documentation page. - in other words tipidee does only return error codes and no content linked to? There currently _is_ content returned with the 404 response. It's just so minimal you get the impression there is none. :) -- Laurent
Re: [announce] tipidee is now in open beta!
An mdoc(7) port of the documentation is now also available: https://sr.ht/~flexibeast/tipidee-man-pages/ Thanks a lot - what speed! :D But please be aware that everything can still be very much in flux until the official release. Doc is getting fixed, completed, reworded, as much as code is. ;) -- Laurent
[announce] tipidee is now in open beta!
Hi folks, For those who don't know, I've been working on a very normal, very sane project, not rabbit-holey or scope-creepy *at all*: a web server. It's named tipidee, and I just made the switch - it is now serving the skarnet.org site. It's in a good enough place that I can now declare it's in beta, which means stable enough for other people to download and test it. I've done all the debugging I could, now I need other people to misuse it in ways I would never have imagined. Please download it, use it, abuse it, and send me all the bug-reports you can. https://skarnet.org/software/tipidee/ git://git.skarnet.org/tipidee/ When it's good enough for a numbered version, there will be a *huge* skarnet.org release, and then we'll move on to Good Things. Thanks a lot, -- Laurent
Re: [OT] djbdns + musl
$ cat conf-cc /opt/bin/musl-gcc -static -Os -march=x86-64 -fomit-frame-pointer -pipe -Wall -Wno-trampolines -Wno-maybe-uninitialized -Werror=overflow -mpreferred-stack-boundary=4 -falign-functions=1 -falign-jumps=1 -falign-loops=1 -fno-unwind-tables -fdata-sections -ffunction-sections -Wl,--gc-sections -fno-asynchronous-unwind-tables -fstrict-aliasing -Wstrict-aliasing=2 -Wno-unused-function -foptimize-sibling-calls -std=c89 -fno-pic -Wl,-z,noseparate-code -fPIE $ cat conf-ld /opt/bin/musl-gcc -s That's off-topic indeed, but the answer is easy: -static is a linking option, not a compilation option. Put -static in your conf-ld, not in your conf-cc. :) -- Laurent
Re: Question about s6-linux-init
While following the guide for the init part I noticed the init scripts seem to be shell scripts. Is there any particular reason they are not execline scripts? I’ve become much more fond of those while trying the waters before. Would it be sensible for me to rewrite them in execline? And, similarly, would you be interested in receiving patches if I do? Hi Mario, execline is best used in very specific situations: - when scripts are *very* simple - for automatically generated scripts (execline is easier to generate than shell) - for power users Scripts such as /etc/rc.init don't fit in these categories. They can contain a lot of stuff (look at s6-overlay's rc.init for an example of something generic enough to work in a Docker container), they're meant to be written manually, and by non-specialists of the s6 ecosystem. That makes execline a bad fit. I obviously like execline as much as you do, but it has shown to generally be an obstacle to s6 adoption, because people who are new to the ecosystem see it, rightly or wrongly, as an additional hurdle. There's a diehard piece of FUD about s6 that says you *have to* learn execline in order to use it; it has never been true, but the myth lingers nonetheless, and showing that you can write scripts in whatever language you want helps dispel it. For this reason, the init script examples are provided in shell, and will remain so. Of course, you're free to write your own scripts in execline, and encouraged to do so if you're comfortable with it and it makes them more efficient :) -- Laurent
Re: [PATCH] s6-tlsserver: actually pass on -Y to s6-tlsd
The -Y flag was being treated as if it means the default of not asking for a client cert. Thanks! Applied with a slightly different style. I should really have used a different name for the optional client certificate. As is, -Y/-y is asymmetrical between s6-tlsc and s6-tlsd, and that's ugly (and the reason for the bug, because I copied the template for s6-tlsserver from s6-tlsclient and failed to fix the -Y discrepancy). And yes, you may well be the first to use it. It's uncommon that a server requires a client certificate - generally only people with a serious PKI setup bother with this, which means big orgs, and those haven't switched to s6-tlsserver yet. ;) -- Laurent
Re: s6-svstat up, down and ready time not correct after system timestamp update.
This can be even worse than that: the timestamp from a GPS source can take several tens of seconds to stabilize, depending on the accuracy of your GPS system and available satellites. Until then, the system date can jump back and forth around the actual time. Ew. That's pretty bad indeed. Forward clock jumps aren't much of a problem, software is generally resilient to that; but *backward* clock jumps are the devil. They can mess up log sequentiality properties; they can make software unresponsive; they can introduce errors in file access/modification times; etc. Basically no important operation should be performed on a system until we know that the system clock isn't going to jump backwards. I don't know how existing GPS-based systems manage that. -- Laurent
Re: s6-svstat up, down and ready time not correct after system timestamp update.
I am setuping s6 for managing services on mine Linux embedded system. Everything is fine. But I faced issue related to system datetime change. My system does not have RTC, but it has GNSS module (managed by gpsd). After GNSS get the location and time chronyd service update system time. And there's your problem: you cannot rely on timestamp data across system clock changes, so any service you would need accurate timestamp data for would need to be started *after* chronyd first updates your system clock. I know it's a pain. I have modified s6, and all my software, to rely on CLOCK_REALTIME as little as possible. But sometimes it's unavoidable, and then you need to be able to trust your system clock, or only use it after it has been updated to something reasonable. I found that s6-svstat uses CLOCK_REALTIME and I think it will be more robust to use CLOCK_MONOTONIC. Unfortunately, no, it will not. The timestamps written by s6-supervise, and used by s6-svstat, are snapshots of the absolute date at a given moment: this is exactly what CLOCK_REALTIME is for. It's the same kind of time data that, for instance, s6-log prepends log lines with when you use the t directive: an absolute timestamp. It is really wallclock time, not stopwatch time. If stopwatch time (i.e. CLOCK_MONOTONIC) were to be used here, sure, you would get more stability across system clock changes for s6-svstat -o updownfor,readyfor ; but you would lose all meaning for s6-svstat -o updownsince,readysince as well as s6-svdt, s6-permafailon and maybe others. Stopwatch time can only be used to compute intervals, never to share absolute timestamps. There is no absolute point of reference for sharing CLOCK_MONOTONIC time across processes, unless you store an offset in the filesystem at boot time and by convention all your software accesses this offset. This isn't specified anywhere, because sharing absolute time across processes is already covered by CLOCK_REALTIME, which is significantly simpler - it only gets messy when the system performs a significant system clock change, which should only happen at most once and in the early life of a system. You are in this case and I know it's ugh, but there is nothing I can do about that. Is the patch correct? Maybe I miss something (maybe some other utils also need to be patched). Even if you wanted to use stopwatch time, no, the patch would not be correct, because CLOCK_MONOTONIC has no absolute meaning - and a tain containing a raw CLOCK_MONOTONIC value would be unusable. The function you are looking for is tain_stopwatch_read(): https://git.skarnet.org/cgi-bin/cgit.cgi/skalibs/tree/src/libstddjb/tain_stopwatch.c but it has to be paired with an initial call to tain_stopwatch_init(), which computes the offset so that you get reasonable absolute times - but this offset is only valid for the current process, and you should never share these absolute times with other processes, because they will be increasingly inaccurate with clock drift and difference between time of creation and time of use. In fact, s6-supervise *does* use CLOCK_MONOTONIC: there is an initial call to tain_now_set_stopwatch_g() here: https://git.skarnet.org/cgi-bin/cgit.cgi/s6/tree/src/supervision/s6-supervise.c#n836 This means that afterwards, timestamps obtained by calls to tain_now_g() will use CLOCK_MONOTONIC. But these timestamps are only used internally, so that the timeout computations for iopause() (the main event loop) are resilient to system clock changes - because s6-supervise, as you are experiencing, can be used very early, before the system clock is properly set, so there is a reason to use CLOCK_MONOTONIC here. The timestamps that are meant to be shared with other processes, however, are all obtained via tain_wallclock_read(), which uses CLOCK_REALTIME, and that is on purpose. I understand this is not the answer you were looking for, but it's the only one I've got. If you cannot live with the inaccurate report of s6-svstat -o updownfor,readyfor then my suggestion is to use s6-rc to delay the start of your sys-dbus (and friends) services until your initial system clock change has happened. -- Laurent
Re: [PATCH] configure: Catch all of variable values
-*=*) eval "$arg" ;; +*=*) eval "${arg%%=*}=\${arg#*=}" ;; I'm going to check, but that's probably correct. Thanks! -- Laurent
Re: posix_spawn (was: Bugs with execline/s6 documentation and skalibs functions using posix_spawn())
Actually I mean a *directory* that is guaranteed to exist (and meanwhile unexecutable): so /dev here. Indeed, /dev should work; but using it still makes me queasier than crafting a nonexistent path. The mkstemp thing works, so, not going to change it to save a couple of syscalls in a configure test. :) Well I was intending to suggest that we simpliy avoided posix_spawn*() where it disagreed with posix_spawn(3p); that is to say simply replacing all previous `#ifdef HASPOSIXSPAWN' conditions with `#if (defined HASPOSIXSPAWN) && (!defined SKALIBS_HASPOSIXSPAWNEARLYRETURN)'. After all it seems to me child_spawn*() is not used that prevalently, so the performance penalty is really minor; of course, feel free to correct me. Yes, falling back to fork+exec when posix_spawn is bad is an option, and I would probably have done just that if I hadn't been pointed to the existence of waitid() to achieve the "test whether a child is dead without reaping it" thing, without which there can be no workaround. But posix_spawn is more than a performance thing. The point of this interface is that its implementation doesn't have to be vfork+exec internally; it was precisely designed to allow spawning processes on nommu machines, where vfork and fork are basically impossible. So, using posix_spawn wherever possible helps with portability as well. Of course, it doesn't matter for glibc, and it doesn't matter for s6 which needs fork anyway. And chances are that platforms that implement posix_spawn() with internals that are *not* fork+exec will not make it return before the spawning has really succeeded. But still, it's nice to make sure it can be used wherever it exists. If you don't like the workaround, nobody's preventing you from using --with-sysdep-posixspawn=no manually. ;) -- Laurent
Re: posix_spawn (was: Bugs with execline/s6 documentation and skalibs functions using posix_spawn())
Fixes pushed to git, thanks! When given an unexecutable path, child_spawn() returns 0, but errno is unset... that's on purpose. Unfortunately, in the parent there is no way to know the child's execve() error code; all we have is the exit status, 127, and we cannot report the reason for the failure. Rather than set errno to something that may be wrong and prompt the caller to take inadequate measures, I'd rather set it to 0, which glibc reports as "success" but really means "no error information" except in a few, well-known contexts; and let the caller deal with the lack of more accurate reporting. I know it's not satisfying, but we can't do any better. I have realised that a simpler unexecutable path can be, for example, /etc (is it mandated in POSIX?); this can save the mkstemp() call in the sysdep test. POSIX doesn't mandate any path other than /dev/null and /dev/console and I'd rather not try executing them, who knows what weird permissions they may have on obscure OSes. It's a sysdep test, it's not performance-critical; I'd rather use mkstemp() to be *sure* we have a path that does not exist. (Of course the user could always race the program, but we're not trying to harden against stupidity here.) (And frankly I personally do not really find it much worthwhile to introduce this amount of complexity for the broken dependency of a quite minor performance optimisation...) I agree it's a lot of work for not much, but as you said, the behaviour is arguably conformant, and your experience proves that old glibcs are still around, so I'd rather make posix_spawn usable where it exists instead of placing the burden of --with-sysdep-posixspawn=no on users who have a bad version. As shown by the qemu bug I linked above, this impacts s6-svscan, which relies on correct child_spawn() reporting when running custom signal handlers, so not working around bad posix_spawn QoI may lead to buggy signal management in s6-svscan, and nobody wants that. A cursory web search appears to say that glibc-2.27 is when they fixed the posix_spawn QoI; 2.17 being bad is consistent with that. But I can't be bothered to go spelunk in glibc code to check and/or bisect, so if someone could confirm, thank you, otherwise, no big deal. -- Laurent
posix_spawn (was: Bugs with execline/s6 documentation and skalibs functions using posix_spawn())
I pushed a workaround to the skalibs git. Could you please try a build on a machine that exhibits the early return behaviour and tell me if - the behaviour is correctly detected by ./configure (the last sysdep) - the child_spawn*() family of functions now works properly even on this machine? Also, can you please tell me what version of glibc these distribution versions are running? Thanks! -- Laurent
Re: Bugs with execline/s6 documentation and skalibs functions using posix_spawn()
Actually I copied the fragment of posix_spawn(3) from a Devuan Chimaera machine, so the problem may be not specific to CentOS 7. I did not test CentOS 6 or other distro (version)s, for example; but on Rocky Linux 8, which I unfortunately also need to support at work, the behaviour is as expected. Attached is a simple test. It may be a bug in some old glibc, then. If we assume posix_spawn(3) and posix_spawn(3p) were the only possible behaviours (which is frankly not that reliable, judging from how neither manpage noted the violation of conformance), then the two behaviours could be distinguished with the attached test. They're not the only possible behaviours: for instance, [1] shows that under some buggy qemu, posix_spawn() always returns early. But that behaviour can also be caught by the same workaround as the glibc behaviour you're observing, so it's fine. Since the bug is more widespread than "one old version of one distro", is visible in production environments used at large, and seems constrained to "posix_spawn succeeds even if exec fails", which is testable, I'll add a sysdep to detect it and a workaround in child_spawn*, but it will mean additional manual --with-sysdep-foobar=blah noise for skalibs cross-builds. [1]: https://skarnet.org/lists/skaware/1658.html -- Laurent
Re: Bugs with execline/s6 documentation and skalibs functions using posix_spawn()
Testing the behaviour may be challenging, however, because I suspect the CentOS 7 implementation of posix_spawn() is just racy, and they simply documented that they don't care. Thinking about it more, I'm afraid it's not a testable behaviour. Not only isn't there any way to force the race since it entirely happens inside a libc function, but also, the test would require running code on the build machine, which doesn't work for cross-builds and people would have to manually set the sysdep anyway. It seems like --with-sysdep-posixspawn=no, as you did, is the easiest workaround. -- Laurent
Re: Bugs with execline/s6 documentation and skalibs functions using posix_spawn()
As a more general fix, I think tryposixspawn.c should at least try spawning a probably unexecutable path (like the one above) as well, which corrects the sysdep on systems where the expected conformance is broken. Adding a sysdep to detect that case is a good idea indeed! Rather than pretending it doesn't exist, though, I'd rather add a different sysdep that tests its behaviour, so it can still be used with the proper workaround. Testing the behaviour may be challenging, however, because I suspect the CentOS 7 implementation of posix_spawn() is just racy, and they simply documented that they don't care. -- Laurent
Re: Bugs with execline/s6 documentation and skalibs functions using posix_spawn()
* In `trap.html', there is a reference to the removed `timeout' keyword. Fixed. * In `s6-svscan-not-1.html', the systemd unit (traumatic experience with it, as you may easily expect) lacks a `KillMode = process'. I believe the correct setting is actually KillMode=mixed; and the ExecStop= line is incorrect as well since ExecStop expects a synchronous command, not an asynchronous one. Better let systemd just send a SIGTERM to s6-svscan, wait for the supervision tree to exit on its own, and SIGKILL the stragglers. I pushed a fix accordingly. * The child_spawn*() family of functions, depending on using posix_spawn or not, exhibit different behaviours on CentOS 7 (trauma again), as posix_spawnp() may return 0 with argv pointing to unexecutable paths. This, for example, results in s6-svscan not exiting on SIGTERM when .s6-svscan/SIGTERM is absent. The behaviour of posix_spawnp() on CentOS 7 does not conform to posix_spawn(3p), but is documented in posix_spawn(3): "Even when these functions return a success status, the child process may still fail for a plethora of reasons related to its pre-exec() initialization. In addition, the exec(3) may fail." Yeah, well, tough for non-conforming systems. That said, I also pushed a change last week that should have fixed this issue as a side effect, so it's all good. If you feel like it, you can try the s6-svscan version in the latest s6 git. :) > --with-sysdep-devurandom Also fixed. Thanks a lot for these reports! -- Laurent
Re: utmps privilege
What's happening is that utmps-utmpd only checks the value of the *primary* gid of the client. It does not check supplementary groups. I agree that it's counter-intuitive, and will see I can fix that. Unfortunately, no, that's not fixable. The credentials-passing mechanism used by s6-ipcserverd (the superserver for utmps-utmpd) only transmits the primary gid, not the supplementary groups; and I'm not aware of another reasonably portable credentials-passing mechanism, let alone that transmits supplementary groups - except the suid mechanism, which, no. So you're going to have to keep setting your *primary* group to utmp if you want to modify the utmp database as a regular user. Sorry. -- Laurent
Re: utmps privilege
Please avoid using a HTML client, it looks like your converter is buggy and giving some garbled output (your top output is unreadable). What's happening is that utmps-utmpd only checks the value of the *primary* gid of the client. It does not check supplementary groups. I agree that it's counter-intuitive, and will see I can fix that. Thanks for the report. -- Laurent
Re: s6-linux-init without virtual consoles
Thanks for the kind words, Oli :) It's all fine, really. In all fairness, yes, I *was* a little cheeky, because Esben sounded very dramatic about a harmless warning. But there's a legitimate UX takeaway here: the warning is indeed needlessly scary. So it will be changed in the next s6-l-i release. And yes, I suppose I can add the verbosity setting while I'm at it (it doesn't sound super useful, but it's not expensive either, so if people want it, why not.) -- Laurent
Re: s6-linux-init without virtual consoles
I think it would be fair to be able to configure s6-linux-init so that it does not rely on specific details about what hardware is available. Then I have some good news for you: s6-linux-init already does not rely on specific details about what hardware is available. Because if it did, and assumed that you have a virtual console, and you didn't, then it would crash. And you would be a very sad panda. And so would I. But it doesn't. What you're seeing is known as a run-time test: the existence of a /dev/tty0 device is tested. And if such a device exists, then s6-l-i attempts to support kbrequest on it. See? conditional support. It's nice and sweet and simple and has fewer failure cases (because the more configuration switches you have, the more you risk human error.) When you don't have a virtual console, s6-l-i works perfectly fine. If there was no warning message, you would never have noticed the extra system call, and you wouldn't be here asking for offline configuration where online configuration works. But there is a warning message, and that's what you don't like. So yes, the problem you have *is* the warning message per se, not the fact that s6-l-i performs one completely undetectable superflous open() call in headless systems. So let's talk about the message. I agree it's not particularly elegant to print a warning on every boot in a normal configuration. So it could be refined: if devtmpfs can be relied on to always provide /dev/tty0 when a console exists, then when there's no such device, instead of "warning: missing device", s6-l-i could print "info: headless system detected". I think that would be less scary than a "warning", and users of headless and headful (?) systems could keep living together in peace and harmony. What do you think? -- Laurent
Re: 回复: lastlog support
I checked the shadow utils site. It's provide a lastlog CLI. while it's a lack of lastlogd similar to utmpd/wtmpd. The lastlog file isn't managed by utmp, but by the login program, with or without assistance from PAM. It's an entirely different operation, and I don't understand why you'd want utmps to be involved. -- Laurent
Re: s6-linux-init without virtual consoles
While that might make sense when the system is expected to have a /dev/tty0 device, it is kind of messy to see that on systems that is not supposed to have /dev/tty0. Kernels and various parts of init systems print warning messages all the time for similar reasons (some operation failed because it's not supported in the current configuration), I don't think it's fair to single out this one. I would prefer to do nothing. That said, if it's important for more users, I could probably add a verbosity setting, where -v0 would silence warning messages. The problem is that it would do so for *all* warning messages, you'd have no way to tell whether you missed a warning that was actually relevant to you. And no, I'm not adding a separate switch for every warning message in the program :P -- Laurent
Re: lastlog support
is there any plan to support lastlog in utmps project? lastlog uses a separate /var/log/lastlog file, so it's not directly tied to utmp. If anything, it *uses* utmp, so it's the other way around: the shadow-utils package should support utmps. -- Laurent
Re: s6-log not responding to signals
And the timeout is only going to start during exit, right? Naturally. :) -- Laurent
Re: s6-log not responding to signals
While that would make s6-log nicer to integrate with s6-rc, I still think that the current behavior of potentially blocking SIGTERM forever is undesirable, so some kind of timeout in s6-log could still be a good idea. That's why I was suggesting a timeout. And since logging a partial line as a complete line is always strictly better than dropping the partial line, once you have the timeout feature, you don't need anything else: set the timeout to 1 ms if you want to exit immediately even with partial lines. -- Laurent
Re: s6-log not responding to signals
The goal is to never write partial lines. So if the process is sent a signal to exit while a partial line have been received, simply exit without writing anything to file. One of the goals is not to write a partial line if it can be avoided; but it defers to the more important goal of not losing any data. Your suggestion goes against that more important goal. I would vote for simply dropping it. And as we are shutting down, the whole thing is a kind of race anyway, so the first part of the line could just as well have been not received at all, so I think we can safely just throw it away without even waiting for it. Nope. Not happening. Certainly, on shutdown, it doesn't matter whether you get that last log line or not. But loggers don't only get killed on shutdown. There are other, good, reasons why you would want to kill (and restart) an s6-log process, and not losing any data is important in these cases. -- Laurent
Re: s6-log not responding to signals
How are you thinking changes to termination behaviour will interact with the existing -p option? There would be no specific interaction. -p only makes s6-log ignore SIGTERM. The signal is received, but does nothing. The new timeout option would make it wait on receipt of an exit signal, be it SIGTERM or SIGHUP. So, with -p, it would only trigger the new behaviour on SIGHUP, and keep doing nothing on SIGTERM. As suggested by the documentation, when s6-log is waiting for a newline to arrive, its behaviour could be influenced by a) EOF on stdin, b) termination signal. Are you thinking of adding the timeout only if there is a termination signal, but EOF has not yet been detected? There are two exit conditions for s6-log: 1. it reads EOF on stdin; 2. it receives a SIGTERM (unless -p), or a SIGHUP. On EOF, s6-log exits *immediately*. If it has a partial line in its buffer, it will process and log it as a full line before exiting. It does not wait because there is no reason to: the producer closed the data stream, so s6-log is never getting any more data to finish the line. On a termination signal, the producer isn't necessarily done sending logs; the signal comes from a third party (the administrator). s6-log's goal is to exit asap but without losing any data, and on a line boundary. If there's nothing in its buffer, it exits immediately, but if there's a partial line, it will wait for the producer to send it the rest of the line, process this line, and then exit without reading anything more. (If the producer has more to send, it can do so if the pipe to s6-log is being fd-held; the next s6-log incarnation then resumes where the old one has stopped. If the pipe isn't being fd-held, then the producer gets a broken pipe error, but knows exactly what it has successfully sent and what it has not: no data has been lost in a buffer.) My suggestion is to add a timeout in the only case s6-log doesn't exit immediately: when it gets a termination signal and there is a partial line in the buffer. The wait is meant to give some leeway for the producer to send the rest of the line before s6-log exits, but if no such rest of the line is coming, it would be better for s6-log not to wait forever. -- Laurent
Re: s6-log not responding to signals
The problem is that until a new-line is received, s6-log will not respond to SIGHUP and SIGTERM. I assume this is not as expected. This is expected; the goal is to finish reading partial lines before existing. This is useful with services that are writing a large amount of logs, where the buffer length does not necessarily align with a newline: after receiving the signal, the logger reads until the next newline, processes the line, then exits. No service should ever write a partial line at the end of their lifetime. However, I agree that the situation you're describing is not ideal and s6-log should be more robust. I'm thinking of adding a timeout: if s6-log hasn't received the end of a partial line n milliseconds after receiving a terminating signal, then it should process the partial line anyway and exit. What do you think? -- Laurent
Re: s6-linux-init-man-pages
An mdoc(7) port of the documentation for s6-linux-init is now available: https://git.sr.ht/~flexibeast/s6-linux-init-man-pages/archive/v1.1.1.0.1.tar.gz 拾
Re: *-man-pages: s6-rc port, Makefile fix
An mdoc(7) port of the documentation for s6-rc is now available: That's awesome, thanks a lot Alexis! 拾 -- Laurent
[announce] April 2023 bugfix release
Hello, New versions of some skarnet.org packages are available. They fix a few visible bugs, so users are encouraged to upgrade. I usually do not announce bugfix releases. This e-mail is sent because two new functionalities were also in git when the bugfixes needed to be made, so they're now available: - A new -D option to execline's elgetopt, allowing customization of the value for the ELGETOPT_n variable when the -n option is given and does not expect an argument. - A new -R option to s6-linux-init-maker, allowing you to set hard resource limits for the system at boot time. skalibs-2.13.1.1 (release) execline-2.9.3.0 (minor) s6-2.11.3.2(release) s6-linux-init-1.1.1.0 (minor) s6-portable-utils-2.3.0.2 (release) Enjoy, Bug-reports always welcome. -- Laurent
Re: [PATCH] Multicall improvements didn't improve trap
Sending signals to the trap process does nothing. If I revert 9d55d49dad0f4cb90e6ff2f9b1c3bc46a6fcf05f, trap works as expected. After some debugging I think that the pids array in trap.c contains garbage since it isn't initialized statically anymore (see the attached patch). Your diagnosis is correct indeed, the change from static to automatic removed the implicit initialization to 0. Patch applied, thanks. -- Laurent
Re: [execline] Conditional export
I'm going to regret this. ifthenelse -s { eltest -f ${FILE} } { export EXISTS ${FILE} } { } env This construct is purposefully not documented, because it breaks syntactic and logic assumptions that are true in the rest of execline. But it can simplify your life in a handful of cases, like this one. What it does: it will *prepend the rest of your script* with the contents of the second or the third block, depending on whether the test in the first block is true. Do not overuse it. Do not ask for support about it. If it makes your script easier to maintain, enjoy. If you start feeling like a sorcerer and are tempted to explore what kind of magical feats you can accomplish with it, don't - it will not end well. -- Laurent
Re: [PATCH] execline: multicall: make sort independent of locale
Can you please tell me what locale you're using, for testing purposes ? -- Laurent
Re: [PATCH] execline: multicall: make sort independent of locale
reset LC_ALL to avoid locale dependent sorting. This is critical to ensure bsort() works reliably. In my locale "execline-cd" was sorted after "execlineb" ... lol. Changing the sorting order for ASCII characters is probably the most insane misdesign in locales. Good catch! Thanks for the patch. Going to apply with a slight modification: making the change global to the whole script, for easier maintainability. -- Laurent
[announce] skarnet.org February 2023 release
Hello, New versions of some skarnet.org packages are available. It hasn't been long since the last release, but lots of small things have happened and it doesn't make much sense to let them rot in git. The main addition is a new multicall configuration for the execline, s6-portable-utils and s6-linux utils packages. When you give the --enable-multicall option to configure, a single binary is compiled, and 'make install' installs this binary and creates symlinks to it. This is useful to setups that focus on saving disk space. Credit for this addition goes to Dominique Martinet, who nerd-sniped me into actually testing such a configuration; and it turned out the disk space gains were very impressive for execline (up to 87%!) I applied the idea to the s6-portable-utils and s6-linux-utils packages, which are also made of small, very simple, independent programs, to see whether it was viable in the general case; but as I suspected, the gains were not as impressive, and making it work required a significant refactoring effort. Since other skarnet.org packages would have an even worse gains/effort ratio, the experiment is stopping there. execline is an outlier, with a 177 kB amd64 static binary being able to replace a 1.3 MB set of binaries; that's much better than I thought it would be, so it's worth supporting. Enjoy. Other changes include mostly bugfixes and quality-of-life improvements. The new versions are the following: skalibs-2.13.1.0 (minor) nsss-0.2.0.3 (release) execline-2.9.2.0 (minor) s6-2.11.3.0 (minor) s6-rc-0.5.4.0 (minor) s6-linux-init-1.1.0.0 (major) s6-portable-utils-2.3.0.0 (major) s6-linux-utils-2.6.1.0(minor) s6-networking-2.5.1.3 (release) mdevd-0.1.6.2 (release) Details of some of these package changes follow. * skalibs-2.13.1.0 - Bugfixes. - New function: sals, listing the contents of a directory in a stralloc. Straightforward, but a large-ish piece of code that was used in multiple places and needed to be factored. https://skarnet.org/software/skalibs/ git://git.skarnet.org/skalibs * execline-2.9.1.0 - New --enable-multicall configure option. This is the big one for some distributions, that don't want to spend 1 MB of disk space on execline binaries. (They already know my position on that.) https://skarnet.org/software/execline/ git://git.skarnet.org/execline * s6-2.11.3.0 --- - Bugfixes. - Instance-related internal changes. Instanced service directories need to be recreated with the new version of s6-instance-maker. - New s6-svc -Q command, instructing s6-supervise not to restart the service when it dies (like -O) and to additionally create a ./down file in the service directory. - s6-ioconnect will now always shutdown() socket endpoints at EOF time; the -0, -1, -6 and -7 options are still supported, but deprecated. https://skarnet.org/software/s6/ git://git.skarnet.org/s6 * s6-rc-0.5.4.0 - - Bugfixes. In particular, s6-rc-update now conserves the existing instances in an instanced service, whether the service is currently active or not. In case of a live update, the current instances keep running, but will restart with the new template next time they die (which can be forced by a s6-instance-control -r invocation). - New s6-rc subcommands: start and stop, equivalent to "-u change" and "-d change" respectively. https://skarnet.org/software/s6-rc/ git://git.skarnet.org/s6-rc * s6-linux-init-1.1.0.0 - - s6-linux-init-maker: -U option removed. No early utmpd script is created. The reason for this change is that distros using utmps need stage 2 utmp services anyway (because the wtmp database needs to be persistent so wtmpd and btmpd can only be started after a log filesystem has been mounted), so utmp is unusable before that point no matter what. Distros should have an utmpd service started at the same time as wtmpd and btmpd; so utmp management goes entirely out of scope for s6-linux-init. https://skarnet.org/software/s6-linux-init/ git://git.skarnet.org/s6-linux-init * s6-portable-utils-2.3.0.0 - - s6-test removed, hence the major update. - New --enable-multicall configure option. https://skarnet.org/software/s6-portable-utils/ git://git.skarnet.org/s6-portable-utils * s6-linux-utils-2.6.1.0 -- - s6-mount option support updated. - New --enable-multicall configure option. https://skarnet.org/software/s6-linux-utils/ git://git.skarnet.org/s6-linux-utils Enjoy, Bug-reports welcome. -- Laurent
Re: s6 instanced services are "forgotten" after s6-rc-update
Am I missing something? Sorry, I had failed to push the changes. Fixed now.
Re: s6 instanced services are "forgotten" after s6-rc-update
On s6-instance-update, as a user I'd expect it to exist and be run automatically as part of s6-rc-update. Since restarting applies the new definition for longruns, having to do one extra step (or two if s6-instance-update isn't made) per instance of a templated longrun would be counterintuitive. It took a bit of creative juice, and significant refactoring, but I think I finally got it down. The current s6-rc git will remember your created instances from one compiled db to the next across s6-rc-update, no matter the state of the service at the time of the update. If the updated instanced service is live, then the new template will be copied to all the instances, but the instances will still be running on the old template until they're killed via s6-instance-control, at which point they will restart on the new template. This is in line with the principle of maximizing service uptime and waiting for admin input before killing processes. Please try it and tell me if it's working for you. You'll need to build against the latest git of skalibs and s6. -- Laurent
Re: single-binary for execline programs?
Yes, this is only possible because you did a very good job in the first place. Good work! This cannot be said enough. Thanks. I managed to de-global the arrays in trap.c, so now the only unavoidable global is in forstdin: a pointer to a structure accessed by a signal handler. You'd think with all the siginfo stuff, POSIX would have thought of mandating a void * auxiliary pointer you'd give to sigaction() and that would be stored and provided to the signal handler, but no, there's just no room to pass user data other than globally. Yet another example of wonderful, user-friendly design. But yeah, 8 bytes of bss/data for the whole thing is pretty good, the crt and the libc are basically the only static RAM users, so there's nothing more to do here. I was also curious about starting time and should have done that in my previous mail, it's a bit slower as expected. Yeah, a 0.2 ms difference is fine, I think. :P But I'm not sure if it's possible to get an accurate benchmark, because the cost of 4-5 strcmp()s are negligible before the cost of the execve's in the first place. I suspect at least half of the difference comes from mapping a bigger executable. I think the main reason to like shared libraries as a distribution is that if you upgrade it, you get the upgrade for all programs that depend on it -- which isn't reall a problem for this. Oh, absolutely, and that's why it's hard to advocate static linking to distributions. It's a very reasonable argument for dynamic linking. At the risk of repeating myself, I'll be happy to help with anything related to this -- that's the least I can do given I brought it up. Thank you. I might seriously take you up on that offer further down the road. :) But really, since the "cat everything together" method works in this case, there's not much more to do except pay attention when writing or editing normal programs in the future. I pushed "multicall-strip" and "multicall-install" targets in git, and documented the setup in the INSTALL file. As experimental, because although I *think* everything is working, there may still be some interaction I've missed. -- Laurent
Re: single-binary for execline programs?
allow you to link against a dynamic libexecline. Can you do it, see how much space you gain? That's a configuration I would definitely support, even if it's slower - people usually love shared libraries. I'm tired. This configuration is obviously already supported, and no need to patch. You just need to ./configure --disable-allstatic. Compared to fully static binaries with musl, a fully static multicall is a size gain of 87%. Compared to fully dynamic binaries, it's a gain of 69%. Still impressive. A fully dynamic multicall binary is only 80 kB on x86_64, but it's a pretty stupid configuration because it's the only user of libexecline so nothing is gained by sharing it. There still may be a case for sharing libskarnet; but toolchains make it so difficult to link against some libraries statically and some others dynamically that supporting that configuration is just not worth it. -- Laurent
Re: single-binary for execline programs?
Look, here's a trivial, suboptimal wrapper, far from pretty: > (...) (look, I said it wasn't pretty -- there are at least a dozen of problems with this, but nothing a day of work I offered to do can't fix; I wrote this because it was faster than talking without a concrete example to have some figures, and that took me less time than the rest of this mail) Damn you to the nine circles of Hell, one by one, slowly, then all of them at the same time. You piqued my curiosity, so I did it, and I spent the day making it work. The execline git now has a 'multicall' make target. It will make an "execline" binary that has *everything* in it. You can symlink it to the name of an execline program and it will do what you expect. You can also call the subcommand as argv[1]: "execline exit 3" will exit 3. No install targets, no automatic stripping, no symlinks, nothing. I don't want to officially support this configuration, because I *know* it will be a time sink - every ricer on the planet will want me to change something. So you get the binary for your own enjoyment, and that's it. Have fun. If it breaks, you get to keep both pieces. It's really rough: it only marginally improves on your model, fixing the most glaring problems. The only fancy thing it does is find the applet via bsearch(), because that's easy and it saves about 20 strcmp() per call. Apart from that, it's super dumb. That said, you were right: that's some pretty hefty saving of disk space. The execline binary is 169kB, statically linked against musl on x86_64. That's neat. I expected it to be at least twice bigger. And the data/bss isn't too bad either: only 2 pages. But that's because execline programs use very little global memory in the first place - the only places where globals are used is when state needs to be accessed by signal handlers, and there's nothing I can do about that in short-lived programs. (Long-lived programs use a selfpipe, so only one int of global data is ever needed for them.) So, all in all, much better results than I expected, it was a pleasant surprise. Still, concatenating all the code feels really clunky, and a real multicall program needs to be designed for this from the start, which won't happen for execline in the foreseeable future, so this is as much as you get for now. If you're interested in hacking the thing, the magic happens in tools/gen-multicall.sh. libexecline is statically linked, so these pages aren't shared afaik? That's right, I forgot it was always statically linked. If it helps, changing ${LIBEXECLINE} to -lexecline in the src/execline/deps-exe files, then running ./tools/gen-deps.sh > package/deps.mak, should allow you to link against a dynamic libexecline. Can you do it, see how much space you gain? That's a configuration I would definitely support, even if it's slower - people usually love shared libraries. I really don't see what's different between e.g. execline and coreutils, who apparently thought it was worth it; coreutils also thought it was worth it to implement true --help and true --version, so I'll leave to your imagination how much I value their technical judgment. The only way to know for sure whether it will be worth it is to stop speculating and start profiling, which is what I did. And it appears the results are interesting, so, that's great! Sigh. I shouldn't feel that way, and any potential improvement should be a source of joy, not dread - but really I wish the results weren't so good. Now Pandora's box has been opened and everyone will want to use the multicall exclusively, so at some point I'll have to support it, i.e. ensure it's actually correct and enhance its maintainability. And that means a lot more work. :( But, unfortunately for you, the full openrc suite is 2.2MB (5 on arm with bloated aarch64), which is a bit less than the s6 suite :-D No, that's fair. It's true that s6 takes a bit more disk space. Where OpenRC loses is RAM and CPU, because it does everything in shell scripts. And shell scripts definitely win on disk space. :) -- Laurent
Re: single-binary for execline programs?
I believe I did my homework looking first -- are there other discussion channels than this list that one should be aware of? The lists are definitely the only place you *should* be aware of, but there are a lot of informal spaces where discussions happen, because not everyone is as well-behaved as you are :) Github issues, webforums of other projects, IRC channels, etc. The important stuff normally only happens here, but I'm getting user feedback from several sources. I'd go out a limb and say if you only support single-binary mode, some of the code could be simplified further by sharing some argument handling, but it's hard to do simpler than your exlsn_main wrapper so it'll likely be identical with individual programs not changing at all, with just an extra shim to wrap them all; it's not like busybox where individual binaries can be selected so a static wrapper would be dead simple. I doubt much sharing would be possible. The main problem I have with multicall is that the program's functionality changes depending on argv[0]. You need to first select on argv[0], and *then* you can parse options and handle arguments. Note that each exlsn_* function needs its own call to subgetopt_r(), despite the options being very similar because they all fill an eltransforminfo_t structure. Having a shim over *all* the execline programs would be that, multiplied by the number of programs; at the source level, there would not be any significant refactoring, because each program is pretty much its own thing. An executable is its own atomic unit, more or less. If anything, execline is the package that's the *least* adapted to multicall because of this. There is no possible sharing between "if" and "piperw", for instance, because these are two small units with very distinct functionality. The only way to make execline suited to multicall would be to entirely refactor the code of the executables and make a giant library, à la busybox. And I am familiar enough with analyzing and patching busybox that I certainly do not want to add that kind of maintenance nightmare to execline. Anything that can be shared in execline is pretty much already shared in libexecline. If you build execline with full shared libraries, you get as much code sharing as is reasonably accessible without a complete rearchitecture. Any significant disk space you would gain in a multicall binary compared to a bunch of dynamically linked executables would come from the deduplication of unavoidable ELF boilerplate and C run-time, and that's basically it. The "one unique binary" argument applies better to some of my other software; for instance, the latest s6-instance-* additions to s6. I considered making a unique "s6-instance" binary, with varying functionality depending on an argv[1] subcommand; I eventually decided against it because it would have broken UI consistency with the rest of s6, but it would have been a reasonable choice for this set of programs - which are already thin wrappers around library calls and share a lot of code. Same thing with s6-fdholder-*. execline binaries, by contrast, are all over the place, and *not* good candidates for multicall. Hmm, I'd need to do some measurements, but my impression would be that since the overall size is smaller it should pay off for any pipeline calling more than a handful of binaries, as you'll benefit from running the same binary multiple times rather than having to look through multiple binaries (even without optimizing the execs out). Yes, you might win a few pages by sharing the text, but I'm more concerned about bss and data. Although I take some care in minimizing globals, I know that in my typical small programs, it won't matter if I add an int global, because the amount of global data I need will never reach 4k, so it won't map an extra page. When you start aggregating applets, the cost of globals skyrockets. You need to pay extra attention to every piece of data. Let me bring the example of busybox again: vda, the maintainer, does an excellent job of keeping the bss/data overhead low (only 2 pages of global private/dirty), but that's at the price of keeping it front and center, always, when reviewing and merging patches, and nacking stuff that would otherwise be a significant improvement. It's *hard*, and hampers code agility in a serious way. I don't want that. Sure, you can say that globals are a bad idea anyway, but a lot of programs need *some* state, if local to a TU - and the C and ELF models make it so that TU-local variables still end up in the global data section. Even almost 1MB (the x86_64 version that doesn't have the problem, package currently 852KB installed size + filesystem overhead..) is still something I consider big for the systems I'm building, even without the binutils issue it's getting harder to fit in a complete rootfs in 100MB. I will never understand how disk space is an issue for execline and s6. RAM absolutely is, because
Re: s6 instanced services are "forgotten" after s6-rc-update
Agree on avoiding restarting old instances. If instances were atomic services, s6-rc-update wouldn't restart them either. OTOH, the template's files are copied, not symlinked, which means restarting old instances will use the old template. Does this call for an s6-instance-update program? The fix I currently have in git does exactly that: instances are now correctly transmitted across s6-rc-update, and not restarted; the new template is copied, but it's not copied to existing instances, it will only be used for new ones. To get the new template on an existing instance, you need s6-instance-delete + s6-instance-create. There may indeed be some value to an s6-instance-update program that would provide a new template to an existing instance, with an option to immediately restart the instance or not. I'll think about it some more, inputs welcome. -- Laurent
Re: single-binary for execline programs?
In particular there's a "feature" with recent binutils that makes every binary be at least 64KB on arm/aarch64[1], so the execline package is a whopping 3.41MB[2] there (... and still 852KB on x86_64[3]) -- whereas just doing a dummy sed to avoid conflict on main and bundling all .c together in a single binary yields just 148KB (x86_64 but should be similar on all archs -- we're talking x20 bloat from aarch64/armv7 sizes! Precious memory and disk space!) > (...) It should be fairly easy to do something like coreutils' --enable-single-binary without much modification The subject has come up a few times recently, so, at the risk of being blunt, I will make it very clear and definitive, for future reference: No. It will not happen. The fact that toolchains are becoming worse and worse is not imputable to execline, or to the way I write or package software. It has always been possible, and reasonable, to provide a lot of small binaries. Building a binary is not inherently more complicated today than it was 20 years ago. There is no fundamental reason why this should change; the only reason why people are even thinking this is that there is an implicit assumption that software always becomes better with time, and using the latest versions is always a good idea. I am guilty of this too. This assumption is true when it comes to bugs, but it becomes false if the main functionality of a project is impacted. If a newer version of binutils is unable to produce reasonably small binaries, to the point that it incites software developers to change their packaging to accommodate the tool, then it's not an improvement, it's a recession. And the place to fix it is binutils. The tooling should be at the service of programmers, not the other way around. It is a similar issue when glibc makes it expensive in terms of RAM to run a large number of copies of the same process. Linux, like other Unix-like kernels, is very efficient at this, and shares everything that can be shared, but glibc performs *a lot* of private mappings that incur considerable overhead. (See the thread around this message: https://skarnet.org/lists/supervision/2804.html for an example.) Does that mean that running 100 copies of the same binary is a bad model? No, it just means that glibc is terrible at that and needs improvement. Back in the day when Solaris was relevant, it had an incredibly expensive implementation of fork(), which made it difficult, especially with the processing power of 1990s-era Sun hardware, to write servers that forked and still served a reasonable number of connections. It led to emerging "good practices", that were taught by my (otherwise wonderful) C/Unix programming teacher, and that were: fork as little as possible, use a single process to do everything. And that's how most userspace on Solaris worked indeed. It did a lot of harm to the ecosystem, turning programs into giant messes because people did not want to use the primitives that were available to them for fear of inefficiency, and jumping through hoops to work around it at the expense of maintainability. Switching to Linux and its efficient fork() was a relief. Multicall binaries have costs, mostly maintainability costs. Switching from a multiple binaries model to a multicall binary model because the tooling is making the multiple binaries model unusably expensive is basically moving the burden from the tooling to the maintainer. Here's a worse tool, do more effort to accommodate it! Additionally to maintainability costs, multicall binaries also have a small cost in CPU usage (binary starting time) and RAM usage (larger mappings, fewer memory optimizations) compared to multiple binaries. These costs are paid not by the maintainer, but by the users. Everyone loses. Well, no. If having a bunch of execline binaries becomes more expensive in disk space because of an "upgrade" in binutils, that is a binutils problem, and the place to fix it is binutils. In the long run this could also provide a workaround for conflicting names, cf. old 2016 thread[4], if we'd prefer either running the appropriate main directly or re-exec'ing into the current binary after setting argv[0] appropriately for "builtins". There have been no conflicts since "import". I do not expect more name conflicts in the future, and in any case, that is not an issue that multicall binaries can solve any better than multiple binaries. These are completely orthogonal things. (I assume you wouldn't like the idea of not installing the individual commands, but that'd become a possibility as well. I'm personally a bit uncomfortable having something in $PATH for 'if' and other commands that have historically been shell builtins, but have a different usage for execline...) You're not the only one who is uncomfortable with it, but it's really a perception thing. There has never been a problem caused by it. Shells don't get confused. External tools don't get confused. On this
Re: s6 instanced services are "forgotten" after s6-rc-update
I can provide an strace of s6-rc-update if needed. Looking into it, it seems s6-rc-update "uncritically" unlinks the live instance/ and instances/ folders and replaces them with brand-new copies from the compiled database. I can confirm that this happens and that it was an oversight; I'm now in the process of fixing it (which will involve a few changes to s6 ending up in a major update, I'm afraid). A question I have is: what should s6-rc-update do when the template has changed? The template will obviously be changed in the new service, but should the old instances stay alive, with the old template? My natural inclinaton is to say yes; if the user wants the service restarted they can say so explicitly in the conversion file. But maybe there are better alternatives I haven't thought about. -- Laurent
Re: s6 instanced services are "forgotten" after s6-rc-update
After having an instanced service definition for s6-rc, subsequent calls to s6-rc-update seem to clobber the instance/ and instances/ subfolders in a way that keeps the instances running, but makes it impossible to control them. After s6-rc-update, I get error messages like: fatal: unable to open /run/service/agetties/instance/.s6-svscan/lock: No such file or directory fatal: unable to control /run/service/agetties/instance/tty1: No such file or directory Hmm, that's weird. Are you sure you're using the latest s6-rc version? Only 0.5.3.3 will correctly manage instances. If you are, I'll try to reproduce the issue to understand what's going on. -- Laurent
[announce] skarnet.org January 2023 release
Hello, New versions of the skarnet.org packages are available. This release is overdue, sorry for the delay - but finally, happy new year everyone! skalibs' strerr_* functions and macros, meant to provide shortcuts for error message composition and output, have been rewritten; they're no longer split between strerr.h and strerr2.h, but are all gathered in strerr.h - the skalibs/strerr2.h headers is now deprecated. This is released as a major version upgrade to skalibs because some hardly ever used strerr macros have been outright removed; and the deprecation of strerr2.h also counts as an API change. However, unless you were using the deleted strerr macros (highly unlikely, as there was no reason to, which is why they're being deleted in the first place), your software should still build as is with the new skalibs, maybe with warnings. The rest of the skarnet.org software stack has undergone at least a release bump, in order to build with the new skalibs with no warnings. Most packages also include several bugfixes, so upgrading the whole stack is recommended. The new version of s6 includes a feature that has often been asked for: an implementation of dynamically instanced services. Six new commands allow you to create and manage dynamic instances of a given service directory template, parameterized by an argument you give to the run script. It also comes with a few quality-of-life changes, such as s6-log line prefixing, as well as a good number of minor bugfixes. The "s6-test" program, formerly in s6-portable-utils, has migrated to the execline package, where it is named "eltest". It still exists in s6-portable-utils, but is deprecated and will be removed in a future release. The new versions are the following: skalibs-2.13.0.0 (major) nsss-0.2.0.2 (release) utmps-0.1.2.1(release) execline-2.9.1.0 (minor) s6-2.11.2.0 (minor) s6-rc-0.5.3.3(release) s6-linux-init-1.0.8.1(release) s6-portable-utils-2.2.5.1(release) s6-linux-utils-2.6.0.1 (release) s6-dns-2.3.5.5 (release) s6-networking-2.5.1.2(release) mdevd-0.1.6.1(release) smtpd-starttls-proxy-0.0.1.2 (release) bcnm-0.0.1.6 (release) dnsfunnel-0.0.1.5(release) Details of some of these package changes follow. * skalibs-2.13.0.0 - Bugfixes. - New functions: buffer_timed_put, buffer_timed_puts, for synchronous writes to a file descriptor with a time limit. - strerr2.h deprecated. strerr.h entirely revamped. Every existing strerr interface is now a variable argument macro around the new strerr_warnv, strerr_warnvsys, strerr_diev and strerr_dievsys functions, which just prints arrays of strings to stderr. This reduces the amount of adhocness in the strerr code considerably, allows calls without an upper bound on the number of strings, and should save some bytes in resulting binaries. https://skarnet.org/software/skalibs/ git://git.skarnet.org/skalibs * execline-2.9.1.0 - Bugfixes. - New program: eltest. This is the program formely available in s6-portable-utils as "s6-test", that has changed packages and be renamed. It's a quasi-implementation of the POSIX "test" utility, that was too useful in execline scripts to be off in a separate package. (Quasi because the exact spec is bad.) It understands -v, for testing the existence of a variable, and =~, for regular expression matching. https://skarnet.org/software/execline/ git://git.skarnet.org/execline * s6-2.11.2.0 --- - Bugfixes. - The one-second service restart delay can now only be skipped when the service is ready. This prevents CPU hogging when a heavy service takes a long time to start and fails before reaching readiness. - The name of the service is now passed as the first argument to ./run and as the third argument to ./finish. - s6-log now understands a new directive: p. "pfoobar:" means that the current log line will be prepended with the "foobar: " prefix. This allows service differentiation in downstream log processing, which was an often requested feature. - New commands available: s6-instance-maker, s6-instance-create, s6-instance-delete, s6-instance-control, s6-instance-status, s6-instance-list. They allow you to manage supervised sets of services created from the same templated service directory with only a parameter (the name of the instance) changing. https://skarnet.org/software/s6/ git://git.skarnet.org/s6 * s6-portable-utils-2.2.5.1 - - s6-test is now deprecated, replaced with the eltest program in the execline package. https://skarnet.org/software/s6-portable-utils/ git://git.skarnet.org/s6-portable-utils Enjoy, Bug-reports welcome as always. -- Laurent
Re: skabus: more related software
This is a complete message bus implementation https://codeberg.org/maandree/bus Perhaps it can be reused? Or at least mentioned under "Similar work" here https://skarnet.org/software/skabus/ I wouldn't say "complete", because depending on your definition of "bus" it's unfortunately not possible to implement one without a daemon on Unix - but yeah it's doing pubsub, similarly to s6's libftrig (but more efficiently via shmem). Added to the skabus "similar work" section, thanks for mentioning it! -- Laurent
Re: [PATCH] Document skalibs/siovec.h header
Thanks! Merged with some rewrites where the decription wasn't accurate. Also wrote some doc for siovec_search() which is the one that required actual effort to come up with :P It allowed me to spot and fix a small bug, too. The release is coming soon, but I still need to document, test, and polish a new s6 feature, so it will be a few more days, sorry about that. -- Laurent
Re: (u)intN_bfmt macros use (u)intN0_fmt_base instead of (u)intN_fmt_base
In src/header/bits-template, line 22: #define uint@BITS@_bfmt(s, b) uint@BITS@0_fmt_base(s, (b), 2) and line 45: #define int@BITS@_bfmt(s, b) int@BITS@0_fmt_base(s, (b), 2) shouldn't have zeros Good catch, thanks! Fixed in current git. -- Laurent
Re: Reading s6-rc database without root for completion
I'd like for the user to be able to complete `sudo s6-rc -u change some...` from a non-root terminal, but trying to use the output of `s6-rc-db list services` fails as it can't take a lock in /run/s6-rc/compiled. Hm, that's an oversight on my part. Reading the database should be possible by normal users, but the lock is currently taken O_RDWR (because the locking primitive is the same for reading and for writing), so it fails. I will fix that. Until then, sure, read your info from the source directory, but be aware it may not be in sync with the current live database. Thanks for the report! -- Laurent
Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?
To clarify, I'm referring to the ->target member (in srv) or the ->exchange member (in mx). Are those not the same as the input format for skadns_send? When parsed by s6dns_message_parse_answer_srv() and s6dns_message_parse_answer_mx(), the domains are obtained from the packet via s6dns_message_get_domain(), which call s6dns_domain_decode(). In other words, when you obtain a s6dns_message_rr_srv_t or a s6dns_message_rr_mx_t, the domains in these structures are in string format. (Because usually they're destined to be returned to the application and displayed, not used in another packet right away.) So if you want to reuse these domains for another skadns_send() query, you need to re-encode them first via s6dns_domain_encode(). Thanks Guillermo for getting to the bottom of this! :) -- Laurent
Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?
However, the OS would still deliver them to skadnsd in a recv() / recvfrom() call, right? If my reading of the truss outputs is correct, the HardenedBSD system isn't getting a response at all, That's right, which is why my hypothesis of the RD bit filter only applied to OmniOS, which did get responses but these got ignored by skadnsd. On HardenedBSD, 18 queries getting no answers from the caches is absolutely a different problem. and whatever error happens with the program running on the OmniOS system, if any, does not involve the network It involves the relevance test: https://github.com/skarnet/s6-dns/blob/master/src/libs6dns/s6dns_engine.c#L32 This function is called on every incoming message that is a potential response. If it returns 0, the message is deemed irrelevant to the current query, and ignored. When you see a recv() (or recvfrom()) from a UDP socket, but no answer is reported to the client and the socket is still polled until it times out, it means that the relevant() test failed. Until tonight, the "h.rd != (q[2] & 1)" test, i.e. "is the rd bit of the response different from the rd bit of the query", was performed outside of the "strict" guard. This made some responses be ignored as malformed, because it's the cache not following the RFC; it is quite possible that it's what happened on OmniOS here. (I can't tell if skadnsd is delivering all received answers to the client). After the first one which is a connection/synchronization marker, a write() to the async pipe to the client (10 on HardenedBSD, 9 on OmniOS) is an answer or a sequence of answers. (skadnsd buffers the answers into a textmessage_sender, i.e. a bufalloc, which is flushed at the next ppoll() invocation.) Writes of length 7 are failures (4 bytes length, 2 bytes query id, 1 byte errno); writes of length 14 are 2 reports of failure, you can see it in the string. 28 is 4 failures; 95 and 140 are likely 1 success (length, query id, 0 for success, then the response packet); 279 is likely two successes. At the end of the traces, we get EOF on 0 while there are still a lot of sockets being polled. That's the client exiting - or at least closing the skadns connection - while some queries are in-flight. The bro math checks out, it definitely looks like all received answers, positive and negative, have been delivered. I feel that packet capture tools like tcpdump(1) or OmniOS' snoop(8) would be better suited for answering the questions that have been raised so far (malformed packets, ignored responses, lack of responses, etc.). strace has an option to print full strings. truss should have a similar option (if its display can be trusted...) You're right that packet capture tools would be good to use in this situation, but since I personally loathe using them, I don't want to ask other people to use them, and I can work with what we have. On HardenedBSD at least, the traces are readable. Also, aren't 18 outstanding queries in a short amount of time from one single host, like, a lot? Couldn't Shaw's caches think that they are being DoS'ed :P ? That's definitely possible, and I would say likely, but I don't want to lay the blame on others before making sure we're in the clear. :) -- Laurent
Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?
Anyway. Pre-update `/package/web/s6-dns/command/s6-dnsip[46] perihelion.ultradian.club` returns the correct response on both machines, even if run after doing the SRV and MX lookups. Wilder and wilder. Can you test s6-dnsip[46]-filter? { echo domain1.org ; echo domain2.org ; ... } | s6-dnsip4-filter These do A and queries, but via skadns. If skadnsd is the culprit, the -filter programs should fail. (side note: I'm realizing that my program makes duplicate queries. This shouldn't impact the accuracy of the responses, but it does mean the caches could be blocking me or something, but not blocking me when I use /package/web/s6-dns/command/s6-dnsip[46].) Could be. We're trying to build a simple test case that fails. If our simple test cases all pass and your program fails, the cause may be in the way your program is spamming the cache - but you'd have to ask the cache administrators about querying policies to test that hypothesis. -- Laurent
Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?
s6dns_engine filters answers that do not seem relevant to in-flight queries. That includes malformed answers or ones that do not follow RFC 1035. I was made aware (thanks, Ermine) that some caches fail to set the RD bit in their responses to queries containing the RD bit; these answers were ignored. I just pushed a workaround to the s6-dns git, to only perform the RD check on answers when a "strict" flag is given, which it's not in any of the command-line wrappers or in skadnsd. Can you please try with the latest s6-dns git and see if the answers you're getting on OmniOS are accepted this time? -- Laurent
Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?
On OmniOS, all the DNS queries (apparently 58) received a response. On HardenedBSD, only the first 4 queries received a response, the next 18 timed out. They were retried 4 additional times, as expected, again timing out without receiving a response. The fd of the async pipe to the client isn't the same in both outputs: it's 9 on OmniOS and 10 on HardenedBSD, which means the client uses one more fd on HardenedBSD for some reason. (Does OmniOS support signalfd()? That would explain it.) On HardenedBSD, 4 queries received responses, that were properly reported to the client. The others were pending and retried with longer timeouts, but only 6 of them reported a full timeout to the client. The client exited while 12 queries were technically still in flight. On OmniOS, I can't even make sense of some of the strings, typically in the async responses to the client. What is the endianness of this machine? A network byte order 32-bit number equal to 3 seems to be encoded as { 0, 0, 3, 0 }, which doesn't look right. (I did check my uint32_bswap() primitive.) If the client isn't complaining very loudly when it receives such strings, it means the strings are correct and the truss tool displays them incorrectly, which doesn't help me diagnose what's going on. In any case the problems look unrelated to skadnsd and come from the interaction between the s6-dns library and the caches: either the packets are correct and the caches are not sending the responses they should, and that's not an s6-dns problem, or the packets are malformed and that's why the servers are ignoring them, and I need to fix that. Amelia, could you do some tests (with the same caches) from s6-dns command-line clients such as s6-dnsip4? That will bypass the skadns layer, and will be easier to trace and understand. Thanks :) -- Laurent
Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?
Neither of those conditions actually apply - my network is up and my resolver is responding (albeit slowly - it takes about a second). I get the expected response on the first batch of queries I fire off, but then the second batch gets ENETUNREACH. This happens every time I run my program (albeit on special snowflake illumos; I have not tried on other OSes). If you think s6-dns is behaving incorrectly, please pastebin a strace (or local equivalent) of skadnsd somewhere, so we can check what it is doing. -- Laurent
Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?
i source spelunked and the story is that, if the error is coming from s6dns_engine_prepare, dt->protostate exceeds or equals 4. I chased that struct member around a few times and I couldn't figure out what it means to s6dns. dt->protostate is used for two things: - in UDP mode, to track how many times the query has been sent to the whole list of caches and all of them have failed to answer within a given timeout. The timeout increases for each round. - in TCP mode, to track how many bytes of the query have been written and how many bytes of the answer have been received (a congested network may result in short writes or reads). The error you got indeed happens when you're in UDP mode (the starting default for every query), dt->protostate has reached 4 and s6dns_engine_prepare() returns 0 ENETUNREACH, which s6dns_engine_timeout() stores into dt->status and skadnsd then sends back to your client. What it means is that your query was sent in succession to every cache listed in dt->servers (most likely, the list of "nameserver" entries in your /etc/resolv.conf, unless you overrode it with the DNSCACHEIP environment variable), and every one of them failed to answer within 1 second, then within 3 seconds, then within 11 seconds, then within 45 seconds. That sounds like either your nameserver list is bad, or your own network is down; and s6-dns reports this as ENETUNREACH. -- Laurent