Introduction ------------ In [1], I (probably after many other people with similar ideas) came up with the idea of a "language that somehow combines execline-like chainloading and shell-like sequential execution".
[1] <http://skarnet.org/cgi-bin/archive.cgi? 2:mss:1130:jgoekjpfpjmfnefbkcob>. Last week, I set up a production server with s6 / s6-rc as its init / rc system, and experimented with above-mentioned idea. The result is very encouraging: apart from simple and clear (i.e. not kludgy) shell scripts now there are only 3 execline scripts that are not machine-generated (i.e. automatically produced by s6-rc-compile etc): stage 1, stage 3, `.s6-svscan/finish'; and they are currently not rewritten in shell only because of limits of (perhaps all) current shell implementations. This mail is organised in three parts: this introduction, a logical reassessment of "Why not just use /bin/sh?" [2], a technical analysis on what needs to be incorporated into a shell to make it also a comfortable language for most currently execline-specific tasks, and an appendix. [2] <http://skarnet.org/software/execline/dieshdiedie.html>. Why shell again --------------- (The reader not familiar with [2] are strongly advised to read it; even a cursory skimming will be instructive.) As a summary, nearly all problems of sh(1) criticised in [2] are not inherit problem of the shell language, but results of crufts accumulated in the Bourne shell specification. With a completely redesigned shell like rc(1) [3] (cf. [4] for the origin paper on its design), these problems no longer exist: [3] <https://swtch.com/plan9port/man/man1/rc.html> [4] <http://doc.cat-v.org/plan_9/4th_edition/papers/rc>. * Parsing: rc(1)'s parser is implemented with yacc; it does not rescan its input except when ordered to `eval'. And you rarely need `eval', because `$$key=$val' is legitimate and nested functions are allowed. (For instance, in werc [5], also noted in [1], there is not any `eval' in the rc(1) scripts). [5] <http://werc.cat-v.org/>. * Quoting: the quoting rules in rc(1) is much simpler (a little weird is that literal single quote "'" is "''" in quoted strings, so a stray "'" need to be written as "''''"); also, rc(1)'s variables are all lists, which make the language much more self-consistent than sh(1). * Simplicity: note how short [3] is, and how much can be done according to the specification; consider how long the manpage (if written by a similar writer) of all execline programs needed to achieve the same thing will be. * Performance: in comparison with the CPU time of service programs, the shell is always a small part (and if not, YDIW; also I remember that in a slashdot comment on systemd's developers' tendency to consider the shell non-performant, the author notes that even in the times of mainframes with resources comparable to outdated mobile phones, the shell is rarely a performance bottleneck). For application scenarios I feel familiar with, I estimate that the speedup of the init phase from replacing busybox ash(1) with execline would be less than 20%, and usually less than 10%. * Memory usage: on my server, I found > $ for i in $(seq 500); do loopwhilex /bin/true /bin/true & done > $ for i in $(seq 100); do rc -c 'while (/bin/true) /bin/true' & done and > $ for i in $(seq 100); do ash -c \ > 'while /bin/true; do /bin/true; done' & done to take up about 80M, 170M and 110M, so this advantage is not really as big as I imagined. BTW, rc(1) taking the most memory is quite surprising; I believe it's because of implementation issues: if the code was written by the busybox developers, or several programmers on this mail list, the memory usage could be much better. * Unneeded features: do note that busybox is usually distributed as an all-in-one binary. If "do one thing and do it well" at the binary aggregation level is often a concern, separately distributed applets would be much more often seen, which is obviously far from the fact. * Portability: rc(1) for Unix has at least two variants: plan9port [6] and the port currently maintained by Toby Goodwin [7]. I do admit that there will be only one execline implementation at least for a long time; but I think fragmentation of a programming language (for example the language I propose) will not be an issue as long as the community is sufficiently active and coherent. [6] <https://swtch.com/plan9port/>. [7] <http://tobold.org/article/rc>. What do we need --------------- In [1] there is the statament "the worst use case of execline is when the shell is used as a programming language". In my summary, execline is best described as a language for precise construction of a process tree by means of chainloading. Thus, if execline's chainloading programs are incorporated into the shell, what can be done with execline would also be achievable with the shell, and usually in a easier manner. The crux of the problem is that chainloading does not need to be identified with exec(): an execline script > modify_proc_attrib1 args1... # Think of `redirfd -w /dev/null'. > modify_proc_attrib2 args2... > modify_proc_attrib2 args3... > ... is conceptually equivalent to a shell script > modify_proc_attrib1 args1... # Think of `exec > /dev/null'. > modify_proc_attrib2 args2... > modify_proc_attrib2 args3... > ... except that: * The shell script only does the modifications, and does not exec(). * The shell might not support all of `modify_proc_attrib*'. The latter is the case of the three execline scripts mentioned above: * stage 1 needs `setsid' and `redirfd -wnb'. * `.s6-svscan/finish' and stage 3 need `wait -r' and `wait' (both need to wait also for process not in the shell's job table). Unfortunately, I am not a system programmer, and do not my current time schedule allow me to spend enough time to systematically learn it; but I think that a language that combines the advantages of shell and execline is not only a concept, but also a feasible and rewarding goal, which is worth a Unix programmer's efforts. Thus I really wish somebody that is interested in process supervision and has the resource to try to realise this concept, probably by incorporating execline utilities into rc(1). Appendix -------- Attached is a simplified tarball of the config of my Alpine server, plus instructions for the setup; a VM is downloadable from [8]. The setup can be used on production machines, but some issues require notice: * Alpine's current opensmtpd installation script mkdir(1)s /var/spool/mail with 0777 permission [9], which should obviously be fixed. * Symlinks are used extensively in the service definitions to implement code reuse (including instanced services); code for the same feature would be noticeably longer if written in execline instead of shell. * Logging is always done with `nobody' user. If stronger security is pursued, it is advised to assign logging of different services to different users, just like OpenBSD and `example/' from the s6-rc source distribution do. [8] <https://drive.google.com/file/d/0B3FGvKEMCkmXb2Q2NXNZNFJPMXM/view>. [9] <https://bugs.alpinelinux.org/issues/6068>. The scripts use the rc(1) variant in [7] as provided in the `rc' package currently in Alpine's `testing' repo. I used these features that the plan9port variant does not have: * `exec' without arguments but with redirections (this incompatibility is not mentioned in its manpage). * `else' instead of `if not' statement. And some unsatisfying properties of rc(1): * The [7] implementation has a bug that an `if' block must be enclosed in braces if it is to be followed by `else'). * The lack of `return', `break' and `continue' is annoying (I really wonder why Tom Duff [2] considered them "redundant or only marginally useful"). The [7] implementation adds `return' and `break' again, but still lacks `continue'. * The lack of a PIPESTATUS counterpart (yes, this is also absent from busybox ash(1)). Bash is admitted bloated, but the functionality of PIPESTATUS is really demanding to construct from other primitives IMHO. -- My current OpenPGP key: RSA4096/0x227E8CAAB7AA186C (expires: 2020.10.19) 7077 7781 B859 5166 AE07 0286 227E 8CAA B7AA 186C