Re: process supervisor - considerations for docker
The idea is that with a docker-targeted s6 tarball, it should universally work on top of any / all base image. Just to make things perfectly clear: I am not going to make a special version of s6 just for Docker. I like Docker, I think it's a useful tool, and I'm willing to support it, as in make adjustments to s6 if necessary to ease integration with Docker without impacting other uses; but I draw the line at adding specific code for it, because 1. it's yet another slippery slope I'm not treading on, and 2. it would defeat the purpose of Docker. AIUI, Docker images can, and should, be standard images - you should be able to run a Linux kernel with init=YOURENTRYPOINT and it should just run, give or take a few details such as /dev and more generally filesystems. (I don't know what assumption a docker entrypoint can make: are /proc and /sys mounted ? is /dev mounted ? is / read-only ? can I create filesystems ?) If what Laurent says is true that the s6 packages are entirely self-contained. It depends on what you mean by self-contained. If you compile and statically link against a libc such as musl, all you need is the execline and s6 binaries. Else, you need a libc.so. And for the build, you also need skalibs. It's unclear to me what format a distributed docker image takes. Is it source, with instructions how to build it ? Is it a binary image ? Or is it possible to distribute both ? Binary images are the most practical, *but* they make assumptions on the target architecture. Having to provide an image per target architecture sounds contrary to the purpose of Docker. * re-write /init from bash --- execline * add the argv[] support for CMD/ENTRYPOINT arguments I can do that with Gorka, with his permission. * /init (probably renamed to be 's6-init' or something like that) Why ? An init process is an init process is an init process, no matter the underlying tools. Also, please don't use the s6-init name. I'm going to need it for a future package and I'd like to avoid possible confusions. * Can probably start work on the first bullet point (convert /init to execline) during this weekend. Unless anyone else would rather jump in before me and do it. But it seems not. Hold your horses, Ben Hur. People usually take a break during the weekend; give at least John and Gorka some time to answer. It's John's initial idea and Gorka's image, let them have the final say, and work on it themselves or delegate tasks as they see fit. * If Laurant wants to push his core s6 releases (including the docker specific one) onto Github. Then it would be great for him to make a github/s6 org with Gorak, as new home for 's6', or else a git mirror of the official skanet.org. I'm not moving s6's home. I can set up a mirror for s6 on github, but I fail to see how this would be useful - it's not as if the skarnet.org server was overloaded. (The day it's overloaded with s6 pull requests will be a happy day for me.) If it's not about pulling, then what is it about ? (In case you can't tell: I'm not a github fan. Technically, it's a good piece of software, git, at the core, with 2 or 3 layers of all your standard bloated crap piled onto it - and it shows whenever you're trying to interoperate with it. Politically, the account creation procedure makes it very clear that github is about money first. The only thing I like about github is the presentation, which is directly inspired from Google. So, yeah, not much to save.) -- Laurent
Re: process supervisor - considerations for docker
On 28/02/2015 11:58, Laurent Bercot wrote: (In case you can't tell: I'm not a github fan. Meh. At this time, publicity is a good thing for my software, even if 1. it's still a bit early, and 2. I have to use tools I'm not entirely comfortable with. So I set up mirrors of everything on github. https://github.com/s6 in particular. Pull to your heart's content and spread the word. Have fun. -- Laurent
Re: process supervisor - considerations for docker
On Fri, Feb 27, 2015 at 10:19 AM, Gorka Lertxundi glertxu...@gmail.com wrote: Dreamcat4, pull request are always welcomed! 2015-02-27 0:40 GMT+01:00 Laurent Bercot ska-supervis...@skarnet.org: On 26/02/2015 21:53, John Regan wrote: Besides, the whole idea here is to make an image that follows best practices, and best practices state we should be using a process supervisor that cleans up orphaned processes and stuff. You should be encouraging people to run their programs, interactively or not, under a supervision tree like s6. The distinction between process and service is key here, and I agree with John. long design rant There's a lot of software out there that seems built on the assumption that a program should do everything within a single executable, and that processes that fail to address certain issues are incomplete and the program needs to be patched. Under Unix, this assumption is incorrect. Unix is mostly defined by its simple and efficient interprocess communication, so a Unix program is best designed as a *set* of processes, with the right communication channels between them, and the right control flow between those processes. Using Unix primitives the right way allows you to accomplish a task with minimal effort by delegating a lot to the operating system. This is how I design and write software: to take advantage of the design of Unix as much as I can, to perform tasks with the lowest possible amount of code. This requires isolating basic building blocks, and providing those building blocks as binaries, with the right interface so users can glue them together on the command line. Take the syslogd service. The rsyslogd way is to have one executable, rsyslogd, that provides the syslogd functionality. The s6 way is to combine several tools to implement syslogd; the functionality already exists, even if it's not immediately apparent. This command line should do: pipeline s6-ipcserver-socketbinder /dev/log s6-envuidgid nobody s6-applyuidgid -Uz s6-ipcserverd ucspilogd s6-envuidgid syslog s6-applyuidgid -Uz s6-log /var/log/syslogd I love puzzles. Yes, that's one unique command line. The syslogd implementation will take the form of two long-running processes, one listening on /dev/log (the syslogd socket) as user nobody, and spawning a short-lived ucspilogd process for every connection to syslog; and the other writing the logs to the /var/log/syslogd directory as user syslog and performing automatic rotation. (You can configure how and where things are logged by writing a real s6-log script at the end of the command line.) Of course, in the real world, you wouldn't write that. First, because s6 provides some shortcuts for common operations so the real command lines would be a tad shorter, and second, because you'd want the long-running processes to be supervised, so you'd use the supervision infrastructure and write two short run scripts instead. (And so, to provide syslogd functionality to one client, you'd really have 1 s6-svscan process, 2 s6-supervise processes, 1 s6-ipcserverd process, 1 ucspilogd process and 1 s6-log process. Yes, 6 processes. This is not as insane as it sounds. Processes are not a scarce resource on Unix; the scarce resources are RAM and CPU. The s6 processes have been designed to take *very* little of those, so the total amount of RAM and CPU they all use is still smaller than the amount used by a single rsyslogd process.) There are good reasons to program this way. Mostly, it amounts to writing as little code as possible. If you look at the source code for every single command that appears on the insane command line above, you'll find that it's pretty short, and short means maintainable - which is the most important quality to have in a codebase, especially when there's just one guy maintaining it. Using high-level languages also reduces the source code's size, but it adds the interpreter's or run-time system's overhead, and a forest of dependencies. What is then run on the machine is not lightweight by any measure. (Plus, most of those languages are total crap.) Anyway, my point is that it often takes several processes to provide a service, and that it's a good thing. This practice should be encouraged. So, yes, running a service under a process supervisor is the right design, and I'm happy that John, Gorka, Les and other people have figured it out. s6 itself provides the process supervision service not as a single executable, but as a set of tools. s6-svscan doesn't do it all, and it's by design. It's just another basic building block. Sure, it's a bit special because it can run as process 1 and is the root of the supervision tree, but that doesn't mean it's a turnkey program - the key lies in how it's used together with other s6 and Unix tools. That's why starting s6-svscan directly as the entrypoint isn't such a good idea. It's much more flexible to run a
Re: process supervisor - considerations for docker
* Once there are 2+ similar s6 images. * May be worth to consult Docker Inc employees about official / base image builds on the hub. Here is an example of why we might benefit from seeking help from Docker Inc: * Multiple FROM images (multiple inheritance). There should already be an open ticket for this feature (which does not exist in Docker). And it seems relevant to our situation. Or they could make a feature called flavours as a way to tweak base images. Then that would save us some unnecessary duplication of work. For example: FROM: ubuntu FLAVOUR: s6 People could instead do: FROM: alpine FLAVOUR: s6 Where FLAVOR: s6 is just a separate auks layer (added ontop of the base) at the time the image is build. So s6 is just the s6-part, kept independent and separated out from the various base images. Then we would only need to worry about maintaining an 's6' flavour, which is self-contained. Bringing everything it needs with it - it's own 'execline' and other needed s6 support tools. So not depending upon anything that may or may-not be in the base image (including busy box). Such help from Docker Inc would save us having to maintain many individual copies of various base images. So we should tell them about it, and let them know that! The missing capability of multiple FROM: base images (which I believe is how is described in current open ticket(s) on docker/docker) is essentially exactly the same idea as this FLAVOR keyword I have used above ^^. They are interchangeable concepts. I've just called it something else for the sake of being awkward / whatever.
Re: process supervisor - considerations for docker
On Fri, Feb 27, 2015 at 1:10 PM, Dreamcat4 dreamc...@gmail.com wrote: * Once there are 2+ similar s6 images. * May be worth to consult Docker Inc employees about official / base image builds on the hub. Here is an example of why we might benefit from seeking help from Docker Inc: * Multiple FROM images (multiple inheritance). There should already be an open ticket for this feature (which does not exist in Docker). And it seems relevant to our situation. Or they could make a feature called flavours as a way to tweak base images. Then that would save us some unnecessary duplication of work. For example: FROM: ubuntu FLAVOUR: s6 People could instead do: FROM: alpine FLAVOUR: s6 Oh wait a minute: I'm being a little retarded. We can already use ADD for achieving that sort of thing. Just instead the entry would point to a github URL to get a single tarball from. Gorka is sort-of already doing this… just with 2 separate ones, without his /init included within, which is copied from a local directory etc. Where FLAVOR: s6 is just a separate auks layer (added ontop of the base) at the time the image is build. So s6 is just the s6-part, kept independent and separated out from the various base images. Then we would only need to worry about maintaining an 's6' flavour, which is self-contained. Bringing everything it needs with it - it's own 'execline' and other needed s6 support tools. So not depending upon anything that may or may-not be in the base image (including busy box). Such help from Docker Inc would save us having to maintain many individual copies of various base images. So we should tell them about it, and let them know that! The missing capability of multiple FROM: base images (which I believe is how is described in current open ticket(s) on docker/docker) is essentially exactly the same idea as this FLAVOR keyword I have used above ^^. They are interchangeable concepts. I've just called it something else for the sake of being awkward / whatever.
Re: process supervisor - considerations for docker
Dreamcat4, pull request are always welcomed! 2015-02-27 0:40 GMT+01:00 Laurent Bercot ska-supervis...@skarnet.org: On 26/02/2015 21:53, John Regan wrote: Besides, the whole idea here is to make an image that follows best practices, and best practices state we should be using a process supervisor that cleans up orphaned processes and stuff. You should be encouraging people to run their programs, interactively or not, under a supervision tree like s6. The distinction between process and service is key here, and I agree with John. long design rant There's a lot of software out there that seems built on the assumption that a program should do everything within a single executable, and that processes that fail to address certain issues are incomplete and the program needs to be patched. Under Unix, this assumption is incorrect. Unix is mostly defined by its simple and efficient interprocess communication, so a Unix program is best designed as a *set* of processes, with the right communication channels between them, and the right control flow between those processes. Using Unix primitives the right way allows you to accomplish a task with minimal effort by delegating a lot to the operating system. This is how I design and write software: to take advantage of the design of Unix as much as I can, to perform tasks with the lowest possible amount of code. This requires isolating basic building blocks, and providing those building blocks as binaries, with the right interface so users can glue them together on the command line. Take the syslogd service. The rsyslogd way is to have one executable, rsyslogd, that provides the syslogd functionality. The s6 way is to combine several tools to implement syslogd; the functionality already exists, even if it's not immediately apparent. This command line should do: pipeline s6-ipcserver-socketbinder /dev/log s6-envuidgid nobody s6-applyuidgid -Uz s6-ipcserverd ucspilogd s6-envuidgid syslog s6-applyuidgid -Uz s6-log /var/log/syslogd I love puzzles. Yes, that's one unique command line. The syslogd implementation will take the form of two long-running processes, one listening on /dev/log (the syslogd socket) as user nobody, and spawning a short-lived ucspilogd process for every connection to syslog; and the other writing the logs to the /var/log/syslogd directory as user syslog and performing automatic rotation. (You can configure how and where things are logged by writing a real s6-log script at the end of the command line.) Of course, in the real world, you wouldn't write that. First, because s6 provides some shortcuts for common operations so the real command lines would be a tad shorter, and second, because you'd want the long-running processes to be supervised, so you'd use the supervision infrastructure and write two short run scripts instead. (And so, to provide syslogd functionality to one client, you'd really have 1 s6-svscan process, 2 s6-supervise processes, 1 s6-ipcserverd process, 1 ucspilogd process and 1 s6-log process. Yes, 6 processes. This is not as insane as it sounds. Processes are not a scarce resource on Unix; the scarce resources are RAM and CPU. The s6 processes have been designed to take *very* little of those, so the total amount of RAM and CPU they all use is still smaller than the amount used by a single rsyslogd process.) There are good reasons to program this way. Mostly, it amounts to writing as little code as possible. If you look at the source code for every single command that appears on the insane command line above, you'll find that it's pretty short, and short means maintainable - which is the most important quality to have in a codebase, especially when there's just one guy maintaining it. Using high-level languages also reduces the source code's size, but it adds the interpreter's or run-time system's overhead, and a forest of dependencies. What is then run on the machine is not lightweight by any measure. (Plus, most of those languages are total crap.) Anyway, my point is that it often takes several processes to provide a service, and that it's a good thing. This practice should be encouraged. So, yes, running a service under a process supervisor is the right design, and I'm happy that John, Gorka, Les and other people have figured it out. s6 itself provides the process supervision service not as a single executable, but as a set of tools. s6-svscan doesn't do it all, and it's by design. It's just another basic building block. Sure, it's a bit special because it can run as process 1 and is the root of the supervision tree, but that doesn't mean it's a turnkey program - the key lies in how it's used together with other s6 and Unix tools. That's why starting s6-svscan directly as the entrypoint isn't such a good idea. It's much more flexible to run a script as the entrypoint that performs a few basic initialization steps then
Re: process supervisor - considerations for docker
On Thu, Feb 26, 2015 at 11:40 PM, Laurent Bercot ska-supervis...@skarnet.org wrote: On 26/02/2015 21:53, John Regan wrote: Besides, the whole idea here is to make an image that follows best practices, and best practices state we should be using a process supervisor that cleans up orphaned processes and stuff. You should be encouraging people to run their programs, interactively or not, under a supervision tree like s6. The distinction between process and service is key here, and I agree with John. long design rant There's a lot of software out there that seems built on the assumption that a program should do everything within a single executable, and that processes that fail to address certain issues are incomplete and the program needs to be patched. Under Unix, this assumption is incorrect. Unix is mostly defined by its simple and efficient interprocess communication, so a Unix program is best designed as a *set* of processes, with the right communication channels between them, and the right control flow between those processes. Using Unix primitives the right way allows you to accomplish a task with minimal effort by delegating a lot to the operating system. This is how I design and write software: to take advantage of the design of Unix as much as I can, to perform tasks with the lowest possible amount of code. This requires isolating basic building blocks, and providing those building blocks as binaries, with the right interface so users can glue them together on the command line. Take the syslogd service. The rsyslogd way is to have one executable, rsyslogd, that provides the syslogd functionality. The s6 way is to combine several tools to implement syslogd; the functionality already exists, even if it's not immediately apparent. This command line should do: pipeline s6-ipcserver-socketbinder /dev/log s6-envuidgid nobody s6-applyuidgid -Uz s6-ipcserverd ucspilogd s6-envuidgid syslog s6-applyuidgid -Uz s6-log /var/log/syslogd Yes, that's one unique command line. The syslogd implementation will take the form of two long-running processes, one listening on /dev/log (the syslogd socket) as user nobody, and spawning a short-lived ucspilogd process for every connection to syslog; and the other writing the logs to the /var/log/syslogd directory as user syslog and performing automatic rotation. (You can configure how and where things are logged by writing a real s6-log script at the end of the command line.) Of course, in the real world, you wouldn't write that. First, because s6 provides some shortcuts for common operations so the real command lines would be a tad shorter, and second, because you'd want the long-running processes to be supervised, so you'd use the supervision infrastructure and write two short run scripts instead. (And so, to provide syslogd functionality to one client, you'd really have 1 s6-svscan process, 2 s6-supervise processes, 1 s6-ipcserverd process, 1 ucspilogd process and 1 s6-log process. Yes, 6 processes. This is not as insane as it sounds. Processes are not a scarce resource on Unix; the scarce resources are RAM and CPU. The s6 processes have been designed to take *very* little of those, so the total amount of RAM and CPU they all use is still smaller than the amount used by a single rsyslogd process.) There are good reasons to program this way. Mostly, it amounts to writing as little code as possible. If you look at the source code for every single command that appears on the insane command line above, you'll find that it's pretty short, and short means maintainable - which is the most important quality to have in a codebase, especially when there's just one guy maintaining it. Using high-level languages also reduces the source code's size, but it adds the interpreter's or run-time system's overhead, and a forest of dependencies. What is then run on the machine is not lightweight by any measure. (Plus, most of those languages are total crap.) Anyway, my point is that it often takes several processes to provide a service, and that it's a good thing. This practice should be encouraged. So, yes, running a service under a process supervisor is the right design, and I'm happy that John, Gorka, Les and other people have figured it out. s6 itself provides the process supervision service not as a single executable, but as a set of tools. s6-svscan doesn't do it all, and it's by design. It's just another basic building block. Sure, it's a bit special because it can run as process 1 and is the root of the supervision tree, but that doesn't mean it's a turnkey program - the key lies in how it's used together with other s6 and Unix tools. That's why starting s6-svscan directly as the entrypoint isn't such a good idea. It's much more flexible to run a script as the entrypoint that performs a few basic initialization steps then execs into s6-svscan. Just like you'd do for a real init.
Re: process supervisor - considerations for docker
I think you're better off with: * Case 1 : docker run --entrypoint= image commandline (with or without -ti depending on whether you need an interactive terminal) * Case 2 : docker run image * Case 3: docker run image commandline (with or without -ti depending on whether you need an interactive terminal) docker run --entrypoint= -ti image /bin/sh would start a shell without the supervision tree running docker run -ti image /bin/sh would start a shell with the supervision tree up. After reading your reasoning, I agree 100% - let -ti drive whether it's interactive, and --entrypoint drive whether there's a supervision tree. -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: process supervisor - considerations for docker
On Thu, Feb 26, 2015 at 02:37:23PM +0100, Laurent Bercot wrote: On 26/02/2015 14:11, John Regan wrote: Just to clarify, docker run spins up a new container, so that wouldn't work for stopping a container. It would just spin up a new container running s6-svscanctl -t service To stop, you run docker stop container id Ha! Shows how much I know about Docker. I believe the idea is sound, though. And definitely implementable. I figure this is also a good moment to go over ENTRYPOINT and CMD, since that's come up a few times in the discussion. When you build a Docker image, the ENTRYPOINT is what program you want to run as PID1 by default. It can be the path to a program, along with some arguments, or it can be null. CMD is really just arguments to your ENTRYPOINT, unless ENTRYPOINT is null, in which case it becomes your effective ENTRYPOINT. At build-time, you can specify a default CMD, which is what gets run if no arguments are passed to docker run. When you do 'docker run imagename blah blah blah', the 'blah blah blah' gets passed as arguments to ENTRYPOINT. If you want to specify a different ENTRYPOINT at runtime, you need to use the --entrypoint switch. So, for example: the default ubuntu image has a null ENTRYPOINT, and the default CMD is /bin/bash. If I run `docker run ubuntu`, then /bin/bash gets executed (and quits immediately, since it doesn't have anything to do). If I run `docker run ubuntu echo hello`, then /bin/echo is executed. In my Ubuntu baseimage, I made the ENTRYPOINT s6-svscan /etc/s6. In hindsight, this probably wasn't the best idea. If the user types docker run jprjr/ubuntu-baseimage hey there, then the effective command becomes s6-svscan /etc/s6 hey there - which goes against how most other Docker images work. So, if I pull up Laurent's earlier list of options for the client: * docker run image commandline Runs commandline in the image without starting the supervision environment. * docker run image /init Starts the supervision environment and lets it run forever. * docker run image /init commandline Runs commandline in the fully operational supervision environment. When commandline exits, the supervision environment is stopped and cleaned up. I'm going to call these case 1 (command with no supervision environment), case 2 (default supervision environment), and case 3 (supervision environment with a provided command). Here's a breakdown of each case's invocation, given a set ENTRYPOINT and CMD: ENTRYPOINT = null, CMD = /init * Case 1: `docker run image commandline` * Case 2: `docker run image * Case 3: `docker run image /init commandline` ENTRYPOINT = /init, CMD = null * Case 1: `docker run --entrypoint= image commandline` * Case 2: `docker run image` * Case 3: `docker run image commandline` Now, something worth noting is that none of these command run interactively - to run interactively, you run something like 'docker run -ti ubuntu /bin/sh'. -t allocates a TTY, and -i keeps STDIN open. So, I think the right thing to do is make /init check if there's a TTY allocated and that STDIN is open, and if so, just exec into the passed arguments without starting the supervision tree. I'm not going to lie, I don't know the details of how to actually do that. Assuming that's possible, your use cases become this: ENTRYPOINT = /init (with TTY/STDIN detection), CMD = null * Case 1: `docker run -ti image commandline` * Case 2: `docker run image` * Case 3: `docker run image commandline` So basically, if you want to run your command interactively with no execution environment, you just pass '-ti' to 'docker run' like you normally do. If you want it to run under the supervision tree, just don't pass the '-ti' flags. This makes the image work like pretty much every image ever, and the user doesn't ever need to type out /init. Laurent, how hard is it to check if you're attached to a TTY or not? This is where we start getting into your area of expertise :) -John
Re: process supervisor - considerations for docker
On Thu, Feb 26, 2015 at 08:23:47PM +, Dreamcat4 wrote: You CANNOT enforce specific ENTRYPOINT + CMD usages amongst docker users. It will never work because too many people use docker in too many different ways. And it does not matter from a technical perspective for the solution I have been quietly thinking of (but not had an opportunity to share yet). It's best to think of ENTRYPOINT (in conventional docker learning before throwing in any /init system) and being the interpreter such as the /bin/sh -c bit that sets up the environment. Like the shebang line. Or could be the python interpreter instead etc. I disagree, and I think your second paragraph actually supports my argument: if you think of ENTRYPOINT as the command for setting up the environment, then it makes sense to use ENTRYPOINT as the method for setting up a supervision tree vs not setting up a supervision tree, because those are two pretty different environments. People use Docker in tons of different ways, sure. But I'm completely able to say this is the entrypoint my image uses, and this is what it does. Besides, the whole idea here is to make an image that follows best practices, and best practices state we should be using a process supervisor that cleans up orphaned processes and stuff. You should be encouraging people to run their programs, interactively or not, under a supervision tree like s6. Heck, most people don't *care* about this kind of thing because they don't even know. So if you just make /init the ENTRYPOINT, 99% of people will probably never even realize what's happening. If they can run `docker run -ti imagename /bin/sh` and get a working, interactive shell, and the container exits when they type exit, then they're good to go! Most won't even question what the image is up to, they'll just continue on getting the benefits of s6 without even realizing it. My suggestion: * /init is launched by docker as the first argument. * init checks for $@. If there are any arguments: * create (from a simple template) a s6 run script * run script launches $1 (first arg) as the command to run * run script template is written with remaining args to $1 * proceed normally (inspect the s6 config directory as usual!) * as there should be no breakage of all existing functionality * Providing there is no VOLUME sat ontop of the /etc/s6 config directory * Then the run script is temporary - it will only last while the container is running. * So won't be there anymore to cleanup on and future 'docker run' invokations with different arguments. The main thing I'm concerned about is about preserving proper shell quoting, because sometimes args can be like --flag='some thing'. It may be one simple way to get proper quoting (in conventional shells like bash) is to use 'set -x' to echo out the line, as the output is ensured by the interpreter to be re-executable. Although even if that takes care of the quotes, it would still not be good to have accidental variable expansion, interpretation of $ ! etc. Maybe I'm thinking a bit too far ahead. But we already know that Gorka's '/init' script is written in bash. I think here, you're getting way more caught up in the details of your idea than you need to be. Shells, arguments, quoting, etc, you're overcomplicating some of this stuff.
Re: process supervisor - considerations for docker
On Thu, Feb 26, 2015 at 8:31 AM, Gorka Lertxundi glertxu...@gmail.com wrote: Hi, My name is Gorka, not Gornak! It seems like suddenly I discovered that I was born in the eastern europe! hehe :) I'll answer to both of you, mixed, so try not to get confused. Lets go, But Gornak - I must say that your new ubuntu base image really seem *a lot* better than the phusion/baseimage one. It is fantastic and an excellent job you have done there and you continue to update with new versions of s6, etc. Can't really say thank you enough for that. Thanks! I think if anybody were to start up a new baseimage project, Alpine is the way to go, hands-down. Tiny, efficient images. Wow, I didn't hear about Alpine Linux. What would differentiate from, for example, busybox with opkg? https://github.com/progrium/busybox. Busybox is battle-tested and having a package manager in it seems the right way. The problem with these 'not as mainstream as ubuntu' distros is the smaller community around it. That community discovers things that you probably didn't be aware of, bugfixes, fast security updates, ... . So my main concern about the image is not its size but the keep-easily-up-to-date-and-secure no matter it size. Even so, although you probably know that, docker storages images incrementally so that just the base images is stored once and all app-specific images will be on top of this image. It is always the result of a commitment to easy of use, size and maintainability. Great work Gorka for providing these linux x86_64 binaries on Github releases. This was exactly the kind of thing I was hoping for / looking for in regards to that aspect. As I said in my last email, I'll try to keep them updated Right so I was half-expecting this kind of response (and from John Regan too). In my initial post I could not think of a concise-enough way to demonstrate and explain my reasoning behind that specific request. At least not without entering into a whole other big long discussion that would have detracted / derailed from some of the other important considerations and discussion points in respect to docker. Basically without that capability (which I am aware goes against convention for process supervisors that occupy pid 1). Then you are forcing docker users to choose an XOR (exclusive-OR) between either using s6 process supervision or the ability to specify command line arguments to their docker containers (via ENTRYPOINT and/or CMD). Which essentially is like a breakage of those ENTRYPOINT and CMD features of docker. At least that is my understanding how pretty much all of these process supervisors behave. And not any criticism levelled at s6 alone. Since you would not typically expect this feature anyway (before we had containerisation etc.). It is very docker-specific. Both of you seem to have stated effectively that you don't really see such a pressing reason why it is needed. So then it's another thing entirely for me to explain why and convince you guys there are good reasons for it being important to be able to continue to use CMD and ENTRYPOINT for specifying command line arguments still remains an important thing after adding a process supervisor. There are actually many different reasons for why that is desirable (that I can think of right now). But that's another discussion and case for me to make to you. I would be happy to go into that aspect further. Perhaps off-the mailing list is a better idea. To then come back here again when that discussion is over and concluded with a short summary. But I don't want to waste anyone's time so please reply and indicate if you would really like for me to go into more depth with better justifications for why we need that particular feature. I don't think it must be one or another. With CMD [ /init ] you can: ^^ Okay so this is what I have been trying to say but Gorka has put more elegantly here. So you kindda have to try to support both. * start your supervisor by default: docker run your-image * get access to the container directly without any s6 process started: docker run your-image /bin/bash * run a custom script and supervise it: docker run your-image /init /your-custom-script Would appreciate coming back to how we can do this later on. After I have made a more convincing case for why it's actually needed. My naive assumption, not knowing any of s6 yet: It should be that simply passing on an argv[] array aught to be possible. And perhaps without too many extra hassles or loops to jump through. Would appreciate that use-cases! :-) To make an overview * Containers that provide Development tools / dev environments - often those category of docker images take direct cmd line args. * Here are some examples of complex single-shot commands that often take command line arguments: * To run a complex build of something (which may spawn out
Re: process supervisor - considerations for docker
On 26/02/2015 21:53, John Regan wrote: Besides, the whole idea here is to make an image that follows best practices, and best practices state we should be using a process supervisor that cleans up orphaned processes and stuff. You should be encouraging people to run their programs, interactively or not, under a supervision tree like s6. The distinction between process and service is key here, and I agree with John. long design rant There's a lot of software out there that seems built on the assumption that a program should do everything within a single executable, and that processes that fail to address certain issues are incomplete and the program needs to be patched. Under Unix, this assumption is incorrect. Unix is mostly defined by its simple and efficient interprocess communication, so a Unix program is best designed as a *set* of processes, with the right communication channels between them, and the right control flow between those processes. Using Unix primitives the right way allows you to accomplish a task with minimal effort by delegating a lot to the operating system. This is how I design and write software: to take advantage of the design of Unix as much as I can, to perform tasks with the lowest possible amount of code. This requires isolating basic building blocks, and providing those building blocks as binaries, with the right interface so users can glue them together on the command line. Take the syslogd service. The rsyslogd way is to have one executable, rsyslogd, that provides the syslogd functionality. The s6 way is to combine several tools to implement syslogd; the functionality already exists, even if it's not immediately apparent. This command line should do: pipeline s6-ipcserver-socketbinder /dev/log s6-envuidgid nobody s6-applyuidgid -Uz s6-ipcserverd ucspilogd s6-envuidgid syslog s6-applyuidgid -Uz s6-log /var/log/syslogd Yes, that's one unique command line. The syslogd implementation will take the form of two long-running processes, one listening on /dev/log (the syslogd socket) as user nobody, and spawning a short-lived ucspilogd process for every connection to syslog; and the other writing the logs to the /var/log/syslogd directory as user syslog and performing automatic rotation. (You can configure how and where things are logged by writing a real s6-log script at the end of the command line.) Of course, in the real world, you wouldn't write that. First, because s6 provides some shortcuts for common operations so the real command lines would be a tad shorter, and second, because you'd want the long-running processes to be supervised, so you'd use the supervision infrastructure and write two short run scripts instead. (And so, to provide syslogd functionality to one client, you'd really have 1 s6-svscan process, 2 s6-supervise processes, 1 s6-ipcserverd process, 1 ucspilogd process and 1 s6-log process. Yes, 6 processes. This is not as insane as it sounds. Processes are not a scarce resource on Unix; the scarce resources are RAM and CPU. The s6 processes have been designed to take *very* little of those, so the total amount of RAM and CPU they all use is still smaller than the amount used by a single rsyslogd process.) There are good reasons to program this way. Mostly, it amounts to writing as little code as possible. If you look at the source code for every single command that appears on the insane command line above, you'll find that it's pretty short, and short means maintainable - which is the most important quality to have in a codebase, especially when there's just one guy maintaining it. Using high-level languages also reduces the source code's size, but it adds the interpreter's or run-time system's overhead, and a forest of dependencies. What is then run on the machine is not lightweight by any measure. (Plus, most of those languages are total crap.) Anyway, my point is that it often takes several processes to provide a service, and that it's a good thing. This practice should be encouraged. So, yes, running a service under a process supervisor is the right design, and I'm happy that John, Gorka, Les and other people have figured it out. s6 itself provides the process supervision service not as a single executable, but as a set of tools. s6-svscan doesn't do it all, and it's by design. It's just another basic building block. Sure, it's a bit special because it can run as process 1 and is the root of the supervision tree, but that doesn't mean it's a turnkey program - the key lies in how it's used together with other s6 and Unix tools. That's why starting s6-svscan directly as the entrypoint isn't such a good idea. It's much more flexible to run a script as the entrypoint that performs a few basic initialization steps then execs into s6-svscan. Just like you'd do for a real init. :) /long design rant Heck, most people don't *care* about this kind of thing because they don't even know. So if you just make /init the ENTRYPOINT, 99%
Re: process supervisor - considerations for docker
Hi Dreamcat4 - First thing's first - I can't stress enough how awesome it is to know people are using/talking about my Docker images, blog posts, and so on. Too cool! I've responded to your concerns/questions/etc throughout the email below. -John On Wed, Feb 25, 2015 at 11:32:37AM +, Dreamcat4 wrote: Thank you for moving my message Laurent. Sorry for the mixup r.e. the mailing lists. I have subscribed to the correct list now (for s6 specific). On Wed, Feb 25, 2015 at 11:30 AM, Laurent Bercot ska-skaw...@skarnet.org wrote: (Moving the discussion to the supervision@list.skarnet.org list. The original message is quoted below.) Hi Dreamcat4, Thanks for your detailed message. I'm very happy that s6 found an application in docker, and that there's such an interest for it! skaw...@list.skarnet.org is indeed the right place to reach me and discuss the software I write, but for s6 in particular and process supervisors in general, supervision@list.skarnet.org is the better place - it's full of people with process supervision experience. Your message gives a lot of food for thought, and I don't have time right now to give it all the attention it deserves. Tonight or tomorrow, though, I will; and other people on the supervisionlist will certainly have good insights. Cheers! -- Laurent On 25/02/2015 11:55, Dreamcat4 wrote: Hello, Now there is someone (John Regan) who has made s6 images for docker. And written a blog post about it. Which is a great effort - and the reason I've come here. But it gives me a taste of wanting more. Something a bit more foolproof, and simpler, to work specifically inside of docker. From that blog post I get a general impression that s6 has many advantages. And it may be a good candidate for docker. But I would be remiss not to ask the developers of s6 themselves not to try to take some kind of a personal an interest in considering how s6 might best work inside of docker specifically. I hope that this is the right mailing list to reach s6 developers / discuss such matters. Is this the correct mailing list for s6 dev discussions? I've read and read around the subject of process supervision inside docker. Various people explain how or why they use various different process supervisors in docker (not just s6). None of them really quite seem ideal. I would like to be wrong about that but nothing has fully convinced me so far. Perhaps it is a fair criticism to say that I still have a lot more to learn in regards to process supervisors. But I have no interest in getting bogged down by that. To me, I already know more-or-less enough about how docker manages (or rather mis-manages!) it's container processes to have an opinion about what is needed, from a docker-sided perspective. And know enough that docker project itself won't fix these issues. For one thing because of not owning what's running on the inside of containers. And also because of their single-process viewpoint take on things. Andy way. That kind of political nonsense doesn't matter for our discussion. I just want to have a technical discussion about what is needed, and how might be the best way to solve the problem! MY CONCERNS ABOUT USING S6 INSIDE OF DOCKER In regards of s6 only, currently these are my currently perceived shortcomings when using it in docker: * it's not clear how to pass in programs arguments via CMD and ENTRYPOINT in docker - in fact i have not seen ANY docker process supervisor solutions show how to do this (except perhaps phusion base image) To be honest, I just haven't really done that. I usually use environment variables to setup my services. For example, if I have a NodeJS service, I'll run something like `docker run -e NODEJS_SCRIPT=myapp.js some-nodejs-image` Then in my NodeJS `run` script, I'd check if that environment variable is defined and use it as my argument to NodeJS. I'm just making up this bit of shell code on the fly, it might have syntax errors, but you should get the idea: ``` if [ -n $NODEJS_SCRIPT ]; then exec node $NODEJS_SCRIPT else printf NODEJS_SCRIPT undefined touch down exit 1 fi ``` Another option is to write a script to use as an entrypoint that handles command arguments, then execs into s6-svcscan. * it is not clear if ENV vars are preserved. That is also something essential for docker. In my experience, they are. If you use s6-svc as your entrypoint (like I do in my images), then define environment variables via docker's -e switch they'll be preserved and available in each service's `run` script, just like in my NodeJS example above. * s6 has many utilities s6-* - not clear which ones are actually required for making a docker process supervisor The only *required* programs are the ones in the main s6 and execline packages. * s6 not available yet as .deb or .rpm package -
Re: process supervisor - considerations for docker
On Wed, Feb 25, 2015 at 03:58:07PM +0100, Gorka Lertxundi wrote: Hello, After that great john's post, I tried to solve exactly your same problems. I created my own base image based primarily on John's and Phusion's base images. That's awesome - I get so excited when I hear somebody's actually read, digested, and taken action based on something I wrote. So cool! :) See my thoughts below. 2015-02-25 12:30 GMT+01:00 Laurent Bercot ska-skaw...@skarnet.org: (Moving the discussion to the supervision@list.skarnet.org list. The original message is quoted below.) Hi Dreamcat4, Thanks for your detailed message. I'm very happy that s6 found an application in docker, and that there's such an interest for it! skaw...@list.skarnet.org is indeed the right place to reach me and discuss the software I write, but for s6 in particular and process supervisors in general, supervision@list.skarnet.org is the better place - it's full of people with process supervision experience. Your message gives a lot of food for thought, and I don't have time right now to give it all the attention it deserves. Tonight or tomorrow, though, I will; and other people on the supervisionlist will certainly have good insights. Cheers! -- Laurent On 25/02/2015 11:55, Dreamcat4 wrote: Hello, Now there is someone (John Regan) who has made s6 images for docker. And written a blog post about it. Which is a great effort - and the reason I've come here. But it gives me a taste of wanting more. Something a bit more foolproof, and simpler, to work specifically inside of docker. From that blog post I get a general impression that s6 has many advantages. And it may be a good candidate for docker. But I would be remiss not to ask the developers of s6 themselves not to try to take some kind of a personal an interest in considering how s6 might best work inside of docker specifically. I hope that this is the right mailing list to reach s6 developers / discuss such matters. Is this the correct mailing list for s6 dev discussions? I've read and read around the subject of process supervision inside docker. Various people explain how or why they use various different process supervisors in docker (not just s6). None of them really quite seem ideal. I would like to be wrong about that but nothing has fully convinced me so far. Perhaps it is a fair criticism to say that I still have a lot more to learn in regards to process supervisors. But I have no interest in getting bogged down by that. To me, I already know more-or-less enough about how docker manages (or rather mis-manages!) it's container processes to have an opinion about what is needed, from a docker-sided perspective. And know enough that docker project itself won't fix these issues. For one thing because of not owning what's running on the inside of containers. And also because of their single-process viewpoint take on things. Andy way. That kind of political nonsense doesn't matter for our discussion. I just want to have a technical discussion about what is needed, and how might be the best way to solve the problem! MY CONCERNS ABOUT USING S6 INSIDE OF DOCKER In regards of s6 only, currently these are my currently perceived shortcomings when using it in docker: * it's not clear how to pass in programs arguments via CMD and ENTRYPOINT in docker - in fact i have not seen ANY docker process supervisor solutions show how to do this (except perhaps phusion base image) * it is not clear if ENV vars are preserved. That is also something essential for docker. * s6 has many utilities s6-* - not clear which ones are actually required for making a docker process supervisor * s6 not available yet as .deb or .rpm package - official packages are helpful because on different distros: + standard locations where to put config files and so on may differ. + to install man pages too, in the right place * s6 is not available as official single pre-compiled binary file for download via wget or curl - which would be the most ideal way to install it into a docker container ^^ Some of these perceived shortcomings are more important / significant than others! Some are not in the remit of s6 development to be concerned about. Some are mild nit-picking, or the ignorance of not-knowning, having not actually tried out s6 before. But my general point is that it is not clear-enough to me (from my perspective) whether s6 can actually satisfy all of the significant docker-specific considerations. Which I have not properly stated yet. So here they are listed below… DOCKER-SPECIFIC CONSIDERATIONS FOR A PROCESS SUPERVISOR A good process supervisor for docker should ideally: * be a single pre-compiled binary program file. That can be downloaded by curl/wget (or can be
Re: process supervisor - considerations for docker
(Moving the discussion to the supervision@list.skarnet.org list. The original message is quoted below.) Hi Dreamcat4, Thanks for your detailed message. I'm very happy that s6 found an application in docker, and that there's such an interest for it! skaw...@list.skarnet.org is indeed the right place to reach me and discuss the software I write, but for s6 in particular and process supervisors in general, supervision@list.skarnet.org is the better place - it's full of people with process supervision experience. Your message gives a lot of food for thought, and I don't have time right now to give it all the attention it deserves. Tonight or tomorrow, though, I will; and other people on the supervisionlist will certainly have good insights. Cheers! -- Laurent On 25/02/2015 11:55, Dreamcat4 wrote: Hello, Now there is someone (John Regan) who has made s6 images for docker. And written a blog post about it. Which is a great effort - and the reason I've come here. But it gives me a taste of wanting more. Something a bit more foolproof, and simpler, to work specifically inside of docker. From that blog post I get a general impression that s6 has many advantages. And it may be a good candidate for docker. But I would be remiss not to ask the developers of s6 themselves not to try to take some kind of a personal an interest in considering how s6 might best work inside of docker specifically. I hope that this is the right mailing list to reach s6 developers / discuss such matters. Is this the correct mailing list for s6 dev discussions? I've read and read around the subject of process supervision inside docker. Various people explain how or why they use various different process supervisors in docker (not just s6). None of them really quite seem ideal. I would like to be wrong about that but nothing has fully convinced me so far. Perhaps it is a fair criticism to say that I still have a lot more to learn in regards to process supervisors. But I have no interest in getting bogged down by that. To me, I already know more-or-less enough about how docker manages (or rather mis-manages!) it's container processes to have an opinion about what is needed, from a docker-sided perspective. And know enough that docker project itself won't fix these issues. For one thing because of not owning what's running on the inside of containers. And also because of their single-process viewpoint take on things. Andy way. That kind of political nonsense doesn't matter for our discussion. I just want to have a technical discussion about what is needed, and how might be the best way to solve the problem! MY CONCERNS ABOUT USING S6 INSIDE OF DOCKER In regards of s6 only, currently these are my currently perceived shortcomings when using it in docker: * it's not clear how to pass in programs arguments via CMD and ENTRYPOINT in docker - in fact i have not seen ANY docker process supervisor solutions show how to do this (except perhaps phusion base image) * it is not clear if ENV vars are preserved. That is also something essential for docker. * s6 has many utilities s6-* - not clear which ones are actually required for making a docker process supervisor * s6 not available yet as .deb or .rpm package - official packages are helpful because on different distros: + standard locations where to put config files and so on may differ. + to install man pages too, in the right place * s6 is not available as official single pre-compiled binary file for download via wget or curl - which would be the most ideal way to install it into a docker container ^^ Some of these perceived shortcomings are more important / significant than others! Some are not in the remit of s6 development to be concerned about. Some are mild nit-picking, or the ignorance of not-knowning, having not actually tried out s6 before. But my general point is that it is not clear-enough to me (from my perspective) whether s6 can actually satisfy all of the significant docker-specific considerations. Which I have not properly stated yet. So here they are listed below… DOCKER-SPECIFIC CONSIDERATIONS FOR A PROCESS SUPERVISOR A good process supervisor for docker should ideally: * be a single pre-compiled binary program file. That can be downloaded by curl/wget (or can be installed from .deb or .rpm). * can take directly command and arguments. With argv[] like this: process_supervisor my_program_or_script my program or script arguments… * will pass on all ENV vars to my_program_or_script faithfully * will run as PID 1 inside the linux namespace * where my_program_or_script may spawn BOTH child AND non-child (orphaned) processes * when process_supervisor (e.g. s6 or whatever) receives a TERM signal * it faithfully passes that signal to my_program_or_script * it also passes that signal to any orphaned non-child processes too * when my_program_or_script dies, or exits * clean up ALL remaining non-children