Re: process supervisor - considerations for docker

2015-02-28 Thread Laurent Bercot

The idea is that with a docker-targeted s6 tarball, it should
universally work on top of any / all base image.


 Just to make things perfectly clear: I am not going to make a
special version of s6 just for Docker. I like Docker, I think it's
a useful tool, and I'm willing to support it, as in make adjustments
to s6 if necessary to ease integration with Docker without impacting
other uses; but I draw the line at adding specific code for it,
because 1. it's yet another slippery slope I'm not treading on, and
2. it would defeat the purpose of Docker. AIUI, Docker images can,
and should, be standard images - you should be able to run a Linux
kernel with init=YOURENTRYPOINT and it should just run, give or take
a few details such as /dev and more generally filesystems. (I don't
know what assumption a docker entrypoint can make: are /proc and
/sys mounted ? is /dev mounted ? is / read-only ? can I create
filesystems ?)



If what Laurent says is true that the s6 packages are entirely
self-contained.


 It depends on what you mean by self-contained. If you compile and
statically link against a libc such as musl, all you need is the
execline and s6 binaries. Else, you need a libc.so. And for the
build, you also need skalibs.
 It's unclear to me what format a distributed docker image takes.
Is it source, with instructions how to build it ? Is it a binary
image ? Or is it possible to distribute both ?
 Binary images are the most practical, *but* they make assumptions
on the target architecture. Having to provide an image per target
architecture sounds contrary to the purpose of Docker.



* re-write /init from bash --- execline
* add the argv[] support for CMD/ENTRYPOINT arguments


 I can do that with Gorka, with his permission.



   * /init (probably renamed to be 's6-init' or something like that)


 Why ? An init process is an init process is an init process, no
matter the underlying tools.
 Also, please don't use the s6-init name. I'm going to need it for
a future package and I'd like to avoid possible confusions.



* Can probably start work on the first bullet point (convert /init
to execline) during this weekend. Unless anyone else would rather jump
in before me and do it. But it seems not.


 Hold your horses, Ben Hur. People usually take a break during the
weekend; give at least John and Gorka some time to answer. It's
John's initial idea and Gorka's image, let them have the final say,
and work on it themselves or delegate tasks as they see fit.



* If Laurant wants to push his core s6 releases (including the docker
specific one) onto Github. Then it would be great for him to make a
github/s6 org with Gorak, as new home for 's6', or else a git mirror
of the official skanet.org.


 I'm not moving s6's home. I can set up a mirror for s6 on github, but
I fail to see how this would be useful - it's not as if the skarnet.org
server was overloaded. (The day it's overloaded with s6 pull requests
will be a happy day for me.)
 If it's not about pulling, then what is it about ?

 (In case you can't tell: I'm not a github fan. Technically, it's a
good piece of software, git, at the core, with 2 or 3 layers of all your
standard bloated crap piled onto it - and it shows whenever you're
trying to interoperate with it. Politically, the account creation
procedure makes it very clear that github is about money first. The
only thing I like about github is the presentation, which is directly
inspired from Google. So, yeah, not much to save.)

--
 Laurent



Re: process supervisor - considerations for docker

2015-02-28 Thread Laurent Bercot

On 28/02/2015 11:58, Laurent Bercot wrote:

  (In case you can't tell: I'm not a github fan.


 Meh. At this time, publicity is a good thing for my software,
even if 1. it's still a bit early, and 2. I have to use tools
I'm not entirely comfortable with. So I set up mirrors of
everything on github.
 https://github.com/s6 in particular.
 Pull to your heart's content and spread the word. Have fun.

--
 Laurent



Re: process supervisor - considerations for docker

2015-02-27 Thread Dreamcat4
On Fri, Feb 27, 2015 at 10:19 AM, Gorka Lertxundi glertxu...@gmail.com wrote:
 Dreamcat4, pull request are always welcomed!

 2015-02-27 0:40 GMT+01:00 Laurent Bercot ska-supervis...@skarnet.org:

 On 26/02/2015 21:53, John Regan wrote:

 Besides, the whole idea here is to make an image that follows best
 practices, and best practices state we should be using a process
 supervisor that cleans up orphaned processes and stuff. You should be
 encouraging people to run their programs, interactively or not, under
 a supervision tree like s6.


  The distinction between process and service is key here, and I
 agree with John.

 long design rant
  There's a lot of software out there that seems built on the assumption
 that
 a program should do everything within a single executable, and that
 processes
 that fail to address certain issues are incomplete and the program needs to
 be patched.

  Under Unix, this assumption is incorrect. Unix is mostly defined by its
 simple and efficient interprocess communication, so a Unix program is best
 designed as a *set* of processes, with the right communication channels
 between them, and the right control flow between those processes. Using
 Unix primitives the right way allows you to accomplish a task with minimal
 effort by delegating a lot to the operating system.

  This is how I design and write software: to take advantage of the design
 of Unix as much as I can, to perform tasks with the lowest possible amount
 of code.
  This requires isolating basic building blocks, and providing those
 building
 blocks as binaries, with the right interface so users can glue them
 together on the command line.

  Take the syslogd service. The rsyslogd way is to have one executable,
 rsyslogd, that provides the syslogd functionality. The s6 way is to combine
 several tools to implement syslogd; the functionality already exists, even
 if it's not immediately apparent. This command line should do:

  pipeline s6-ipcserver-socketbinder /dev/log s6-envuidgid nobody
 s6-applyuidgid -Uz s6-ipcserverd ucspilogd  s6-envuidgid syslog
 s6-applyuidgid -Uz s6-log /var/log/syslogd


 I love puzzles.


  Yes, that's one unique command line. The syslogd implementation will take
 the form of two long-running processes, one listening on /dev/log (the
 syslogd socket) as user nobody, and spawning a short-lived ucspilogd
 process
 for every connection to syslog; and the other writing the logs to the
 /var/log/syslogd directory as user syslog and performing automatic
 rotation.
 (You can configure how and where things are logged by writing a real s6-log
 script at the end of the command line.)

  Of course, in the real world, you wouldn't write that. First, because s6
 provides some shortcuts for common operations so the real command lines
 would be a tad shorter, and second, because you'd want the long-running
 processes to be supervised, so you'd use the supervision infrastructure
 and write two short run scripts instead.

  (And so, to provide syslogd functionality to one client, you'd really have
 1 s6-svscan process, 2 s6-supervise processes, 1 s6-ipcserverd process,
 1 ucspilogd process and 1 s6-log process. Yes, 6 processes. This is not as
 insane as it sounds. Processes are not a scarce resource on Unix; the
 scarce resources are RAM and CPU. The s6 processes have been designed to
 take *very* little of those, so the total amount of RAM and CPU they all
 use is still smaller than the amount used by a single rsyslogd process.)

  There are good reasons to program this way. Mostly, it amounts to writing
 as little code as possible. If you look at the source code for every single
 command that appears on the insane command line above, you'll find that
 it's
 pretty short, and short means maintainable - which is the most important
 quality to have in a codebase, especially when there's just one guy
 maintaining it.
  Using high-level languages also reduces the source code's size, but it
 adds the interpreter's or run-time system's overhead, and a forest of
 dependencies. What is then run on the machine is not lightweight by any
 measure. (Plus, most of those languages are total crap.)

  Anyway, my point is that it often takes several processes to provide a
 service, and that it's a good thing. This practice should be encouraged.
 So, yes, running a service under a process supervisor is the right design,
 and I'm happy that John, Gorka, Les and other people have figured it out.

  s6 itself provides the process supervision service not as a single
 executable, but as a set of tools. s6-svscan doesn't do it all, and it's
 by design. It's just another basic building block. Sure, it's a bit special
 because it can run as process 1 and is the root of the supervision tree,
 but that doesn't mean it's a turnkey program - the key lies in how it's
 used together with other s6 and Unix tools.
  That's why starting s6-svscan directly as the entrypoint isn't such a
 good idea. It's much more flexible to run a 

Re: process supervisor - considerations for docker

2015-02-27 Thread Dreamcat4
 * Once there are 2+ similar s6 images.
   * May be worth to consult Docker Inc employees about official / base
 image builds on the hub.

Here is an example of why we might benefit from seeking help from Docker Inc:

* Multiple FROM images (multiple inheritance).

There should already be an open ticket for this feature (which does
not exist in Docker). And it seems relevant to our situation.

Or they could make a feature called flavours as a way to tweak
base images. Then that would save us some unnecessary duplication of
work.

For example:

FROM: ubuntu
FLAVOUR: s6

People could instead do:

FROM: alpine
FLAVOUR: s6

Where FLAVOR: s6 is just a separate auks layer (added ontop of the
base) at the time the image is build. So s6 is just the s6-part, kept
independent and separated out from the various base images.

Then we would only need to worry about maintaining an 's6' flavour,
which is self-contained. Bringing everything it needs with it - it's
own 'execline' and other needed s6 support tools. So not depending
upon anything that may or may-not be in the base image (including busy
box).

Such help from Docker Inc would save us having to maintain many
individual copies of various base images. So we should tell them about
it, and let them know that!

The missing capability of multiple FROM: base images (which I believe
is how is described in current open ticket(s) on docker/docker) is
essentially exactly the same idea as this FLAVOR keyword I have used
above ^^. They are interchangeable concepts. I've just called it
something else for the sake of being awkward / whatever.


Re: process supervisor - considerations for docker

2015-02-27 Thread Dreamcat4
On Fri, Feb 27, 2015 at 1:10 PM, Dreamcat4 dreamc...@gmail.com wrote:
 * Once there are 2+ similar s6 images.
   * May be worth to consult Docker Inc employees about official / base
 image builds on the hub.

 Here is an example of why we might benefit from seeking help from Docker Inc:

 * Multiple FROM images (multiple inheritance).

 There should already be an open ticket for this feature (which does
 not exist in Docker). And it seems relevant to our situation.

 Or they could make a feature called flavours as a way to tweak
 base images. Then that would save us some unnecessary duplication of
 work.

 For example:

 FROM: ubuntu
 FLAVOUR: s6

 People could instead do:

 FROM: alpine
 FLAVOUR: s6

Oh wait a minute: I'm being a little retarded. We can already use ADD
for achieving that sort of thing. Just instead the entry would point
to a github URL to get a single tarball from. Gorka is sort-of already
doing this… just with 2 separate ones, without his /init included
within, which is copied from a local directory etc.

 Where FLAVOR: s6 is just a separate auks layer (added ontop of the
 base) at the time the image is build. So s6 is just the s6-part, kept
 independent and separated out from the various base images.

 Then we would only need to worry about maintaining an 's6' flavour,
 which is self-contained. Bringing everything it needs with it - it's
 own 'execline' and other needed s6 support tools. So not depending
 upon anything that may or may-not be in the base image (including busy
 box).

 Such help from Docker Inc would save us having to maintain many
 individual copies of various base images. So we should tell them about
 it, and let them know that!

 The missing capability of multiple FROM: base images (which I believe
 is how is described in current open ticket(s) on docker/docker) is
 essentially exactly the same idea as this FLAVOR keyword I have used
 above ^^. They are interchangeable concepts. I've just called it
 something else for the sake of being awkward / whatever.


Re: process supervisor - considerations for docker

2015-02-27 Thread Gorka Lertxundi
Dreamcat4, pull request are always welcomed!

2015-02-27 0:40 GMT+01:00 Laurent Bercot ska-supervis...@skarnet.org:

 On 26/02/2015 21:53, John Regan wrote:

 Besides, the whole idea here is to make an image that follows best
 practices, and best practices state we should be using a process
 supervisor that cleans up orphaned processes and stuff. You should be
 encouraging people to run their programs, interactively or not, under
 a supervision tree like s6.


  The distinction between process and service is key here, and I
 agree with John.

 long design rant
  There's a lot of software out there that seems built on the assumption
 that
 a program should do everything within a single executable, and that
 processes
 that fail to address certain issues are incomplete and the program needs to
 be patched.

  Under Unix, this assumption is incorrect. Unix is mostly defined by its
 simple and efficient interprocess communication, so a Unix program is best
 designed as a *set* of processes, with the right communication channels
 between them, and the right control flow between those processes. Using
 Unix primitives the right way allows you to accomplish a task with minimal
 effort by delegating a lot to the operating system.

  This is how I design and write software: to take advantage of the design
 of Unix as much as I can, to perform tasks with the lowest possible amount
 of code.
  This requires isolating basic building blocks, and providing those
 building
 blocks as binaries, with the right interface so users can glue them
 together on the command line.

  Take the syslogd service. The rsyslogd way is to have one executable,
 rsyslogd, that provides the syslogd functionality. The s6 way is to combine
 several tools to implement syslogd; the functionality already exists, even
 if it's not immediately apparent. This command line should do:

  pipeline s6-ipcserver-socketbinder /dev/log s6-envuidgid nobody
 s6-applyuidgid -Uz s6-ipcserverd ucspilogd  s6-envuidgid syslog
 s6-applyuidgid -Uz s6-log /var/log/syslogd


I love puzzles.


  Yes, that's one unique command line. The syslogd implementation will take
 the form of two long-running processes, one listening on /dev/log (the
 syslogd socket) as user nobody, and spawning a short-lived ucspilogd
 process
 for every connection to syslog; and the other writing the logs to the
 /var/log/syslogd directory as user syslog and performing automatic
 rotation.
 (You can configure how and where things are logged by writing a real s6-log
 script at the end of the command line.)

  Of course, in the real world, you wouldn't write that. First, because s6
 provides some shortcuts for common operations so the real command lines
 would be a tad shorter, and second, because you'd want the long-running
 processes to be supervised, so you'd use the supervision infrastructure
 and write two short run scripts instead.

  (And so, to provide syslogd functionality to one client, you'd really have
 1 s6-svscan process, 2 s6-supervise processes, 1 s6-ipcserverd process,
 1 ucspilogd process and 1 s6-log process. Yes, 6 processes. This is not as
 insane as it sounds. Processes are not a scarce resource on Unix; the
 scarce resources are RAM and CPU. The s6 processes have been designed to
 take *very* little of those, so the total amount of RAM and CPU they all
 use is still smaller than the amount used by a single rsyslogd process.)

  There are good reasons to program this way. Mostly, it amounts to writing
 as little code as possible. If you look at the source code for every single
 command that appears on the insane command line above, you'll find that
 it's
 pretty short, and short means maintainable - which is the most important
 quality to have in a codebase, especially when there's just one guy
 maintaining it.
  Using high-level languages also reduces the source code's size, but it
 adds the interpreter's or run-time system's overhead, and a forest of
 dependencies. What is then run on the machine is not lightweight by any
 measure. (Plus, most of those languages are total crap.)

  Anyway, my point is that it often takes several processes to provide a
 service, and that it's a good thing. This practice should be encouraged.
 So, yes, running a service under a process supervisor is the right design,
 and I'm happy that John, Gorka, Les and other people have figured it out.

  s6 itself provides the process supervision service not as a single
 executable, but as a set of tools. s6-svscan doesn't do it all, and it's
 by design. It's just another basic building block. Sure, it's a bit special
 because it can run as process 1 and is the root of the supervision tree,
 but that doesn't mean it's a turnkey program - the key lies in how it's
 used together with other s6 and Unix tools.
  That's why starting s6-svscan directly as the entrypoint isn't such a
 good idea. It's much more flexible to run a script as the entrypoint
 that performs a few basic initialization steps then 

Re: process supervisor - considerations for docker

2015-02-27 Thread Dreamcat4
On Thu, Feb 26, 2015 at 11:40 PM, Laurent Bercot
ska-supervis...@skarnet.org wrote:
 On 26/02/2015 21:53, John Regan wrote:

 Besides, the whole idea here is to make an image that follows best
 practices, and best practices state we should be using a process
 supervisor that cleans up orphaned processes and stuff. You should be
 encouraging people to run their programs, interactively or not, under
 a supervision tree like s6.


  The distinction between process and service is key here, and I
 agree with John.

 long design rant
  There's a lot of software out there that seems built on the assumption that
 a program should do everything within a single executable, and that
 processes
 that fail to address certain issues are incomplete and the program needs to
 be patched.

  Under Unix, this assumption is incorrect. Unix is mostly defined by its
 simple and efficient interprocess communication, so a Unix program is best
 designed as a *set* of processes, with the right communication channels
 between them, and the right control flow between those processes. Using
 Unix primitives the right way allows you to accomplish a task with minimal
 effort by delegating a lot to the operating system.

  This is how I design and write software: to take advantage of the design
 of Unix as much as I can, to perform tasks with the lowest possible amount
 of code.
  This requires isolating basic building blocks, and providing those building
 blocks as binaries, with the right interface so users can glue them
 together on the command line.

  Take the syslogd service. The rsyslogd way is to have one executable,
 rsyslogd, that provides the syslogd functionality. The s6 way is to combine
 several tools to implement syslogd; the functionality already exists, even
 if it's not immediately apparent. This command line should do:

  pipeline s6-ipcserver-socketbinder /dev/log s6-envuidgid nobody
 s6-applyuidgid -Uz s6-ipcserverd ucspilogd  s6-envuidgid syslog
 s6-applyuidgid -Uz s6-log /var/log/syslogd

  Yes, that's one unique command line. The syslogd implementation will take
 the form of two long-running processes, one listening on /dev/log (the
 syslogd socket) as user nobody, and spawning a short-lived ucspilogd process
 for every connection to syslog; and the other writing the logs to the
 /var/log/syslogd directory as user syslog and performing automatic rotation.
 (You can configure how and where things are logged by writing a real s6-log
 script at the end of the command line.)

  Of course, in the real world, you wouldn't write that. First, because s6
 provides some shortcuts for common operations so the real command lines
 would be a tad shorter, and second, because you'd want the long-running
 processes to be supervised, so you'd use the supervision infrastructure
 and write two short run scripts instead.

  (And so, to provide syslogd functionality to one client, you'd really have
 1 s6-svscan process, 2 s6-supervise processes, 1 s6-ipcserverd process,
 1 ucspilogd process and 1 s6-log process. Yes, 6 processes. This is not as
 insane as it sounds. Processes are not a scarce resource on Unix; the
 scarce resources are RAM and CPU. The s6 processes have been designed to
 take *very* little of those, so the total amount of RAM and CPU they all
 use is still smaller than the amount used by a single rsyslogd process.)

  There are good reasons to program this way. Mostly, it amounts to writing
 as little code as possible. If you look at the source code for every single
 command that appears on the insane command line above, you'll find that it's
 pretty short, and short means maintainable - which is the most important
 quality to have in a codebase, especially when there's just one guy
 maintaining it.
  Using high-level languages also reduces the source code's size, but it
 adds the interpreter's or run-time system's overhead, and a forest of
 dependencies. What is then run on the machine is not lightweight by any
 measure. (Plus, most of those languages are total crap.)

  Anyway, my point is that it often takes several processes to provide a
 service, and that it's a good thing. This practice should be encouraged.
 So, yes, running a service under a process supervisor is the right design,
 and I'm happy that John, Gorka, Les and other people have figured it out.

  s6 itself provides the process supervision service not as a single
 executable, but as a set of tools. s6-svscan doesn't do it all, and it's
 by design. It's just another basic building block. Sure, it's a bit special
 because it can run as process 1 and is the root of the supervision tree,
 but that doesn't mean it's a turnkey program - the key lies in how it's
 used together with other s6 and Unix tools.
  That's why starting s6-svscan directly as the entrypoint isn't such a
 good idea. It's much more flexible to run a script as the entrypoint
 that performs a few basic initialization steps then execs into s6-svscan.
 Just like you'd do for a real init. 

Re: process supervisor - considerations for docker

2015-02-26 Thread John Regan


  I think you're better off with:

  * Case 1 : docker run --entrypoint= image commandline
(with or without -ti depending on whether you need an interactive
terminal)
  * Case 2 : docker run image
  * Case 3: docker run image commandline
(with or without -ti depending on whether you need an interactive
terminal)

  docker run --entrypoint= -ti image /bin/sh
would start a shell without the supervision tree running

  docker run -ti image /bin/sh
would start a shell with the supervision tree up.


After reading your reasoning, I agree 100% - let -ti drive whether it's 
interactive, and --entrypoint drive whether there's a supervision tree. 
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: process supervisor - considerations for docker

2015-02-26 Thread John Regan
On Thu, Feb 26, 2015 at 02:37:23PM +0100, Laurent Bercot wrote:
 On 26/02/2015 14:11, John Regan wrote:
 Just to clarify, docker run spins up a new container, so that wouldn't 
 work for stopping a container. It would just spin up a new container running 
 s6-svscanctl -t service
 
 To stop, you run docker stop container id
 
  Ha! Shows how much I know about Docker.
  I believe the idea is sound, though. And definitely implementable.
 

I figure this is also a good moment to go over ENTRYPOINT and CMD,
since that's come up a few times in the discussion.

When you build a Docker image, the ENTRYPOINT is what program you want
to run as PID1 by default. It can be the path to a program, along with some
arguments, or it can be null.

CMD is really just arguments to your ENTRYPOINT, unless ENTRYPOINT is
null, in which case it becomes your effective ENTRYPOINT.

At build-time, you can specify a default CMD, which is what gets run
if no arguments are passed to docker run.

When you do 'docker run imagename blah blah blah', the 'blah blah
blah' gets passed as arguments to ENTRYPOINT. If you want to specify a
different ENTRYPOINT at runtime, you need to use the --entrypoint
switch.

So, for example: the default ubuntu image has a null ENTRYPOINT, and
the default CMD is /bin/bash. If I run `docker run ubuntu`, then
/bin/bash gets executed (and quits immediately, since it doesn't have
anything to do).

If I run `docker run ubuntu echo hello`, then /bin/echo is executed.

In my Ubuntu baseimage, I made the ENTRYPOINT s6-svscan /etc/s6. In
hindsight, this probably wasn't the best idea. If the user types
docker run jprjr/ubuntu-baseimage hey there, then the effective
command becomes s6-svscan /etc/s6 hey there - which goes against how
most other Docker images work.

So, if I pull up Laurent's earlier list of options for the client:

 * docker run image commandline
   Runs commandline in the image without starting the supervision environment.
 * docker run image /init
   Starts the supervision environment and lets it run forever.
 * docker run image /init commandline
   Runs commandline in the fully operational supervision environment. When
commandline exits, the supervision environment is stopped and cleaned up.

I'm going to call these case 1 (command with no supervision environment), case
2 (default supervision environment), and case 3 (supervision environment with
a provided command).

Here's a breakdown of each case's invocation, given a set ENTRYPOINT and CMD:

ENTRYPOINT = null, CMD = /init

* Case 1: `docker run image commandline`
* Case 2: `docker run image
* Case 3: `docker run image /init commandline`

ENTRYPOINT = /init, CMD = null

* Case 1: `docker run --entrypoint= image commandline`
* Case 2: `docker run image`
* Case 3: `docker run image commandline`

Now, something worth noting is that none of these command run interactively -
to run interactively, you run something like 'docker run -ti ubuntu /bin/sh'.

-t allocates a TTY, and -i keeps STDIN open.

So, I think the right thing to do is make /init check if there's a TTY
allocated and that STDIN is open, and if so, just exec into the passed
arguments without starting the supervision tree.

I'm not going to lie, I don't know the details of how to actually do that.
Assuming that's possible, your use cases become this:

ENTRYPOINT = /init (with TTY/STDIN detection), CMD = null

* Case 1: `docker run -ti image commandline`
* Case 2: `docker run image`
* Case 3: `docker run image commandline`

So basically, if you want to run your command interactively with no execution
environment, you just pass '-ti' to 'docker run' like you normally do. If
you want it to run under the supervision tree, just don't pass the '-ti'
flags. This makes the image work like pretty much every image ever, and
the user doesn't ever need to type out /init.

Laurent, how hard is it to check if you're attached to a TTY or not? This is
where we start getting into your area of expertise :)

-John


Re: process supervisor - considerations for docker

2015-02-26 Thread John Regan
On Thu, Feb 26, 2015 at 08:23:47PM +, Dreamcat4 wrote:
 You CANNOT enforce specific ENTRYPOINT + CMD usages amongst docker
 users. It will never work because too many people use docker in too
 many different ways. And it does not matter from a technical
 perspective for the solution I have been quietly thinking of (but not
 had an opportunity to share yet).
 
 It's best to think of ENTRYPOINT (in conventional docker learning
 before throwing in any /init system) and being the interpreter such
 as the /bin/sh -c bit that sets up the environment. Like the shebang
 line. Or could be the python interpreter instead etc.

I disagree, and I think your second paragraph actually supports my
argument: if you think of ENTRYPOINT as the command for setting up the
environment, then it makes sense to use ENTRYPOINT as the method for
setting up a supervision tree vs not setting up a supervision tree,
because those are two pretty different environments.

People use Docker in tons of different ways, sure. But I'm completely
able to say this is the entrypoint my image uses, and this is what it
does.

Besides, the whole idea here is to make an image that follows best
practices, and best practices state we should be using a process
supervisor that cleans up orphaned processes and stuff. You should be
encouraging people to run their programs, interactively or not, under
a supervision tree like s6.

Heck, most people don't *care* about this kind of thing because they
don't even know. So if you just make /init the ENTRYPOINT, 99% of
people will probably never even realize what's happening. If they can
run `docker run -ti imagename /bin/sh` and get a working, interactive
shell, and the container exits when they type exit, then they're
good to go! Most won't even question what the image is up to, they'll
just continue on getting the benefits of s6 without even realizing it.

 
 My suggestion:
 
 * /init is launched by docker as the first argument.
 * init checks for $@. If there are any arguments:
 
  * create (from a simple template) a s6 run script
* run script launches $1 (first arg) as the command to run
  * run script template is written with remaining args to $1
 
  * proceed normally (inspect the s6 config directory as usual!)
* as there should be no breakage of all existing functionality
 
 * Providing there is no VOLUME sat ontop of the /etc/s6 config directory
 * Then the run script is temporary - it will only last while the
 container is running.
* So won't be there anymore to cleanup on and future 'docker run'
 invokations with different arguments.
 
 The main thing I'm concerned about is about preserving proper shell
 quoting, because sometimes args can be like --flag='some thing'.
 
 It may be one simple way to get proper quoting (in conventional shells
 like bash) is to use 'set -x' to echo out the line, as the output is
 ensured by the interpreter to be re-executable. Although even if that
 takes care of the quotes, it would still not be good to have
 accidental variable expansion, interpretation of $ ! etc. Maybe I'm
 thinking a bit too far ahead. But we already know that Gorka's '/init'
 script is written in bash.

I think here, you're getting way more caught up in the details of your
idea than you need to be. Shells, arguments, quoting, etc, you're
overcomplicating some of this stuff.


Re: process supervisor - considerations for docker

2015-02-26 Thread Dreamcat4
On Thu, Feb 26, 2015 at 8:31 AM, Gorka Lertxundi glertxu...@gmail.com wrote:
 Hi,

 My name is Gorka, not Gornak! It seems like suddenly I discovered that I was
 born in the eastern europe! hehe :)

 I'll answer to both of you, mixed, so try not to get confused.

 Lets go,

  But Gornak - I must say that your new ubuntu base image really seem *a
  lot* better than the phusion/baseimage one. It is fantastic and an
  excellent job you have done there and you continue to update with new
  versions of s6, etc. Can't really say thank you enough for that.


 Thanks!

 I think if anybody were to start up a new baseimage project, Alpine is
 the way to go, hands-down. Tiny, efficient images.


 Wow, I didn't hear about Alpine Linux. What would differentiate from, for
 example, busybox with opkg? https://github.com/progrium/busybox. Busybox is
 battle-tested and having a package manager in it seems the right way.

 The problem with these 'not as mainstream as ubuntu' distros is the smaller
 community around it. That community discovers things that you probably
 didn't be aware of, bugfixes, fast security updates, ... . So my main
 concern about the image is not its size but the
 keep-easily-up-to-date-and-secure no matter it size. Even so, although you
 probably know that, docker storages images incrementally so that just the
 base images is stored once and all app-specific images will be on top of
 this image.

 It is always the result of a commitment to easy of use, size and
 maintainability.


  Great work Gorka for providing these linux x86_64 binaries on Github
  releases.
  This was exactly the kind of thing I was hoping for / looking for in
  regards to that aspect.


 As I said in my last email, I'll try to keep them updated

  Right so I was half-expecting this kind of response (and from John
  Regan too). In my initial post I could not think of a concise-enough
  way to demonstrate and explain my reasoning behind that specific
  request. At least not without entering into a whole other big long
  discussion that would have detracted / derailed from some of the other
  important considerations and discussion points in respect to docker.
 
  Basically without that capability (which I am aware goes against
  convention for process supervisors that occupy pid 1). Then you are
  forcing docker users to choose an XOR (exclusive-OR) between either
  using s6 process supervision or the ability to specify command line
  arguments to their docker containers (via ENTRYPOINT and/or CMD).
  Which essentially is like a breakage of those ENTRYPOINT and CMD
  features of docker. At least that is my understanding how pretty much
  all of these process supervisors behave. And not any criticism
  levelled at s6 alone. Since you would not typically expect this
  feature anyway (before we had containerisation etc.). It is very
  docker-specific.
 
  Both of you seem to have stated effectively that you don't really see
  such a pressing reason why it is needed.
 
  So then it's another thing entirely for me to explain why and convince
  you guys there are good reasons for it being important to be able to
  continue to use CMD and ENTRYPOINT for specifying command line
  arguments still remains an important thing after adding a process
  supervisor. There are actually many different reasons for why that is
  desirable (that I can think of right now). But that's another
  discussion and case for me to make to you.
 
  I would be happy to go into that aspect further. Perhaps off-the
  mailing list is a better idea. To then come back here again when that
  discussion is over and concluded with a short summary. But I don't
  want to waste anyone's time so please reply and indicate if you would
  really like for me to go into more depth with better justifications
  for why we need that particular feature.


 I don't think it must be one or another. With CMD [ /init ] you can:

^^ Okay so this is what I have been trying to say but Gorka has put
more elegantly here. So you kindda have to try to support both.

 * start your supervisor by default: docker run your-image
 * get access to the container directly without any s6 process started:
 docker run your-image /bin/bash
 * run a custom script and supervise it: docker run your-image /init
 /your-custom-script


  Would appreciate coming back to how we can do this later on. After I
  have made a more convincing case for why it's actually needed. My
  naive assumption, not knowing any of s6 yet: It should be that simply
  passing on an argv[] array aught to be possible. And perhaps without
  too many extra hassles or loops to jump through.


 Would appreciate that use-cases! :-)

To make an overview

* Containers that provide Development tools / dev environments - often
those category of docker images take direct cmd line args.
  * Here are some examples of complex single-shot commands that often
take command line arguments:
* To run a complex build of something (which may spawn out 

Re: process supervisor - considerations for docker

2015-02-26 Thread Laurent Bercot

On 26/02/2015 21:53, John Regan wrote:

Besides, the whole idea here is to make an image that follows best
practices, and best practices state we should be using a process
supervisor that cleans up orphaned processes and stuff. You should be
encouraging people to run their programs, interactively or not, under
a supervision tree like s6.


 The distinction between process and service is key here, and I
agree with John.

long design rant
 There's a lot of software out there that seems built on the assumption that
a program should do everything within a single executable, and that processes
that fail to address certain issues are incomplete and the program needs to
be patched.

 Under Unix, this assumption is incorrect. Unix is mostly defined by its
simple and efficient interprocess communication, so a Unix program is best
designed as a *set* of processes, with the right communication channels
between them, and the right control flow between those processes. Using
Unix primitives the right way allows you to accomplish a task with minimal
effort by delegating a lot to the operating system.

 This is how I design and write software: to take advantage of the design
of Unix as much as I can, to perform tasks with the lowest possible amount
of code.
 This requires isolating basic building blocks, and providing those building
blocks as binaries, with the right interface so users can glue them
together on the command line.

 Take the syslogd service. The rsyslogd way is to have one executable,
rsyslogd, that provides the syslogd functionality. The s6 way is to combine
several tools to implement syslogd; the functionality already exists, even
if it's not immediately apparent. This command line should do:

 pipeline s6-ipcserver-socketbinder /dev/log s6-envuidgid nobody s6-applyuidgid -Uz 
s6-ipcserverd ucspilogd  s6-envuidgid syslog s6-applyuidgid -Uz s6-log 
/var/log/syslogd

 Yes, that's one unique command line. The syslogd implementation will take
the form of two long-running processes, one listening on /dev/log (the
syslogd socket) as user nobody, and spawning a short-lived ucspilogd process
for every connection to syslog; and the other writing the logs to the
/var/log/syslogd directory as user syslog and performing automatic rotation.
(You can configure how and where things are logged by writing a real s6-log
script at the end of the command line.)

 Of course, in the real world, you wouldn't write that. First, because s6
provides some shortcuts for common operations so the real command lines
would be a tad shorter, and second, because you'd want the long-running
processes to be supervised, so you'd use the supervision infrastructure
and write two short run scripts instead.

 (And so, to provide syslogd functionality to one client, you'd really have
1 s6-svscan process, 2 s6-supervise processes, 1 s6-ipcserverd process,
1 ucspilogd process and 1 s6-log process. Yes, 6 processes. This is not as
insane as it sounds. Processes are not a scarce resource on Unix; the
scarce resources are RAM and CPU. The s6 processes have been designed to
take *very* little of those, so the total amount of RAM and CPU they all
use is still smaller than the amount used by a single rsyslogd process.)

 There are good reasons to program this way. Mostly, it amounts to writing
as little code as possible. If you look at the source code for every single
command that appears on the insane command line above, you'll find that it's
pretty short, and short means maintainable - which is the most important
quality to have in a codebase, especially when there's just one guy
maintaining it.
 Using high-level languages also reduces the source code's size, but it
adds the interpreter's or run-time system's overhead, and a forest of
dependencies. What is then run on the machine is not lightweight by any
measure. (Plus, most of those languages are total crap.)

 Anyway, my point is that it often takes several processes to provide a
service, and that it's a good thing. This practice should be encouraged.
So, yes, running a service under a process supervisor is the right design,
and I'm happy that John, Gorka, Les and other people have figured it out.

 s6 itself provides the process supervision service not as a single
executable, but as a set of tools. s6-svscan doesn't do it all, and it's
by design. It's just another basic building block. Sure, it's a bit special
because it can run as process 1 and is the root of the supervision tree,
but that doesn't mean it's a turnkey program - the key lies in how it's
used together with other s6 and Unix tools.
 That's why starting s6-svscan directly as the entrypoint isn't such a
good idea. It's much more flexible to run a script as the entrypoint
that performs a few basic initialization steps then execs into s6-svscan.
Just like you'd do for a real init. :)
/long design rant

 

Heck, most people don't *care* about this kind of thing because they
don't even know. So if you just make /init the ENTRYPOINT, 99% 

Re: process supervisor - considerations for docker

2015-02-25 Thread John Regan
Hi Dreamcat4 -

First thing's first - I can't stress enough how awesome it is to know
people are using/talking about my Docker images, blog posts, and so
on. Too cool!

I've responded to your concerns/questions/etc throughout the email
below.

-John

On Wed, Feb 25, 2015 at 11:32:37AM +, Dreamcat4 wrote:
 Thank you for moving my message Laurent.
 
 Sorry for the mixup r.e. the mailing lists. I have subscribed to the
 correct list now (for s6 specific).
 
 On Wed, Feb 25, 2015 at 11:30 AM, Laurent Bercot
 ska-skaw...@skarnet.org wrote:
 
   (Moving the discussion to the supervision@list.skarnet.org list.
  The original message is quoted below.)
 
   Hi Dreamcat4,
 
   Thanks for your detailed message. I'm very happy that s6 found an
  application in docker, and that there's such an interest for it!
  skaw...@list.skarnet.org is indeed the right place to reach me and
  discuss the software I write, but for s6 in particular and process
  supervisors in general, supervision@list.skarnet.org is the better
  place - it's full of people with process supervision experience.
 
   Your message gives a lot of food for thought, and I don't have time
  right now to give it all the attention it deserves. Tonight or
  tomorrow, though, I will; and other people on the supervisionlist
  will certainly have good insights.
 
   Cheers!
 
  -- Laurent
 
 
 
  On 25/02/2015 11:55, Dreamcat4 wrote:
 
  Hello,
  Now there is someone (John Regan) who has made s6 images for docker.
  And written a blog post about it. Which is a great effort - and the
  reason I've come here. But it gives me a taste of wanting more.
  Something a bit more foolproof, and simpler, to work specifically
  inside of docker.
 
   From that blog post I get a general impression that s6 has many
  advantages. And it may be a good candidate for docker. But I would be
  remiss not to ask the developers of s6 themselves not to try to take
  some kind of a personal an interest in considering how s6 might best
  work inside of docker specifically. I hope that this is the right
  mailing list to reach s6 developers / discuss such matters. Is this
  the correct mailing list for s6 dev discussions?
 
  I've read and read around the subject of process supervision inside
  docker. Various people explain how or why they use various different
  process supervisors in docker (not just s6). None of them really quite
  seem ideal. I would like to be wrong about that but nothing has fully
  convinced me so far. Perhaps it is a fair criticism to say that I
  still have a lot more to learn in regards to process supervisors. But
  I have no interest in getting bogged down by that. To me, I already
  know more-or-less enough about how docker manages (or rather
  mis-manages!) it's container processes to have an opinion about what
  is needed, from a docker-sided perspective. And know enough that
  docker project itself won't fix these issues. For one thing because of
  not owning what's running on the inside of containers. And also
  because of their single-process viewpoint take on things. Andy way.
  That kind of political nonsense doesn't matter for our discussion. I
  just want to have a technical discussion about what is needed, and how
  might be the best way to solve the problem!
 
 
  MY CONCERNS ABOUT USING S6 INSIDE OF DOCKER
 
  In regards of s6 only, currently these are my currently perceived
  shortcomings when using it in docker:
 
  * it's not clear how to pass in programs arguments via CMD and
  ENTRYPOINT in docker
 - in fact i have not seen ANY docker process supervisor solutions
  show how to do this (except perhaps phusion base image)
 

To be honest, I just haven't really done that. I usually use
environment variables to setup my services. For example, if I have a
NodeJS service, I'll run something like

`docker run -e NODEJS_SCRIPT=myapp.js some-nodejs-image`

Then in my NodeJS `run` script, I'd check if that environment variable
is defined and use it as my argument to NodeJS. I'm just making up
this bit of shell code on the fly, it might have syntax errors, but
you should get the idea:

```
if [ -n $NODEJS_SCRIPT ]; then
exec node $NODEJS_SCRIPT
else
printf NODEJS_SCRIPT undefined
touch down
exit 1
fi
```

Another option is to write a script to use as an entrypoint that
handles command arguments, then execs into s6-svcscan.

  * it is not clear if ENV vars are preserved. That is also something
  essential for docker.

In my experience, they are. If you use s6-svc as your entrypoint (like
I do in my images), then define environment variables via docker's -e
switch they'll be preserved and available in each service's `run` script,
just like in my NodeJS example above.

 
  * s6 has many utilities s6-*
   - not clear which ones are actually required for making a docker
  process supervisor

The only *required* programs are the ones in the main s6 and execline
packages.

 
  * s6 not available yet as .deb or .rpm package
   - 

Re: process supervisor - considerations for docker

2015-02-25 Thread John Regan
On Wed, Feb 25, 2015 at 03:58:07PM +0100, Gorka Lertxundi wrote:
 Hello,
 
 After that great john's post, I tried to solve exactly your same problems. I
 created my own base image based primarily on John's and Phusion's base
 images.

That's awesome - I get so excited when I hear somebody's actually
read, digested, and taken action based on something I wrote. So cool!
:)

 
 See my thoughts below.
 
 2015-02-25 12:30 GMT+01:00 Laurent Bercot ska-skaw...@skarnet.org:
 
 
   (Moving the discussion to the supervision@list.skarnet.org list.
  The original message is quoted below.)
 
   Hi Dreamcat4,
 
   Thanks for your detailed message. I'm very happy that s6 found an
  application in docker, and that there's such an interest for it!
  skaw...@list.skarnet.org is indeed the right place to reach me and
  discuss the software I write, but for s6 in particular and process
  supervisors in general, supervision@list.skarnet.org is the better
  place - it's full of people with process supervision experience.
 
   Your message gives a lot of food for thought, and I don't have time
  right now to give it all the attention it deserves. Tonight or
  tomorrow, though, I will; and other people on the supervisionlist
  will certainly have good insights.
 
   Cheers!
 
  -- Laurent
 
 
  On 25/02/2015 11:55, Dreamcat4 wrote:
 
  Hello,
  Now there is someone (John Regan) who has made s6 images for docker.
  And written a blog post about it. Which is a great effort - and the
  reason I've come here. But it gives me a taste of wanting more.
  Something a bit more foolproof, and simpler, to work specifically
  inside of docker.
 
   From that blog post I get a general impression that s6 has many
  advantages. And it may be a good candidate for docker. But I would be
  remiss not to ask the developers of s6 themselves not to try to take
  some kind of a personal an interest in considering how s6 might best
  work inside of docker specifically. I hope that this is the right
  mailing list to reach s6 developers / discuss such matters. Is this
  the correct mailing list for s6 dev discussions?
 
  I've read and read around the subject of process supervision inside
  docker. Various people explain how or why they use various different
  process supervisors in docker (not just s6). None of them really quite
  seem ideal. I would like to be wrong about that but nothing has fully
  convinced me so far. Perhaps it is a fair criticism to say that I
  still have a lot more to learn in regards to process supervisors. But
  I have no interest in getting bogged down by that. To me, I already
  know more-or-less enough about how docker manages (or rather
  mis-manages!) it's container processes to have an opinion about what
  is needed, from a docker-sided perspective. And know enough that
  docker project itself won't fix these issues. For one thing because of
  not owning what's running on the inside of containers. And also
  because of their single-process viewpoint take on things. Andy way.
  That kind of political nonsense doesn't matter for our discussion. I
  just want to have a technical discussion about what is needed, and how
  might be the best way to solve the problem!
 
 
  MY CONCERNS ABOUT USING S6 INSIDE OF DOCKER
 
  In regards of s6 only, currently these are my currently perceived
  shortcomings when using it in docker:
 
  * it's not clear how to pass in programs arguments via CMD and
  ENTRYPOINT in docker
 
 - in fact i have not seen ANY docker process supervisor solutions
  show how to do this (except perhaps phusion base image)
 
  * it is not clear if ENV vars are preserved. That is also something
  essential for docker.
 
 
  * s6 has many utilities s6-*
   - not clear which ones are actually required for making a docker
  process supervisor
 
 
  * s6 not available yet as .deb or .rpm package
   - official packages are helpful because on different distros:
  + standard locations where to put config files and so on may
  differ.
  + to install man pages too, in the right place
 
 
  * s6 is not available as official single pre-compiled binary file for
  download via wget or curl
  - which would be the most ideal way to install it into a docker
  container
 
 
  ^^ Some of these perceived shortcomings are more important /
  significant than others! Some are not in the remit of s6 development
  to be concerned about. Some are mild nit-picking, or the ignorance of
  not-knowning, having not actually tried out s6 before.
 
  But my general point is that it is not clear-enough to me (from my
  perspective) whether s6 can actually satisfy all of the significant
  docker-specific considerations. Which I have not properly stated yet.
  So here they are listed below…
 
 
  DOCKER-SPECIFIC CONSIDERATIONS FOR A PROCESS SUPERVISOR
 
  A good process supervisor for docker should ideally:
 
  * be a single pre-compiled binary program file. That can be downloaded
  by curl/wget (or can be 

Re: process supervisor - considerations for docker

2015-02-25 Thread Laurent Bercot


 (Moving the discussion to the supervision@list.skarnet.org list.
The original message is quoted below.)

 Hi Dreamcat4,

 Thanks for your detailed message. I'm very happy that s6 found an
application in docker, and that there's such an interest for it!
skaw...@list.skarnet.org is indeed the right place to reach me and
discuss the software I write, but for s6 in particular and process
supervisors in general, supervision@list.skarnet.org is the better
place - it's full of people with process supervision experience.

 Your message gives a lot of food for thought, and I don't have time
right now to give it all the attention it deserves. Tonight or
tomorrow, though, I will; and other people on the supervisionlist
will certainly have good insights.

 Cheers!

-- Laurent


On 25/02/2015 11:55, Dreamcat4 wrote:

Hello,
Now there is someone (John Regan) who has made s6 images for docker.
And written a blog post about it. Which is a great effort - and the
reason I've come here. But it gives me a taste of wanting more.
Something a bit more foolproof, and simpler, to work specifically
inside of docker.

 From that blog post I get a general impression that s6 has many
advantages. And it may be a good candidate for docker. But I would be
remiss not to ask the developers of s6 themselves not to try to take
some kind of a personal an interest in considering how s6 might best
work inside of docker specifically. I hope that this is the right
mailing list to reach s6 developers / discuss such matters. Is this
the correct mailing list for s6 dev discussions?

I've read and read around the subject of process supervision inside
docker. Various people explain how or why they use various different
process supervisors in docker (not just s6). None of them really quite
seem ideal. I would like to be wrong about that but nothing has fully
convinced me so far. Perhaps it is a fair criticism to say that I
still have a lot more to learn in regards to process supervisors. But
I have no interest in getting bogged down by that. To me, I already
know more-or-less enough about how docker manages (or rather
mis-manages!) it's container processes to have an opinion about what
is needed, from a docker-sided perspective. And know enough that
docker project itself won't fix these issues. For one thing because of
not owning what's running on the inside of containers. And also
because of their single-process viewpoint take on things. Andy way.
That kind of political nonsense doesn't matter for our discussion. I
just want to have a technical discussion about what is needed, and how
might be the best way to solve the problem!


MY CONCERNS ABOUT USING S6 INSIDE OF DOCKER

In regards of s6 only, currently these are my currently perceived
shortcomings when using it in docker:

* it's not clear how to pass in programs arguments via CMD and
ENTRYPOINT in docker
   - in fact i have not seen ANY docker process supervisor solutions
show how to do this (except perhaps phusion base image)

* it is not clear if ENV vars are preserved. That is also something
essential for docker.

* s6 has many utilities s6-*
 - not clear which ones are actually required for making a docker
process supervisor

* s6 not available yet as .deb or .rpm package
 - official packages are helpful because on different distros:
+ standard locations where to put config files and so on may differ.
+ to install man pages too, in the right place

* s6 is not available as official single pre-compiled binary file for
download via wget or curl
- which would be the most ideal way to install it into a docker container


^^ Some of these perceived shortcomings are more important /
significant than others! Some are not in the remit of s6 development
to be concerned about. Some are mild nit-picking, or the ignorance of
not-knowning, having not actually tried out s6 before.

But my general point is that it is not clear-enough to me (from my
perspective) whether s6 can actually satisfy all of the significant
docker-specific considerations. Which I have not properly stated yet.
So here they are listed below…


DOCKER-SPECIFIC CONSIDERATIONS FOR A PROCESS SUPERVISOR

A good process supervisor for docker should ideally:

* be a single pre-compiled binary program file. That can be downloaded
by curl/wget (or can be installed from .deb or .rpm).

* can take directly command and arguments. With argv[] like this:
 process_supervisor my_program_or_script my program or script
arguments…

* will pass on all ENV vars to my_program_or_script faithfully

* will run as PID 1 inside the linux namespace

* where my_program_or_script may spawn BOTH child AND non-child
(orphaned) processes

* when process_supervisor (e.g. s6 or whatever) receives a TERM signal
   * it faithfully passes that signal to my_program_or_script
   * it also passes that signal to any orphaned non-child processes too

* when my_program_or_script dies, or exits
   * clean up ALL remaining non-children