Bug#970525: docker.io: Unable to start minikube/kubernetes containers: unable to find user 0: invalid argument

2020-09-21 Thread Shengjing Zhu
On Tue, Sep 22, 2020 at 10:30 AM El boulangero  wrote:
>
> Then the issue must lie in this commit: 
> https://salsa.debian.org/docker-team/docker/-/commit/ad52cffa31359262a8e9d44daddf896c3e063dd2
>
> The docker.io package didn't build anymore, due to runc `1.0.0~rc92` which 
> landed in debian unstable. Shengjing Zhu came up with the patch to fix that, 
> but it was not a straightforward patch. The issue could be in this patch. Or 
> maybe there's more work required to make docker.io 19.03.x work with latest 
> runc (ie. more patching is needed, not less, sorry :/).
>
> Let me say it another way: when you install docker-ce from Docker's repo, you 
> also get the containerd.io package, that ships the runc binary. All of these 
> components are basically provided altogether by docker, and they are at 
> versions that were tested together. While in Debian, these separate 
> components (containerd, runc) are packaged independently, and these are not 
> the same versions as the ones shipped by Docker. So sometimes we hit this 
> kind of issues with the Debian package.
>
> And to be more correct: in Debian we actually bundle containerd within the 
> docker.io package, because nobody has the bandwidth to try to make docker 
> 19.03.x build-against / work-with containerd 1.4.x. So we build the version 
> of containerd that is vendored in the docker source tree, and ship it in the 
> docker.io package. But runc is NOT bundled in, it is provided independently 
> by the runc package, ie. version `1.0.0~rc92`.
>
> I hope that this clarifies a bit what is the issue here.
>

This is indeed caused by runc 1.0.0~rc92, the following patch can fix
the problem.
https://github.com/moby/moby/pull/41288
This patch is missed in 19.03.13, probably will be in 19.03.14
https://github.com/moby/moby/pull/41293

-- 
Shengjing Zhu



Bug#970525: docker.io: Unable to start minikube/kubernetes containers: unable to find user 0: invalid argument

2020-09-21 Thread El boulangero
Then the issue must lie in this commit:
https://salsa.debian.org/docker-team/docker/-/commit/ad52cffa31359262a8e9d44daddf896c3e063dd2

The docker.io package didn't build anymore, due to runc `1.0.0~rc92` which
landed in debian unstable. Shengjing Zhu came up with the patch to fix
that, but it was not a straightforward patch. The issue could be in this
patch. Or maybe there's more work required to make docker.io 19.03.x work
with latest runc (ie. more patching is needed, not less, sorry :/).

Let me say it another way: when you install docker-ce from Docker's repo,
you also get the containerd.io package, that ships the runc binary. All of
these components are basically provided altogether by docker, and they are
at versions that were tested together. While in Debian, these separate
components (containerd, runc) are packaged independently, and these are not
the same versions as the ones shipped by Docker. So sometimes we hit this
kind of issues with the Debian package.

And to be more correct: in Debian we actually bundle containerd within the
docker.io package, because nobody has the bandwidth to try to make docker
19.03.x build-against / work-with containerd 1.4.x. So we build the version
of containerd that is vendored in the docker source tree, and ship it in
the docker.io package. But runc is NOT bundled in, it is provided
independently by the runc package, ie. version `1.0.0~rc92`.

I hope that this clarifies a bit what is the issue here.

I CC Shengjing in case he knows more about this issue. I will also try to
have a look on my side as well.

In the meantime I guess you can downgrade to docker.io version
19.03.12+dfsg1-3 and maybe use `apt-mark hold` to prevent any further
upgrade.

Cheers,

  Arnaud


On Tue, Sep 22, 2020 at 6:35 AM Tianon Gravi  wrote:

> On Mon, 21 Sep 2020 at 13:48,  wrote:
> > On Sun, Sep 20, 2020 at 09:58:45AM +0700, El boulangero wrote:
> > > Do you know what's special with the `tianon/true` image? On what
> OS/release
> > > is it based?
> >
> > It's an image that contains only a single binary that returns 0. That
> > binary uses no libraries, not even libc.
> >
> > It's intended as an extremely light-weight image for purposes that don't
> > need a whole OS. See, for example,
> >
> https://stackoverflow.com/questions/37120260/configure-docker-compose-override-to-ignore-hide-some-containers
> >
> > It seems that something changed between 19.03.12+dfsg1-3 and
> > 19.03.12+dfsg1-4 that is somehow or other assuming the container
> > contains more infrastructure. If you determine that the bug is upstream,
> > feel free to forward it to them (and, ideally, revert whatever patch was
> > added to 19.03.12+dfsg1-4 that caused the problem in the mean time to
> > avoid breaking other software on the system).
>
> I don't think this is an upstream bug -- I'm using their "docker-ce"
> package (version "5:19.03.12~3-0~debian-buster") on a host I've got,
> and here's the result of some tests there:
>
> $ docker run --rm tianon/true && echo ok
> ok
>
> $ docker run --rm --user 0:0 tianon/true && echo ok
> ok
>
> $ docker run --rm --user 1000:1000 tianon/true && echo ok
> ok
>
> ♥,
> - Tianon
>   4096R / B42F 6819 007F 00F8 8E36  4FD4 036A 9C25 BF35 7DD4
>


Bug#970525: docker.io: Unable to start minikube/kubernetes containers: unable to find user 0: invalid argument

2020-09-21 Thread Tianon Gravi
On Mon, 21 Sep 2020 at 13:48,  wrote:
> On Sun, Sep 20, 2020 at 09:58:45AM +0700, El boulangero wrote:
> > Do you know what's special with the `tianon/true` image? On what OS/release
> > is it based?
>
> It's an image that contains only a single binary that returns 0. That
> binary uses no libraries, not even libc.
>
> It's intended as an extremely light-weight image for purposes that don't
> need a whole OS. See, for example,
> https://stackoverflow.com/questions/37120260/configure-docker-compose-override-to-ignore-hide-some-containers
>
> It seems that something changed between 19.03.12+dfsg1-3 and
> 19.03.12+dfsg1-4 that is somehow or other assuming the container
> contains more infrastructure. If you determine that the bug is upstream,
> feel free to forward it to them (and, ideally, revert whatever patch was
> added to 19.03.12+dfsg1-4 that caused the problem in the mean time to
> avoid breaking other software on the system).

I don't think this is an upstream bug -- I'm using their "docker-ce"
package (version "5:19.03.12~3-0~debian-buster") on a host I've got,
and here's the result of some tests there:

$ docker run --rm tianon/true && echo ok
ok

$ docker run --rm --user 0:0 tianon/true && echo ok
ok

$ docker run --rm --user 1000:1000 tianon/true && echo ok
ok

♥,
- Tianon
  4096R / B42F 6819 007F 00F8 8E36  4FD4 036A 9C25 BF35 7DD4



Bug#970525: docker.io: Unable to start minikube/kubernetes containers: unable to find user 0: invalid argument

2020-09-21 Thread anomie
On Sun, Sep 20, 2020 at 09:58:45AM +0700, El boulangero wrote:
> Do you know what's special with the `tianon/true` image? On what OS/release
> is it based?

It's an image that contains only a single binary that returns 0. That
binary uses no libraries, not even libc.

It's intended as an extremely light-weight image for purposes that don't
need a whole OS. See, for example,
https://stackoverflow.com/questions/37120260/configure-docker-compose-override-to-ignore-hide-some-containers

It seems that something changed between 19.03.12+dfsg1-3 and
19.03.12+dfsg1-4 that is somehow or other assuming the container
contains more infrastructure. If you determine that the bug is upstream,
feel free to forward it to them (and, ideally, revert whatever patch was
added to 19.03.12+dfsg1-4 that caused the problem in the mean time to
avoid breaking other software on the system).



Bug#970525: docker.io: Unable to start minikube/kubernetes containers: unable to find user 0: invalid argument

2020-09-19 Thread El boulangero
Hi,

I can indeed reproduce the issue. Note that this doesn't happen with the
`debian` image, only with the image `tianon/true`.

$ sudo docker run --rm -it debian echo ok
ok

Do you know what's special with the `tianon/true` image? On what OS/release
is it based?

Additionally, did you try with the docker package provided by docker.com?
See https://docs.docker.com/engine/install/debian/ . If you hit the same
problem, then you should report the issue upstream. If you don't, then
maybe there's something to investigate in the way we build the package for
Debian.

Cheers,

  Arnaud





On Fri, Sep 18, 2020 at 11:03 PM  wrote:

> Here's a simpler test case:
>
>   $ sudo dpkg -i docker.io_19.03.12+dfsg1-4_amd64.deb
>   (Reading database ... 257350 files and directories currently installed.)
>   Preparing to unpack docker.io_19.03.12+dfsg1-4_amd64.deb ...
>   Unpacking docker.io (19.03.12+dfsg1-4) over (19.03.12+dfsg1-3) ...
>   Setting up docker.io (19.03.12+dfsg1-4) ...
>   insserv: Script sysstat has overlapping Default-Start and Default-Stop
> runlevels (2 3 4 5) and (2 3 4 5). This should be fixed.
>   Processing triggers for systemd (246.5-1) ...
>   Processing triggers for man-db (2.9.3-2) ...
>   $ sudo systemctl restart docker
>   $ sudo docker run tianon/true && echo "ok"
>   docker: Error response from daemon: unable to find user 0: invalid
> argument.
>   ERRO[] error waiting for container: context canceled
>
> versus
>
>   $ sudo dpkg -i docker.io_19.03.12+dfsg1-3_amd64.deb
>   dpkg: warning: downgrading docker.io from 19.03.12+dfsg1-4 to
> 19.03.12+dfsg1-3
>   (Reading database ... 257350 files and directories currently installed.)
>   Preparing to unpack docker.io_19.03.12+dfsg1-3_amd64.deb ...
>   Unpacking docker.io (19.03.12+dfsg1-3) over (19.03.12+dfsg1-4) ...
>   Setting up docker.io (19.03.12+dfsg1-3) ...
>   insserv: Script sysstat has overlapping Default-Start and Default-Stop
> runlevels (2 3 4 5) and (2 3 4 5). This should be fixed.
>   Processing triggers for systemd (246.5-1) ...
>   Processing triggers for man-db (2.9.3-2) ...
>   $ sudo systemctl restart docker
>   $ sudo docker run tianon/true && echo "ok"
>   ok
>


Bug#970525: docker.io: Unable to start minikube/kubernetes containers: unable to find user 0: invalid argument

2020-09-18 Thread anomie
Here's a simpler test case:

  $ sudo dpkg -i docker.io_19.03.12+dfsg1-4_amd64.deb 
  (Reading database ... 257350 files and directories currently installed.)
  Preparing to unpack docker.io_19.03.12+dfsg1-4_amd64.deb ...
  Unpacking docker.io (19.03.12+dfsg1-4) over (19.03.12+dfsg1-3) ...
  Setting up docker.io (19.03.12+dfsg1-4) ...
  insserv: Script sysstat has overlapping Default-Start and Default-Stop 
runlevels (2 3 4 5) and (2 3 4 5). This should be fixed.
  Processing triggers for systemd (246.5-1) ...
  Processing triggers for man-db (2.9.3-2) ...
  $ sudo systemctl restart docker
  $ sudo docker run tianon/true && echo "ok"
  docker: Error response from daemon: unable to find user 0: invalid argument.
  ERRO[] error waiting for container: context canceled 

versus

  $ sudo dpkg -i docker.io_19.03.12+dfsg1-3_amd64.deb 
  dpkg: warning: downgrading docker.io from 19.03.12+dfsg1-4 to 19.03.12+dfsg1-3
  (Reading database ... 257350 files and directories currently installed.)
  Preparing to unpack docker.io_19.03.12+dfsg1-3_amd64.deb ...
  Unpacking docker.io (19.03.12+dfsg1-3) over (19.03.12+dfsg1-4) ...
  Setting up docker.io (19.03.12+dfsg1-3) ...
  insserv: Script sysstat has overlapping Default-Start and Default-Stop 
runlevels (2 3 4 5) and (2 3 4 5). This should be fixed.
  Processing triggers for systemd (246.5-1) ...
  Processing triggers for man-db (2.9.3-2) ...
  $ sudo systemctl restart docker
  $ sudo docker run tianon/true && echo "ok"
  ok



Bug#970525: docker.io: Unable to start minikube/kubernetes containers: unable to find user 0: invalid argument

2020-09-17 Thread Matthew Gabeler-Lee
Package: docker.io
Version: 19.03.12+dfsg1-4
Severity: important

Something in the change(s) for 19.03.12+dfsg1-4 has broken using the
docker.io package with some minikube configurations (particularly the "none"
driver which runs the kubernetes containers directly in the host docker
instance). All of minikube/kubelet's attempts to start the pods result in
docker logging errors like so:

Sep 17 20:28:35 nigripes dockerd[2793900]: 
time="2020-09-17T20:28:35.211635346-04:00" level=error msg="Handler for POST 
/v1.40/containers/e847bf6564589a04ca7a9a54f77d09a1cf6300fbaebce0224d7e86fc9f900754/start
 returned error: unable to find user 0: invalid argument"

Downgrading just one release to 19.03.12+dfsg1-3 and it works again, so
there's a pretty narrow window for what could have broken this.

-- System Information:
Debian Release: bullseye/sid
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'stable-updates'), (500, 'stable'), (490, 
'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 5.7.0-3-amd64 (SMP w/8 CPU threads)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages docker.io depends on:
ii  adduser  3.118
ii  init-system-helpers  1.58
ii  iptables 1.8.5-3
ii  libc62.31-3
ii  libdevmapper1.02.1   2:1.02.171-3
ii  libltdl7 2.4.6-14
ii  libnspr4 2:4.28-1
ii  libnss3  2:3.56-1
ii  libseccomp2  2.4.3-1+b1
ii  libsystemd0  246.5-1
ii  lsb-base 11.1.0
ii  runc 1.0.0~rc92+dfsg1-5
ii  tini 0.18.0-1+b1

Versions of packages docker.io recommends:
ii  ca-certificates  20200601
ii  cgroupfs-mount   1.4
ii  git  1:2.28.0-1
ii  needrestart  3.5-1
ii  xz-utils 5.2.4-1+b1

Versions of packages docker.io suggests:
ii  aufs-tools   1:4.14+20190211-1
ii  btrfs-progs  5.7-1
ii  debootstrap  1.0.123
pn  docker-doc   
ii  e2fsprogs1.45.6-1
pn  rinse
ii  xfsprogs 5.6.0-1+b2
pn  zfs-fuse | zfsutils  

-- no debconf information