Hi Nivedita,
Can you point me to the threads discussing this problem? I'm aware of
this one, https://github.com/moby/moby/issues/5618 and some very old
ones that were solved in the 4.15 kernel. I would like to try the
solutions and provide feedback.
Thanks!
** Bug watch added:
As several different forums are discussing this issue,
I'm using this LP bug to continue investigation in to
current manifestation of this bug (after 4.15 kernel).
I suspect it's in one of the other places not fixed, as
my colleague Dan stated a while ago.
--
You received this bug notification
We are seeing definitely a problem on kernels after 4.15.0-159-generic,
which is the last known good kernel. 5.3* kernels are affected, but I
do not have data on most recent upstream.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to
Our serves are affected by this bug too in focal and it triggered today
in one of our servers.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.3 LTS
Release: 20.04
Codename: focal
We have seen this in:
- Ubuntu 5.4.0-89.100-generic
- Ubuntu 5.4.0-88.99-generic
--
You received this bug
Is anyone still seeing a similar issue on current mainline?
** Tags added: sts
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1711407
Title:
unregister_netdevice: waiting for lo to
Bug also happens on 5.4.0-42-generic.
Distributor ID: Ubuntu
Description:Ubuntu 20.04.1 LTS
Release:20.04
Codename: focal
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
4.15.0-54-generic did work great in the past
we moved to 5.3.0-28-generic and this bug is notable...
funny enough one of our machines was able to recover... both have the
very same OS version
Distributor ID: Ubuntu
Description:Ubuntu 18.04.4 LTS
Release:18.04
Codename: bionic
4.15.0-54-generic is bug free
5.3.0-28-generic has the bug
funny enough one of our machines was able to recover... both have the
very same OS version
Distributor ID: Ubuntu
Description: Ubuntu 18.04.4 LTS
Release: 18.04
Codename: bionic
5.3.0-28-generic
maybe a downgrade de to Ubuntu 18.04.2 is
hey guys, we create our namespace using ip netns add.
Each nameSpace has it's ipv6 and ipv4.
We notice that the ipv6 ping continues to report, but the namespace at
/run/netns does not exist anymore.
is it possible to find the namespace using the ipv6 info?
--
You received this bug notification
Hi! I'm on Debian and I'm getting this issue with the following kernel:
4.19.0-8-amd64 #1 SMP Debian 4.19.98-1 (2020-01-26)
I feel like this isn't the best place to report this, but can someone
point me to the best place to report this issue? Can't find any
discussion on it from the past year but
** Tags added: cscc
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1711407
Title:
unregister_netdevice: waiting for lo to become free
Status in linux package in Ubuntu:
In Progress
** Changed in: linux (Ubuntu Trusty)
Assignee: Dan Streetman (ddstreet) => (unassigned)
** Changed in: linux (Ubuntu Bionic)
Assignee: Dan Streetman (ddstreet) => (unassigned)
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to
unfortunately I've moved away from kernel work, for now at least, and no
longer have time to continue this bug; if anyone else would like to pick
up the work, I'm happy to answer questions.
** Changed in: linux (Ubuntu Xenial)
Assignee: Dan Streetman (ddstreet) => (unassigned)
** Changed
Dan,
We had been running 80-100 instances on the hotfix kernel and I was
calling victory on the workaround as it has been a month without this
issue popping up, but I just had an instance lockup on, say, "docker ps"
and dmesg shows:
[2924480.806202] unregister_netdevice: waiting for vethd078a31
Using:
linux-image-4.4.0-127-generic 4.4.0-127.153+hf1711407v20180524b3
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1711407
Title:
unregister_netdevice: waiting for lo to
Also seeing it on Ubuntu 18.04 on AWS:
# uname -a
Linux uni09.sys.timedoctor.com 4.15.0-1016-aws #16-Ubuntu SMP Wed Jul 18
09:20:54 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
I'm hitting this bug frequently on AWS with Ubuntu 16.04.5 LTS:
Linux uni02.sys.timedoctor.com 4.4.0-1063-aws #72-Ubuntu SMP Fri Jul 13
07:23:34 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to
I'm using kernel version 4.9.73
4.9.73 #0 SMP PREEMPT Wed May 30 13:39:24 2018 aarch64 GNU/Linux
The problem reproduces for me as soon as I do ftp(vsftpd 3.0.3 server on DUT)
to the DUT. After that it just hangs and keeps giving the msg every 10 secs
waiting for lo to become free. Usage count=1.
> So they should not be operating on the same sock concurrently.
But, of course, all this is caused by bugs, so yes it's certainly
possible a bug allowed what you said to happen, but you'll need to dig
in deeper to find where such a bug is (if any), since the locking design
is correct as far as I
> is it possibility that sk->sk_dst_cache is overwritten?
maybe ;-)
> like in __sk_dst_check,
> when tcp timer tries to resend a packet, at the same time, tcp_close is called
tcp_close() locks the sock; tcp_retransmit_timer() is also called with
the sock locked. So they should not be operating
is it possibility that sk->sk_dst_cache is overwritten? like in __sk_dst_check,
when tcp timer tries to resend a packet, at the same time, tcp_close is called,
and a reset packet will send, and ip_queue_xmit will be called concurrent;
cpu 1 cpu 2
I have the same issue, and I dump the vmcore, and find dst cache, hope it has
some help.
this leaked dst is in dst_busy_list, except dst_busy_list, nowhere I can find
it in
==
First case:
crash> rtable 0x880036fbba00 -x
struct rtable {
dst = {
callback_head
Many thanks, Dan! Going to roll it out to a few test boxes and try it
out over the next few days. Will report back as soon as I have some
feedback.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
> We're using a newer kernel:
> 4.4.0-109-generic
ok, i backported the workaround to the xenial 4.4 kernel, and built it at the
same ppa:
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1711407
it's version 4.4.0-127.153+hf1711407v20180524b3
can you test with that and let me know if it works
Thanks for the ping, Dan. No worries. Looking forward to it as this issue
seems to be biting us more frequently nowadays so really interested to see
if this kernel helps.
On Sat, May 19, 2018 at 7:19 AM, Dan Streetman
wrote:
> > Would you be able to build a test
> Would you be able to build a test kernel against this version?
very sorry, i've been quite busy the last couple weeks. I'll get a test
kernel built for you next week.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
Hi Dan,
We're using a newer kernel:
4.4.0-109-generic #132-Ubuntu SMP Tue Jan 9 19:52:39 UTC 2018 x86_64 x86_64
x86_64 GNU/Linux
Would you be able to build a test kernel against this version?
Thanks very much, Dan.
On Tue, May 1, 2018 at 12:14 AM, Dan Streetman
> The symptoms seen are a number of docker commands that just hang indefinitely
> (e.g.
> docker ps, spinning up new containers). Occasionally, after a long time, we
> have
> seen it come back and continue working OK (like whatever was holding the lock
> released it finally), but most of the
Hi Dan,
> There's actually been quite a lot of discussion/work in this bug since
> comment 2, so I don't
> actually need that info anymore (except for reproduction steps, that's always
> welcome).
Fair enough. I wish I had some reproduction steps but unfortunately it
just happens randomly and
Hi @gservat,
> To answer your questions:
There's actually been quite a lot of discussion/work in this bug since
comment 2, so I don't actually need that info anymore (except for
reproduction steps, that's always welcome).
> It just happens on its own after 1/2 weeks. The following shows in
Hi Dan,
We run a bunch of instances both on AWS and GCP, and we run a
significant number of containers on both. We've only ever seen this
problem on GCP and never on AWS (it's baffling!). The kernels are as-
close-as-possible and the rest (Ubuntu version / Docker version / etc)
are identical. To
> the CPU usage is zero, there is nothing popping up. The tasks causing the
> high load
> value are in D state
ok that seems expected then. This doesn't 'fix' the problem of the
leaked dst (and namespace), just prevents blocking further net namespace
creation/deletion.
> if I can't run out of
Hi Dan,
the CPU usage is zero, there is nothing popping up. The tasks causing
the high load value are in D state. If I do "echo w > /proc/sysrq-
trigger", I will get for every leaked NS:
[392149.562095] kworker/u81:40 D0 4290 2 0x8000
[392149.562112] Workqueue: netns cleanup_net
> The machine continue to run and the docker is still able to create new
> instances,
> i.e. there is no complete lock out.
Great!
> The machine continued to run further, and after an ~hour , 3 namespaces
> leaked.
> The machine now has load "3", which seems to be an artifact of this. Other
>
Hi Dan,
so here are the information for the latest kernel
(4.13.0-38.43+hf1711407v20180413b1):
The bug got reproduced:
[ 1085.626663] unregister_netdevice: possible namespace leak (netns
8d2dc3d34800), have been waiting for 60 seconds
[ 1085.626696] (netns 8d2dc3d34800): dst
> I have just reproduced it using the previous #43+hf1711407v20180412b2"
kernel:
Excellent. And just to clarify again, I do *not* expect the kernel to
fix/avoid the problem - I fully expect it to happen - the change in
these kernels is strictly to make sure that the system is still usable
after
Hi there, great to hear the progress!
I have just reproduced it using the previous #43+hf1711407v20180412b2"
kernel:
Apr 13 19:12:14 prg41-004 kernel: [13621.398319] unregister_netdevice: likely
namespace leak (netns 8c7a056f), have been waiting for 1973452 seconds
Apr 13 19:12:24
Jiri,
I just added a new artful version to the ppa, version
4.13.0-38.43+hf1711407v20180413b1; it's building right now. It should
avoid the hang just like the last version from yesterday; but the last
one blocked freeing any future net namespaces after one hangs; this one
fixes that so even
Just to re-summarize this:
The problem here is when a net namespace is being cleaned up, the thread
cleaning it up gets hung inside net/core/dev.c netdev_wait_allrefs()
function. This is called every time anything calls rtnl_unlock(), which
happens a lot, including during net namespace cleanup.
Hi Jiri,
no the "Won't Fix" is only for the Zesty release, as that's EOL now.
I actually have a test kernel building in the ppa:
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1711407/+packages
it's version 4.13.0-38.43+hf1711407v20180412b2; it hasn't finished
building yet, probably will
Hi Dan,
how should I read that it got to "Won't fix" state? No more time to
debug it?
Thanks
Jirka H.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1711407
Title:
Jiri,
I've looked at several possible ways to just work around the main issue,
of hanging netns creation/destruction, and just allow the netns to leak
when a dst leaks. However, none of the approaches I tried were usable.
I'm going to keep looking at it this week and I'll let you know when I
Hi Dan,
any luck in preparing anothe kernel with more debugging info? Looking
forward to try another one.
Thanks for your effort!
Jiri Horky
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
> Hi, might be useful to find root-cause
"root-cause" for this is a dst object leak, and there have been many
kernel dst leak patches, there almost certainly are still dst leak(s) in
the kernel, and more dst leaks will be accidentally added later.
Unfortunately, the dst leaks all lead to an
Hi, might be useful to find root-cause: we kept running into this Kernel bug
triggered by docker somehow until deactivating IPv6. I have not seen it since
then.
Regards, Hamza
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in
Hi Dan, thanks for commenting. I started to lose hope already ;) Looking
forward for next kernel, will test it right away.
Jiri Horky
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
Sorry for the long delay - I haven't forgot about this bug, but I have
to think about how best to add debug to track the dst leak(s) in a
generic way. I'll try to get back to this bug sometime this or next
week.
--
You received this bug notification because you are a member of Kernel
Packages,
Hi again,
so for the second kernel (), was able to reproduce it in ~same time. The
messages are:
Feb 13 23:04:23 prg41-004 kernel: [ 650.285711] unregister_netdevice: waiting
for lo (netns 943cfe8ce000) to become free. Usage count = 1
Feb 13 23:04:23 prg41-004 kernel: [ 650.285736]
Forgot to paste the second kernel version. The previous message was
obtained using 4.13.13-00941-gf382397cf315 kernel (the one with
ipsec/xfrm dst leak patch).
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
Hi Dan,
so for the first kernel (linux-
image-4.13.13-00940-gf16e2bbbddee_4.13.13-00940-gf16e2bbbddee-
24_amd64.deb), it failed after some 10 minutes. The outputs are:
Feb 13 22:35:54 prg41-004 kernel: [ 736.399342] unregister_netdevice: waiting
for lo (netns 9adbfbebb000) to become free.
Can you also test with this kernel please:
http://people.canonical.com/~ddstreet/lp1711407/linux-image-4.13.13-00941-gf382397cf315_4.13.13-00941-gf382397cf315-26_amd64.deb
It's the same as the last one, with a ipsec/xfrm dst leak patch.
--
You received this bug notification because you are a
Ok, here is a new debug kernel:
http://people.canonical.com/~ddstreet/lp1711407/linux-image-4.13.13-00940-gf16e2bbbddee_4.13.13-00940-gf16e2bbbddee-24_amd64.deb
I didn't use the PPA as this one will dump out quite a lot of debug in
the logs all the time. When you reproduce it (which hopefully
Hi Dan,
Glad to help, I thank you for digging into that! Here is the latest
output:
[ 912.489822] unregister_netdevice: waiting for lo (netns 8de5c10f3000) to
become free. Usage count = 1
[ 912.489927] (netns 8de5c10f3000): dst 8de5ba0ced00 expires 0 error
0 obsolete 2 flags 0
Here is another Artful debug kernel, with just slightly more debug info; it may
help a little while I work on adding more significant debug to track where the
dst refcnt is getting leaked. Same PPA:
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1711407
New kernel version is
And we're on our way down the rabbit hole...
netns exit is blocked by reference from interface...
interface exit is blocked by reference from dst object...
dst object free is blocked by reference from ???...
I'll set up more dbg and have a new test kernel, probably by Monday.
Thanks!
--
You
Hi Dan,
here is the output:
Feb 2 18:50:14 prg41-004 kernel: [ 482.151773] unregister_netdevice: waiting
for lo (netns 8de11803e000) to become free. Usage count = 1
Feb 2 18:50:14 prg41-004 kernel: [ 482.151876](netns
8de11803e000): dst 8dd905360300 expires 0 error 0
Hi Jirka,
I have a new debug kernel building (will take a few hours as usual);
please test when you have a chance. New kernel is still for Artful and
version is linux-4.13.0-32.35+hf1711407v20180202b1.
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1711407
--
You received this bug
Hi Dan,
to answer your questions:
1) Yes, it was full output before next "waiting for..." message
2) If I leave the box running (and stops generating the load), it actually does
not recover itself at all. It keeps outputing the "waiting for..." forever with
the same device:
# dmesg | grep
> Here is the output of the kernel when the "unregister_netdevice" bug
was hit
Thanks! And just to verify, that's the full output of all those lines
before the next "waiting..." line, right? If so, it looks like the
problem in your case is not a kernel socket holding the netns open, it
looks
Hi Dan,
Here is the output of the kernel when the "unregister_netdevice" bug was
hit for the first time on the reproducer machine.
I hope it helps. Let me know if I can provide you more debug output or
try to reproduce it once more.
Thanks
Jiri Horky
Jan 31 08:42:26 prg41-004 kernel: [
I uploaded a new debug kernel to the ppa:
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1711407
I've only updated the 4.13 kernel currently, on Artful, so if you need a
Xenial HWE 4.13 kernel let me know and I can build that as well. It
will likely take hours to finish building but should be
Ok, waiting for the kernel with dbg messages. I should be able to test
it right away.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1711407
Title:
unregister_netdevice: waiting for lo
> was able to get "unregister_netdevice: waiting for lo to become free. Usage
> count = 1"
> in less than 5 minutes :-(
well, that's unfortunate that you still see it but fortunate that you're
able to reproduce it quickly. The issue that I fixed - TCP kernel
socket staying open - is certainly
Hi Dan,
first of all thanks for looking into this long lasting, irritating
issue.
I have tried your kernel (4.13.0-30-generic #33+hf1711407v20180118b2-Ubuntu
SMP) on Ubuntu 17.10 and was able to get "unregister_netdevice: waiting for lo
to become free. Usage count = 1" in less than 5 minutes
I've updated the patch and submitted upstream:
https://marc.info/?l=linux-netdev=151631015108584=2
The PPA also contains the latest ubuntu kernels patched for the issue, if
anyone has time to test:
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1711407
> How do we know when we hit the bug
Happy Holidays!
After some more investigation I've changed the patch to what I think is
a more appropriate fix, though it will only fix cases where the
container hang is due to a TCP kernel socket hanging while trying to
close itself (waiting for FIN close sequence, which will never complete
Hi Dan,
How do we know when we hit the bug with your patched kernel? is it still
logging "waiting for lo to become free" or is there any other signature
we can use to detect the original problem and still say "Wow! it works,
Dan rocks"
I have been scavenging the internet looking for your linux
One note on the test kernels, I'm still updating the patch as the change
in the current test kernels will leave sockets open which leaks memory
due to the netns not fully closing/exiting. I'll update this bug when I
have an updated patch and test kernels.
I would not recommend the current test
Looks good so far. Docker is still usable after 5000 runs, no "waiting
for lo to become free" in syslog.
Linux docker 4.4.0-104-generic #127+hf1711407v20171220b1-Ubuntu SMP Wed
Dec 20 12:31:13 UTC 201 x86_64 x86_64 x86_64 GNU/Linux
$ docker version
Client:
Version: 17.09.1-ce
API version:
nvm, forgot to configure https_proxy :)
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1711407
Title:
unregister_netdevice: waiting for lo to become free
Status in linux package in
$ sudo add-apt-repository ppa:ddstreet/lp1711407
Cannot add PPA: 'ppa:~ddstreet/ubuntu/lp1711407'.
ERROR: '~ddstreet' user or team does not exist.
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.3 LTS"
--
You received this bug notification
Ok, waiting for the build to finish and then will test ASAP.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1711407
Title:
unregister_netdevice: waiting for lo to become free
Status in
> the test kernel you were using had this one [1] included? It's
upstream commit f186ce61bb82
no that commit doesn't help, because the issue isn't one of dst refcount
leaks or delayed dst gc, the problem is kernel sockets that haven't
closed yet holding a reference to the netns lo interface
Hi Dan,
the test kernel you were using had this one [1] included? It's upstream
commit f186ce61bb82 ("Fix an intermittent pr_emerg warning about lo
becoming free."). Meaning, it's a rather a dst leak somewhere in latest
upstream as well?
Thanks,
Daniel
[1]
> So if there is something to test or debug, I would be happy to help
fixing this issue.
yes, I'm hoping to have a kernel build with some added debug to
track/list exactly what socket(s) are still holding lo device references
when the container is killed/shutdown, so testing with that would help
Hi,
In a test-setup I can easily reproduce this issue very quickly.
At my scenario it looks like the issue is just triggered if I change the IP
inside the docker container
(Needed to test StrongSwan's MOBIKE feature), send some data (ICMP) and then
kill the docker container.
So if there is
simpler reproducer for this using LXD instead of Docker:
1. install/setup LXD
$ sudo apt install lxd
$ sudo lxd init
2. create two containers
$ lxc launch ubuntu:xenial server
$ lxc launch ubuntu:xenial client
3. make the client privileged (so it can mount smb/cifs)
$ lxc config set client
> In both of our reproducers, we added "ip route flush table all;
that won't help
> ifconfig down;
that's not a valid cmdline
> sleep 10" before existing from containers.
sleeping without closing existing open sockets doesn't help
what are you using for container mgmt? docker? lxc? lxd?
his information might be relevant.
We are able to reproduce the problem with unregister_netdevice: waiting
for lo to become free. Usage count = 1 with 4.14.0-rc3 kernel with
CONFIG_PREEMPT_NONE=y and running only on one CPU with following boot
kernel options:
BOOT_IMAGE=/boot/vmlinuz-4.14.0-rc3
** Changed in: linux (Ubuntu Artful)
Status: Confirmed => In Progress
** Changed in: linux (Ubuntu Bionic)
Status: New => In Progress
** Changed in: linux (Ubuntu Zesty)
Status: New => In Progress
** Changed in: linux (Ubuntu Xenial)
Status: New => In Progress
**
> Should separate Ubuntu bug reports be created (if they don't exist already)
> regarding
> the kernel crashes
This bug is only to fix the hang/delay in new container creation.
Please open a new bug for kernel crashes.
--
You received this bug notification because you are a member of Kernel
Ok, seems that I'm than in the wrong ticket with my issues. :-)
Should separate Ubuntu bug reports be created (if they don't exist already)
regarding the kernel
crashes?
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
> Is there any way for a non-kernel-dev type to see exactly which resource
> unregister_netdevice is waiting for? (Or does it only keep track of usage
> count?)
not really, no. It's simply waiting for its reference count to drop to
0, and just broadcasts unregister events periodically hoping
> That the startup of containers is delayed is annoying.
> The much bigger issue is that it can reproducible cause a kernel Oops and
> crash a whole machine.
Yes as you found this uncovers other bugs in the kernel, but those are
separate and looks like they are getting fixed.
--
You received
My analysis so far of the problem:
1. container A has an outstanding TCP connection (thus, a socket and dst
which hold reference on the "lo" interface from the container). When the
container is stopped, the TCP connection takes ~2 minutes to timeout
(with default settings).
2. when container A
According to https://github.com/moby/moby/issues/35068 the crash is
fixed by:
https://patchwork.ozlabs.org/patch/801533/
https://patchwork.ozlabs.org/patch/778449/
** Bug watch added: github.com/moby/moby/issues #35068
https://github.com/moby/moby/issues/35068
--
You received this bug
Hello Dan,
thanks for the analysis!
That the startup of containers is delayed is annoying.
The much bigger issue is that it can reproducible cause a kernel Oops and crash
a whole machine.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed
Is there any way for a non-kernel-dev type to see exactly which resource
unregister_netdevice is waiting for? (Or does it only keep track of
usage count?)
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
** Also affects: linux (Ubuntu Artful)
Importance: Medium
Assignee: Dan Streetman (ddstreet)
Status: Confirmed
** Also affects: linux (Ubuntu Bb-series)
Importance: Undecided
Status: New
** Also affects: linux (Ubuntu Trusty)
Importance: Undecided
Status: New
Hi everyone,
so this is an interesting problem. There are 2 parts to this:
1. in the container being destroyed, some socket(s) remain open for a
period of time, which prevents the container from fully exits until all
its sockets have exited. While this happens you will see the 'waiting
for lo
Since xenial we have this very annoying behavior (unregister_netdevice:
waiting for lo to become free. Usage count = 1) as well, which in turn
makes a container reboot taking several minutes instead of seconds e.g.
on vivid (lxc-ls -f also hangs for that time). E.g.:
[ +10.244888]
We're seeing this on our build servers. These are VMWare virtual
machines which run docker containers in privileged mode. Inside the
containers are Jenkins agents and a build environment. Part of the build
process uses unshare to create these namespaces: ipc uts mount pid net.
If you have any
Thank you for the reproducer, I have been able to reproduce this using
docker-samba-loop scripts. This should help me to dig into the kernel
to see where the dst leak is coming from.
** Description changed:
This is a "continuation" of bug 1403152, as that bug has been marked
"fix released"
I could not reproduce the bug with the described method with kernel
4.4.0-81-generic and neither with 4.13.0-041300rc7-generic. 4.4.0-81
logged a hung tasks but does not Oops.
So the bug might have been reintroduced between 4.4.0-82 and 4.4.0-93
and 4.13 seems to contain a fix.
--
You received
bug 1715660 seems related
( Lots of "waiting for lo to become free" message, then a "task ip:1358
blocked for more than 120 seconds." with a backtrace )
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
Attached are the logs for an Oops on Ubuntu 14.04 on kernel linux-
image-4.4.0-93-generic=4.4.0-93.116~14.04.1
** Attachment added: "kernoops-4.4.0-93.txt"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/+attachment/4941963/+files/kernoops-4.4.0-93.txt
--
You received this bug
With https://github.com/fho/docker-samba-loop I was able to reproduce
kernel Oopses on a clean Ubuntu 16.0.4 installation with:
- linux-image-4.10.0-32-generic=4.10.0-32.36~16.04.1
- linux-image-4.11.0-14-generic=4.11.0-14.20~16.04.1
On 4.11.0-14 it was much harder to reproduce. Sometimes only a
Distributor ID: Ubuntu
Description:Ubuntu 16.04.3 LTS
Release:16.04
Codename: xenial
cat /etc/apt/sources.list.d/docker.list
deb [arch=amd64] https://apt.dockerproject.org/repo ubuntu-xenial main
# deb-src [arch=amd64] https://apt.dockerproject.org/repo ubuntu-xenial main
cat
Can anyone experiencing this issue please provide details such as:
-what release are you using (trusty/xenial/zesty)?
-what kernel version are you using?
-do you have specific steps to reproduce the problem?
** Description changed:
This is a "continuation" of bug 1403152, as that bug has
99 matches
Mail list logo