We also backported [1] to 4.2 (linux-lts-wily) and deployed it to our
production OpenStack cloud.  We just installed it yesterday and our MTBF
is between two and twenty days, so we won't know if this has made any
difference for a while now.

Some details about our configuration / failure mode:

Three OpenStack "Layer 3" hosts (running 3.19.0-30-generic
#34~14.04.1-Ubuntu) providing virtual routers/VPNs/Metadata via network
namespaces.

Our most recent failures occurred on hosts B and C (within 30 minutes of
each other, after having been fine for weeks) while removing routers
from A and re-creating them on B and C.


Our stack traces are a slightly different from the ones posted above...

Dec 14 15:37:05 hostname kernel: [961050.119727] INFO: task ip:9865 blocked for 
more than 120 seconds.
Dec 14 15:37:05 hostname kernel: [961050.126707]       Tainted: G         C     
3.19.0-30-generic #34~14.04.1-Ubuntu
Dec 14 15:37:05 hostname kernel: [961050.135073] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 14 15:37:05 hostname kernel: [961050.144094] ip              D 
ffff88097e3e3de8     0  9865   9864 0x00000000
Dec 14 15:37:05 hostname kernel: [961050.144098]  ffff88097e3e3de8 
ffff880e982693a0 0000000000013e80 ffff88097e3e3fd8
Dec 14 15:37:05 hostname kernel: [961050.144100]  0000000000013e80 
ffff88101a8993a0 ffff880e982693a0 0000000000000000
Dec 14 15:37:05 hostname kernel: [961050.144102]  ffffffff81cdb2a0 
ffffffff81cdb2a4 ffff880e982693a0 00000000ffffffff
Dec 14 15:37:05 hostname kernel: [961050.144104] Call Trace:
Dec 14 15:37:05 hostname kernel: [961050.144109]  [<ffffffff817b2fa9>] 
schedule_preempt_disabled+0x29/0x70
Dec 14 15:37:05 hostname kernel: [961050.144111]  [<ffffffff817b4c95>] 
__mutex_lock_slowpath+0x95/0x100
Dec 14 15:37:05 hostname kernel: [961050.144115]  [<ffffffff811cfd66>] ? 
__kmalloc+0x226/0x280
Dec 14 15:37:05 hostname kernel: [961050.144117]  [<ffffffff816a14a1>] ? 
net_alloc_generic+0x21/0x30
Dec 14 15:37:05 hostname kernel: [961050.144120]  [<ffffffff817b4d23>] 
mutex_lock+0x23/0x37
Dec 14 15:37:05 hostname kernel: [961050.144122]  [<ffffffff816a1c75>] 
copy_net_ns+0x75/0x150
Dec 14 15:37:05 hostname kernel: [961050.144125]  [<ffffffff810943ad>] 
create_new_namespaces+0xfd/0x180
Dec 14 15:37:05 hostname kernel: [961050.144127]  [<ffffffff810945ba>] 
unshare_nsproxy_namespaces+0x5a/0xc0
Dec 14 15:37:05 hostname kernel: [961050.144130]  [<ffffffff8107439b>] 
SyS_unshare+0x15b/0x2e0
Dec 14 15:37:05 hostname kernel: [961050.144133]  [<ffffffff817b6e4d>] 
system_call_fastpath+0x16/0x1b
Dec 14 15:37:05 hostname kernel: [961050.144135] INFO: task ip:9896 blocked for 
more than 120 seconds.
Dec 14 15:37:05 hostname kernel: [961050.151109]       Tainted: G         C     
3.19.0-30-generic #34~14.04.1-Ubuntu
Dec 14 15:37:05 hostname kernel: [961050.159558] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 14 15:37:05 hostname kernel: [961050.168551] ip              D 
ffff8804591cfde8     0  9896   9895 0x00000000
Dec 14 15:37:05 hostname kernel: [961050.168556]  ffff8804591cfde8 
ffff880814031d70 0000000000013e80 ffff8804591cffd8
Dec 14 15:37:05 hostname kernel: [961050.168558]  0000000000013e80 
ffffffff81c1d4e0 ffff880814031d70 0000000000000000
Dec 14 15:37:05 hostname kernel: [961050.168560]  ffffffff81cdb2a0 
ffffffff81cdb2a4 ffff880814031d70 00000000ffffffff
Dec 14 15:37:05 hostname kernel: [961050.168562] Call Trace:
Dec 14 15:37:05 hostname kernel: [961050.168568]  [<ffffffff817b2fa9>] 
schedule_preempt_disabled+0x29/0x70
Dec 14 15:37:05 hostname kernel: [961050.168571]  [<ffffffff817b4c95>] 
__mutex_lock_slowpath+0x95/0x100
Dec 14 15:37:05 hostname kernel: [961050.168573]  [<ffffffff817b4d23>] 
mutex_lock+0x23/0x37
Dec 14 15:37:05 hostname kernel: [961050.168577]  [<ffffffff816a1c75>] 
copy_net_ns+0x75/0x150
Dec 14 15:37:05 hostname kernel: [961050.168581]  [<ffffffff810943ad>] 
create_new_namespaces+0xfd/0x180
Dec 14 15:37:05 hostname kernel: [961050.168584]  [<ffffffff810945ba>] 
unshare_nsproxy_namespaces+0x5a/0xc0
Dec 14 15:37:05 hostname kernel: [961050.168587]  [<ffffffff8107439b>] 
SyS_unshare+0x15b/0x2e0
Dec 14 15:37:05 hostname kernel: [961050.168589]  [<ffffffff817b6e4d>] 
system_call_fastpath+0x16/0x1b

[1] http://www.spinics.net/lists/netdev/msg351337.html

Cheers,
James Dempsey

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1403152

Title:
  unregister_netdevice: waiting for lo to become free. Usage count

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1403152/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to