subject:"\[Kernel\-packages\] \[Bug 1765980\] Re\: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown"

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

2019-03-14 Thread Kai-Heng Feng

It's is mainline since v4.20-rc6. Have you tried it?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765980

Title:
  IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock
  upon LXD container shutdown

Status in linux package in Ubuntu:
  Expired
Status in linux source package in Bionic:
  Confirmed

Bug description:
  I've spent the last few days tracking down an issue where an attempt
  to shutdown an LXD container after several hours of host uptime on
  Ubuntu Bionic (4.15.0-15.16-generic) would cause a kworker thread to
  start spinning on one CPU core and all subsequent container start/stop
  operations to fail.

  The underlying issue is that a kworker thread (executing cleanup_net)
  spins in inet_frags_exit_net, waiting for sum_frag_mem_limit(nf) to
  become zero, which never happens becacuse it has underflowed to some
  negative multiple of 64. That kworker thread keeps holding net_mutex
  and therefore blocks any further container start/stops. That in turn
  is triggered by receiving a fragmented IPv6 MDNS packet in my
  instance, but it could probably be triggered by any fragmented IPv6
  traffic.

  The reason for the frag mem limit counter to underflow is
  nf_ct_frag6_reasm deducting more from it than the sum of all previous
  nf_ct_frag6_queue calls added, due to pskb_expand_head (called through
  skb_unclone) adding a multiple of 64 to the SKB's truesize, due to
  kmalloc_reserve allocating some additional slack space to the buffer.

  Removing this line:
  size = SKB_WITH_OVERHEAD(ksize(data));
  or making it conditional with nhead or ntail being nonzero works around the 
issue, but a proper fix for this seems complicated.
  There is already a comment saying "It is not generally safe to change 
skb->truesize." right above the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
  I'll leave figuring out the proper way to fix this to the maintainers of this 
area... ;)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765980/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

2019-02-21 Thread Stoiko Ivanov

While digging into this - I found the following commit which might contain a 
fix:
https://github.com/torvalds/linux/commit/ebaf39e6032faf77218220707fc3fa22487784e0


** Changed in: linux (Ubuntu Bionic)
   Status: Expired => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765980

Title:
  IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock
  upon LXD container shutdown

Status in linux package in Ubuntu:
  Expired
Status in linux source package in Bionic:
  Confirmed

Bug description:
  I've spent the last few days tracking down an issue where an attempt
  to shutdown an LXD container after several hours of host uptime on
  Ubuntu Bionic (4.15.0-15.16-generic) would cause a kworker thread to
  start spinning on one CPU core and all subsequent container start/stop
  operations to fail.

  The underlying issue is that a kworker thread (executing cleanup_net)
  spins in inet_frags_exit_net, waiting for sum_frag_mem_limit(nf) to
  become zero, which never happens becacuse it has underflowed to some
  negative multiple of 64. That kworker thread keeps holding net_mutex
  and therefore blocks any further container start/stops. That in turn
  is triggered by receiving a fragmented IPv6 MDNS packet in my
  instance, but it could probably be triggered by any fragmented IPv6
  traffic.

  The reason for the frag mem limit counter to underflow is
  nf_ct_frag6_reasm deducting more from it than the sum of all previous
  nf_ct_frag6_queue calls added, due to pskb_expand_head (called through
  skb_unclone) adding a multiple of 64 to the SKB's truesize, due to
  kmalloc_reserve allocating some additional slack space to the buffer.

  Removing this line:
  size = SKB_WITH_OVERHEAD(ksize(data));
  or making it conditional with nhead or ntail being nonzero works around the 
issue, but a proper fix for this seems complicated.
  There is already a comment saying "It is not generally safe to change 
skb->truesize." right above the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
  I'll leave figuring out the proper way to fix this to the maintainers of this 
area... ;)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765980/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

2018-11-02 Thread Owen Valentine

This has expired, but affects me too here.
I'm specifically using Proxmox, which uses Ubuntu for upstream kernel - 
currently at 4.15.18, and I see the same symptoms.
Specifically LXC, specifically at stop and specifically kworker using 100% of 
one core and preventing startup of other containers.
I've found that this isn't present in Proxmox's own kernel 4.15.10-1 
specifically.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765980

Title:
  IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock
  upon LXD container shutdown

Status in linux package in Ubuntu:
  Expired
Status in linux source package in Bionic:
  Expired

Bug description:
  I've spent the last few days tracking down an issue where an attempt
  to shutdown an LXD container after several hours of host uptime on
  Ubuntu Bionic (4.15.0-15.16-generic) would cause a kworker thread to
  start spinning on one CPU core and all subsequent container start/stop
  operations to fail.

  The underlying issue is that a kworker thread (executing cleanup_net)
  spins in inet_frags_exit_net, waiting for sum_frag_mem_limit(nf) to
  become zero, which never happens becacuse it has underflowed to some
  negative multiple of 64. That kworker thread keeps holding net_mutex
  and therefore blocks any further container start/stops. That in turn
  is triggered by receiving a fragmented IPv6 MDNS packet in my
  instance, but it could probably be triggered by any fragmented IPv6
  traffic.

  The reason for the frag mem limit counter to underflow is
  nf_ct_frag6_reasm deducting more from it than the sum of all previous
  nf_ct_frag6_queue calls added, due to pskb_expand_head (called through
  skb_unclone) adding a multiple of 64 to the SKB's truesize, due to
  kmalloc_reserve allocating some additional slack space to the buffer.

  Removing this line:
  size = SKB_WITH_OVERHEAD(ksize(data));
  or making it conditional with nhead or ntail being nonzero works around the 
issue, but a proper fix for this seems complicated.
  There is already a comment saying "It is not generally safe to change 
skb->truesize." right above the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
  I'll leave figuring out the proper way to fix this to the maintainers of this 
area... ;)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765980/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

2018-06-23 Thread Launchpad Bug Tracker

[Expired for linux (Ubuntu) because there has been no activity for 60
days.]

** Changed in: linux (Ubuntu)
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765980

Title:
  IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock
  upon LXD container shutdown

Status in linux package in Ubuntu:
  Expired
Status in linux source package in Bionic:
  Expired

Bug description:
  I've spent the last few days tracking down an issue where an attempt
  to shutdown an LXD container after several hours of host uptime on
  Ubuntu Bionic (4.15.0-15.16-generic) would cause a kworker thread to
  start spinning on one CPU core and all subsequent container start/stop
  operations to fail.

  The underlying issue is that a kworker thread (executing cleanup_net)
  spins in inet_frags_exit_net, waiting for sum_frag_mem_limit(nf) to
  become zero, which never happens becacuse it has underflowed to some
  negative multiple of 64. That kworker thread keeps holding net_mutex
  and therefore blocks any further container start/stops. That in turn
  is triggered by receiving a fragmented IPv6 MDNS packet in my
  instance, but it could probably be triggered by any fragmented IPv6
  traffic.

  The reason for the frag mem limit counter to underflow is
  nf_ct_frag6_reasm deducting more from it than the sum of all previous
  nf_ct_frag6_queue calls added, due to pskb_expand_head (called through
  skb_unclone) adding a multiple of 64 to the SKB's truesize, due to
  kmalloc_reserve allocating some additional slack space to the buffer.

  Removing this line:
  size = SKB_WITH_OVERHEAD(ksize(data));
  or making it conditional with nhead or ntail being nonzero works around the 
issue, but a proper fix for this seems complicated.
  There is already a comment saying "It is not generally safe to change 
skb->truesize." right above the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
  I'll leave figuring out the proper way to fix this to the maintainers of this 
area... ;)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765980/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

2018-06-23 Thread Launchpad Bug Tracker

[Expired for linux (Ubuntu Bionic) because there has been no activity
for 60 days.]

** Changed in: linux (Ubuntu Bionic)
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765980

Title:
  IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock
  upon LXD container shutdown

Status in linux package in Ubuntu:
  Expired
Status in linux source package in Bionic:
  Expired

Bug description:
  I've spent the last few days tracking down an issue where an attempt
  to shutdown an LXD container after several hours of host uptime on
  Ubuntu Bionic (4.15.0-15.16-generic) would cause a kworker thread to
  start spinning on one CPU core and all subsequent container start/stop
  operations to fail.

  The underlying issue is that a kworker thread (executing cleanup_net)
  spins in inet_frags_exit_net, waiting for sum_frag_mem_limit(nf) to
  become zero, which never happens becacuse it has underflowed to some
  negative multiple of 64. That kworker thread keeps holding net_mutex
  and therefore blocks any further container start/stops. That in turn
  is triggered by receiving a fragmented IPv6 MDNS packet in my
  instance, but it could probably be triggered by any fragmented IPv6
  traffic.

  The reason for the frag mem limit counter to underflow is
  nf_ct_frag6_reasm deducting more from it than the sum of all previous
  nf_ct_frag6_queue calls added, due to pskb_expand_head (called through
  skb_unclone) adding a multiple of 64 to the SKB's truesize, due to
  kmalloc_reserve allocating some additional slack space to the buffer.

  Removing this line:
  size = SKB_WITH_OVERHEAD(ksize(data));
  or making it conditional with nhead or ntail being nonzero works around the 
issue, but a proper fix for this seems complicated.
  There is already a comment saying "It is not generally safe to change 
skb->truesize." right above the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
  I'll leave figuring out the proper way to fix this to the maintainers of this 
area... ;)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765980/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

2018-04-24 Thread Michael Sparmann

So far I have not been able to reproduce it on the mainline kernel linked above.
However, given the intermittent nature of the problem, I'm not convinced that 
this was actually fixed.
The source code related to the underlying root cause looks unchanged, and the 
symptoms may well be hidden away for my load pattern by unrelated changing 
resulting in different kmalloc behavior.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765980

Title:
  IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock
  upon LXD container shutdown

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  I've spent the last few days tracking down an issue where an attempt
  to shutdown an LXD container after several hours of host uptime on
  Ubuntu Bionic (4.15.0-15.16-generic) would cause a kworker thread to
  start spinning on one CPU core and all subsequent container start/stop
  operations to fail.

  The underlying issue is that a kworker thread (executing cleanup_net)
  spins in inet_frags_exit_net, waiting for sum_frag_mem_limit(nf) to
  become zero, which never happens becacuse it has underflowed to some
  negative multiple of 64. That kworker thread keeps holding net_mutex
  and therefore blocks any further container start/stops. That in turn
  is triggered by receiving a fragmented IPv6 MDNS packet in my
  instance, but it could probably be triggered by any fragmented IPv6
  traffic.

  The reason for the frag mem limit counter to underflow is
  nf_ct_frag6_reasm deducting more from it than the sum of all previous
  nf_ct_frag6_queue calls added, due to pskb_expand_head (called through
  skb_unclone) adding a multiple of 64 to the SKB's truesize, due to
  kmalloc_reserve allocating some additional slack space to the buffer.

  Removing this line:
  size = SKB_WITH_OVERHEAD(ksize(data));
  or making it conditional with nhead or ntail being nonzero works around the 
issue, but a proper fix for this seems complicated.
  There is already a comment saying "It is not generally safe to change 
skb->truesize." right above the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
  I'll leave figuring out the proper way to fix this to the maintainers of this 
area... ;)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765980/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

2018-04-23 Thread Joseph Salisbury

Would it be possible for you to test the latest upstream kernel? Refer
to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest
v4.16 kernel[0].

If this bug is fixed in the mainline kernel, please add the following
tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag:
'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as
"Confirmed".


Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.17-rc2


** Changed in: linux (Ubuntu)
   Importance: Undecided => Medium

** Tags added: bionic kernel-da-key

** Also affects: linux (Ubuntu Bionic)
   Importance: Medium
   Status: Confirmed

** Changed in: linux (Ubuntu Bionic)
   Status: Confirmed => Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765980

Title:
  IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock
  upon LXD container shutdown

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  I've spent the last few days tracking down an issue where an attempt
  to shutdown an LXD container after several hours of host uptime on
  Ubuntu Bionic (4.15.0-15.16-generic) would cause a kworker thread to
  start spinning on one CPU core and all subsequent container start/stop
  operations to fail.

  The underlying issue is that a kworker thread (executing cleanup_net)
  spins in inet_frags_exit_net, waiting for sum_frag_mem_limit(nf) to
  become zero, which never happens becacuse it has underflowed to some
  negative multiple of 64. That kworker thread keeps holding net_mutex
  and therefore blocks any further container start/stops. That in turn
  is triggered by receiving a fragmented IPv6 MDNS packet in my
  instance, but it could probably be triggered by any fragmented IPv6
  traffic.

  The reason for the frag mem limit counter to underflow is
  nf_ct_frag6_reasm deducting more from it than the sum of all previous
  nf_ct_frag6_queue calls added, due to pskb_expand_head (called through
  skb_unclone) adding a multiple of 64 to the SKB's truesize, due to
  kmalloc_reserve allocating some additional slack space to the buffer.

  Removing this line:
  size = SKB_WITH_OVERHEAD(ksize(data));
  or making it conditional with nhead or ntail being nonzero works around the 
issue, but a proper fix for this seems complicated.
  There is already a comment saying "It is not generally safe to change 
skb->truesize." right above the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
  I'll leave figuring out the proper way to fix this to the maintainers of this 
area... ;)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765980/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

2018-04-21 Thread Michael Sparmann

** Description changed:

  I've spent the last few days tracking down an issue where an attempt to
  shutdown an LXD container after several hours of host uptime on Ubuntu
  Bionic (4.15.0-15.16-generic) would cause a kworker thread to start
  spinning on one CPU core and all subsequent container start/stop
  operations to fail.
  
  The underlying issue is that a kworker thread (executing cleanup_net)
  spins in inet_frags_exit_net, waiting for sum_frag_mem_limit(nf) to
  become zero, which never happens becacuse it has underflowed to some
  negative multiple of 64. That kworker thread keeps holding net_mutex and
  therefore blocks any further container start/stops. That in turn is
  triggered by receiving a fragmented IPv6 MDNS packet in my instance, but
  it could probably be triggered by any fragmented IPv6 traffic.
  
  The reason for the frag mem limit counter to underflow is
  nf_ct_frag6_reasm deducting more from it than the sum of all previous
  nf_ct_frag6_queue calls added, due to pskb_expand_head (called through
  skb_unclone) adding a multiple of 64 to the SKB's truesize, due to
  kmalloc_reserve allocating some additional slack space to the buffer.
  
  Removing this line:
  size = SKB_WITH_OVERHEAD(ksize(data));
  or making it conditional with nhead or ntail being nonzero works around the 
issue, but a proper fix for this seems complicated.
- There is already a comment saying "It is not generally safe to change 
skb->truesize." right about the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
+ There is already a comment saying "It is not generally safe to change 
skb->truesize." right above the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
  I'll leave figuring out the proper way to fix this to the maintainers of this 
area... ;)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765980

Title:
  IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock
  upon LXD container shutdown

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  I've spent the last few days tracking down an issue where an attempt
  to shutdown an LXD container after several hours of host uptime on
  Ubuntu Bionic (4.15.0-15.16-generic) would cause a kworker thread to
  start spinning on one CPU core and all subsequent container start/stop
  operations to fail.

  The underlying issue is that a kworker thread (executing cleanup_net)
  spins in inet_frags_exit_net, waiting for sum_frag_mem_limit(nf) to
  become zero, which never happens becacuse it has underflowed to some
  negative multiple of 64. That kworker thread keeps holding net_mutex
  and therefore blocks any further container start/stops. That in turn
  is triggered by receiving a fragmented IPv6 MDNS packet in my
  instance, but it could probably be triggered by any fragmented IPv6
  traffic.

  The reason for the frag mem limit counter to underflow is
  nf_ct_frag6_reasm deducting more from it than the sum of all previous
  nf_ct_frag6_queue calls added, due to pskb_expand_head (called through
  skb_unclone) adding a multiple of 64 to the SKB's truesize, due to
  kmalloc_reserve allocating some additional slack space to the buffer.

  Removing this line:
  size = SKB_WITH_OVERHEAD(ksize(data));
  or making it conditional with nhead or ntail being nonzero works around the 
issue, but a proper fix for this seems complicated.
  There is already a comment saying "It is not generally safe to change 
skb->truesize." right above the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
  I'll leave figuring out the proper way to fix this to the maintainers of this 
area... ;)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765980/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

2018-04-21 Thread Michael Sparmann

The cause of the issue is already understood, and the machine currently isn't 
running an unmodified kernel for debugging reasons. Apport logs won't help here.
Contact me if you need specific information.

** Changed in: linux (Ubuntu)
   Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765980

Title:
  IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock
  upon LXD container shutdown

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  I've spent the last few days tracking down an issue where an attempt
  to shutdown an LXD container after several hours of host uptime on
  Ubuntu Bionic (4.15.0-15.16-generic) would cause a kworker thread to
  start spinning on one CPU core and all subsequent container start/stop
  operations to fail.

  The underlying issue is that a kworker thread (executing cleanup_net)
  spins in inet_frags_exit_net, waiting for sum_frag_mem_limit(nf) to
  become zero, which never happens becacuse it has underflowed to some
  negative multiple of 64. That kworker thread keeps holding net_mutex
  and therefore blocks any further container start/stops. That in turn
  is triggered by receiving a fragmented IPv6 MDNS packet in my
  instance, but it could probably be triggered by any fragmented IPv6
  traffic.

  The reason for the frag mem limit counter to underflow is
  nf_ct_frag6_reasm deducting more from it than the sum of all previous
  nf_ct_frag6_queue calls added, due to pskb_expand_head (called through
  skb_unclone) adding a multiple of 64 to the SKB's truesize, due to
  kmalloc_reserve allocating some additional slack space to the buffer.

  Removing this line:
  size = SKB_WITH_OVERHEAD(ksize(data));
  or making it conditional with nhead or ntail being nonzero works around the 
issue, but a proper fix for this seems complicated.
  There is already a comment saying "It is not generally safe to change 
skb->truesize." right above the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
  I'll leave figuring out the proper way to fix this to the maintainers of this 
area... ;)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765980/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

9 matches

Site Navigation

Mail list logo

Footer information