[Kernel-packages] [Bug 1765998] Re: FS access deadlock with btrfs quotas enabled

2019-03-01 Thread Michael Sparmann
Hm, it's been a while...
I think back then I made some btrfs developers aware of it on IRC, but never 
got around to sending it to the mailing list.
I'm running my own kernel builds for now (I had to do that to fix some other 
issues anyway) with the patch from comment #4 applied, which seems to reliably 
fix this issue.

I am very occasionally getting parent transid verify errors on the quota
tree though, which I believe must be originating from another bug added
at some point after I posted that patch here, because initially I didn't
have any of those for several months. It seems that those can be cleaned
up by temporarily disabling and re-enabling quota, so they are no big
deal to me right now, despite causing some annoying downtime
occasionally.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765998

Title:
  FS access deadlock with btrfs quotas enabled

Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Bionic:
  Triaged

Bug description:
  I'm running into an issue on Ubuntu Bionic (but not Xenial) where
  shortly after boot, under heavy load from many LXD containers starting
  at once, access to the btrfs filesystem that the containers are on
  deadlocks.

  The issue is quite hard to reproduce on other systems, quite likely
  related to the size of the filesystem involved (4 devices with a total
  of 8TB, millions of files, ~20 subvolumes with tens of snapshots each)
  and the access pattern from many LXD containers at once. It definitely
  goes away when disabling btrfs quotas though. Another prerequisite to
  trigger this bug may be the container subvolumes sharing extents (from
  their parent image or due to deduplication).

  I can only reliably reproduce it on a production system that I can only do 
very limited testing on, however I have been able to gather the following 
information:
  - Many threads are stuck, trying to aquire locks on various tree roots, which 
are never released by their current holders.
  - There always seem to be (at least) two threads executing rmdir syscalls 
which are creating the circular dependency: One of them is in btrfs_cow_block 
=> ... => btrfs_qgroup_trace_extent_post => ... => find_parent_nodes and wants 
to acquire a lock that was already aquired by btrfs_search_slot of the other 
rmdir.
  - Reverting this patch seems to prevent it from happening: 
https://patchwork.kernel.org/patch/9573267/

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765998/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

2018-04-24 Thread Michael Sparmann
So far I have not been able to reproduce it on the mainline kernel linked above.
However, given the intermittent nature of the problem, I'm not convinced that 
this was actually fixed.
The source code related to the underlying root cause looks unchanged, and the 
symptoms may well be hidden away for my load pattern by unrelated changing 
resulting in different kmalloc behavior.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765980

Title:
  IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock
  upon LXD container shutdown

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  I've spent the last few days tracking down an issue where an attempt
  to shutdown an LXD container after several hours of host uptime on
  Ubuntu Bionic (4.15.0-15.16-generic) would cause a kworker thread to
  start spinning on one CPU core and all subsequent container start/stop
  operations to fail.

  The underlying issue is that a kworker thread (executing cleanup_net)
  spins in inet_frags_exit_net, waiting for sum_frag_mem_limit(nf) to
  become zero, which never happens becacuse it has underflowed to some
  negative multiple of 64. That kworker thread keeps holding net_mutex
  and therefore blocks any further container start/stops. That in turn
  is triggered by receiving a fragmented IPv6 MDNS packet in my
  instance, but it could probably be triggered by any fragmented IPv6
  traffic.

  The reason for the frag mem limit counter to underflow is
  nf_ct_frag6_reasm deducting more from it than the sum of all previous
  nf_ct_frag6_queue calls added, due to pskb_expand_head (called through
  skb_unclone) adding a multiple of 64 to the SKB's truesize, due to
  kmalloc_reserve allocating some additional slack space to the buffer.

  Removing this line:
  size = SKB_WITH_OVERHEAD(ksize(data));
  or making it conditional with nhead or ntail being nonzero works around the 
issue, but a proper fix for this seems complicated.
  There is already a comment saying "It is not generally safe to change 
skb->truesize." right above the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
  I'll leave figuring out the proper way to fix this to the maintainers of this 
area... ;)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765980/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1765998] Re: FS access deadlock with btrfs quotas enabled

2018-04-23 Thread Michael Sparmann
** Changed in: linux (Ubuntu Bionic)
   Status: Incomplete => Confirmed

** Tags added: kernel-bug-exists-upstream

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765998

Title:
  FS access deadlock with btrfs quotas enabled

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Bionic:
  Confirmed

Bug description:
  I'm running into an issue on Ubuntu Bionic (but not Xenial) where
  shortly after boot, under heavy load from many LXD containers starting
  at once, access to the btrfs filesystem that the containers are on
  deadlocks.

  The issue is quite hard to reproduce on other systems, quite likely
  related to the size of the filesystem involved (4 devices with a total
  of 8TB, millions of files, ~20 subvolumes with tens of snapshots each)
  and the access pattern from many LXD containers at once. It definitely
  goes away when disabling btrfs quotas though. Another prerequisite to
  trigger this bug may be the container subvolumes sharing extents (from
  their parent image or due to deduplication).

  I can only reliably reproduce it on a production system that I can only do 
very limited testing on, however I have been able to gather the following 
information:
  - Many threads are stuck, trying to aquire locks on various tree roots, which 
are never released by their current holders.
  - There always seem to be (at least) two threads executing rmdir syscalls 
which are creating the circular dependency: One of them is in btrfs_cow_block 
=> ... => btrfs_qgroup_trace_extent_post => ... => find_parent_nodes and wants 
to acquire a lock that was already aquired by btrfs_search_slot of the other 
rmdir.
  - Reverting this patch seems to prevent it from happening: 
https://patchwork.kernel.org/patch/9573267/

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765998/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1765998] Re: FS access deadlock with btrfs quotas enabled

2018-04-23 Thread Michael Sparmann
I can confirm that the issue still exists in the mainline kernel build
linked above.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765998

Title:
  FS access deadlock with btrfs quotas enabled

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  I'm running into an issue on Ubuntu Bionic (but not Xenial) where
  shortly after boot, under heavy load from many LXD containers starting
  at once, access to the btrfs filesystem that the containers are on
  deadlocks.

  The issue is quite hard to reproduce on other systems, quite likely
  related to the size of the filesystem involved (4 devices with a total
  of 8TB, millions of files, ~20 subvolumes with tens of snapshots each)
  and the access pattern from many LXD containers at once. It definitely
  goes away when disabling btrfs quotas though. Another prerequisite to
  trigger this bug may be the container subvolumes sharing extents (from
  their parent image or due to deduplication).

  I can only reliably reproduce it on a production system that I can only do 
very limited testing on, however I have been able to gather the following 
information:
  - Many threads are stuck, trying to aquire locks on various tree roots, which 
are never released by their current holders.
  - There always seem to be (at least) two threads executing rmdir syscalls 
which are creating the circular dependency: One of them is in btrfs_cow_block 
=> ... => btrfs_qgroup_trace_extent_post => ... => find_parent_nodes and wants 
to acquire a lock that was already aquired by btrfs_search_slot of the other 
rmdir.
  - Reverting this patch seems to prevent it from happening: 
https://patchwork.kernel.org/patch/9573267/

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765998/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1765998] Re: FS access deadlock with btrfs quotas enabled

2018-04-23 Thread Michael Sparmann
This patch seems to fix it for me (running that for several days now).

** Patch added: "0002-qgroup_accounting_fix.patch"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765998/+attachment/5126022/+files/0002-qgroup_accounting_fix.patch

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765998

Title:
  FS access deadlock with btrfs quotas enabled

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  I'm running into an issue on Ubuntu Bionic (but not Xenial) where
  shortly after boot, under heavy load from many LXD containers starting
  at once, access to the btrfs filesystem that the containers are on
  deadlocks.

  The issue is quite hard to reproduce on other systems, quite likely
  related to the size of the filesystem involved (4 devices with a total
  of 8TB, millions of files, ~20 subvolumes with tens of snapshots each)
  and the access pattern from many LXD containers at once. It definitely
  goes away when disabling btrfs quotas though. Another prerequisite to
  trigger this bug may be the container subvolumes sharing extents (from
  their parent image or due to deduplication).

  I can only reliably reproduce it on a production system that I can only do 
very limited testing on, however I have been able to gather the following 
information:
  - Many threads are stuck, trying to aquire locks on various tree roots, which 
are never released by their current holders.
  - There always seem to be (at least) two threads executing rmdir syscalls 
which are creating the circular dependency: One of them is in btrfs_cow_block 
=> ... => btrfs_qgroup_trace_extent_post => ... => find_parent_nodes and wants 
to acquire a lock that was already aquired by btrfs_search_slot of the other 
rmdir.
  - Reverting this patch seems to prevent it from happening: 
https://patchwork.kernel.org/patch/9573267/

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765998/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1765998] Re: FS access deadlock with btrfs quotas enabled

2018-04-21 Thread Michael Sparmann
I cannot run the affected (production) system using a broken kernel, and it 
will lockup after boot within seconds.
If necessary, I can provide additional information or testing upon request.

** Changed in: linux (Ubuntu)
   Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765998

Title:
  FS access deadlock with btrfs quotas enabled

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  I'm running into an issue on Ubuntu Bionic (but not Xenial) where
  shortly after boot, under heavy load from many LXD containers starting
  at once, access to the btrfs filesystem that the containers are on
  deadlocks.

  The issue is quite hard to reproduce on other systems, quite likely
  related to the size of the filesystem involved (4 devices with a total
  of 8TB, millions of files, ~20 subvolumes with tens of snapshots each)
  and the access pattern from many LXD containers at once. It definitely
  goes away when disabling btrfs quotas though. Another prerequisite to
  trigger this bug may be the container subvolumes sharing extents (from
  their parent image or due to deduplication).

  I can only reliably reproduce it on a production system that I can only do 
very limited testing on, however I have been able to gather the following 
information:
  - Many threads are stuck, trying to aquire locks on various tree roots, which 
are never released by their current holders.
  - There always seem to be (at least) two threads executing rmdir syscalls 
which are creating the circular dependency: One of them is in btrfs_cow_block 
=> ... => btrfs_qgroup_trace_extent_post => ... => find_parent_nodes and wants 
to acquire a lock that was already aquired by btrfs_search_slot of the other 
rmdir.
  - Reverting this patch seems to prevent it from happening: 
https://patchwork.kernel.org/patch/9573267/

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765998/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

2018-04-21 Thread Michael Sparmann
** Description changed:

  I've spent the last few days tracking down an issue where an attempt to
  shutdown an LXD container after several hours of host uptime on Ubuntu
  Bionic (4.15.0-15.16-generic) would cause a kworker thread to start
  spinning on one CPU core and all subsequent container start/stop
  operations to fail.
  
  The underlying issue is that a kworker thread (executing cleanup_net)
  spins in inet_frags_exit_net, waiting for sum_frag_mem_limit(nf) to
  become zero, which never happens becacuse it has underflowed to some
  negative multiple of 64. That kworker thread keeps holding net_mutex and
  therefore blocks any further container start/stops. That in turn is
  triggered by receiving a fragmented IPv6 MDNS packet in my instance, but
  it could probably be triggered by any fragmented IPv6 traffic.
  
  The reason for the frag mem limit counter to underflow is
  nf_ct_frag6_reasm deducting more from it than the sum of all previous
  nf_ct_frag6_queue calls added, due to pskb_expand_head (called through
  skb_unclone) adding a multiple of 64 to the SKB's truesize, due to
  kmalloc_reserve allocating some additional slack space to the buffer.
  
  Removing this line:
  size = SKB_WITH_OVERHEAD(ksize(data));
  or making it conditional with nhead or ntail being nonzero works around the 
issue, but a proper fix for this seems complicated.
- There is already a comment saying "It is not generally safe to change 
skb->truesize." right about the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
+ There is already a comment saying "It is not generally safe to change 
skb->truesize." right above the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
  I'll leave figuring out the proper way to fix this to the maintainers of this 
area... ;)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765980

Title:
  IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock
  upon LXD container shutdown

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  I've spent the last few days tracking down an issue where an attempt
  to shutdown an LXD container after several hours of host uptime on
  Ubuntu Bionic (4.15.0-15.16-generic) would cause a kworker thread to
  start spinning on one CPU core and all subsequent container start/stop
  operations to fail.

  The underlying issue is that a kworker thread (executing cleanup_net)
  spins in inet_frags_exit_net, waiting for sum_frag_mem_limit(nf) to
  become zero, which never happens becacuse it has underflowed to some
  negative multiple of 64. That kworker thread keeps holding net_mutex
  and therefore blocks any further container start/stops. That in turn
  is triggered by receiving a fragmented IPv6 MDNS packet in my
  instance, but it could probably be triggered by any fragmented IPv6
  traffic.

  The reason for the frag mem limit counter to underflow is
  nf_ct_frag6_reasm deducting more from it than the sum of all previous
  nf_ct_frag6_queue calls added, due to pskb_expand_head (called through
  skb_unclone) adding a multiple of 64 to the SKB's truesize, due to
  kmalloc_reserve allocating some additional slack space to the buffer.

  Removing this line:
  size = SKB_WITH_OVERHEAD(ksize(data));
  or making it conditional with nhead or ntail being nonzero works around the 
issue, but a proper fix for this seems complicated.
  There is already a comment saying "It is not generally safe to change 
skb->truesize." right above the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
  I'll leave figuring out the proper way to fix this to the maintainers of this 
area... ;)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765980/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1765980] Re: IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

2018-04-21 Thread Michael Sparmann
The cause of the issue is already understood, and the machine currently isn't 
running an unmodified kernel for debugging reasons. Apport logs won't help here.
Contact me if you need specific information.

** Changed in: linux (Ubuntu)
   Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765980

Title:
  IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock
  upon LXD container shutdown

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  I've spent the last few days tracking down an issue where an attempt
  to shutdown an LXD container after several hours of host uptime on
  Ubuntu Bionic (4.15.0-15.16-generic) would cause a kworker thread to
  start spinning on one CPU core and all subsequent container start/stop
  operations to fail.

  The underlying issue is that a kworker thread (executing cleanup_net)
  spins in inet_frags_exit_net, waiting for sum_frag_mem_limit(nf) to
  become zero, which never happens becacuse it has underflowed to some
  negative multiple of 64. That kworker thread keeps holding net_mutex
  and therefore blocks any further container start/stops. That in turn
  is triggered by receiving a fragmented IPv6 MDNS packet in my
  instance, but it could probably be triggered by any fragmented IPv6
  traffic.

  The reason for the frag mem limit counter to underflow is
  nf_ct_frag6_reasm deducting more from it than the sum of all previous
  nf_ct_frag6_queue calls added, due to pskb_expand_head (called through
  skb_unclone) adding a multiple of 64 to the SKB's truesize, due to
  kmalloc_reserve allocating some additional slack space to the buffer.

  Removing this line:
  size = SKB_WITH_OVERHEAD(ksize(data));
  or making it conditional with nhead or ntail being nonzero works around the 
issue, but a proper fix for this seems complicated.
  There is already a comment saying "It is not generally safe to change 
skb->truesize." right above the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
  I'll leave figuring out the proper way to fix this to the maintainers of this 
area... ;)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765980/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1765998] [NEW] FS access deadlock with btrfs quotas enabled

2018-04-21 Thread Michael Sparmann
Public bug reported:

I'm running into an issue on Ubuntu Bionic (but not Xenial) where
shortly after boot, under heavy load from many LXD containers starting
at once, access to the btrfs filesystem that the containers are on
deadlocks.

The issue is quite hard to reproduce on other systems, quite likely
related to the size of the filesystem involved (4 devices with a total
of 8TB, millions of files, ~20 subvolumes with tens of snapshots each)
and the access pattern from many LXD containers at once. It definitely
goes away when disabling btrfs quotas though. Another prerequisite to
trigger this bug may be the container subvolumes sharing extents (from
their parent image or due to deduplication).

I can only reliably reproduce it on a production system that I can only do very 
limited testing on, however I have been able to gather the following 
information:
- Many threads are stuck, trying to aquire locks on various tree roots, which 
are never released by their current holders.
- There always seem to be (at least) two threads executing rmdir syscalls which 
are creating the circular dependency: One of them is in btrfs_cow_block => ... 
=> btrfs_qgroup_trace_extent_post => ... => find_parent_nodes and wants to 
acquire a lock that was already aquired by btrfs_search_slot of the other rmdir.
- Reverting this patch seems to prevent it from happening: 
https://patchwork.kernel.org/patch/9573267/

** Affects: linux (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765998

Title:
  FS access deadlock with btrfs quotas enabled

Status in linux package in Ubuntu:
  New

Bug description:
  I'm running into an issue on Ubuntu Bionic (but not Xenial) where
  shortly after boot, under heavy load from many LXD containers starting
  at once, access to the btrfs filesystem that the containers are on
  deadlocks.

  The issue is quite hard to reproduce on other systems, quite likely
  related to the size of the filesystem involved (4 devices with a total
  of 8TB, millions of files, ~20 subvolumes with tens of snapshots each)
  and the access pattern from many LXD containers at once. It definitely
  goes away when disabling btrfs quotas though. Another prerequisite to
  trigger this bug may be the container subvolumes sharing extents (from
  their parent image or due to deduplication).

  I can only reliably reproduce it on a production system that I can only do 
very limited testing on, however I have been able to gather the following 
information:
  - Many threads are stuck, trying to aquire locks on various tree roots, which 
are never released by their current holders.
  - There always seem to be (at least) two threads executing rmdir syscalls 
which are creating the circular dependency: One of them is in btrfs_cow_block 
=> ... => btrfs_qgroup_trace_extent_post => ... => find_parent_nodes and wants 
to acquire a lock that was already aquired by btrfs_search_slot of the other 
rmdir.
  - Reverting this patch seems to prevent it from happening: 
https://patchwork.kernel.org/patch/9573267/

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765998/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1765980] [NEW] IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock upon LXD container shutdown

2018-04-21 Thread Michael Sparmann
Public bug reported:

I've spent the last few days tracking down an issue where an attempt to
shutdown an LXD container after several hours of host uptime on Ubuntu
Bionic (4.15.0-15.16-generic) would cause a kworker thread to start
spinning on one CPU core and all subsequent container start/stop
operations to fail.

The underlying issue is that a kworker thread (executing cleanup_net)
spins in inet_frags_exit_net, waiting for sum_frag_mem_limit(nf) to
become zero, which never happens becacuse it has underflowed to some
negative multiple of 64. That kworker thread keeps holding net_mutex and
therefore blocks any further container start/stops. That in turn is
triggered by receiving a fragmented IPv6 MDNS packet in my instance, but
it could probably be triggered by any fragmented IPv6 traffic.

The reason for the frag mem limit counter to underflow is
nf_ct_frag6_reasm deducting more from it than the sum of all previous
nf_ct_frag6_queue calls added, due to pskb_expand_head (called through
skb_unclone) adding a multiple of 64 to the SKB's truesize, due to
kmalloc_reserve allocating some additional slack space to the buffer.

Removing this line:
size = SKB_WITH_OVERHEAD(ksize(data));
or making it conditional with nhead or ntail being nonzero works around the 
issue, but a proper fix for this seems complicated.
There is already a comment saying "It is not generally safe to change 
skb->truesize." right about the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
I'll leave figuring out the proper way to fix this to the maintainers of this 
area... ;)

** Affects: linux (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765980

Title:
  IPv6 fragments with nf_conntrack_reasm loaded cause net_mutex deadlock
  upon LXD container shutdown

Status in linux package in Ubuntu:
  New

Bug description:
  I've spent the last few days tracking down an issue where an attempt
  to shutdown an LXD container after several hours of host uptime on
  Ubuntu Bionic (4.15.0-15.16-generic) would cause a kworker thread to
  start spinning on one CPU core and all subsequent container start/stop
  operations to fail.

  The underlying issue is that a kworker thread (executing cleanup_net)
  spins in inet_frags_exit_net, waiting for sum_frag_mem_limit(nf) to
  become zero, which never happens becacuse it has underflowed to some
  negative multiple of 64. That kworker thread keeps holding net_mutex
  and therefore blocks any further container start/stops. That in turn
  is triggered by receiving a fragmented IPv6 MDNS packet in my
  instance, but it could probably be triggered by any fragmented IPv6
  traffic.

  The reason for the frag mem limit counter to underflow is
  nf_ct_frag6_reasm deducting more from it than the sum of all previous
  nf_ct_frag6_queue calls added, due to pskb_expand_head (called through
  skb_unclone) adding a multiple of 64 to the SKB's truesize, due to
  kmalloc_reserve allocating some additional slack space to the buffer.

  Removing this line:
  size = SKB_WITH_OVERHEAD(ksize(data));
  or making it conditional with nhead or ntail being nonzero works around the 
issue, but a proper fix for this seems complicated.
  There is already a comment saying "It is not generally safe to change 
skb->truesize." right about the offending modification of truesize, but the if 
statement guarding it apparently doesn't keep out all problematic cases.
  I'll leave figuring out the proper way to fix this to the maintainers of this 
area... ;)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765980/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp