We used BPF/BCC to trace all refcount changes on all of the relevant
mounts, and for each (struct mount*, stack) pair, kept track of the
number of references acquired/released by that pair. Including the
struct mount* in the key in our map make the map quite large, but
allowed us to pare the output down to a manageable size when we hit this
issue by filtering out all of the refcount changes except those related
to the one affected filesystem.

Looking at all the reference count changes for a particular filesystem
affected by this issue, we can see that the extra reference which is
acquired but never released is being taken by code related to the NFSD
export cache. In particular, the cache maintains a queue of requests
which rpc.mountd is supposed to read and process. After processing a
request, rpc.mountd is supposed to issue a downcall with the info
requested by the kernel, at which point the item can be removed from the
queue, and can be flushed when it expires, releasing the reference the
item holds on the mount.

rpc.mountd is waiting for requests, as it should be, but the kernel is
not notifying mountd that there are requests in the queue. We have more
investigation to do to understand how the kernel got into this state.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1832384

Title:
  Unable to unmount apparently unused filesystem

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1832384/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to