We used BPF/BCC to trace all refcount changes on all of the relevant mounts, and for each (struct mount*, stack) pair, kept track of the number of references acquired/released by that pair. Including the struct mount* in the key in our map make the map quite large, but allowed us to pare the output down to a manageable size when we hit this issue by filtering out all of the refcount changes except those related to the one affected filesystem.
Looking at all the reference count changes for a particular filesystem affected by this issue, we can see that the extra reference which is acquired but never released is being taken by code related to the NFSD export cache. In particular, the cache maintains a queue of requests which rpc.mountd is supposed to read and process. After processing a request, rpc.mountd is supposed to issue a downcall with the info requested by the kernel, at which point the item can be removed from the queue, and can be flushed when it expires, releasing the reference the item holds on the mount. rpc.mountd is waiting for requests, as it should be, but the kernel is not notifying mountd that there are requests in the queue. We have more investigation to do to understand how the kernel got into this state. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1832384 Title: Unable to unmount apparently unused filesystem To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1832384/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs