Public bug reported:
== Summary ==
automount leaks Unix socketpairs per mount helper invocation. The parent
daemon does not close its end of the socketpair after the mount helper
subprocess exits. On active systems this causes steady fd accumulation
over days to weeks, eventually exhausting the per-process fd limit
(default 20,480 on Ubuntu 24.04), at which point autofs stops servicing
all NFS mount requests.
== Environment ==
Fleet of 500+ Ubuntu 24.04 and SLES compute hosts running a batch job
scheduler. Autofs uses SSSD as the automount backend:
$ grep ^automount /etc/nsswitch.conf
automount: files sss
$ grep autofs_provider /etc/sssd/sssd.conf
autofs_provider = ldap
Automount maps are LDAP-backed. The fd leak is observed regardless of
SSSD/LDAP activity (see "What was ruled out" below).
== Symptoms ==
- ls -la /proc/$(pidof automount)/fd | wc -l grows monotonically over time
- Rate: 0.45-10 fds/hour depending on NFS mount workload
- After exhaustion, autofs stops mounting NFS paths entirely
- systemctl restart autofs clears all accumulated fds immediately, confirming
fds are held by automount, not the kernel
== Evidence ==
--- 1. lsof on affected host (202-day uptime, no restarts) ---
Package: 5.1.9-1ubuntu4
Kernel: 6.14.0-33-generic
$ lsof -p $(pidof automount) | awk 'NR>1 {print $5}' | sort | uniq -c | sort
-rn
2082 unix
57 REG
51 FIFO
27 DIR
4 CHR
Total fds: 2165 / 20480 (10%)
Unix sockets: 2082 of 2165 (96%) — all anonymous CONNECTED, no bound path.
Fresh-restart baseline: 80 fds. The 2,082 are the accumulated leak.
--- 2. Dead-peer verification ---
Sampled 20 socket inodes from automount's fd table:
for inode in <sample>; do grep " $inode " /proc/net/unix; done
All 20 inodes absent from /proc/net/unix.
In Linux, both ends of a connected Unix socketpair appear in
/proc/net/unix while both are open. Absence proves the peer (mount
helper) has exited and closed its end. Automount holds 2,082 orphaned
socket ends with dead peers.
--- 3. strace — mount helper call chain ---
Captured with: strace -ff -yy -e
trace=socketpair,close,clone,fork,execve,exit_group
Each mount request triggers a 3-process chain:
automount dispatch thread (PID A)
close(4<UNIX-STREAM:[inode1]>) = 0 <- closes inherited socket pair ends
close(4<UNIX-STREAM:[inode2]>) = 0
close(4<UNIX-STREAM:[inode3]>) = 0
close(4<UNIX-STREAM:[inode4]>) = 0
clone() = PID B (/bin/mount)
clone() = PID C (/sbin/mount.nfs)
close(3<UNIX-STREAM:[inode5]>) = 0 <- grandchild closes its ends
close(3<UNIX-STREAM:[inode6]>) = 0
exit_group(0)
Representative dispatch thread trace:
22:58:42.556386 close(4<UNIX-STREAM:[92517884]>) = 0
22:58:42.556594 close(4<UNIX-STREAM:[92517885]>) = 0
22:58:42.556811 close(4<UNIX-STREAM:[92517886]>) = 0
22:58:42.557013 close(4<UNIX-STREAM:[92517887]>) = 0
22:58:42.642343 clone(child_stack=NULL,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, ...) = PID_B
22:58:42.705608 +++ exited with 0 +++
Representative /sbin/mount.nfs trace:
22:58:42.649905 execve("/sbin/mount.nfs", ["/sbin/mount.nfs",
"nas-server:/exports/home/user1", "/home/user1", "-o", "rw"], ...) = 0
22:58:42.654225 close(3<UNIX-STREAM:[92499747]>) = 0
22:58:42.654348 close(3<UNIX-STREAM:[92499748]>) = 0
22:58:42.659118 close(3<UNIX-STREAM:[92523559]>) = 0
22:58:42.659341 close(3<UNIX-STREAM:[92523560]>) = 0
22:58:42.703484 exit_group(0) = ?
The dispatch thread closes 4 inherited socket ends at startup — pairs
created by the parent before the fork. No corresponding close() for
those inodes appears in the parent trace. The socketpair() call occurs
in an automount pthread; pthread shared-PID tracing limits prevent
capturing it directly, but the inherited-and-closed pattern in the child
confirms the parent created the pairs pre-fork and retains its ends
post-exit.
--- 4. Fleet-wide impact ---
Scan of 500+ Ubuntu 24.04 + SLES hosts:
CRIT (>=90%): 0
WARN (>=75%): 1 (hostB: 85%, 17510/20480 fds, 75-day uptime)
Elevated: 6 (22%-47%)
7 hosts required manual autofs restart over a 2-day period.
== What was ruled out ==
Package 5.1.9-1ubuntu4.1: changelog (LP: #2074003) confirms this release
fixes only a Kerberos ticket renewal bug in modules/cyrus-sasl.c — no
fd-related changes. Both versions accumulate at identical rates (~1
fd/hr at equivalent workload). The apparent difference in fleet scans is
explained entirely by uptime and restart history.
Kerberos/LDAP reconnect storms: blocked LDAP ports 636/389 via iptables
on two hosts for 20 minutes while monitoring fds. Zero accumulation
observed. The SSSD/LDAP path is not involved.
== Workaround ==
systemctl restart autofs
Resets to ~80 fds. Must be repeated periodically on active hosts.
== Suggested fix area ==
The parent should close its end of the socketpair after the mount helper
exits (or after dispatching). Likely location: mount subprocess dispatch
path in daemon/spawn.c, daemon/direct.c, or daemon/indirect.c.
== Version ==
autofs 5.1.9-1ubuntu4 (Ubuntu 24.04 Noble)
Also reproduced: SUSE Linux Enterprise 15 SP6, autofs 5.1.9-150600.1.4
Kernel: 6.14.0-33-generic
** Affects: autofs (Ubuntu)
Importance: Undecided
Status: New
** Tags: autofs fd-leak nfs noble
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2152277
Title:
automount leaks Unix socketpairs per mount helper invocation — fd
exhaustion after days/weeks of uptime
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/autofs/+bug/2152277/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs