On Wed, 27 May 2026 16:21:50 -0300 Ricardo Robaina wrote: > When auditd is bottlenecked (e.g., by slow disk I/O), kauditd blocks > on the netlink socket wait queue (nlk->wait). If the wait timeout > fully expires (timeo == 0), netlink mistakenly interprets the zeroed > timeout as a non-blocking request on its next retry iteration. It > then triggers netlink_overrun that drops the event and poisons the > socket with ENOBUFS. This bypasses the audit subsystem's internal > retry backlog and falsely returns an error to user-space: > > auditd[]: Error receiving audit netlink packet (No buffer space available) > > Unlike standard netlink users, the audit subsystem has a hard > requirement to never silently drop security records. It uses a short > finite socket timeout (sk_sndtimeo = HZ/10) to escape a stalled > auditd and safely requeue the message internally. However, once > netlink_overrun() executes, the ENOBUFS state is set on the > receiving socket, and the audit subsystem has no mechanism to > intercept or clear this from the outside.
This provides no improvement over v2, let's keep discussion on the v2 thread.

