On Wed, 13 May 2026 14:24:43 -0300 Ricardo Robaina wrote: > When auditd is bottlenecked (e.g., by slow disk I/O), kauditd blocks on > the netlink socket.
Holding socket lock during slow IO sounds very wrong. One could say - that's abuse of the socket lock? > If the wait timeout fully expires (timeo == 0), > netlink mistakenly interprets the zeroed timeout as a non-blocking > request. It then triggers netlink_overrun that drops the event, > completely bypassing the audit subsystem's internal retry queue, and > falsely returns ENOBUFS to user-space, resulting in the following error: > > auditd[]: Error receiving audit netlink packet (No buffer space available) > > Fix this by detecting when a blocking sender's timeout has expired > (timeo == 0 && !nonblock) in netlink_unicast(). In this case, instead > of retrying with timeo=0 (which would incorrectly trigger netlink_overrun > on the next iteration), safely free the skb and return -EAGAIN, allowing > the audit subsystem to gracefully enqueue the pending event into its > internal backlog. The socket _is_ the queue, normally. Please explore fixing this in audit? -- pw-bot: cr

