Hello, On Thursday, May 28, 2026 7:29:01 PM Eastern Daylight Time Jakub Kicinski wrote: > On Thu, 28 May 2026 18:40:44 -0400 Steve Grubb wrote: > > > > (3) A new NETLINK_F_RECV_NO_ENOBUFS socket flag doesn't exist in > > > > stable > > > > kernels where this bug is actively impacting users > > > > > > Which commit are you referring to? Isn't that flag itself ancient? > > > > You're right, it is. I see how this flag would fix the pathological > > behavior that was reported. But as I have looked at this suggestion, > > there seems to be one wrinkle. User space should not need to know that > > the audit code in the kernel has this retry mechanism. > > It's not about the retry mechanism, at least in my mind - I read > your reply as "user space should not know that there was congestion". > Why?
In the audit case, it is not useful. I know there can be an endless supply and there's not much that can be done except dequeueing what's next. > It's not very useful, I get that, but user space can just clear > the congestion signal and keep going. How? The recvfrom man page doesn't even discuss ENOBUFS. Which is one of the strongest arguments for a kernel side patch. The fact that there is exists a socket option to declare that you do not want ENOBUFS on netlink sockets is esoteric knowledge. The netlink(7) man page does cover the flag. But even where it discusses ENOBUFS, it does not mention that this is preventable by setting a socket option. I do appreciate this being pointed out. But getting from the recvfrom man page to a solution is not obvious. > > It seems like the audit subsystem should set the flag on auditd's > > socket at registration time in auditd_set(). The kernel is the right > > place for this because it's the kernel that manages the retry/ hold > > queues and sets the sk_sndtimeo that triggers the overrun path - > > auditd has no knowledge of these internals. > > We have to carry this code somewhere, either in user space or in > the kernel. I'd prefer not to carry it in the kernel. I can put this in the audit daemon. But whoever else writes a similar app will have to independently discover the same solution when faced with the pathologically bad behavior. A kernel side fix would have made it easier for future app developers to be successful. -Steve

