On Mon, May 18, 2026 at 8:04 AM Simon Horman <[email protected]> wrote: > > From: 'Simon Horman' <[email protected]> > > This is an AI-generated review of your patch. The human sending this > email has considered the AI review valid, or at least plausible. > Full review at: https://sashiko.dev > --- > netlink, audit: prevent false ENOBUFS on timeout expiry > > This commit modifies netlink_unicast() to return -EAGAIN when a blocking > sender's timeout expires. This bypasses the netlink_overrun() call to > allow the audit subsystem to gracefully enqueue the pending event into its > internal backlog without falsely returning ENOBUFS to user-space. > > > diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c > > --- a/net/netlink/af_netlink.c > > +++ b/net/netlink/af_netlink.c > > @@ -1351,8 +1351,18 @@ int netlink_unicast(struct sock *ssk, struct sk_buff > > *skb, > > } > > > > err = netlink_attachskb(sk, skb, &timeo, ssk); > > - if (err == 1) > > + if (err == 1) { > > + /* timeo may have been zeroed by schedule_timeout inside > > + * netlink_attachskb. If the caller is a timed-blocking sender > > + * (not genuinely nonblocking), don't re-enter with timeo=0 as > > + * that would misfire netlink_overrun on the next iteration. > > + */ > > + if (timeo == 0 && !nonblock) { > > + kfree_skb(skb); > > + return -EAGAIN; > > Could this regression cause silent packet drops in non-audit subsystems? > > While this change prevents false ENOBUFS for the audit subsystem, other > subsystems like InfiniBand's IWPM use rdma_nl_unicast_wait(), which sets > nonblock=0 and operates on a kernel socket with a timeout. > > If the timeout expires because the receiving user-space daemon is > bottlenecked, this new path will free the skb and return -EAGAIN without > ever invoking netlink_overrun(). > > Without netlink_overrun(), the receiver's socket error is never set to > ENOBUFS. Would this prevent user-space from being notified of the dropped > message, leading to permanent state desynchronization between the kernel > and user-space? > > > + } > > goto retry; > > + } > > if (err) > > return err; >
Hi Simon, Thanks for reviewing this patch! You are correct that the timeo == 0 && !nonblock heuristic in v2 relies on an implicit assumption about finite sk_sndtimeo. While RDMA/IWPM with MAX_SCHEDULE_TIMEOUT would never reach this path in practice, your concern correctly identifies that the heuristic is not surgical enough. I've submitted v3 [1] with an explicit NETLINK_UNICAST_TIMED constant (value 2). Callers must explicitly opt into this contract, leaving IWPM and all other subsystems completely untouched: if (timeo == 0 && nonblock == NETLINK_UNICAST_TIMED) This ensures zero risk of silent drops or state desynchronization in other subsystems. Does this address your concern? [1] https://lore.kernel.org/audit/[email protected]/T/#u Best regards, Ricardo

