On Fri, 29 May 2026 13:41:10 +0100 Gary Guo <[email protected]> wrote:
> On Fri May 29, 2026 at 1:29 PM BST, Onur Özkan wrote: > > On Fri, 29 May 2026 13:07:18 +0100 > > Gary Guo <[email protected]> wrote: > > > >> On Fri May 29, 2026 at 7:57 AM BST, Onur Özkan wrote: > >> >> >> > +#[pinned_drop] > >> >> >> > +impl PinnedDrop for Srcu { > >> >> >> > + fn drop(self: Pin<&mut Self>) { > >> >> >> > + let ptr = self.inner.get(); > >> >> >> > + > >> >> >> > + // SAFETY: By the type invariants, `self` contains a > >> >> >> > valid and pinned `struct srcu_struct` > >> >> >> > + // and `srcu_readers_active()` only checks the active > >> >> >> > reader count. > >> >> >> > + if unsafe { bindings::srcu_readers_active(ptr) } { > >> >> >> > + crate::pr_warn!( > >> >> >> > + "Leaked `Guard` detected while dropping SRCU; > >> >> >> > drop will block forever.\n" > >> >> >> > + ); > >> >> > >> >> I think this could be a `warn_on` similar to how cleanup_srcu_struct > >> >> handle the > >> >> condition. > >> > > >> > We also call cleanup_srcu_struct below. The idea was to provide > >> > additional > >> > information, we don't need to call warn_on twice. > >> > >> If the code blocks on `synchronize_srcu` then there's no call to > >> `cleanup_srcu_struct`. > > > > > > Ah right. I can do that in this case but honestly it's still more > > informative > > with the current way. It explicitly tells you what the problem is. > > While the error message itself is not that informative (there's some potential > to improve this, see > https://lore.kernel.org/all/[email protected]/), the stack > trace produced by a `warn_on` would have all the information and is more > useful > to troubleshoot than a `pr_warn` which doesn't tell you which `Srcu` being > dropped is causing the issue. Cool, I will use it in the next version. Thanks! > > >> >> > > >> >> > Actually, now I am now thinking about whether we can come up with a > >> >> > better > >> >> > approach when we detect leaked guards. Initially I came up with the > >> >> > synchronize_srcu() solution because it would handle leaked guards > >> >> > automatically > >> >> > without requiring any additional checks. But now that we can actually > >> >> > detect > >> >> > whether guards are leaked the question becomes: > >> >> > > >> >> > "Is there a better option than effectively sleeping forever > >> >> > when leaked > >> >> > guards are detected?" > >> >> > > >> >> > I have no plans for tomorrow other than finalizing this series > >> >> > including the > >> >> > question above. > >> >> > >> >> The best solution is to proceed cleanups anyway, given Rust rules > >> >> ensure that > >> >> these are actual leaks and not just srcu read-side critical section > >> >> that failed > >> >> to synchronize with the destruction of SRCU. > >> >> > >> >> This obviously require changes to the SRCU code though. > >> > > >> > > >> > The issue is difficult to fix purely from the C side. Once drop returns > >> > Rust > >> > is free to destroy srcu_struct. If srcu still has pending callback > >> > associated > >> > with that srcu_struct, for example from a future call_srcu() wrapper then > >> > returning from drop while readers are active can turn into a UAF. There > >> > is also > >> > no way to handle callbacks in a reasonable way in cleanup logic while > >> > there are > >> > active readers. > >> > >> Callbacks should be flushed during the drop due to srcu_barrier. Am I > >> missing > >> something? > > > > No. Callbacks can only be invoked once the grace period has completed [1], > > which > > can never happen while there is an active reader. > > > > [1]: > > https://elixir.bootlin.com/linux/v7.1-rc5/source/kernel/rcu/srcutree.c#L1452-L1454 > > Well, then srcu_barrier will not return. When `srcu_barrier` returns all > in-flight SRCU callbacks must have been executed. Yes, or there weren't any callbacks posted. > > Best, > Gary > > > > >> > >> I'm pretty sure that, if we disregard potential misuses from C side, > >> removing > >> all "leak it" paths would be fine and won't leak to UAF if all users are > >> from > >> Rust side. > >> > >> To be very clear, I am not advocating to actually implement this way. I > >> agree > >> with your conclusion below that this is broken code and a warning + > >> blocking is > >> good enough. This is really just my thoughts on your "is there a better > >> option" > >> question, and I think it's better in ideal world, but I think blocking is a > >> good pragmatic choice. > > > > I see. Maybe I should have phrased the question like "Is there a better > > option > > with similar complexity" to be more clear.

