On Fri May 29, 2026 at 7:57 AM BST, Onur Özkan wrote:
>> >> > +#[pinned_drop]
>> >> > +impl PinnedDrop for Srcu {
>> >> > +    fn drop(self: Pin<&mut Self>) {
>> >> > +        let ptr = self.inner.get();
>> >> > +
>> >> > +        // SAFETY: By the type invariants, `self` contains a valid and 
>> >> > pinned `struct srcu_struct`
>> >> > +        // and `srcu_readers_active()` only checks the active reader 
>> >> > count.
>> >> > +        if unsafe { bindings::srcu_readers_active(ptr) } {
>> >> > +            crate::pr_warn!(
>> >> > +                "Leaked `Guard` detected while dropping SRCU; drop 
>> >> > will block forever.\n"
>> >> > +            );
>> 
>> I think this could be a `warn_on` similar to how cleanup_srcu_struct handle 
>> the
>> condition.
>
> We also call cleanup_srcu_struct below. The idea was to provide additional
> information, we don't need to call warn_on twice.

If the code blocks on `synchronize_srcu` then there's no call to
`cleanup_srcu_struct`.

>
>> 
>> >> > +        }
>> >> > +
>> >> > +        // `cleanup_srcu_struct()` may return early if readers are 
>> >> > still active. Because `Srcu`
>> >> > +        // owns the embedded `srcu_struct`, returning from `drop` in 
>> >> > that state could free memory
>> >> > +        // that is still referenced by the C side.
>> >> > +        //
>> >> > +        // Wait for all readers to complete first. If any `Guard` was 
>> >> > leaked, `synchronize_srcu()`
>> >> > +        // will sleep forever.
>> >> > +        //
>> >> > +        // SAFETY: By the type invariants, `self` contains a valid and 
>> >> > pinned `struct srcu_struct`.
>> >> > +        unsafe { bindings::synchronize_srcu(ptr) };
>> >> 
>> >> Sashiko got a good point here which is calling synchronize_srcu() only if 
>> >> there
>> >> are active readers. That's a nice low-effort improvement we can have in 
>> >> the next
>> >> version.
>> >> 
>> >> Onur
>> >
>> > Actually, now I am now thinking about whether we can come up with a better
>> > approach when we detect leaked guards. Initially I came up with the
>> > synchronize_srcu() solution because it would handle leaked guards 
>> > automatically
>> > without requiring any additional checks. But now that we can actually 
>> > detect
>> > whether guards are leaked the question becomes:
>> >
>> >    "Is there a better option than effectively sleeping forever when leaked
>> >     guards are detected?"
>> >
>> > I have no plans for tomorrow other than finalizing this series including 
>> > the
>> > question above.
>> 
>> The best solution is to proceed cleanups anyway, given Rust rules ensure that
>> these are actual leaks and not just srcu read-side critical section that 
>> failed
>> to synchronize with the destruction of SRCU.
>> 
>> This obviously require changes to the SRCU code though.
>
>
> The issue is difficult to fix purely from the C side. Once drop returns Rust
> is free to destroy srcu_struct. If srcu still has pending callback associated
> with that srcu_struct, for example from a future call_srcu() wrapper then
> returning from drop while readers are active can turn into a UAF. There is 
> also
> no way to handle callbacks in a reasonable way in cleanup logic while there 
> are
> active readers.

Callbacks should be flushed during the drop due to srcu_barrier. Am I missing
something?

I'm pretty sure that, if we disregard potential misuses from C side, removing
all "leak it" paths would be fine and won't leak to UAF if all users are from
Rust side.

To be very clear, I am not advocating to actually implement this way. I agree
with your conclusion below that this is broken code and a warning + blocking is
good enough. This is really just my thoughts on your "is there a better option"
question, and I think it's better in ideal world, but I think blocking is a
good pragmatic choice.

Best,
Gary

>
> I mean in theory this could be fixed in the C code, but that would require to
> re-write srcu cores/semantics for this special case. The $clean_something 
> helper
> would need know that the active readers are abandoned and will never unlock 
> and
> it would also need to decide what to do with the pending callbacks, which is
> also a big problem (as gp will never complete, callbacks will never run).
>
> It's also worth to note that calling mem::forget on the srcu guard is WRONG
> CODE and very easy to catch on review (by us and also Sashiko/any LLM). So
> finding a solution that doesn't add too much complexity should be a key
> consideration here. With that in mind, keeping the synchronize_srcu() not 
> really
> a bad solution. Sleeping forever is a bad failure mode, but it is better than 
> a
> potential UAF and either case requires sending a fix patch for the leaked 
> guard
> anyway.


Reply via email to