On Thu, Sep 18, 2025, Michael S. Tsirkin wrote:
> On Thu, Sep 18, 2025 at 09:52:19AM -0700, Sean Christopherson wrote:
> > On Thu, Sep 18, 2025, Michael S. Tsirkin wrote:
> > > On Thu, Sep 18, 2025 at 09:04:07AM -0700, Sean Christopherson wrote:
> > > > On Thu, Sep 18, 2025, Sebastian Andrzej Siewior wrote:
> > > > > On 2025-09-18 11:09:05 [-0400], Michael S. Tsirkin wrote:
> > > > > > So how about switching to this approach then?
> > > > > > Instead of piling up fixes like we seem to do now ...
> > > > 
> > > > I don't have a strong preference for 6.17, beyond landing a fix of some 
> > > > kind.
> > > > I think there are three options for 6.17, in order of "least like to 
> > > > break
> > > > something":
> > > > 
> > > >  1. Sebastian's get_task_struct() fix
> > > 
> > > 
> > > I am just a bit apprehensive that we don't create a situation
> > > where we leak the task struct somehow, given the limited
> > > testing time. Can you help me get convinced that risk is 0?
> > 
> > I doubt it, I share same similar concerns about lack of testing.  So I guess
> > thinking about this again, #2 is probably safer since it'd only impact KVM?
> 
> I can't say I understand completely how we get that state though?
> Why did the warning trigger if it's not a UAF?

It's purely a flaw in the sanity check itself due to the ordering in 
vhost_task_fn().

As is, vhost_task_fn() marks the task KILLED before invoking ->handle_sigkill(),
i.e. before vhost_worker_killed() is guaranteed to complete, and thus before
worker->killed is set.  As a result, vhost can keep waking workers that have
KILLED set, but haven't actually exited.  That's perfectly fine as UAF won't
occur until do_exit() is called, and that won't happen until ->handle_sigkill()
completes.

> > > >  2. This series, without the KILLED sanity check in __vhost_task_wake()
> > > >  3. This series, with my fixup (with which syzbot was happy)
> 

Reply via email to