On Thu, Aug 07, 2025 at 09:05:25AM +0000, Bernd Schubert wrote:
> Hi Brian,
> 
> sorry for late replies. Totally swamped in work this week and next week
> will be off another week.
> 
> On 8/5/25 06:11, Brian Song wrote:
> > 
> > 
> > On 2025-08-04 7:33 a.m., Bernd Schubert wrote:
> >> Hi Brian,
> >>
> >> sorry for my late reply, just back from vacation and fighting through
> >> my mails.
> >>
> >> On 8/4/25 01:33, Brian Song wrote:
> >>>
> >>>
> >>> On 2025-08-01 12:09 p.m., Brian Song wrote:
> >>>> Hi Bernd,
> >>>>
> >>>> We are currently working on implementing termination support for fuse-
> >>>> over-io_uring in QEMU, and right now we are focusing on how to clean up
> >>>> in-flight SQEs properly. Our main question is about how well the kernel
> >>>> supports robust cancellation for these fuse-over-io_uring SQEs. Does it
> >>>> actually implement cancellation beyond destroying the io_uring queue?
> >>>> [...]
> >>>
> >>
> >> I have to admit that I'm confused why you can't use umount, isn't that
> >> the most graceful way to shutdown a connection?
> >>
> >> If you need another custom way for some reasons, we probably need
> >> to add it.
> >>
> >>
> >> Thanks,
> >> Bernd
> > 
> > Hi Bernd,
> > 
> > Thanks for your insights!
> > 
> > I think umount doesn't cancel any pending SQEs, right? From what I see, 
> > the only way to cancel all pending SQEs and transition all entries to 
> > the FRRS_USERSPACE state (unavailable for further fuse requests) in the 
> > kernel is by calling io_uring_files_cancel in do_exit, or 
> > io_uring_task_cancel in begin_new_exec.
> 
> There are two umount forms
> 
> - Forced umount - immediately cancels the connection and aborts
> requests. That also immediately releases pending SQEs.
> 
> - Normal umount, destroys the connection and completed SQEs at the end
> of umount.
> 
> > 
> >  From my understanding, QEMU follows an event-driven model. So if we 
> > don't cancel the SQEs submitted by a connection when it ends, then 
> > before QEMU exits — after the connection is closed and the associated 
> > FUSE data structures have been freed — any CQE that comes back will 
> > trigger QEMU to invoke a previously deleted CQE handler, leading to a 
> > segfault.
> > 
> > So if the only way to make all pending entries unavailable in the kernel 
> > is calling do_exit or begin_new_exec, I think we should do some 
> > workarounds in QEMU.
> 
> I guess if we find a good argument why qemu needs to complete SQEs
> before umount is complete a kernel patch would be accepted. Doesn't
> sound that difficult to create patch for that. At least for entries that
> are on state FRRS_AVAILABLE. I can prepare patch, but at best in between
> Saturday and Monday.

Hi Bernd,
QEMU quiesces I/O at certain points, like when the block driver graph is
reconfigured (kind of like changing the device-mapper table in the
kernel) or when threads are reconfigured. This is also used during
termination to stop accepting new I/O and wait until in-flight I/O has
completed.

Ideally io_uring's ASYNC_CANCEL would work on in-flight
FUSE-over-io_uring uring_cmd requests. The REGISTER or COMMIT_AND_FETCH
uring_cmds would complete with -ECANCELED and future FUSE requests would
be queued in the kernel until FUSE-over-io_uring becomes ready again.

If and when userspace becomes ready again, it submits REGISTER
uring_cmds again and queued FUSE requests are then delivered to
userspace.

Thanks for your help!

Stefan

Attachment: signature.asc
Description: PGP signature

Reply via email to