On Fri, 7 Mar 2025 10:23:02 +0100 Christian Schoenebeck <qemu_...@crudebyte.com> wrote:
> This patch fixes two different bugs in v9fs_reclaim_fd(): > > 1. Reduce latency: > > This function calls v9fs_co_close() and v9fs_co_closedir() in a loop. Each > one of the calls adds two thread hops (between main thread and a fs driver > background thread). Each thread hop adds latency, which sums up in > function's loop to a significant duration. > > Reduce overall latency by open coding what v9fs_co_close() and > v9fs_co_closedir() do, executing those and the loop itself altogether in > only one background thread block, hence reducing the total amount of > thread hops to only two. > > 2. Fix file descriptor leak: > > The existing code called v9fs_co_close() and v9fs_co_closedir() to close > file descriptors. Both functions check right at the beginning if the 9p > request was cancelled: > > if (v9fs_request_cancelled(pdu)) { > return -EINTR; > } > > So if client sent a 'Tflush' message, v9fs_co_close() / v9fs_co_closedir() > returned without having closed the file descriptor and v9fs_reclaim_fd() > subsequently freed the FID without its file descriptor being closed, hence > leaking those file descriptors. > > This 2nd bug is fixed by this patch as well by open coding v9fs_co_close() > and v9fs_co_closedir() inside of v9fs_reclaim_fd() and not performing the > v9fs_request_cancelled(pdu) check there. > > Fixes: 7a46274529c ('hw/9pfs: Add file descriptor reclaim support') > Fixes: bccacf6c792 ('hw/9pfs: Implement TFLUSH operation') > Signed-off-by: Christian Schoenebeck <qemu_...@crudebyte.com> > --- Reviewed-by: Greg Kurz <gr...@kaod.org> > hw/9pfs/9p.c | 29 ++++++++++++++++++++--------- > 1 file changed, 20 insertions(+), 9 deletions(-) > > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c > index 4f9c2dde9c..80b190ff5b 100644 > --- a/hw/9pfs/9p.c > +++ b/hw/9pfs/9p.c > @@ -434,6 +434,8 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) > V9fsFidState *f; > GHashTableIter iter; > gpointer fid; > + int err; > + int nclosed = 0; > > /* prevent multiple coroutines running this function simultaniously */ > if (s->reclaiming) { > @@ -446,10 +448,10 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) > QSLIST_HEAD(, V9fsFidState) reclaim_list = > QSLIST_HEAD_INITIALIZER(reclaim_list); > > + /* Pick FIDs to be closed, collect them on reclaim_list. */ > while (g_hash_table_iter_next(&iter, &fid, (gpointer *) &f)) { > /* > - * Unlink fids cannot be reclaimed. Check > - * for them and skip them. Also skip fids > + * Unlinked fids cannot be reclaimed, skip those, and also skip fids > * currently being operated on. > */ > if (f->ref || f->flags & FID_NON_RECLAIMABLE) { > @@ -499,17 +501,26 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu) > } > } > /* > - * Now close the fid in reclaim list. Free them if they > - * are already clunked. > + * Close the picked FIDs altogether on a background I/O driver thread. Do > + * this all at once to keep latency (i.e. amount of thread hops between > main > + * thread <-> fs driver background thread) as low as possible. > */ > + v9fs_co_run_in_worker({ > + QSLIST_FOREACH(f, &reclaim_list, reclaim_next) { > + err = (f->fid_type == P9_FID_DIR) ? > + s->ops->closedir(&s->ctx, &f->fs_reclaim) : > + s->ops->close(&s->ctx, &f->fs_reclaim); > + if (!err) { > + /* total_open_fd must only be mutated on main thread */ > + nclosed++; > + } > + } > + }); > + total_open_fd -= nclosed; > + /* Free the closed FIDs. */ > while (!QSLIST_EMPTY(&reclaim_list)) { > f = QSLIST_FIRST(&reclaim_list); > QSLIST_REMOVE(&reclaim_list, f, V9fsFidState, reclaim_next); > - if (f->fid_type == P9_FID_FILE) { > - v9fs_co_close(pdu, &f->fs_reclaim); > - } else if (f->fid_type == P9_FID_DIR) { > - v9fs_co_closedir(pdu, &f->fs_reclaim); > - } > /* > * Now drop the fid reference, free it > * if clunked. -- Greg