Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization

Greg Kurz Wed, 01 Jul 2020 03:10:00 -0700

On Tue, 30 Jun 2020 20:00:08 +0200
Christian Schoenebeck <qemu_...@crudebyte.com> wrote:


> On Dienstag, 30. Juni 2020 18:39:57 CEST Greg Kurz wrote:
> > On Tue, 30 Jun 2020 17:16:40 +0200
> > 
> > Christian Schoenebeck <qemu_...@crudebyte.com> wrote:
> > > On Montag, 29. Juni 2020 18:39:02 CEST Greg Kurz wrote:
> > > > On Wed, 03 Jun 2020 19:16:08 +0200
> > > > 
> > > > Christian Schoenebeck <qemu_...@crudebyte.com> wrote:
> > > > > On Sonntag, 19. April 2020 17:06:17 CEST Christian Schoenebeck wrote:
> > > > > > Make top half really top half and bottom half really bottom half:
> > > > > > 
> > > > > > Each T_readdir request handling is hopping between threads (main
> > > > > > I/O thread and background I/O driver threads) several times for
> > > > > > every individual directory entry, which sums up to huge latencies
> > > > > > for handling just a single T_readdir request.
> > > > > > 
> > > > > > Instead of doing that, collect now all required directory entries
> > > > > > (including all potentially required stat buffers for each entry) in
> > > > > > one rush on a background I/O thread from fs driver by calling the
> > > > > > previously added function v9fs_co_readdir_many() instead of
> > > > > > v9fs_co_readdir(), then assemble the entire resulting network
> > > > > > response message for the readdir request on main I/O thread. The
> > > > > > fs driver is still aborting the directory entry retrieval loop
> > > > > > (on the background I/O thread inside of v9fs_co_readdir_many())
> > > > > > as soon as it would exceed the client's requested maximum R_readdir
> > > > > > response size. So this will not introduce a performance penalty on
> > > > > > another end.
> > > > > > 
> > > > > > Signed-off-by: Christian Schoenebeck <qemu_...@crudebyte.com>
> > > > > > ---
> > > > > > 
> > > > > >  hw/9pfs/9p.c | 122
> > > > > >  +++++++++++++++++++++++----------------------------
> > > > > >  1 file changed, 55 insertions(+), 67 deletions(-)
> > > > > 
> > > > > Ping. Anybody?
> > > > > 
> > > > > I would like to roll out this patch set soon and this is the only
> > > > > patch in
> > > > > this series missing a review yet.
> > > > 
> > > > Hi Christian,
> > > 
> > > Hi Greg,
> > > 
> > > > Sorry for getting back to this only now :-\
> > > > 
> > > > So I still have some concerns about the locking of the directory stream
> > > > pointer a fid. It was initially introduced to avoid concurrent accesses
> > > > by multiple threads to the corresponding internal glibc object, as
> > > > recommended in the readdir(3) manual page. Now, this patch considerably
> > > > extends the critical section to also contain calls to telldir() and all
> > > > the _many_ readdir()... so I'm not sure exactly what's the purpose of
> > > > that mutex right now. Please provide more details.
> > > 
> > > The intention of this lock is to prevent concurrent r/w (i.e. telldir()/
> > > readdir()) of the dir stream position by two or more active readdir
> > > requests handled in parallel, provided that they would use the same FID.
> > > Due to the latter requirement for this to become relevant, we already
> > > agreed that this is not something that would usually happen with a Linux
> > > 9p client, but there is nothing from protocol point of view that would
> > > prohibit this to be done by a client, so the resulting undefined
> > > behaviour should be prevented, which this lock does.
> > 
> > Makes sense. The dir stream position is no different from glibc's internal
> > dirent in that respect: it shouldn't be used by concurrent threads.
> 
> Exactly, it is a conceptual issue per se in general, independent of whether 
> glibc is used and independent of the fs driver.
> 
> > > What's your precise concern?
> > 
> > My overall concern is still the same. The patches are big and I've
> > been away for too long, so I need some more discussion to reassemble
> > my thoughts and the puzzle :)
> 
> Np, if you have questions, just come along.
> 
> > But now that I'm starting to understand why it's a good thing to
> > extend the critical section, I realize it should be extended
> > even more: we also have calls to seekdir() and rewindir() in
> > v9fs_readdir() and friends that could _theoretically_ cause the
> > very same kind of undefined behavior you're mentioning.
> 
> You are talking about the "old" code now. Yes, it is wrong and I did not 

No I'm talking about code that isn't changed by this series:

    if (initial_offset == 0) {
        v9fs_co_rewinddir(pdu, fidp);
    } else {
        v9fs_co_seekdir(pdu, fidp, initial_offset);
    }
    count = v9fs_do_readdir(pdu, fidp, max_count);

Leaving these outside the critical section seems to negate
your arguments... please clarify.

> bother to fix the "old" implementation, since this optimized readdir 
> implementation fixed this issue as well in one rush, but also note ...
> 
> > I think the change is important enough that it deserves its
> > own patch. I expect that moving the locking to the top-level
> > handler might also simplify the existing code, and thus your
> > series as well.
> 
> ... please reconsider if you really want to open that box. The "old" readdir 
> implementation on main thread is inherently wrong, and handling the lock 
> there 
> and error pathes is extremely cumbersome and error prone.
> 

There are indeed several issues with the current readdir:

1) potential inconsistency on concurrent Treaddir requests

2) excessive dispatching to worker threads

So we agreed that 1) should never happen with a regular linux
client (we could even dump the lock actually) but we want to
make it clear in the code anyway that actions on the directory
stream are serialized.

2) basically means moving some logic from the frontend to the
backend, ie. called from v9fs_co_run_in_worker(). This implies
that a very long request (ie. one that would span on many calls
to readdir()) cannot be interrupted by the client anymore, as
opposed to what we have now BTW.

I tend to think that addressing both issues in one "rush" is
as much "error prone".

> Also: it does not make sense to move locking on this series from fs driver 
> back to main I/O thread. Atomicity must retain on driver side, not on top 
> half.
> 

Then the whole seekdir/rewinddir/telldir/readdir sequence should
be called with a single v9fs_co_run_in_worker() invocation, in
which case the locking could just be something like:

int coroutine_fn v9fs_co_readdir_many(V9fsPDU *pdu, V9fsFidState *fidp,
                                      struct V9fsDirEnt **entries,
                                      int32_t maxsize, bool dostat)
{
    int err = 0;

    if (v9fs_request_cancelled(pdu)) {
        return -EINTR;
    }

    v9fs_readdir_lock(&fidp->fs.dir);

    v9fs_co_run_in_worker({
        err = do_readdir_many(pdu, fidp, entries, maxsize, dostat);
    });

    v9fs_readdir_unlock(&fidp->fs.dir);
    return err;
}

This would technically leave the locking in the main I/O thread,
but it isn't conceptually different from locking at the beginning
of do_readdir_lock() and unlocking before returning. My concern
is that I don't know how coroutine mutexes behave with multiple
worker threads...

> Best regards,
> Christian Schoenebeck
> 
> 

Cheers,

--
Greg

Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization

Reply via email to