Am 05.12.2014 um 19:15 hat Paolo Bonzini geschrieben: > > > On 05/12/2014 17:06, Kevin Wolf wrote: > > If the queue array for io_submit() is already full, but a new request > > arrives, we cannot add it to that queue anymore. We can, however, use a > > CoQueue, which is implemented as a list and can therefore queue as many > > requests as we want. > > > > Signed-off-by: Kevin Wolf <kw...@redhat.com> > > --- > > block/linux-aio.c | 31 ++++++++++++++++++++++++++----- > > 1 file changed, 26 insertions(+), 5 deletions(-) > > > > diff --git a/block/linux-aio.c b/block/linux-aio.c > > index 373ec4b..8e6328b 100644 > > --- a/block/linux-aio.c > > +++ b/block/linux-aio.c > > @@ -44,6 +44,7 @@ typedef struct { > > int plugged; > > unsigned int size; > > unsigned int idx; > > + CoQueue waiting; > > } LaioQueue; > > > > struct qemu_laio_state { > > @@ -160,6 +161,8 @@ static void ioq_init(LaioQueue *io_q) > > io_q->size = MAX_QUEUED_IO; > > io_q->idx = 0; > > io_q->plugged = 0; > > + > > + qemu_co_queue_init(&io_q->waiting); > > } > > > > static int ioq_submit(struct qemu_laio_state *s) > > @@ -201,15 +204,29 @@ static int ioq_submit(struct qemu_laio_state *s) > > s->io_q.idx * sizeof(s->io_q.iocbs[0])); > > } > > > > + /* Now there should be room for some more requests */ > > + if (!qemu_co_queue_empty(&s->io_q.waiting)) { > > + if (qemu_in_coroutine()) { > > + qemu_co_queue_next(&s->io_q.waiting); > > + } else { > > + qemu_co_enter_next(&s->io_q.waiting); > > We should get better performance by wrapping these with > plug/unplug. Trivial for the qemu_co_enter_next case, much less for > qemu_co_queue_next...
We can probably just use qemu_co_enter_next() everywhere. The only reason why I put a qemu_co_queue_next() there was that it saves a coroutine switch - probably premature optimisation anyway... > This exposes what I think is the main wrinkle in these patches: I'm not > sure linux-aio is a great match for the coroutine architecture. You > introduce some infrastructure duplication with block.c to track > coroutines, and I don't find the coroutine code to be an improvement > over Ming Lei's asynchronous one---in fact I actually find it more > complicated. Really? I found the callback-based one that introduces new BHs and an additional state for a queue that is being aborted (which must be considered everywhere) really ugly, and the resulting code from this coroutine-based series rather clean. I honestly expected that people would debate whether it does the right thing, but that nobody would disagree that it looks nicer - but maybe it's a matter of taste. Also note that this specific patch is doing an additional step that isn't part of Ming's series: Ming's series simply lets requests fail if the queue is full. Also, regardless of that (though I find readability important), my benchmarks seem to suggest that without this conversion, the other optimisations in the queue don't work that well. The fastest performance I've seen so far - including both coroutine and callback based versions - has this conversion applied (measured without patches 4-6 yet, though). Kevin