On 11/21/2016 11:31 AM, Kevin Wolf wrote: > This converts the quorum block driver from implementing callback-based > interfaces for read/write to coroutine-based ones. This is the first > step that will allow us further simplification of the code. > > Signed-off-by: Kevin Wolf <kw...@redhat.com> > --- > block/quorum.c | 192 > ++++++++++++++++++++++++++++++++++----------------------- > 1 file changed, 115 insertions(+), 77 deletions(-) >
> @@ -174,14 +162,14 @@ static bool quorum_64bits_compare(QuorumVoteValue *a, > QuorumVoteValue *b) > static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs, > QEMUIOVector *qiov, > uint64_t sector_num, > - int nb_sectors, > - BlockCompletionFunc *cb, > - void *opaque) > + int nb_sectors) > { > BDRVQuorumState *s = bs->opaque; > - QuorumAIOCB *acb = qemu_aio_get(&quorum_aiocb_info, bs, cb, opaque); > + QuorumAIOCB *acb = g_new(QuorumAIOCB, 1); Worth using g_new0() here... > int i; > > + acb->co = qemu_coroutine_self(); > + acb->bs = bs; > acb->sector_num = sector_num; > acb->nb_sectors = nb_sectors; > acb->qiov = qiov; > @@ -191,6 +179,7 @@ static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs, > acb->rewrite_count = 0; > acb->votes.compare = quorum_sha256_compare; > QLIST_INIT(&acb->votes.vote_list); > + acb->has_completed = false; > acb->is_read = false; > acb->vote_ret = 0; ...to eliminate 0-assignments here? Not a show-stopper to leave it as-is, though. > -static BlockAIOCB *read_fifo_child(QuorumAIOCB *acb); > +static int read_fifo_child(QuorumAIOCB *acb); > > static void quorum_copy_qiov(QEMUIOVector *dest, QEMUIOVector *source) > { > @@ -272,14 +261,14 @@ static void quorum_report_bad_acb(QuorumChildRequest > *sacb, int ret) > QuorumAIOCB *acb = sacb->parent; > QuorumOpType type = acb->is_read ? QUORUM_OP_TYPE_READ : > QUORUM_OP_TYPE_WRITE; > quorum_report_bad(type, acb->sector_num, acb->nb_sectors, > - sacb->aiocb->bs->node_name, ret); > + sacb->bs->node_name, ret); > } > > -static void quorum_fifo_aio_cb(void *opaque, int ret) > +static int quorum_fifo_aio_cb(void *opaque, int ret) > { > QuorumChildRequest *sacb = opaque; > QuorumAIOCB *acb = sacb->parent; > - BDRVQuorumState *s = acb->common.bs->opaque; > + BDRVQuorumState *s = acb->bs->opaque; > > assert(acb->is_read && s->read_pattern == QUORUM_READ_PATTERN_FIFO); > > @@ -288,8 +277,7 @@ static void quorum_fifo_aio_cb(void *opaque, int ret) > > /* We try to read next child in FIFO order if we fail to read */ > if (acb->children_read < s->num_children) { > - read_fifo_child(acb); > - return; > + return read_fifo_child(acb); > } Question unrelated to this patch: in FIFO mode, are we doing work sequentially or in parallel? That is, does the quorum code kick off all children simultaneously, then wait until the first child answers with success (and abort all remaining children) or failure (at which point moving to the second child may already have an answer)? Or does it only kick of the first child, wait for a response, and not start the second child until after the first child fails? I guess one way has more potentially wasted work (and a stress test of our ability to cancel work on secondary children), while the other has higher latencies, so maybe it is something that a future quorum patch may want to make configurable? > > -static BlockAIOCB *read_fifo_child(QuorumAIOCB *acb) > +static int read_fifo_child(QuorumAIOCB *acb) > { > - BDRVQuorumState *s = acb->common.bs->opaque; > + BDRVQuorumState *s = acb->bs->opaque; > int n = acb->children_read++; > + int ret; > > - acb->qcrs[n].aiocb = bdrv_aio_readv(s->children[n], acb->sector_num, > - acb->qiov, acb->nb_sectors, > - quorum_fifo_aio_cb, &acb->qcrs[n]); > + acb->qcrs[n].bs = s->children[n]->bs; > + ret = bdrv_co_preadv(s->children[n], acb->sector_num * BDRV_SECTOR_SIZE, > + acb->nb_sectors * BDRV_SECTOR_SIZE, acb->qiov, 0); > + ret = quorum_fifo_aio_cb(&acb->qcrs[n], ret); somewhat answering myself - it looks like the current fifo approach is high-latency rather than parallel, in that at most one child is being run at a time. The conversion itself looks sane; Reviewed-by: Eric Blake <ebl...@redhat.com> -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature