On Fri, Oct 28, 2011 at 11:02 AM, Zhi Yong Wu <wu...@linux.vnet.ibm.com> wrote: > +static void bdrv_io_limits_skip_set(void *opaque, > + BlockAPIType co_type, > + bool cb_skip, > + bool limit_skip) { > + RwCo *rwco; > + BlockDriverAIOCBCoroutine *aioco; > + > + if (co_type == BDRV_API_SYNC) { > + rwco = opaque; > + rwco->limit_skip = limit_skip; > + } else if (co_type == BDRV_API_ASYNC) { > + aioco = opaque; > + aioco->cb_skip = cb_skip; > + aioco->limit_skip = limit_skip; > + } else { > + abort(); > + } > +}
The main question I have about this series is why have different cases for sync, aio, and coroutines? Perhaps I'm missing something but this should all be much simpler. All read/write requests are processed in a coroutine (bdrv_co_do_readv()/bdrv_co_do_writev()). That is the place to perform I/O throttling. Throttling should not be aware of sync, aio, vs coroutines. Since all requests have coroutines you could use CoQueue and the actual queue waiting code in bdrv_co_do_readv()/bdrv_co_do_writev() becomes: if (bdrv_exceeds_io_limit(bs, sector_num, qiov, is_write)) { qemu_co_queue_wait(&bs->throttled_reqs); /* Wait until this request is allowed to start */ while (bdrv_exceeds_io_limit(bs, sector_num, qiov, is_write)) { /* Re-inserting at the head of the CoQueue is equivalent to * the queue->flushing/queue->limit_exceeded behavior in your * patch. */ qemu_co_queue_wait_insert_head(&bs->throttled_reqs); } } I think block/blk-queue.h isn't needed if you use the existing CoQueue structure. This is the main point I want to raise, just a few minor comments below which may not be relevant if you can drop BlockQueue. > +static int qemu_block_queue_handler(BlockQueueAIOCB *request) > +{ > + int ret; > + > + BlockDriverState *bs = request->common.bs; > + assert(bs); > + > + if (bs->io_limits_enabled) { I'm not sure why the BlockQueue needs to reach into BlockDriverState. Now BlockDriverState and BlockQueue know about and depend on each other. It's usually nicer to keep the relationship unidirectional, if possible. > + ret = request->handler(request->common.bs, request->sector_num, > + request->nb_sectors, request->qiov, > + request, request->co_type); > + } else { > + if (request->co_type == BDRV_API_CO) { > + qemu_coroutine_enter(request->co, request->cocb); > + } else { > + printf("%s, req: %p\n", __func__, (void *)request); Debug output should be removed. > + bdrv_io_limits_issue_request(request, request->co_type); > + } > + > + ret = BDRV_BLKQ_DEQ_PASS; > + } > + > + return ret; > +} > + > +void qemu_block_queue_submit(BlockQueue *queue, BlockDriverCompletionFunc > *cb) > +{ > + BlockQueueAIOCB *request; > + queue->flushing = true; > + > + /*QTAILQ_FOREACH_SAFE(request, &queue->requests, entry, next) {*/ Commented out code should be removed. > + while (!QTAILQ_EMPTY(&queue->requests)) { > + int ret = 0; > + > + request = QTAILQ_FIRST(&queue->requests); > + QTAILQ_REMOVE(&queue->requests, request, entry); > + queue->limit_exceeded = false; > + ret = qemu_block_queue_handler(request); > + if (ret == -EIO) { > + cb(request, -EIO); > + break; > + } else if (ret == BDRV_BLKQ_ENQ_AGAIN) { > + QTAILQ_INSERT_HEAD(&queue->requests, request, entry); > + break; > + } else if (ret == BDRV_BLKQ_DEQ_PASS) { > + cb(request, 0); > + } What if ret is not -EIO, BDRV_BLKQ_ENQ_AGAIN, or BDRV_BLKQ_DEQ_PASS? I think the -EIO case should be the default that calls cb(request, ret). > + } > + > + printf("%s, leave\n", __func__); Debug code should be removed. Stefan