Re: [Qemu-devel] [PATCH v4] blockjob: Fix hang in block_job_finish_sync

2016-02-03 Thread Stefan Hajnoczi
On Tue, Feb 02, 2016 at 10:12:24AM +0800, Fam Zheng wrote:
> With a mirror job running on a virtio-blk dataplane disk, sending "q" to
> HMP will cause a dead loop in block_job_finish_sync.
> 
> This is because the aio_poll() only processes the AIO context of bs
> which has no more work to do, while the main loop BH that is scheduled
> for setting the job->completed flag is never processed.
> 
> Fix this by adding a flag in BlockJob structure, to track which context
> to poll for the block job to make progress. Its value is set to true
> when block_job_coroutine_complete() is called, and is checked in
> block_job_finish_sync to determine which context to poll.
> 
> Suggested-by: Stefan Hajnoczi 
> Signed-off-by: Fam Zheng 
> ---
>  blockjob.c   | 6 +-
>  include/block/blockjob.h | 5 +
>  2 files changed, 10 insertions(+), 1 deletion(-)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


signature.asc
Description: PGP signature


[Qemu-devel] [PATCH v4] blockjob: Fix hang in block_job_finish_sync

2016-02-01 Thread Fam Zheng
With a mirror job running on a virtio-blk dataplane disk, sending "q" to
HMP will cause a dead loop in block_job_finish_sync.

This is because the aio_poll() only processes the AIO context of bs
which has no more work to do, while the main loop BH that is scheduled
for setting the job->completed flag is never processed.

Fix this by adding a flag in BlockJob structure, to track which context
to poll for the block job to make progress. Its value is set to true
when block_job_coroutine_complete() is called, and is checked in
block_job_finish_sync to determine which context to poll.

Suggested-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 blockjob.c   | 6 +-
 include/block/blockjob.h | 5 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/blockjob.c b/blockjob.c
index 80adb9d..b15df93 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -304,7 +304,9 @@ static int block_job_finish_sync(BlockJob *job,
 return -EBUSY;
 }
 while (!job->completed) {
-aio_poll(bdrv_get_aio_context(bs), true);
+aio_poll(job->deferred_to_main_loop ? qemu_get_aio_context() :
+  bdrv_get_aio_context(bs),
+ true);
 }
 ret = (job->cancelled && job->ret == 0) ? -ECANCELED : job->ret;
 block_job_unref(job);
@@ -478,6 +480,7 @@ static void block_job_defer_to_main_loop_bh(void *opaque)
 aio_context = bdrv_get_aio_context(data->job->bs);
 aio_context_acquire(aio_context);
 
+data->job->deferred_to_main_loop = false;
 data->fn(data->job, data->opaque);
 
 aio_context_release(aio_context);
@@ -497,6 +500,7 @@ void block_job_defer_to_main_loop(BlockJob *job,
 data->aio_context = bdrv_get_aio_context(job->bs);
 data->fn = fn;
 data->opaque = opaque;
+job->deferred_to_main_loop = true;
 
 qemu_bh_schedule(data->bh);
 }
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index d84ccd8..8bedc49 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -130,6 +130,11 @@ struct BlockJob {
  */
 bool ready;
 
+/**
+ * Set to true when the job has deferred work to the main loop.
+ */
+bool deferred_to_main_loop;
+
 /** Status that is published by the query-block-jobs QMP API */
 BlockDeviceIoStatus iostatus;
 
-- 
2.4.3