Re: [Qemu-block] [Qemu-devel] [PATCH v3 01/10] qapi: Add transaction support to block-dirty-bitmap operations

2015-05-18 Thread Kevin Wolf
Am 11.05.2015 um 15:10 hat Stefan Hajnoczi geschrieben:
 On Fri, May 08, 2015 at 08:29:09AM -0600, Eric Blake wrote:
  On 05/08/2015 07:14 AM, Stefan Hajnoczi wrote:
  
   No it doesn't.  Actions have to appear atomic to the qmp_transaction
   caller.  Both approaches achieve that so they are both correct in
   isolation.
   
   The ambiguity is whether commit the changes for .commit() means
   changes take effect or discard stashed state, making undo
   impossible.
   
   I think the discard stashed state, making undo impossible
   interpretation is good because .commit() is not allowed to fail.  That
   function should only do things that never fail.
   
   That's going to get hard to maintain as we add more transactions.
   
   Yes, we need to be consistent and stick to one of the interpretations in
   order to guarantee ordering.
   
   Unfortunately, there is already an inconsistency:
   
   1. internal_snapshot - snapshot taken in .prepare()
   2. external_snapshot - BDS node appended in .commit()
   3. drive_backup - block job started in .prepare()
   4. blockdev_backup - block job started in .prepare()
   
   external_snapshot followed by internal_snapshot acts like the reverse
   ordering!
  
  Is that fatal, though?
 
 Yes, ordering is critical when add-bitmap or clear-bitmap are combined
 with drive-backup.  Typically the drive-backup must happen after
 add-bitmap or clear-bitmap.

Is there a reason to include add-bitmap/clear-bitmap in the transaction
rather than preparing everything before the transaction and then only
starting the backup (possibly of multiple disks) in a transaction?

The original assumption was that there are no interdependencies between
different transaction commands and the order is undefined.

Kevin


pgpq00KvxklsK.pgp
Description: PGP signature


Re: [Qemu-block] [PATCH v2 0/2] block: Fix error code for bdrv_getlength when the image is too big

2015-05-18 Thread Kevin Wolf
Am 15.05.2015 um 10:36 hat Fam Zheng geschrieben:
 v2: Correct detection of overflow. [Markus, Berto]
 
 If the image has a huge enough virtual size,
 
   $ qemu-img info afl9.img
   qemu-img: Can't get size of device 'image': Unknown error -512
 
 It's because of the multiplication overflow in the return statement in
 bdrv_getlength (the big nagetive value is later truncated to 0x200). Fix it to
 return -EFBIG:
 
   qemu-img: Can't get size of device 'image': File too large
 
 Bug reported by Richard Jones in:
 
 https://bugzilla.redhat.com/show_bug.cgi?id=1221499

Thanks, applied to the block branch.

Kevin



Re: [Qemu-block] [PATCH v4 08/11] qmp: Add an implementation wrapper for qmp_drive_backup

2015-05-18 Thread John Snow


On 05/18/2015 10:42 AM, Stefan Hajnoczi wrote:
 On Mon, May 11, 2015 at 07:04:23PM -0400, John Snow wrote:
 @@ -2900,9 +2917,16 @@ void qmp_drive_backup(const char *device,
 const char *target, } }
 
 +/* If we are not supplied with callback override info, use
 our defaults */ +if (cb == NULL) { +cb =
 block_job_cb; +} +if (opaque == NULL) { +opaque =
 bs; +}
 
 Why assign opaque separately, it raises the question what happens
 if a custom cb is given but the caller really wants opaque to be
 NULL?
 
 The following might be clearer:
 
 if (cb == NULL) { cb = block_job_cb; opaque = bs; }
 

It just wasn't a consideration when I was writing it, since the
transaction system won't ever want to pass NULL here.

It's easy enough to fix, though.



Re: [Qemu-block] [PATCH v2] qemu-iotests: Make debugging python tests easier

2015-05-18 Thread Kevin Wolf
Am 18.05.2015 um 03:39 hat Fam Zheng geschrieben:
 Adding -d option. The output goes to tee so it appears in your
 console. Also, raise the verbosity of unnitest runner.
 
 When testing a topic branch, it's possible that a bug introduced by a
 code change makes the python test case hang, with debug output, it is
 much easier to locate the problem.
 
 This can also be helpful if you want to watch the progress of a python
 test, it offers you a way to sense the speed of each test case method
 you're writing.
 
 Note: because there is no easy way to get *both* the verbose output and
 the output expected by ./check comparison, the case would always fail
 with an output mismatch. The sole purpose of using this option is
 giving developers a quick way to debug when things go wrong.
 
 Signed-off-by: Fam Zheng f...@redhat.com

Thanks, applied to the block branch.

Next step: Log all qemu-img/qemu-io output and QMP traffic when -d is
specified. :-)

Kevin



Re: [Qemu-block] [PATCH v4 09/11] block: drive_backup transaction callback support

2015-05-18 Thread Stefan Hajnoczi
On Mon, May 11, 2015 at 07:04:24PM -0400, John Snow wrote:
 +static void drive_backup_cb(BlkActionState *common)
 +{
 +BlkActionCallbackData *cb_data = common-cb_data;
 +BlockDriverState *bs = cb_data-opaque;
 +DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
 +
 +assert(state-bs == bs);
 +if (bs-job) {
 +assert(state-job == bs-job);
 +}

What is the purpose of the if statement?

Why is it not okay for a new job to have started?

 +
 +state-aio_context = bdrv_get_aio_context(bs);
 +aio_context_acquire(state-aio_context);

The bs-job access above should be protected by aio_context_acquire().


pgp7FkEx19Y09.pgp
Description: PGP signature


Re: [Qemu-block] [PATCH v4 01/11] qapi: Add transaction support to block-dirty-bitmap operations

2015-05-18 Thread Max Reitz

On 12.05.2015 01:04, John Snow wrote:

This adds two qmp commands to transactions.

block-dirty-bitmap-add allows you to create a bitmap simultaneously
alongside a new full backup to accomplish a clean synchronization
point.

block-dirty-bitmap-clear allows you to reset a bitmap back to as-if
it were new, which can also be used alongside a full backup to
accomplish a clean synchronization point.

Signed-off-by: Fam Zheng f...@redhat.com
Signed-off-by: John Snow js...@redhat.com
---
  block.c   |  19 +++-
  blockdev.c| 114 +-
  docs/bitmaps.md   |   6 +--
  include/block/block.h |   1 -
  include/block/block_int.h |   3 ++
  qapi-schema.json  |   6 ++-
  6 files changed, 139 insertions(+), 10 deletions(-)


Reviewed-by: Max Reitz mre...@redhat.com



[Qemu-block] [PATCH 3/3] qcow2: reorder fields in Qcow2CachedTable to reduce padding

2015-05-18 Thread Alberto Garcia
Changing the current ordering saves 8 bytes per cache entry in x86_64.

Signed-off-by: Alberto Garcia be...@igalia.com
---
 block/qcow2-cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index a215f5b..43590ff 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -31,9 +31,9 @@
 
 typedef struct Qcow2CachedTable {
 int64_t  offset;
-bool dirty;
 uint64_t lru_counter;
 int  ref;
+bool dirty;
 } Qcow2CachedTable;
 
 struct Qcow2Cache {
-- 
2.1.4




Re: [Qemu-block] [PATCH v3 04/12] virtio-blk: Move complete_request to 'ops' structure

2015-05-18 Thread Max Reitz

On 15.05.2015 08:04, Fam Zheng wrote:

Should more ops be added to differentiate code between dataplane and
non-dataplane, the new saved_ops approach will be cleaner than messing
with N pointers.

Signed-off-by: Fam Zheng f...@redhat.com
---
  hw/block/dataplane/virtio-blk.c | 13 -
  hw/block/virtio-blk.c   |  8 ++--
  include/hw/virtio/virtio-blk.h  |  9 +++--
  3 files changed, 21 insertions(+), 9 deletions(-)


Reviewed-by: Max Reitz mre...@redhat.com



Re: [Qemu-block] [PATCH v3 03/12] block-backend: Add blk_op_blocker_add_notifier

2015-05-18 Thread Max Reitz

On 15.05.2015 08:04, Fam Zheng wrote:

Forward the call to bdrv_op_blocker_add_notifier.

Signed-off-by: Fam Zheng f...@redhat.com
---
  block/block-backend.c  | 6 ++
  include/sysemu/block-backend.h | 2 ++
  2 files changed, 8 insertions(+)


Reviewed-by: Max Reitz mre...@redhat.com



Re: [Qemu-block] [PATCH v3 02/12] block: Add op blocker notifier list

2015-05-18 Thread Max Reitz

On 15.05.2015 08:04, Fam Zheng wrote:

BDS users can register a notifier and get notified about op blocker
changes.

Signed-off-by: Fam Zheng f...@redhat.com
---
  block.c   | 35 +++
  include/block/block.h |  8 
  include/block/block_int.h |  3 +++
  3 files changed, 46 insertions(+)

diff --git a/block.c b/block.c
index 7904098..178e186 100644
--- a/block.c
+++ b/block.c
@@ -3375,6 +3375,19 @@ struct BdrvOpBlocker {
  QLIST_ENTRY(BdrvOpBlocker) list;
  };
  
+/* Add a notifier which will be notified when the first blocker of some type is

+ * added to bs, or when the last blocker is removed. The notifier handler
+ * should receive a BlockOpEvent data.
+ *
+ * Caller must hold bs-aio_context; the notifier handler is also called with
+ * it hold.


*held (I think)

With that fixed:

Reviewed-by: Max Reitz mre...@redhat.com

(Although I'm missing a place where the notifiers are freed (on 
bdrv_delete()/bdrv_close()), but maybe that's done in a different patch)



+ */
+void bdrv_op_blocker_add_notifier(BlockDriverState *bs,
+  Notifier *notifier)
+{
+notifier_list_add(bs-op_blocker_notifiers, notifier);
+}
+
  bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp)
  {
  BdrvOpBlocker *blocker;
@@ -3391,26 +3404,48 @@ bool bdrv_op_is_blocked(BlockDriverState *bs, 
BlockOpType op, Error **errp)
  return false;
  }
  
+static void bdrv_op_blocker_notify(BlockDriverState *bs, BlockOpType op,

+   Error *reason, bool blocking)
+{
+BlockOpEvent event = (BlockOpEvent) {
+op = op,
+reason = reason,
+blocking = blocking,
+};
+
+notifier_list_notify(bs-op_blocker_notifiers, event);
+}
+
  void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason)
  {
+bool blocked;
  BdrvOpBlocker *blocker;
  assert((int) op = 0  op  BLOCK_OP_TYPE_MAX);
  
  blocker = g_new0(BdrvOpBlocker, 1);

  blocker-reason = reason;
+blocked = !QLIST_EMPTY(bs-op_blockers[op]);
  QLIST_INSERT_HEAD(bs-op_blockers[op], blocker, list);
+if (!blocked) {
+bdrv_op_blocker_notify(bs, op, reason, true);
+}
  }
  
  void bdrv_op_unblock(BlockDriverState *bs, BlockOpType op, Error *reason)

  {
+bool blocked;
  BdrvOpBlocker *blocker, *next;
  assert((int) op = 0  op  BLOCK_OP_TYPE_MAX);
+blocked = !QLIST_EMPTY(bs-op_blockers[op]);
  QLIST_FOREACH_SAFE(blocker, bs-op_blockers[op], list, next) {
  if (blocker-reason == reason) {
  QLIST_REMOVE(blocker, list);
  g_free(blocker);
  }
  }
+if (blocked  QLIST_EMPTY(bs-op_blockers[op])) {
+bdrv_op_blocker_notify(bs, op, reason, false);
+}
  }
  
  void bdrv_op_block_all(BlockDriverState *bs, Error *reason)

diff --git a/include/block/block.h b/include/block/block.h
index 906fb31..3420b2c 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -163,6 +163,12 @@ typedef enum BlockOpType {
  BLOCK_OP_TYPE_MAX,
  } BlockOpType;
  
+typedef struct {

+BlockOpType type;
+Error *reason;
+bool blocking;
+} BlockOpEvent;
+
  void bdrv_iostatus_enable(BlockDriverState *bs);
  void bdrv_iostatus_reset(BlockDriverState *bs);
  void bdrv_iostatus_disable(BlockDriverState *bs);
@@ -491,6 +497,8 @@ void bdrv_disable_copy_on_read(BlockDriverState *bs);
  void bdrv_ref(BlockDriverState *bs);
  void bdrv_unref(BlockDriverState *bs);
  
+void bdrv_op_blocker_add_notifier(BlockDriverState *bs,

+  Notifier *notifier);
  bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp);
  void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason);
  void bdrv_op_unblock(BlockDriverState *bs, BlockOpType op, Error *reason);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index db29b74..29d1c84 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -418,6 +418,9 @@ struct BlockDriverState {
  /* operation blockers */
  QLIST_HEAD(, BdrvOpBlocker) op_blockers[BLOCK_OP_TYPE_MAX];
  
+/* Callback when one list in op_blockers has empty status change. */

+NotifierList op_blocker_notifiers;
+
  /* long-running background operation */
  BlockJob *job;
  





Re: [Qemu-block] [PATCH 0/5] Misc fixes and testing of qcow[2] encryption

2015-05-18 Thread Kevin Wolf
Am 12.05.2015 um 18:09 hat Daniel P. Berrange geschrieben:
 I realize that qcow[2] encryption is a feature we have deprecated
 and will remove support for running it with the QEMU system
 emulators in this cycle. We do still need to make sure it continues
 to work for the sake of letting people run qemu-img convert to
 retrieve their data though.
 
 Some of the other patches I'm working on which introduce a cypto
 cipher API touch this qcow2 code, thus I wanted to be able to test
 that it doesn't break anything.
 
 I found that qemu-iotests didn't have any coverage of the qcow2
 encryption code. For added fun, I then discovered that qemu-io
 doesn't check if an encryption key is required, so ends up
 writing plain text to the files instead of cipher, and returning
 cipher text for reads, instead of plain text. IOW qemu-io will
 corrupt encrypted qcow2 files on write.
 
 This series adds some asserts that will protect against this kind
 of mistake, adds support for getting passwords to qemu-io (in the
 same manner that qemu-img supports), and finally adds a test case
 for reading/writing encrypted qcow2.

Thanks, applied to the block branch.

Kevin



[Qemu-block] [PATCH 1/3] qcow2: mark the memory as no longer needed after qcow2_cache_empty()

2015-05-18 Thread Alberto Garcia
After having emptied the cache, the data in the cache tables is no
longer useful, so we can tell the kernel that we are done with it. In
Linux this frees the resources associated with it.

The effect of this can be seen in the block-commit operation: it moves
data from the top to the base image (and fills both caches), then it
empties the top image. At this point the data in that cache is no
longer needed so it's just wasting memory.

Signed-off-by: Alberto Garcia be...@igalia.com
---
 block/qcow2-cache.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index ed92a09..ed14a92 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -22,8 +22,10 @@
  * THE SOFTWARE.
  */
 
+#include sys/mman.h
 #include block/block_int.h
 #include qemu-common.h
+#include qemu/osdep.h
 #include qcow2.h
 #include trace.h
 
@@ -60,6 +62,22 @@ static inline int qcow2_cache_get_table_idx(BlockDriverState 
*bs,
 return idx;
 }
 
+static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c,
+  int i, int num_tables)
+{
+#if QEMU_MADV_DONTNEED != QEMU_MADV_INVALID
+BDRVQcowState *s = bs-opaque;
+void *t = qcow2_cache_get_table_addr(bs, c, i);
+long align = sysconf(_SC_PAGESIZE);
+size_t mem_size = (size_t) s-cluster_size * num_tables;
+size_t offset = QEMU_ALIGN_UP((uintptr_t) t, align) - (uintptr_t) t;
+size_t length = QEMU_ALIGN_DOWN(mem_size - offset, align);
+if (length  0) {
+qemu_madvise((uint8_t *) t + offset, length, QEMU_MADV_DONTNEED);
+}
+#endif
+}
+
 Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables)
 {
 BDRVQcowState *s = bs-opaque;
@@ -237,6 +255,8 @@ int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c)
 c-entries[i].lru_counter = 0;
 }
 
+qcow2_cache_table_release(bs, c, 0, c-size);
+
 c-lru_counter = 0;
 
 return 0;
-- 
2.1.4




[Qemu-block] [PATCH 0/3] Clean unused entries in the qcow2 L2/refcount cache

2015-05-18 Thread Alberto Garcia
These patches use MADV_DONTNEED to clean unused cache entries. Under
Linux it frees the memory used by those pages.

Berto

Alberto Garcia (3):
  qcow2: mark the memory as no longer needed after qcow2_cache_empty()
  qcow2: add option to clean unused cache entries after some time
  qcow2: reorder fields in Qcow2CachedTable to reduce padding

 block/qcow2-cache.c  | 57 +++-
 block/qcow2.c| 56 +++
 block/qcow2.h|  4 
 qapi/block-core.json |  6 +-
 4 files changed, 121 insertions(+), 2 deletions(-)

-- 
2.1.4




[Qemu-block] [PATCH 2/3] qcow2: add option to clean unused cache entries after some time

2015-05-18 Thread Alberto Garcia
This adds a new 'cache-clean-interval' option that cleans all qcow2
cache entries that haven't been used in a certain interval, given in
seconds.

This allows setting a large L2 cache size so it can handle scenarios
with lots of I/O and at the same time use little memory during periods
of inactivity.

This feature currently relies on MADV_DONTNEED to free that memory, so
it is not useful in systems that don't follow that behavior.

Signed-off-by: Alberto Garcia be...@igalia.com
---
 block/qcow2-cache.c  | 35 
 block/qcow2.c| 56 
 block/qcow2.h|  4 
 qapi/block-core.json |  6 +-
 4 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index ed14a92..a215f5b 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -43,6 +43,7 @@ struct Qcow2Cache {
 booldepends_on_flush;
 void   *table_array;
 uint64_tlru_counter;
+uint64_tcache_clean_lru_counter;
 };
 
 static inline void *qcow2_cache_get_table_addr(BlockDriverState *bs,
@@ -78,6 +79,40 @@ static void qcow2_cache_table_release(BlockDriverState *bs, 
Qcow2Cache *c,
 #endif
 }
 
+static inline bool can_clean_entry(Qcow2Cache *c, int i)
+{
+Qcow2CachedTable *t = c-entries[i];
+return t-ref == 0  !t-dirty  t-offset != 0 
+t-lru_counter = c-cache_clean_lru_counter;
+}
+
+void qcow2_cache_clean_unused(BlockDriverState *bs, Qcow2Cache *c)
+{
+int i = 0;
+while (i  c-size) {
+int to_clean = 0;
+
+/* Skip the entries that we don't need to clean */
+while (i  c-size  !can_clean_entry(c, i)) {
+i++;
+}
+
+/* And count how many we can clean in a row */
+while (i  c-size  can_clean_entry(c, i)) {
+c-entries[i].offset = 0;
+c-entries[i].lru_counter = 0;
+i++;
+to_clean++;
+}
+
+if (to_clean  0) {
+qcow2_cache_table_release(bs, c, i - to_clean, to_clean);
+}
+}
+
+c-cache_clean_lru_counter = c-lru_counter;
+}
+
 Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables)
 {
 BDRVQcowState *s = bs-opaque;
diff --git a/block/qcow2.c b/block/qcow2.c
index b9a72e3..f6ed39f 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -468,6 +468,11 @@ static QemuOptsList qcow2_runtime_opts = {
 .type = QEMU_OPT_SIZE,
 .help = Maximum refcount block cache size,
 },
+{
+.name = QCOW2_OPT_CACHE_CLEAN_INTERVAL,
+.type = QEMU_OPT_NUMBER,
+.help = Clean unused cache entries after this time (in seconds),
+},
 { /* end of list */ }
 },
 };
@@ -483,6 +488,49 @@ static const char 
*overlap_bool_option_names[QCOW2_OL_MAX_BITNR] = {
 [QCOW2_OL_INACTIVE_L2_BITNR]= QCOW2_OPT_OVERLAP_INACTIVE_L2,
 };
 
+static void cache_clean_timer_cb(void *opaque)
+{
+BlockDriverState *bs = opaque;
+BDRVQcowState *s = bs-opaque;
+qcow2_cache_clean_unused(bs, s-l2_table_cache);
+qcow2_cache_clean_unused(bs, s-refcount_block_cache);
+timer_mod(s-cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
+  s-cache_clean_interval * 1000);
+}
+
+static void cache_clean_timer_init(BlockDriverState *bs, AioContext *context)
+{
+BDRVQcowState *s = bs-opaque;
+if (s-cache_clean_interval  0) {
+s-cache_clean_timer = aio_timer_new(context, QEMU_CLOCK_VIRTUAL,
+ SCALE_MS, cache_clean_timer_cb,
+ bs);
+timer_mod(s-cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
+  s-cache_clean_interval * 1000);
+}
+}
+
+static void cache_clean_timer_del(BlockDriverState *bs)
+{
+BDRVQcowState *s = bs-opaque;
+if (s-cache_clean_timer) {
+timer_del(s-cache_clean_timer);
+timer_free(s-cache_clean_timer);
+s-cache_clean_timer = NULL;
+}
+}
+
+static void qcow2_detach_aio_context(BlockDriverState *bs)
+{
+cache_clean_timer_del(bs);
+}
+
+static void qcow2_attach_aio_context(BlockDriverState *bs,
+ AioContext *new_context)
+{
+cache_clean_timer_init(bs, new_context);
+}
+
 static void read_cache_sizes(QemuOpts *opts, uint64_t *l2_cache_size,
  uint64_t *refcount_cache_size, Error **errp)
 {
@@ -838,6 +886,9 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, 
int flags,
 ret = -ENOMEM;
 goto fail;
 }
+s-cache_clean_interval =
+qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL, 0);
+cache_clean_timer_init(bs, bdrv_get_aio_context(bs));
 
 s-cluster_cache = g_malloc(s-cluster_size);
 /* one more sector for decompressed data alignment */
@@ -1004,6 +1055,7 @@ static int 

Re: [Qemu-block] [PATCH v4 06/11] block: add refcount to Job object

2015-05-18 Thread Stefan Hajnoczi
On Mon, May 11, 2015 at 07:04:21PM -0400, John Snow wrote:
 If we want to get at the job after the life of the job,
 we'll need a refcount for this object.
 
 This may occur for example if we wish to inspect the actions
 taken by a particular job after a transactional group of jobs
 runs, and further actions are required.
 
 Signed-off-by: John Snow js...@redhat.com
 Reviewed-by: Max Reitz mre...@redhat.com
 ---
  blockjob.c   | 18 --
  include/block/blockjob.h | 21 +
  2 files changed, 37 insertions(+), 2 deletions(-)

I think the only reason for this refcount is so that
backup_transaction_complete() can be called.  It accesses
BackupBlockJob-sync_bitmap so the BackupBlockJob instance needs to be
alive.

The bitmap refcount is incremented in blockdev.c, not block/backup.c, so
it is fine to drop backup_transaction_complete() and decrement the
bitmap refcount in blockdev.c instead.

If you do that then there is no need to add a refcount to block job.
This would simplify things.


pgpooeh7qM9TI.pgp
Description: PGP signature


Re: [Qemu-block] [PATCH v4 09/11] block: drive_backup transaction callback support

2015-05-18 Thread John Snow


On 05/18/2015 11:35 AM, Stefan Hajnoczi wrote:
 On Mon, May 11, 2015 at 07:04:24PM -0400, John Snow wrote:
 +static void drive_backup_cb(BlkActionState *common) +{ +
 BlkActionCallbackData *cb_data = common-cb_data; +
 BlockDriverState *bs = cb_data-opaque; +DriveBackupState
 *state = DO_UPCAST(DriveBackupState, common, common); + +
 assert(state-bs == bs); +if (bs-job) { +
 assert(state-job == bs-job); +}
 
 What is the purpose of the if statement?
 
 Why is it not okay for a new job to have started?
 

Hmm, maybe it's fine -- It was just my thought that it probably
/shouldn't/ occur under normal circumstances.

I think my assumption was that we want to impose an ordering that job
cleanup occurs before another job launches, in general.

I think, though, that you wanted to start allowing non-conflicting
jobs to run concurrently, though, so I'll just eye over this series
again to make sure it's okay for cleanup to happen after another job
starts ...

...Provided the second job does not fiddle with bitmaps, of course. We
should clean those up before another bitmap job starts, definitely.

 + +state-aio_context = bdrv_get_aio_context(bs); +
 aio_context_acquire(state-aio_context);
 
 The bs-job access above should be protected by
 aio_context_acquire().
 

Thanks,
--js



Re: [Qemu-block] [PATCH v4 11/11] qmp-commands.hx: Update the supported 'transaction' operations

2015-05-18 Thread Max Reitz

On 12.05.2015 01:04, John Snow wrote:

From: Kashyap Chamarthy kcham...@redhat.com

Although the canonical source of reference for QMP commands is
qapi-schema.json, for consistency's sake, update qmp-commands.hx to
state the list of supported transactionable operations, namely:

 drive-backup
 blockdev-backup
 blockdev-snapshot-internal-sync
 abort
 block-dirty-bitmap-add
 block-dirty-bitmap-clear

Signed-off-by: Kashyap Chamarthy kcham...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: John Snow js...@redhat.com
---
  qmp-commands.hx | 13 -
  1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 7506774..363126a 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1238,11 +1238,14 @@ SQMP
  transaction
  ---
  
-Atomically operate on one or more block devices.  The only supported operations

-for now are drive-backup, internal and external snapshotting.  A list of
-dictionaries is accepted, that contains the actions to be performed.
-If there is any failure performing any of the operations, all operations
-for the group are abandoned.
+Atomically operate on one or more block devices.  Operations that are
+currently supported: drive-backup, blockdev-backup,
+blockdev-snapshot-sync, blockdev-snapshot-internal-sync, abort,
+block-dirty-bitmap-add, block-dirty-bitmap-clear


Hm, seven operations... Worth making it a real list?

Max


(refer to the
+qemu/qapi-schema.json file for minimum required QEMU versions for these
+operations).  A list of dictionaries is accepted, that contains the
+actions to be performed.  If there is any failure performing any of the
+operations, all operations for the group are abandoned.
  
  For external snapshots, the dictionary contains the device, the file to use for

  the new snapshot, and the format.  The default format, if not specified, is





Re: [Qemu-block] [PATCH 1/1] docs: document how to configure the qcow2 L2/refcount caches

2015-05-18 Thread Alberto Garcia
On Mon 18 May 2015 06:28:03 PM CEST, Max Reitz mre...@redhat.com wrote:

 +This work is licensed under the terms of the GNU GPL, version 2 or
 +later. See the COPYING file in the top-level directory.

 Well, GPL is strange for documentation but I guess it works (also,
 it's the implicit license for any documentation withou an explicit
 license, so it has to work...).

I don't have a strong preference here as long as it's free so I decided
to stick to what the majority of QEMU documents are using.

Berto



Re: [Qemu-block] [PATCH v3 05/12] virtio-blk: Don't handle output when there is device IO op blocker

2015-05-18 Thread Max Reitz

On 15.05.2015 08:04, Fam Zheng wrote:

virtio-blk now listens to op blocker change of the associated block
backend.

Up on setting op blocker on BLOCK_OP_TYPE_DEVICE_IO:

   non-dataplane:
1) Set VirtIOBlock.paused
2) In virtio_blk_handle_output, do nothing if VirtIOBlock.paused

   dataplane:
1) Clear the host event notifier
2) In handle_notify, do nothing if VirtIOBlock.paused

Up on removing the op blocker:

   non-dataplane:
1) Clear VirtIOBlock.paused
2) Schedule a BH on the AioContext of the backend, which calls
virtio_blk_handle_output, so that the previous unhandled kicks can
make progress

   dataplane:
1) Set the host event notifier
2) Notify the host event notifier so that unhandled events are
processed

Signed-off-by: Fam Zheng f...@redhat.com
---
  hw/block/dataplane/virtio-blk.c | 25 +++-
  hw/block/virtio-blk.c   | 63 +++--
  include/hw/virtio/virtio-blk.h  |  8 +-
  3 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index e287e09..a5e8e35 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -87,8 +87,28 @@ static void complete_request_vring(VirtIOBlockReq *req, 
unsigned char status)
  qemu_bh_schedule(s-bh);
  }
  
+static void virtio_blk_data_plane_pause(VirtIOBlock *vblk)

+{
+VirtIOBlockDataPlane *s = vblk-dataplane;
+
+event_notifier_test_and_clear(s-host_notifier);
+aio_set_event_notifier(s-ctx, s-host_notifier, NULL);
+}
+
+static void handle_notify(EventNotifier *e);
+static void virtio_blk_data_plane_resume(VirtIOBlock *vblk)
+{
+VirtIOBlockDataPlane *s = vblk-dataplane;
+
+aio_set_event_notifier(s-ctx, s-host_notifier, handle_notify);
+
+event_notifier_set(s-host_notifier);
+}
+
  static const VirtIOBlockOps virtio_blk_data_plane_ops = {
-.complete_request = complete_request_vring,
+.complete_request   = complete_request_vring,
+.pause  = virtio_blk_data_plane_pause,
+.resume = virtio_blk_data_plane_resume,
  };
  
  static void handle_notify(EventNotifier *e)

@@ -98,6 +118,9 @@ static void handle_notify(EventNotifier *e)
  VirtIOBlock *vblk = VIRTIO_BLK(s-vdev);
  
  event_notifier_test_and_clear(s-host_notifier);

+if (vblk-paused) {
+return;
+}
  blk_io_plug(s-conf-conf.blk);
  for (;;) {
  MultiReqBuffer mrb = {};
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index f4a9d19..4bc1b2a 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -59,8 +59,38 @@ static void virtio_blk_complete_request(VirtIOBlockReq *req,
  virtio_notify(vdev, s-vq);
  }
  
+typedef struct {

+QEMUBH *bh;
+VirtIOBlock *s;
+} VirtIOBlockResumeData;
+
+static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq);
+static void virtio_blk_resume_bh_cb(void *opaque)
+{
+VirtIOBlockResumeData *data = opaque;
+qemu_bh_delete(data-bh);
+virtio_blk_handle_output(VIRTIO_DEVICE(data-s), data-s-vq);
+}
+
+static void virtio_blk_pause(VirtIOBlock *vblk)
+{
+/* TODO: stop ioeventfd */
+}
+
+static void virtio_blk_resume(VirtIOBlock *vblk)
+{
+VirtIOBlockResumeData *data = g_new(VirtIOBlockResumeData, 1);
+data-bh = aio_bh_new(blk_get_aio_context(vblk-blk),
+virtio_blk_resume_bh_cb, data);
+data-s = vblk;
+data-s-paused = false;
+qemu_bh_schedule(data-bh);
+}
+
  static const VirtIOBlockOps virtio_blk_ops = (VirtIOBlockOps) {
-.complete_request = virtio_blk_complete_request,
+.complete_request   = virtio_blk_complete_request,
+.pause  = virtio_blk_pause,
+.resume = virtio_blk_resume,
  };
  
  static void virtio_blk_req_complete(VirtIOBlockReq *req, unsigned char status)

@@ -597,6 +627,9 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
VirtQueue *vq)
  VirtIOBlockReq *req;
  MultiReqBuffer mrb = {};
  
+if (s-paused) {

+return;
+}
  /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
   * dataplane here instead of waiting for .set_status().
   */
@@ -787,7 +820,7 @@ static void virtio_blk_save(QEMUFile *f, void *opaque)
  
  virtio_save(vdev, f);

  }
-
+
  static void virtio_blk_save_device(VirtIODevice *vdev, QEMUFile *f)
  {
  VirtIOBlock *s = VIRTIO_BLK(vdev);
@@ -875,6 +908,29 @@ static void virtio_blk_migration_state_changed(Notifier 
*notifier, void *data)
  }
  }
  
+static void virtio_blk_op_blocker_changed(Notifier *notifier, void *opaque)

+{
+BlockOpEvent *event = opaque;
+VirtIOBlock *s = container_of(notifier, VirtIOBlock,
+  op_blocker_notifier);
+bool pause;
+
+if (event-type != BLOCK_OP_TYPE_DEVICE_IO) {
+return;
+}
+pause = event-blocking || 

Re: [Qemu-block] [PATCH v3 06/12] nbd-server: Clear can_read when device io blocker is set

2015-05-18 Thread Max Reitz

On 15.05.2015 08:04, Fam Zheng wrote:

So that NBD export cannot submit IO during bdrv_drain_all().

Signed-off-by: Fam Zheng f...@redhat.com
---
  nbd.c | 18 ++
  1 file changed, 18 insertions(+)


But can_read is not cleared until nbd_update_can_read() is called, which 
is only in nbd_request_get(), nbd_request_put(), and 
nbd_co_receive_request(), which are all called from nbd_trip() - so the 
one request which clears can_read will most probably be itself processed.


Maybe we should call nbd_update_can_read() in nbd_op_blocker_changed(), 
for all clients of the export? Or just once somewhere at the beginning 
of processing a request, and if io_blocked is set, just yield...


Max



[Qemu-block] [PATCH v3 07/13] Change qemu_set_fd_handler2(..., NULL, ...) to qemu_set_fd_handler

2015-05-18 Thread Fam Zheng
Done with following Coccinelle semantic patch, plus manual cosmetic changes in
net/*.c.

@@
expression E1, E2, E3, E4;
@@
-   qemu_set_fd_handler2(E1, NULL, E2, E3, E4);
+   qemu_set_fd_handler(E1, E2, E3, E4);

Signed-off-by: Fam Zheng f...@redhat.com
---
 blockdev-nbd.c |  4 ++--
 main-loop.c|  3 +--
 migration/exec.c   |  6 +++---
 migration/fd.c |  4 ++--
 migration/rdma.c   |  7 +++
 migration/tcp.c|  6 +++---
 migration/unix.c   |  6 +++---
 net/l2tpv3.c   |  8 
 net/netmap.c   |  8 
 net/socket.c   |  8 
 net/tap.c  |  8 
 ui/vnc-auth-sasl.c |  2 +-
 ui/vnc-auth-vencrypt.c |  2 +-
 ui/vnc-ws.c|  6 +++---
 ui/vnc.c   | 27 ---
 util/qemu-sockets.c|  8 +++-
 16 files changed, 53 insertions(+), 60 deletions(-)

diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 85cda4c..0d9df47 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -43,7 +43,7 @@ void qmp_nbd_server_start(SocketAddress *addr, Error **errp)
 
 server_fd = socket_listen(addr, errp);
 if (server_fd != -1) {
-qemu_set_fd_handler2(server_fd, NULL, nbd_accept, NULL, NULL);
+qemu_set_fd_handler(server_fd, nbd_accept, NULL, NULL);
 }
 }
 
@@ -129,7 +129,7 @@ void qmp_nbd_server_stop(Error **errp)
 }
 
 if (server_fd != -1) {
-qemu_set_fd_handler2(server_fd, NULL, NULL, NULL, NULL);
+qemu_set_fd_handler(server_fd, NULL, NULL, NULL);
 close(server_fd);
 server_fd = -1;
 }
diff --git a/main-loop.c b/main-loop.c
index 981bcb5..82875a4 100644
--- a/main-loop.c
+++ b/main-loop.c
@@ -100,8 +100,7 @@ static int qemu_signal_init(void)
 
 fcntl_setfl(sigfd, O_NONBLOCK);
 
-qemu_set_fd_handler2(sigfd, NULL, sigfd_handler, NULL,
- (void *)(intptr_t)sigfd);
+qemu_set_fd_handler(sigfd, sigfd_handler, NULL, (void *)(intptr_t)sigfd);
 
 return 0;
 }
diff --git a/migration/exec.c b/migration/exec.c
index 4790247..8406d2b 100644
--- a/migration/exec.c
+++ b/migration/exec.c
@@ -49,7 +49,7 @@ static void exec_accept_incoming_migration(void *opaque)
 {
 QEMUFile *f = opaque;
 
-qemu_set_fd_handler2(qemu_get_fd(f), NULL, NULL, NULL, NULL);
+qemu_set_fd_handler(qemu_get_fd(f), NULL, NULL, NULL);
 process_incoming_migration(f);
 }
 
@@ -64,6 +64,6 @@ void exec_start_incoming_migration(const char *command, Error 
**errp)
 return;
 }
 
-qemu_set_fd_handler2(qemu_get_fd(f), NULL,
-exec_accept_incoming_migration, NULL, f);
+qemu_set_fd_handler(qemu_get_fd(f), exec_accept_incoming_migration, NULL,
+f);
 }
diff --git a/migration/fd.c b/migration/fd.c
index 129da99..3e4bed0 100644
--- a/migration/fd.c
+++ b/migration/fd.c
@@ -62,7 +62,7 @@ static void fd_accept_incoming_migration(void *opaque)
 {
 QEMUFile *f = opaque;
 
-qemu_set_fd_handler2(qemu_get_fd(f), NULL, NULL, NULL, NULL);
+qemu_set_fd_handler(qemu_get_fd(f), NULL, NULL, NULL);
 process_incoming_migration(f);
 }
 
@@ -84,5 +84,5 @@ void fd_start_incoming_migration(const char *infd, Error 
**errp)
 return;
 }
 
-qemu_set_fd_handler2(fd, NULL, fd_accept_incoming_migration, NULL, f);
+qemu_set_fd_handler(fd, fd_accept_incoming_migration, NULL, f);
 }
diff --git a/migration/rdma.c b/migration/rdma.c
index 77e3444..171c23f 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -2834,7 +2834,7 @@ static int qemu_rdma_accept(RDMAContext *rdma)
 }
 }
 
-qemu_set_fd_handler2(rdma-channel-fd, NULL, NULL, NULL, NULL);
+qemu_set_fd_handler(rdma-channel-fd, NULL, NULL, NULL);
 
 ret = rdma_accept(rdma-cm_id, conn_param);
 if (ret) {
@@ -3331,9 +3331,8 @@ void rdma_start_incoming_migration(const char *host_port, 
Error **errp)
 
 trace_rdma_start_incoming_migration_after_rdma_listen();
 
-qemu_set_fd_handler2(rdma-channel-fd, NULL,
- rdma_accept_incoming_migration, NULL,
-(void *)(intptr_t) rdma);
+qemu_set_fd_handler(rdma-channel-fd, rdma_accept_incoming_migration,
+NULL, (void *)(intptr_t)rdma);
 return;
 err:
 error_propagate(errp, local_err);
diff --git a/migration/tcp.c b/migration/tcp.c
index 91c9cf3..ae89172 100644
--- a/migration/tcp.c
+++ b/migration/tcp.c
@@ -65,7 +65,7 @@ static void tcp_accept_incoming_migration(void *opaque)
 c = qemu_accept(s, (struct sockaddr *)addr, addrlen);
 err = socket_error();
 } while (c  0  err == EINTR);
-qemu_set_fd_handler2(s, NULL, NULL, NULL, NULL);
+qemu_set_fd_handler(s, NULL, NULL, NULL);
 closesocket(s);
 
 DPRINTF(accepted migration\n);
@@ -98,6 +98,6 @@ void tcp_start_incoming_migration(const char *host_port, 
Error **errp)
 return;
 }
 
-qemu_set_fd_handler2(s, NULL, 

[Qemu-block] [PATCH v3 05/13] net/socket: Drop net_socket_can_send

2015-05-18 Thread Fam Zheng
This callback is called by main loop before polling s-fd, if it returns
false, the fd will not be polled in this iteration.

This is redundant with checks inside read callback. After this patch,
the data will be sent to peer when it arrives. If the device can't
receive, it will be queued to incoming_queue, and when the device status
changes, this queue will be flushed.

If the peer is not ready, disable the read poll until send completes.

Signed-off-by: Fam Zheng f...@redhat.com
---
 net/socket.c | 31 +++
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/net/socket.c b/net/socket.c
index c30e03f..82fa175 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -51,18 +51,9 @@ typedef struct NetSocketState {
 static void net_socket_accept(void *opaque);
 static void net_socket_writable(void *opaque);
 
-/* Only read packets from socket when peer can receive them */
-static int net_socket_can_send(void *opaque)
-{
-NetSocketState *s = opaque;
-
-return qemu_can_send_packet(s-nc);
-}
-
 static void net_socket_update_fd_handler(NetSocketState *s)
 {
-qemu_set_fd_handler2(s-fd,
- s-read_poll  ? net_socket_can_send : NULL,
+qemu_set_fd_handler2(s-fd, NULL,
  s-read_poll  ? s-send_fn : NULL,
  s-write_poll ? net_socket_writable : NULL,
  s);
@@ -142,6 +133,15 @@ static ssize_t net_socket_receive_dgram(NetClientState 
*nc, const uint8_t *buf,
 return ret;
 }
 
+static void net_socket_send_completed(NetClientState *nc, ssize_t len)
+{
+NetSocketState *s = DO_UPCAST(NetSocketState, nc, nc);
+
+if (!s-read_poll) {
+net_socket_read_poll(s, true);
+}
+}
+
 static void net_socket_send(void *opaque)
 {
 NetSocketState *s = opaque;
@@ -211,9 +211,13 @@ static void net_socket_send(void *opaque)
 buf += l;
 size -= l;
 if (s-index = s-packet_len) {
-qemu_send_packet(s-nc, s-buf, s-packet_len);
 s-index = 0;
 s-state = 0;
+if (qemu_send_packet_async(s-nc, s-buf, size,
+   net_socket_send_completed) == 0) {
+net_socket_read_poll(s, false);
+break;
+}
 }
 break;
 }
@@ -234,7 +238,10 @@ static void net_socket_send_dgram(void *opaque)
 net_socket_write_poll(s, false);
 return;
 }
-qemu_send_packet(s-nc, s-buf, size);
+if (qemu_send_packet_async(s-nc, s-buf, size,
+   net_socket_send_completed) == 0) {
+net_socket_read_poll(s, false);
+}
 }
 
 static int net_socket_mcast_create(struct sockaddr_in *mcastaddr, struct 
in_addr *localaddr)
-- 
2.4.1




[Qemu-block] [PATCH v3 09/13] alsaaudio: Remove unused error handling of qemu_set_fd_handler

2015-05-18 Thread Fam Zheng
The function cannot fail, so the check is superfluous.

Signed-off-by: Fam Zheng f...@redhat.com
---
 audio/alsaaudio.c | 16 ++--
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/audio/alsaaudio.c b/audio/alsaaudio.c
index 74ead97..ed7655d 100644
--- a/audio/alsaaudio.c
+++ b/audio/alsaaudio.c
@@ -266,31 +266,19 @@ static int alsa_poll_helper (snd_pcm_t *handle, struct 
pollhlp *hlp, int mask)
 
 for (i = 0; i  count; ++i) {
 if (pfds[i].events  POLLIN) {
-err = qemu_set_fd_handler (pfds[i].fd, alsa_poll_handler,
-   NULL, hlp);
+qemu_set_fd_handler (pfds[i].fd, alsa_poll_handler, NULL, hlp);
 }
 if (pfds[i].events  POLLOUT) {
 if (conf.verbose) {
 dolog (POLLOUT %d %d\n, i, pfds[i].fd);
 }
-err = qemu_set_fd_handler (pfds[i].fd, NULL,
-   alsa_poll_handler, hlp);
+qemu_set_fd_handler (pfds[i].fd, NULL, alsa_poll_handler, hlp);
 }
 if (conf.verbose) {
 dolog (Set handler events=%#x index=%d fd=%d err=%d\n,
pfds[i].events, i, pfds[i].fd, err);
 }
 
-if (err) {
-dolog (Failed to set handler events=%#x index=%d fd=%d err=%d\n,
-   pfds[i].events, i, pfds[i].fd, err);
-
-while (i--) {
-qemu_set_fd_handler (pfds[i].fd, NULL, NULL, NULL);
-}
-g_free (pfds);
-return -1;
-}
 }
 hlp-pfds = pfds;
 hlp-count = count;
-- 
2.4.1




[Qemu-block] [PATCH v3 08/13] main-loop: Drop qemu_set_fd_handler2

2015-05-18 Thread Fam Zheng
All users are converted to qemu_set_fd_handler now, drop
qemu_set_fd_handler2 and IOHandlerRecord.fd_read_poll.

Signed-off-by: Fam Zheng f...@redhat.com
---
 include/block/aio.h  |  2 +-
 include/qemu/main-loop.h | 49 +---
 iohandler.c  | 26 +
 stubs/set-fd-handler.c   |  9 -
 4 files changed, 7 insertions(+), 79 deletions(-)

diff --git a/include/block/aio.h b/include/block/aio.h
index d2bb423..b46103e 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -241,7 +241,7 @@ bool aio_dispatch(AioContext *ctx);
 bool aio_poll(AioContext *ctx, bool blocking);
 
 /* Register a file descriptor and associated callbacks.  Behaves very similarly
- * to qemu_set_fd_handler2.  Unlike qemu_set_fd_handler2, these callbacks will
+ * to qemu_set_fd_handler.  Unlike qemu_set_fd_handler, these callbacks will
  * be invoked when using aio_poll().
  *
  * Code that invokes AIO completion functions should rely on this function
diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
index 62c68c0..7da1d63 100644
--- a/include/qemu/main-loop.h
+++ b/include/qemu/main-loop.h
@@ -96,8 +96,7 @@ AioContext *qemu_get_aio_context(void);
  * that the main loop waits for.
  *
  * Calling qemu_notify_event is rarely necessary, because main loop
- * services (bottom halves and timers) call it themselves.  One notable
- * exception occurs when using qemu_set_fd_handler2 (see below).
+ * services (bottom halves and timers) call it themselves.
  */
 void qemu_notify_event(void);
 
@@ -172,52 +171,6 @@ typedef void IOReadHandler(void *opaque, const uint8_t 
*buf, int size);
 typedef int IOCanReadHandler(void *opaque);
 
 /**
- * qemu_set_fd_handler2: Register a file descriptor with the main loop
- *
- * This function tells the main loop to wake up whenever one of the
- * following conditions is true:
- *
- * 1) if @fd_write is not %NULL, when the file descriptor is writable;
- *
- * 2) if @fd_read is not %NULL, when the file descriptor is readable.
- *
- * @fd_read_poll can be used to disable the @fd_read callback temporarily.
- * This is useful to avoid calling qemu_set_fd_handler2 every time the
- * client becomes interested in reading (or dually, stops being interested).
- * A typical example is when @fd is a listening socket and you want to bound
- * the number of active clients.  Remember to call qemu_notify_event whenever
- * the condition may change from %false to %true.
- *
- * The callbacks that are set up by qemu_set_fd_handler2 are level-triggered.
- * If @fd_read does not read from @fd, or @fd_write does not write to @fd
- * until its buffers are full, they will be called again on the next
- * iteration.
- *
- * @fd: The file descriptor to be observed.  Under Windows it must be
- * a #SOCKET.
- *
- * @fd_read_poll: A function that returns 1 if the @fd_read callback
- * should be fired.  If the function returns 0, the main loop will not
- * end its iteration even if @fd becomes readable.
- *
- * @fd_read: A level-triggered callback that is fired if @fd is readable
- * at the beginning of a main loop iteration, or if it becomes readable
- * during one.
- *
- * @fd_write: A level-triggered callback that is fired when @fd is writable
- * at the beginning of a main loop iteration, or if it becomes writable
- * during one.
- *
- * @opaque: A pointer-sized value that is passed to @fd_read_poll,
- * @fd_read and @fd_write.
- */
-int qemu_set_fd_handler2(int fd,
- IOCanReadHandler *fd_read_poll,
- IOHandler *fd_read,
- IOHandler *fd_write,
- void *opaque);
-
-/**
  * qemu_set_fd_handler: Register a file descriptor with the main loop
  *
  * This function tells the main loop to wake up whenever one of the
diff --git a/iohandler.c b/iohandler.c
index cca614f..d361cf2 100644
--- a/iohandler.c
+++ b/iohandler.c
@@ -33,7 +33,6 @@
 #endif
 
 typedef struct IOHandlerRecord {
-IOCanReadHandler *fd_read_poll;
 IOHandler *fd_read;
 IOHandler *fd_write;
 void *opaque;
@@ -46,14 +45,10 @@ typedef struct IOHandlerRecord {
 static QLIST_HEAD(, IOHandlerRecord) io_handlers =
 QLIST_HEAD_INITIALIZER(io_handlers);
 
-
-/* XXX: fd_read_poll should be suppressed, but an API change is
-   necessary in the character devices to suppress fd_can_read(). */
-int qemu_set_fd_handler2(int fd,
- IOCanReadHandler *fd_read_poll,
- IOHandler *fd_read,
- IOHandler *fd_write,
- void *opaque)
+int qemu_set_fd_handler(int fd,
+IOHandler *fd_read,
+IOHandler *fd_write,
+void *opaque)
 {
 IOHandlerRecord *ioh;
 
@@ -75,7 +70,6 @@ int qemu_set_fd_handler2(int fd,
 QLIST_INSERT_HEAD(io_handlers, ioh, next);
 found:
 ioh-fd = fd;
-ioh-fd_read_poll = 

[Qemu-block] [PATCH v3 13/13] iohandler: Change return type of qemu_set_fd_handler to void

2015-05-18 Thread Fam Zheng
Signed-off-by: Fam Zheng f...@redhat.com
---
 include/qemu/main-loop.h | 8 
 iohandler.c  | 9 -
 stubs/set-fd-handler.c   | 8 
 3 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
index 7da1d63..0f4a0fd 100644
--- a/include/qemu/main-loop.h
+++ b/include/qemu/main-loop.h
@@ -198,10 +198,10 @@ typedef int IOCanReadHandler(void *opaque);
  *
  * @opaque: A pointer-sized value that is passed to @fd_read and @fd_write.
  */
-int qemu_set_fd_handler(int fd,
-IOHandler *fd_read,
-IOHandler *fd_write,
-void *opaque);
+void qemu_set_fd_handler(int fd,
+ IOHandler *fd_read,
+ IOHandler *fd_write,
+ void *opaque);
 
 #ifdef CONFIG_POSIX
 /**
diff --git a/iohandler.c b/iohandler.c
index d361cf2..826f713 100644
--- a/iohandler.c
+++ b/iohandler.c
@@ -45,10 +45,10 @@ typedef struct IOHandlerRecord {
 static QLIST_HEAD(, IOHandlerRecord) io_handlers =
 QLIST_HEAD_INITIALIZER(io_handlers);
 
-int qemu_set_fd_handler(int fd,
-IOHandler *fd_read,
-IOHandler *fd_write,
-void *opaque)
+void qemu_set_fd_handler(int fd,
+ IOHandler *fd_read,
+ IOHandler *fd_write,
+ void *opaque)
 {
 IOHandlerRecord *ioh;
 
@@ -77,7 +77,6 @@ int qemu_set_fd_handler(int fd,
 ioh-deleted = 0;
 qemu_notify_event();
 }
-return 0;
 }
 
 void qemu_iohandler_fill(GArray *pollfds)
diff --git a/stubs/set-fd-handler.c b/stubs/set-fd-handler.c
index a895e62..a8481bc 100644
--- a/stubs/set-fd-handler.c
+++ b/stubs/set-fd-handler.c
@@ -1,10 +1,10 @@
 #include qemu-common.h
 #include qemu/main-loop.h
 
-int qemu_set_fd_handler(int fd,
-IOHandler *fd_read,
-IOHandler *fd_write,
-void *opaque)
+void qemu_set_fd_handler(int fd,
+ IOHandler *fd_read,
+ IOHandler *fd_write,
+ void *opaque)
 {
 abort();
 }
-- 
2.4.1




[Qemu-block] [PATCH v4 13/13] block/mirror: Block device IO during mirror exit

2015-05-18 Thread Fam Zheng
When block job mirror is finished, the source and target is synced. But
we call bdrv_swap() later in main loop bh. If the guest write before
that, target will not get the new data.

Reported-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: Fam Zheng f...@redhat.com
---
 block/mirror.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/block/mirror.c b/block/mirror.c
index 58f391a..d96403b 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -320,6 +320,8 @@ static void mirror_drain(MirrorBlockJob *s)
 
 typedef struct {
 int ret;
+/* Use to pause device IO during mirror exit */
+Error *blocker;
 } MirrorExitData;
 
 static void mirror_exit(BlockJob *job, void *opaque)
@@ -328,6 +330,8 @@ static void mirror_exit(BlockJob *job, void *opaque)
 MirrorExitData *data = opaque;
 AioContext *replace_aio_context = NULL;
 
+bdrv_op_unblock(s-common.bs, BLOCK_OP_TYPE_DEVICE_IO, data-blocker);
+error_free(data-blocker);
 if (s-to_replace) {
 replace_aio_context = bdrv_get_aio_context(s-to_replace);
 aio_context_acquire(replace_aio_context);
@@ -563,8 +567,10 @@ immediate_exit:
 bdrv_release_dirty_bitmap(bs, s-dirty_bitmap);
 bdrv_iostatus_disable(s-target);
 
-data = g_malloc(sizeof(*data));
+data = g_malloc0(sizeof(*data));
 data-ret = ret;
+error_setg(data-blocker, mirror job exiting);
+bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, data-blocker);
 block_job_defer_to_main_loop(s-common, mirror_exit, data);
 }
 
-- 
2.4.1




[Qemu-block] [PATCH v4 06/13] virtio-scsi-dataplane: Add device IO op blocker listener

2015-05-18 Thread Fam Zheng
When a disk is attached to scsi-bus, virtio_scsi_hotplug will take care
of protecting the block device with op blockers. Currently we haven't
enabled block jobs (like what's done in virtio_blk_data_plane_create),
but it is necessary to honor device IO op blocker first before we do.
This is useful to make sure that guest IO requests are paused during qmp
transactions (such as multi-disk snapshot or backup).

A counter is added to the virtio-scsi device, which keeps track of
currently blocked disks. If it goes from 0 to 1, the ioeventfds are
disabled; when it goes back to 0, they are re-enabled.

Also in device initialization, push the enabling of ioeventfds to before
return, so the virtio_scsi_clear_aio is not needed there. Rename it,
pair with an enabling variant, fix one coding style issue, then use it
in the device pause points.

Signed-off-by: Fam Zheng f...@redhat.com
---
 hw/scsi/virtio-scsi-dataplane.c | 82 +++--
 hw/scsi/virtio-scsi.c   |  3 ++
 include/hw/virtio/virtio-scsi.h |  3 ++
 3 files changed, 68 insertions(+), 20 deletions(-)

diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index 5575648..e220c12 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -40,7 +40,6 @@ void virtio_scsi_set_iothread(VirtIOSCSI *s, IOThread 
*iothread)
 
 static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
VirtQueue *vq,
-   EventNotifierHandler *handler,
int n)
 {
 BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s)));
@@ -60,7 +59,6 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
 r = g_slice_new(VirtIOSCSIVring);
 r-host_notifier = *virtio_queue_get_host_notifier(vq);
 r-guest_notifier = *virtio_queue_get_guest_notifier(vq);
-aio_set_event_notifier(s-ctx, r-host_notifier, handler);
 
 r-parent = s;
 
@@ -71,7 +69,6 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
 return r;
 
 fail_vring:
-aio_set_event_notifier(s-ctx, r-host_notifier, NULL);
 k-set_host_notifier(qbus-parent, n, false);
 g_slice_free(VirtIOSCSIVring, r);
 return NULL;
@@ -104,6 +101,9 @@ void virtio_scsi_vring_push_notify(VirtIOSCSIReq *req)
 }
 }
 
+static void virtio_scsi_start_ioeventfd(VirtIOSCSI *s);
+static void virtio_scsi_stop_ioeventfd(VirtIOSCSI *s);
+
 static void virtio_scsi_iothread_handle_ctrl(EventNotifier *notifier)
 {
 VirtIOSCSIVring *vring = container_of(notifier,
@@ -111,6 +111,7 @@ static void virtio_scsi_iothread_handle_ctrl(EventNotifier 
*notifier)
 VirtIOSCSI *s = VIRTIO_SCSI(vring-parent);
 VirtIOSCSIReq *req;
 
+assert(!s-pause_counter);
 event_notifier_test_and_clear(notifier);
 while ((req = virtio_scsi_pop_req_vring(s, vring))) {
 virtio_scsi_handle_ctrl_req(s, req);
@@ -124,6 +125,7 @@ static void virtio_scsi_iothread_handle_event(EventNotifier 
*notifier)
 VirtIOSCSI *s = vring-parent;
 VirtIODevice *vdev = VIRTIO_DEVICE(s);
 
+assert(!s-pause_counter);
 event_notifier_test_and_clear(notifier);
 
 if (!(vdev-status  VIRTIO_CONFIG_S_DRIVER_OK)) {
@@ -143,6 +145,7 @@ static void virtio_scsi_iothread_handle_cmd(EventNotifier 
*notifier)
 VirtIOSCSIReq *req, *next;
 QTAILQ_HEAD(, VirtIOSCSIReq) reqs = QTAILQ_HEAD_INITIALIZER(reqs);
 
+assert(!s-pause_counter);
 event_notifier_test_and_clear(notifier);
 while ((req = virtio_scsi_pop_req_vring(s, vring))) {
 if (virtio_scsi_handle_cmd_req_prepare(s, req)) {
@@ -155,8 +158,56 @@ static void virtio_scsi_iothread_handle_cmd(EventNotifier 
*notifier)
 }
 }
 
+void virtio_scsi_dataplane_blocker_notify(Notifier *notifier,
+  void *data)
+{
+VirtIOSCSI *s = container_of(notifier, VirtIOSCSI, blocker_notifier);
+BlockOpEvent *event = data;
+
+if (event-type != BLOCK_OP_TYPE_DEVICE_IO) {
+return;
+}
+if (event-blocking) {
+s-pause_counter++;
+if (s-pause_counter == 1) {
+virtio_scsi_stop_ioeventfd(s);
+}
+} else {
+s-pause_counter--;
+if (s-pause_counter == 0) {
+virtio_scsi_start_ioeventfd(s);
+}
+}
+assert(s-pause_counter = 0);
+}
+
 /* assumes s-ctx held */
-static void virtio_scsi_clear_aio(VirtIOSCSI *s)
+static void virtio_scsi_start_ioeventfd(VirtIOSCSI *s)
+{
+VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s);
+int i;
+
+if (!s-dataplane_started || s-dataplane_stopping) {
+return;
+}
+if (s-ctrl_vring) {
+aio_set_event_notifier(s-ctx, s-ctrl_vring-host_notifier,
+   virtio_scsi_iothread_handle_ctrl);
+}
+if (s-event_vring) {
+aio_set_event_notifier(s-ctx, s-event_vring-host_notifier,
+   

[Qemu-block] [PATCH v4 11/13] blockdev: Block device IO during blockdev-backup transaction

2015-05-18 Thread Fam Zheng
Signed-off-by: Fam Zheng f...@redhat.com
---
 blockdev.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index ae52d27..bd28183 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1649,6 +1649,7 @@ typedef struct BlockdevBackupState {
 BlockDriverState *bs;
 BlockJob *job;
 AioContext *aio_context;
+Error *blocker;
 } BlockdevBackupState;
 
 static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
@@ -1685,6 +1686,10 @@ static void blockdev_backup_prepare(BlkTransactionState 
*common, Error **errp)
 }
 aio_context_acquire(state-aio_context);
 
+state-bs = bs;
+error_setg(state-blocker, blockdev-backup in progress);
+bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, state-blocker);
+
 qmp_blockdev_backup(backup-device, backup-target,
 backup-sync,
 backup-has_speed, backup-speed,
@@ -1696,7 +1701,6 @@ static void blockdev_backup_prepare(BlkTransactionState 
*common, Error **errp)
 return;
 }
 
-state-bs = bs;
 state-job = state-bs-job;
 }
 
@@ -1715,6 +1719,10 @@ static void blockdev_backup_clean(BlkTransactionState 
*common)
 {
 BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, 
common);
 
+if (state-bs) {
+bdrv_op_unblock(state-bs, BLOCK_OP_TYPE_DEVICE_IO, state-blocker);
+error_free(state-blocker);
+}
 if (state-aio_context) {
 aio_context_release(state-aio_context);
 }
-- 
2.4.1




[Qemu-block] [PATCH v4 03/13] block-backend: Add blk_op_blocker_add_notifier

2015-05-18 Thread Fam Zheng
Forward the call to bdrv_op_blocker_add_notifier.

Signed-off-by: Fam Zheng f...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
---
 block/block-backend.c  | 6 ++
 include/sysemu/block-backend.h | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index 93e46f3..16efe60 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -802,6 +802,12 @@ void blk_op_unblock_all(BlockBackend *blk, Error *reason)
 bdrv_op_unblock_all(blk-bs, reason);
 }
 
+void blk_op_blocker_add_notifier(BlockBackend *blk,
+ Notifier *notifier)
+{
+bdrv_op_blocker_add_notifier(blk-bs, notifier);
+}
+
 AioContext *blk_get_aio_context(BlockBackend *blk)
 {
 return bdrv_get_aio_context(blk-bs);
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index b4a4d5e..cde9651 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -136,6 +136,8 @@ int blk_get_flags(BlockBackend *blk);
 int blk_get_max_transfer_length(BlockBackend *blk);
 void blk_set_guest_block_size(BlockBackend *blk, int align);
 void *blk_blockalign(BlockBackend *blk, size_t size);
+void blk_op_blocker_add_notifier(BlockBackend *blk,
+ Notifier *notifier);
 bool blk_op_is_blocked(BlockBackend *blk, BlockOpType op, Error **errp);
 void blk_op_unblock(BlockBackend *blk, BlockOpType op, Error *reason);
 void blk_op_block_all(BlockBackend *blk, Error *reason);
-- 
2.4.1




[Qemu-block] [PATCH v3 02/13] qemu-nbd: Switch to qemu_set_fd_handler

2015-05-18 Thread Fam Zheng
Achieved by:

- Remembering the server fd with a global variable, in order to access
  it from nbd_client_closed.

- Checking nbd_can_accept() and updating server_fd handler whenever
  client connects or disconnects.

Signed-off-by: Fam Zheng f...@redhat.com
Reviewed-by: Paolo Bonzini pbonz...@redhat.com
---
 qemu-nbd.c | 21 +
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 7e690ff..5af6d11 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -53,6 +53,7 @@ static int persistent = 0;
 static enum { RUNNING, TERMINATE, TERMINATING, TERMINATED } state;
 static int shared = 1;
 static int nb_fds;
+static int server_fd;
 
 static void usage(const char *name)
 {
@@ -340,7 +341,7 @@ out:
 return (void *) EXIT_FAILURE;
 }
 
-static int nbd_can_accept(void *opaque)
+static int nbd_can_accept(void)
 {
 return nb_fds  shared;
 }
@@ -351,19 +352,21 @@ static void nbd_export_closed(NBDExport *exp)
 state = TERMINATED;
 }
 
+static void nbd_update_server_fd_handler(int fd);
+
 static void nbd_client_closed(NBDClient *client)
 {
 nb_fds--;
 if (nb_fds == 0  !persistent  state == RUNNING) {
 state = TERMINATE;
 }
+nbd_update_server_fd_handler(server_fd);
 qemu_notify_event();
 nbd_client_put(client);
 }
 
 static void nbd_accept(void *opaque)
 {
-int server_fd = (uintptr_t) opaque;
 struct sockaddr_in addr;
 socklen_t addr_len = sizeof(addr);
 
@@ -380,12 +383,22 @@ static void nbd_accept(void *opaque)
 
 if (nbd_client_new(exp, fd, nbd_client_closed)) {
 nb_fds++;
+nbd_update_server_fd_handler(server_fd);
 } else {
 shutdown(fd, 2);
 close(fd);
 }
 }
 
+static void nbd_update_server_fd_handler(int fd)
+{
+if (nbd_can_accept()) {
+qemu_set_fd_handler(fd, nbd_accept, NULL, (void *)(uintptr_t)fd);
+} else {
+qemu_set_fd_handler(fd, NULL, NULL, NULL);
+}
+}
+
 int main(int argc, char **argv)
 {
 BlockBackend *blk;
@@ -761,8 +774,8 @@ int main(int argc, char **argv)
 memset(client_thread, 0, sizeof(client_thread));
 }
 
-qemu_set_fd_handler2(fd, nbd_can_accept, nbd_accept, NULL,
- (void *)(uintptr_t)fd);
+server_fd = fd;
+nbd_update_server_fd_handler(fd);
 
 /* now when the initialization is (almost) complete, chdir(/)
  * to free any busy filesystems */
-- 
2.4.1




[Qemu-block] [PATCH v3 00/13] main-loop: Get rid of fd_read_poll and qemu_set_fd_handler2

2015-05-18 Thread Fam Zheng
v3: Replace previous 13 with a simple return type conversion patch.
Drop RFC.

This carries out the mandate in the comment of qemu_set_fd_handler2 and removes
fd_read_poll from the code base, because it will make the work easier to
convert ppoll to epoll in main loop.  Also, the aio interface doesn't have a
read poll callback, which means this conversion woule be necessary if we want
to move iohandler fds from main loop to AioContext.

There are five users of the read poll callback now: qemu-nbd, l2tpv3, netmap,
socket and tap, which are all covered.

Patch 1 adds a stub for qemu_set_fd_handler which will be referenced in coming
patches.

Patch 2 converts qemu-nbd which compares two global numbers in the fd_read_poll
callback.

Patches 3~6 converts the four net devices, all of which checks
qemu_can_send_packet() in the callback.

Patch 7 and 8 finally removes the function.

The rest patches are cleanup for the unuseful return value of
qemu_set_fd_handler.

Please review!



Fam Zheng (13):
  stubs: Add qemu_set_fd_handler
  qemu-nbd: Switch to qemu_set_fd_handler
  l2tpv3: Drop l2tpv3_can_send
  netmap: Drop netmap_can_send
  net/socket: Drop net_socket_can_send
  tap: Drop tap_can_send
  Change qemu_set_fd_handler2(..., NULL, ...) to qemu_set_fd_handler
  main-loop: Drop qemu_set_fd_handler2
  alsaaudio: Remove unused error handling of qemu_set_fd_handler
  oss: Remove unused error handling of qemu_set_fd_handler
  xen_backend: Remove unused error handling of qemu_set_fd_handler
  event-notifier: Always return 0 for posix implementation
  iohandler: Change return type of qemu_set_fd_handler to void

 audio/alsaaudio.c   | 16 ++---
 audio/ossaudio.c| 14 ++-
 blockdev-nbd.c  |  4 ++--
 hw/xen/xen_backend.c|  4 +---
 include/block/aio.h |  2 +-
 include/qemu/main-loop.h| 57 -
 iohandler.c | 21 ++---
 main-loop.c |  3 +--
 migration/exec.c|  6 ++---
 migration/fd.c  |  4 ++--
 migration/rdma.c|  7 +++---
 migration/tcp.c |  6 ++---
 migration/unix.c|  6 ++---
 net/l2tpv3.c| 17 --
 net/netmap.c| 20 
 net/socket.c| 37 +
 net/tap.c   | 28 +++---
 qemu-nbd.c  | 21 +
 stubs/set-fd-handler.c  |  3 +--
 ui/vnc-auth-sasl.c  |  2 +-
 ui/vnc-auth-vencrypt.c  |  2 +-
 ui/vnc-ws.c |  6 ++---
 ui/vnc.c| 27 ++---
 util/event_notifier-posix.c |  3 ++-
 util/qemu-sockets.c |  8 +++
 25 files changed, 120 insertions(+), 204 deletions(-)

-- 
2.4.1




[Qemu-block] [PATCH v3 03/13] l2tpv3: Drop l2tpv3_can_send

2015-05-18 Thread Fam Zheng
This callback is called by main loop before polling s-fd, if it returns
false, the fd will not be polled in this iteration.

This is redundant with checks inside read callback. After this patch,
the data will be copied from s-fd to s-msgvec when it arrives. If the
device can't receive, it will be queued to incoming_queue, and when the
device status changes, this queue will be flushed.

Signed-off-by: Fam Zheng f...@redhat.com
---
 net/l2tpv3.c | 11 +--
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/net/l2tpv3.c b/net/l2tpv3.c
index 8c598b0..8eed06b 100644
--- a/net/l2tpv3.c
+++ b/net/l2tpv3.c
@@ -133,14 +133,12 @@ typedef struct NetL2TPV3State {
 
 } NetL2TPV3State;
 
-static int l2tpv3_can_send(void *opaque);
 static void net_l2tpv3_send(void *opaque);
 static void l2tpv3_writable(void *opaque);
 
 static void l2tpv3_update_fd_handler(NetL2TPV3State *s)
 {
-qemu_set_fd_handler2(s-fd,
- s-read_poll ? l2tpv3_can_send : NULL,
+qemu_set_fd_handler2(s-fd, NULL,
  s-read_poll ? net_l2tpv3_send : NULL,
  s-write_poll ? l2tpv3_writable : NULL,
  s);
@@ -169,13 +167,6 @@ static void l2tpv3_writable(void *opaque)
 qemu_flush_queued_packets(s-nc);
 }
 
-static int l2tpv3_can_send(void *opaque)
-{
-NetL2TPV3State *s = opaque;
-
-return qemu_can_send_packet(s-nc);
-}
-
 static void l2tpv3_send_completed(NetClientState *nc, ssize_t len)
 {
 NetL2TPV3State *s = DO_UPCAST(NetL2TPV3State, nc, nc);
-- 
2.4.1




[Qemu-block] [PATCH v4 09/13] blockdev: Block device IO during external snapshot transaction

2015-05-18 Thread Fam Zheng
Signed-off-by: Fam Zheng f...@redhat.com
---
 blockdev.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 7f763d9..923fc90 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1404,6 +1404,7 @@ typedef struct ExternalSnapshotState {
 BlockDriverState *old_bs;
 BlockDriverState *new_bs;
 AioContext *aio_context;
+Error *blocker;
 } ExternalSnapshotState;
 
 static void external_snapshot_prepare(BlkTransactionState *common,
@@ -1525,6 +1526,9 @@ static void external_snapshot_prepare(BlkTransactionState 
*common,
 if (ret != 0) {
 error_propagate(errp, local_err);
 }
+
+error_setg(state-blocker, snapshot in progress);
+bdrv_op_block(state-old_bs, BLOCK_OP_TYPE_DEVICE_IO, state-blocker);
 }
 
 static void external_snapshot_commit(BlkTransactionState *common)
@@ -1541,8 +1545,6 @@ static void external_snapshot_commit(BlkTransactionState 
*common)
  * don't want to abort all of them if one of them fails the reopen */
 bdrv_reopen(state-new_bs, state-new_bs-open_flags  ~BDRV_O_RDWR,
 NULL);
-
-aio_context_release(state-aio_context);
 }
 
 static void external_snapshot_abort(BlkTransactionState *common)
@@ -1552,6 +1554,17 @@ static void external_snapshot_abort(BlkTransactionState 
*common)
 if (state-new_bs) {
 bdrv_unref(state-new_bs);
 }
+}
+
+static void external_snapshot_clean(BlkTransactionState *common)
+{
+ExternalSnapshotState *state =
+ DO_UPCAST(ExternalSnapshotState, common, common);
+
+if (state-old_bs) {
+bdrv_op_unblock(state-old_bs, BLOCK_OP_TYPE_DEVICE_IO, 
state-blocker);
+error_free(state-blocker);
+}
 if (state-aio_context) {
 aio_context_release(state-aio_context);
 }
@@ -1716,6 +1729,7 @@ static const BdrvActionOps actions[] = {
 .prepare  = external_snapshot_prepare,
 .commit   = external_snapshot_commit,
 .abort = external_snapshot_abort,
+.clean = external_snapshot_clean,
 },
 [TRANSACTION_ACTION_KIND_DRIVE_BACKUP] = {
 .instance_size = sizeof(DriveBackupState),
-- 
2.4.1




[Qemu-block] [PATCH v4 08/13] blockdev: Block device IO during internal snapshot transaction

2015-05-18 Thread Fam Zheng
Signed-off-by: Fam Zheng f...@redhat.com
---
 blockdev.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 5eaf77e..7f763d9 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1262,6 +1262,7 @@ typedef struct InternalSnapshotState {
 BlockDriverState *bs;
 AioContext *aio_context;
 QEMUSnapshotInfo sn;
+Error *blocker;
 } InternalSnapshotState;
 
 static void internal_snapshot_prepare(BlkTransactionState *common,
@@ -1300,6 +1301,10 @@ static void 
internal_snapshot_prepare(BlkTransactionState *common,
 state-aio_context = bdrv_get_aio_context(bs);
 aio_context_acquire(state-aio_context);
 
+state-bs = bs;
+error_setg(state-blocker, internal snapshot in progress);
+bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, state-blocker);
+
 if (!bdrv_is_inserted(bs)) {
 error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
 return;
@@ -1354,9 +1359,6 @@ static void internal_snapshot_prepare(BlkTransactionState 
*common,
  name, device);
 return;
 }
-
-/* 4. succeed, mark a snapshot is created */
-state-bs = bs;
 }
 
 static void internal_snapshot_abort(BlkTransactionState *common)
@@ -1387,6 +1389,10 @@ static void internal_snapshot_clean(BlkTransactionState 
*common)
 InternalSnapshotState *state = DO_UPCAST(InternalSnapshotState,
  common, common);
 
+if (state-bs) {
+bdrv_op_unblock(state-bs, BLOCK_OP_TYPE_DEVICE_IO, state-blocker);
+error_free(state-blocker);
+}
 if (state-aio_context) {
 aio_context_release(state-aio_context);
 }
-- 
2.4.1




[Qemu-block] [PATCH v4 10/13] blockdev: Block device IO during drive-backup transaction

2015-05-18 Thread Fam Zheng
Signed-off-by: Fam Zheng f...@redhat.com
---
 blockdev.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 923fc90..ae52d27 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1575,6 +1575,7 @@ typedef struct DriveBackupState {
 BlockDriverState *bs;
 AioContext *aio_context;
 BlockJob *job;
+Error *blocker;
 } DriveBackupState;
 
 static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
@@ -1599,6 +1600,9 @@ static void drive_backup_prepare(BlkTransactionState 
*common, Error **errp)
 state-aio_context = bdrv_get_aio_context(bs);
 aio_context_acquire(state-aio_context);
 
+state-bs = bs;
+error_setg(state-blocker, drive-backup in progress);
+bdrv_op_block(bs, BLOCK_OP_TYPE_DEVICE_IO, state-blocker);
 qmp_drive_backup(backup-device, backup-target,
  backup-has_format, backup-format,
  backup-sync,
@@ -1613,7 +1617,6 @@ static void drive_backup_prepare(BlkTransactionState 
*common, Error **errp)
 return;
 }
 
-state-bs = bs;
 state-job = state-bs-job;
 }
 
@@ -1632,6 +1635,10 @@ static void drive_backup_clean(BlkTransactionState 
*common)
 {
 DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
 
+if (state-bs) {
+bdrv_op_unblock(state-bs, BLOCK_OP_TYPE_DEVICE_IO, state-blocker);
+error_free(state-blocker);
+}
 if (state-aio_context) {
 aio_context_release(state-aio_context);
 }
-- 
2.4.1




[Qemu-block] [PATCH v3 11/13] xen_backend: Remove unused error handling of qemu_set_fd_handler

2015-05-18 Thread Fam Zheng
The function cannot fail, so the check is superfluous.

Signed-off-by: Fam Zheng f...@redhat.com
---
 hw/xen/xen_backend.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index b2cb22b..2510e2e 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -714,9 +714,7 @@ int xen_be_init(void)
 return -1;
 }
 
-if (qemu_set_fd_handler(xs_fileno(xenstore), xenstore_update, NULL, NULL) 
 0) {
-goto err;
-}
+qemu_set_fd_handler(xs_fileno(xenstore), xenstore_update, NULL, NULL);
 
 if (xen_xc == XC_HANDLER_INITIAL_VALUE) {
 /* Check if xen_init() have been called */
-- 
2.4.1




[Qemu-block] [PATCH v3 10/13] oss: Remove unused error handling of qemu_set_fd_handler

2015-05-18 Thread Fam Zheng
The function cannot fail, so the check is superfluous.

Signed-off-by: Fam Zheng f...@redhat.com
---
 audio/ossaudio.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/audio/ossaudio.c b/audio/ossaudio.c
index 4db2ca6..b9c6b30 100644
--- a/audio/ossaudio.c
+++ b/audio/ossaudio.c
@@ -138,18 +138,18 @@ static void oss_helper_poll_in (void *opaque)
 audio_run (oss_poll_in);
 }
 
-static int oss_poll_out (HWVoiceOut *hw)
+static void oss_poll_out (HWVoiceOut *hw)
 {
 OSSVoiceOut *oss = (OSSVoiceOut *) hw;
 
-return qemu_set_fd_handler (oss-fd, NULL, oss_helper_poll_out, NULL);
+qemu_set_fd_handler (oss-fd, NULL, oss_helper_poll_out, NULL);
 }
 
-static int oss_poll_in (HWVoiceIn *hw)
+static void oss_poll_in (HWVoiceIn *hw)
 {
 OSSVoiceIn *oss = (OSSVoiceIn *) hw;
 
-return qemu_set_fd_handler (oss-fd, oss_helper_poll_in, NULL, NULL);
+qemu_set_fd_handler (oss-fd, oss_helper_poll_in, NULL, NULL);
 }
 
 static int oss_write (SWVoiceOut *sw, void *buf, int len)
@@ -634,7 +634,8 @@ static int oss_ctl_out (HWVoiceOut *hw, int cmd, ...)
 va_end (ap);
 
 ldebug (enabling voice\n);
-if (poll_mode  oss_poll_out (hw)) {
+if (poll_mode) {
+oss_poll_out (hw);
 poll_mode = 0;
 }
 hw-poll_mode = poll_mode;
@@ -828,7 +829,8 @@ static int oss_ctl_in (HWVoiceIn *hw, int cmd, ...)
 poll_mode = va_arg (ap, int);
 va_end (ap);
 
-if (poll_mode  oss_poll_in (hw)) {
+if (poll_mode) {
+oss_poll_in (hw);
 poll_mode = 0;
 }
 hw-poll_mode = poll_mode;
-- 
2.4.1




[Qemu-block] [PATCH v3 04/13] netmap: Drop netmap_can_send

2015-05-18 Thread Fam Zheng
This callback is called by main loop before polling s-fd, if it returns
false, the fd will not be polled in this iteration.

This is redundant with checks inside read callback. After this patch,
the data will be copied from s-fd to s-iov when it arrives. If the
device can't receive, it will be queued to incoming_queue, and when the
device status changes, this queue will be flushed.

Also remove the qemu_can_send_packet() check in netmap_send. If it's
true, we are good; if it's false, the qemu_sendv_packet_async would
return 0 and read poll will be disabled until netmap_send_completed is
called.

Signed-off-by: Fam Zheng f...@redhat.com
---
 net/netmap.c | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/net/netmap.c b/net/netmap.c
index 0c1772b..b3efb5b 100644
--- a/net/netmap.c
+++ b/net/netmap.c
@@ -132,23 +132,13 @@ error:
 return -1;
 }
 
-/* Tell the event-loop if the netmap backend can send packets
-   to the frontend. */
-static int netmap_can_send(void *opaque)
-{
-NetmapState *s = opaque;
-
-return qemu_can_send_packet(s-nc);
-}
-
 static void netmap_send(void *opaque);
 static void netmap_writable(void *opaque);
 
 /* Set the event-loop handlers for the netmap backend. */
 static void netmap_update_fd_handler(NetmapState *s)
 {
-qemu_set_fd_handler2(s-me.fd,
- s-read_poll  ? netmap_can_send : NULL,
+qemu_set_fd_handler2(s-me.fd, NULL,
  s-read_poll  ? netmap_send : NULL,
  s-write_poll ? netmap_writable : NULL,
  s);
@@ -317,7 +307,7 @@ static void netmap_send(void *opaque)
 
 /* Keep sending while there are available packets into the netmap
RX ring and the forwarding path towards the peer is open. */
-while (!nm_ring_empty(ring)  qemu_can_send_packet(s-nc)) {
+while (!nm_ring_empty(ring)) {
 uint32_t i;
 uint32_t idx;
 bool morefrag;
-- 
2.4.1




Re: [Qemu-block] [PATCH v3 05/12] virtio-blk: Don't handle output when there is device IO op blocker

2015-05-18 Thread Fam Zheng
On Mon, 05/18 20:19, Max Reitz wrote:
 On 15.05.2015 08:04, Fam Zheng wrote:
 
 Indentation is off if you intended to indent based on the opening
 parenthesis.

Will fix.

 
 Different question: Since event-type == BLOCK_OP_TYPE_DEVICE_IO, when will
 event-blocking ever be not equal to blk_op_is_blocked()?

It's unnecessary.

Fam



[Qemu-block] [PATCH v4 04/13] virtio-blk: Move complete_request to 'ops' structure

2015-05-18 Thread Fam Zheng
Should more ops be added to differentiate code between dataplane and
non-dataplane, the new saved_ops approach will be cleaner than messing
with N pointers.

Signed-off-by: Fam Zheng f...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
---
 hw/block/dataplane/virtio-blk.c | 13 -
 hw/block/virtio-blk.c   |  8 ++--
 include/hw/virtio/virtio-blk.h  |  9 +++--
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 3ecc8bd..e287e09 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -51,8 +51,7 @@ struct VirtIOBlockDataPlane {
 
 /* Operation blocker on BDS */
 Error *blocker;
-void (*saved_complete_request)(struct VirtIOBlockReq *req,
-   unsigned char status);
+const VirtIOBlockOps *saved_ops;
 };
 
 /* Raise an interrupt to signal guest, if necessary */
@@ -88,6 +87,10 @@ static void complete_request_vring(VirtIOBlockReq *req, 
unsigned char status)
 qemu_bh_schedule(s-bh);
 }
 
+static const VirtIOBlockOps virtio_blk_data_plane_ops = {
+.complete_request = complete_request_vring,
+};
+
 static void handle_notify(EventNotifier *e)
 {
 VirtIOBlockDataPlane *s = container_of(e, VirtIOBlockDataPlane,
@@ -270,8 +273,8 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 }
 s-host_notifier = *virtio_queue_get_host_notifier(vq);
 
-s-saved_complete_request = vblk-complete_request;
-vblk-complete_request = complete_request_vring;
+s-saved_ops = vblk-ops;
+vblk-ops = virtio_blk_data_plane_ops;
 
 s-starting = false;
 s-started = true;
@@ -314,7 +317,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 return;
 }
 s-stopping = true;
-vblk-complete_request = s-saved_complete_request;
+vblk-ops = s-saved_ops;
 trace_virtio_blk_data_plane_stop(s);
 
 aio_context_acquire(s-ctx);
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index e6afe97..f4a9d19 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -59,9 +59,13 @@ static void virtio_blk_complete_request(VirtIOBlockReq *req,
 virtio_notify(vdev, s-vq);
 }
 
+static const VirtIOBlockOps virtio_blk_ops = (VirtIOBlockOps) {
+.complete_request = virtio_blk_complete_request,
+};
+
 static void virtio_blk_req_complete(VirtIOBlockReq *req, unsigned char status)
 {
-req-dev-complete_request(req, status);
+req-dev-ops-complete_request(req, status);
 }
 
 static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error,
@@ -905,7 +909,7 @@ static void virtio_blk_device_realize(DeviceState *dev, 
Error **errp)
 s-sector_mask = (s-conf.conf.logical_block_size / BDRV_SECTOR_SIZE) - 1;
 
 s-vq = virtio_add_queue(vdev, 128, virtio_blk_handle_output);
-s-complete_request = virtio_blk_complete_request;
+s-ops = virtio_blk_ops;
 virtio_blk_data_plane_create(vdev, conf, s-dataplane, err);
 if (err != NULL) {
 error_propagate(errp, err);
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 6bf5905..28b3436 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -44,6 +44,12 @@ struct VirtIOBlkConf
 struct VirtIOBlockDataPlane;
 
 struct VirtIOBlockReq;
+
+typedef struct {
+/* Function to push to vq and notify guest */
+void (*complete_request)(struct VirtIOBlockReq *req, unsigned char status);
+} VirtIOBlockOps;
+
 typedef struct VirtIOBlock {
 VirtIODevice parent_obj;
 BlockBackend *blk;
@@ -54,8 +60,7 @@ typedef struct VirtIOBlock {
 unsigned short sector_mask;
 bool original_wce;
 VMChangeStateEntry *change;
-/* Function to push to vq and notify guest */
-void (*complete_request)(struct VirtIOBlockReq *req, unsigned char status);
+const VirtIOBlockOps *ops;
 Notifier migration_state_notifier;
 struct VirtIOBlockDataPlane *dataplane;
 } VirtIOBlock;
-- 
2.4.1




[Qemu-block] [PATCH v4 05/13] virtio-blk: Don't handle output when there is device IO op blocker

2015-05-18 Thread Fam Zheng
virtio-blk now listens to op blocker change of the associated block
backend.

Up on setting op blocker on BLOCK_OP_TYPE_DEVICE_IO:

  non-dataplane:
   1) Set VirtIOBlock.paused
   2) In virtio_blk_handle_output, do nothing if VirtIOBlock.paused

  dataplane:
   1) Clear the host event notifier
   2) In handle_notify, do nothing if VirtIOBlock.paused

Up on removing the op blocker:

  non-dataplane:
   1) Clear VirtIOBlock.paused
   2) Schedule a BH on the AioContext of the backend, which calls
   virtio_blk_handle_output, so that the previous unhandled kicks can
   make progress

  dataplane:
   1) Set the host event notifier
   2) Notify the host event notifier so that unhandled events are
   processed

Signed-off-by: Fam Zheng f...@redhat.com
---
 hw/block/dataplane/virtio-blk.c | 25 -
 hw/block/virtio-blk.c   | 59 +++--
 include/hw/virtio/virtio-blk.h  |  8 +-
 3 files changed, 88 insertions(+), 4 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index e287e09..a5e8e35 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -87,8 +87,28 @@ static void complete_request_vring(VirtIOBlockReq *req, 
unsigned char status)
 qemu_bh_schedule(s-bh);
 }
 
+static void virtio_blk_data_plane_pause(VirtIOBlock *vblk)
+{
+VirtIOBlockDataPlane *s = vblk-dataplane;
+
+event_notifier_test_and_clear(s-host_notifier);
+aio_set_event_notifier(s-ctx, s-host_notifier, NULL);
+}
+
+static void handle_notify(EventNotifier *e);
+static void virtio_blk_data_plane_resume(VirtIOBlock *vblk)
+{
+VirtIOBlockDataPlane *s = vblk-dataplane;
+
+aio_set_event_notifier(s-ctx, s-host_notifier, handle_notify);
+
+event_notifier_set(s-host_notifier);
+}
+
 static const VirtIOBlockOps virtio_blk_data_plane_ops = {
-.complete_request = complete_request_vring,
+.complete_request   = complete_request_vring,
+.pause  = virtio_blk_data_plane_pause,
+.resume = virtio_blk_data_plane_resume,
 };
 
 static void handle_notify(EventNotifier *e)
@@ -98,6 +118,9 @@ static void handle_notify(EventNotifier *e)
 VirtIOBlock *vblk = VIRTIO_BLK(s-vdev);
 
 event_notifier_test_and_clear(s-host_notifier);
+if (vblk-paused) {
+return;
+}
 blk_io_plug(s-conf-conf.blk);
 for (;;) {
 MultiReqBuffer mrb = {};
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index f4a9d19..b4b19b5 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -59,8 +59,38 @@ static void virtio_blk_complete_request(VirtIOBlockReq *req,
 virtio_notify(vdev, s-vq);
 }
 
+typedef struct {
+QEMUBH *bh;
+VirtIOBlock *s;
+} VirtIOBlockResumeData;
+
+static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq);
+static void virtio_blk_resume_bh_cb(void *opaque)
+{
+VirtIOBlockResumeData *data = opaque;
+qemu_bh_delete(data-bh);
+virtio_blk_handle_output(VIRTIO_DEVICE(data-s), data-s-vq);
+}
+
+static void virtio_blk_pause(VirtIOBlock *vblk)
+{
+/* TODO: stop ioeventfd */
+}
+
+static void virtio_blk_resume(VirtIOBlock *vblk)
+{
+VirtIOBlockResumeData *data = g_new(VirtIOBlockResumeData, 1);
+data-bh = aio_bh_new(blk_get_aio_context(vblk-blk),
+virtio_blk_resume_bh_cb, data);
+data-s = vblk;
+data-s-paused = false;
+qemu_bh_schedule(data-bh);
+}
+
 static const VirtIOBlockOps virtio_blk_ops = (VirtIOBlockOps) {
-.complete_request = virtio_blk_complete_request,
+.complete_request   = virtio_blk_complete_request,
+.pause  = virtio_blk_pause,
+.resume = virtio_blk_resume,
 };
 
 static void virtio_blk_req_complete(VirtIOBlockReq *req, unsigned char status)
@@ -597,6 +627,9 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
VirtQueue *vq)
 VirtIOBlockReq *req;
 MultiReqBuffer mrb = {};
 
+if (s-paused) {
+return;
+}
 /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
  * dataplane here instead of waiting for .set_status().
  */
@@ -787,7 +820,7 @@ static void virtio_blk_save(QEMUFile *f, void *opaque)
 
 virtio_save(vdev, f);
 }
-
+
 static void virtio_blk_save_device(VirtIODevice *vdev, QEMUFile *f)
 {
 VirtIOBlock *s = VIRTIO_BLK(vdev);
@@ -875,6 +908,25 @@ static void virtio_blk_migration_state_changed(Notifier 
*notifier, void *data)
 }
 }
 
+static void virtio_blk_op_blocker_changed(Notifier *notifier, void *opaque)
+{
+BlockOpEvent *event = opaque;
+VirtIOBlock *s = container_of(notifier, VirtIOBlock,
+  op_blocker_notifier);
+
+if (event-type != BLOCK_OP_TYPE_DEVICE_IO) {
+return;
+}
+if (event-blocking == s-paused) {
+return;
+}
+if (event-blocking) {
+s-ops-pause(s);
+} else {
+s-ops-resume(s);
+}

[Qemu-block] [PATCH v4 07/13] nbd-server: Clear can_read when device io blocker is set

2015-05-18 Thread Fam Zheng
So that NBD export cannot submit IO during bdrv_drain_all().

Signed-off-by: Fam Zheng f...@redhat.com
---
 nbd.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/nbd.c b/nbd.c
index 06b501b..b5026af 100644
--- a/nbd.c
+++ b/nbd.c
@@ -160,6 +160,8 @@ struct NBDExport {
 uint32_t nbdflags;
 QTAILQ_HEAD(, NBDClient) clients;
 QTAILQ_ENTRY(NBDExport) next;
+Notifier blocker_notifier;
+bool io_blocked;
 
 AioContext *ctx;
 };
@@ -1053,6 +1055,22 @@ static void blk_aio_detach(void *opaque)
 exp-ctx = NULL;
 }
 
+static void nbd_op_blocker_changed(Notifier *notifier, void *data)
+{
+BlockOpEvent *event = data;
+NBDClient *client;
+NBDExport *exp = container_of(notifier, NBDExport, blocker_notifier);
+
+if (event-type != BLOCK_OP_TYPE_DEVICE_IO) {
+return;
+}
+exp-io_blocked = event-blocking;
+
+QTAILQ_FOREACH(client, exp-clients, next) {
+nbd_update_can_read(client);
+}
+}
+
 NBDExport *nbd_export_new(BlockBackend *blk, off_t dev_offset, off_t size,
   uint32_t nbdflags, void (*close)(NBDExport *),
   Error **errp)
@@ -1081,6 +1099,8 @@ NBDExport *nbd_export_new(BlockBackend *blk, off_t 
dev_offset, off_t size,
  * access since the export could be available before migration handover.
  */
 blk_invalidate_cache(blk, NULL);
+exp-blocker_notifier.notify = nbd_op_blocker_changed;
+blk_op_blocker_add_notifier(blk, exp-blocker_notifier);
 return exp;
 
 fail:
@@ -1132,6 +1152,7 @@ void nbd_export_close(NBDExport *exp)
 nbd_export_set_name(exp, NULL);
 nbd_export_put(exp);
 if (exp-blk) {
+notifier_remove(exp-blocker_notifier);
 blk_remove_aio_context_notifier(exp-blk, blk_aio_attached,
 blk_aio_detach, exp);
 blk_unref(exp-blk);
@@ -1455,6 +1476,9 @@ static void nbd_update_can_read(NBDClient *client)
 bool can_read = client-recv_coroutine ||
 client-nb_requests  MAX_NBD_REQUESTS;
 
+if (client-exp  client-exp-io_blocked) {
+can_read = false;
+}
 if (can_read != client-can_read) {
 client-can_read = can_read;
 nbd_set_handlers(client);
-- 
2.4.1




Re: [Qemu-block] [PATCH v3 06/12] nbd-server: Clear can_read when device io blocker is set

2015-05-18 Thread Fam Zheng
On Mon, 05/18 20:35, Max Reitz wrote:
 On 15.05.2015 08:04, Fam Zheng wrote:
 So that NBD export cannot submit IO during bdrv_drain_all().
 
 Signed-off-by: Fam Zheng f...@redhat.com
 ---
   nbd.c | 18 ++
   1 file changed, 18 insertions(+)
 
 But can_read is not cleared until nbd_update_can_read() is called, which is
 only in nbd_request_get(), nbd_request_put(), and nbd_co_receive_request(),
 which are all called from nbd_trip() - so the one request which clears
 can_read will most probably be itself processed.
 
 Maybe we should call nbd_update_can_read() in nbd_op_blocker_changed(), for
 all clients of the export? Or just once somewhere at the beginning of
 processing a request, and if io_blocked is set, just yield...

I think you're right. I'll fix that.

Fam