Re: [Qemu-devel] [PATCH v2 4/4] replay: introduce block devices record/replay

2016-02-11 Thread Stefan Hajnoczi
On Thu, Feb 11, 2016 at 04:52:42PM +0300, Pavel Dovgalyuk wrote:
> > From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> > On Wed, Feb 10, 2016 at 12:13:23PM +0300, Pavel Dovgalyuk wrote:
> > > @@ -784,7 +798,11 @@ BlockAIOCB *blk_aio_flush(BlockBackend *blk,
> > >  return blk_abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
> > >  }
> > >
> > > -return bdrv_aio_flush(blk->bs, cb, opaque);
> > > +if (replay_mode == REPLAY_MODE_NONE) {
> > > +return bdrv_aio_flush(blk->bs, cb, opaque);
> > > +} else {
> > > +return replay_aio_flush(blk->bs, cb, opaque);
> > > +}
> > >  }
> > 
> > I am only looking at this patch in isolation and am not familiar with
> > the record/replay work, but these changes appear invasive.  In order for
> > record/replay to keep working in the future, everyone touching block
> > layer code must be familiar with the principles of how record/replay
> > works.  This patch does not include documentation.
> 
> We've already discussed it with Kevin.
> He proposes adding new block driver for record/replay.
> 
> > Can you point me to documentation that explains the how record replay
> > works?
> 
> There is some information about it in docs/replay.txt and in
> http://wiki.qemu.org/Features/record-replay

I was thinking about developer documentation.  A feature like this is
"cross-cutting" - it affects many components.  There should be
documentation that explains the semantics so everyone modifying code can
follow the rules to keep record/replay working.

Live migration is similar in that it touches many components.  The
docs/migration.txt file explains the APIs and general idea.  It
communicates the assumptions, limitations, and requirements that live
migration imposes.  Many QEMU developers understand migration basics and
keep them in mind during code review.  This way we can prevent most (but
not all) migration breakage.

Record/replay should aim for the same level of education/awareness,
otherwise the feature will break and bitrot.

The alternative is to implement it in a way that isn't invasive.  Then
other developers don't need to understand how it works.  Maybe
implementing it in a block driver can achieve this.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2 4/4] replay: introduce block devices record/replay

2016-02-11 Thread Stefan Hajnoczi
On Wed, Feb 10, 2016 at 12:13:23PM +0300, Pavel Dovgalyuk wrote:
> @@ -784,7 +798,11 @@ BlockAIOCB *blk_aio_flush(BlockBackend *blk,
>  return blk_abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
>  }
>  
> -return bdrv_aio_flush(blk->bs, cb, opaque);
> +if (replay_mode == REPLAY_MODE_NONE) {
> +return bdrv_aio_flush(blk->bs, cb, opaque);
> +} else {
> +return replay_aio_flush(blk->bs, cb, opaque);
> +}
>  }

I am only looking at this patch in isolation and am not familiar with
the record/replay work, but these changes appear invasive.  In order for
record/replay to keep working in the future, everyone touching block
layer code must be familiar with the principles of how record/replay
works.  This patch does not include documentation.

Can you point me to documentation that explains the how record replay
works?

Thanks,
Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2 4/4] replay: introduce block devices record/replay

2016-02-11 Thread Pavel Dovgalyuk
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> On Wed, Feb 10, 2016 at 12:13:23PM +0300, Pavel Dovgalyuk wrote:
> > @@ -784,7 +798,11 @@ BlockAIOCB *blk_aio_flush(BlockBackend *blk,
> >  return blk_abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
> >  }
> >
> > -return bdrv_aio_flush(blk->bs, cb, opaque);
> > +if (replay_mode == REPLAY_MODE_NONE) {
> > +return bdrv_aio_flush(blk->bs, cb, opaque);
> > +} else {
> > +return replay_aio_flush(blk->bs, cb, opaque);
> > +}
> >  }
> 
> I am only looking at this patch in isolation and am not familiar with
> the record/replay work, but these changes appear invasive.  In order for
> record/replay to keep working in the future, everyone touching block
> layer code must be familiar with the principles of how record/replay
> works.  This patch does not include documentation.

We've already discussed it with Kevin.
He proposes adding new block driver for record/replay.

> Can you point me to documentation that explains the how record replay
> works?

There is some information about it in docs/replay.txt and in
http://wiki.qemu.org/Features/record-replay

Pavel Dovgalyuk




[Qemu-devel] [PATCH v2 4/4] replay: introduce block devices record/replay

2016-02-10 Thread Pavel Dovgalyuk
This patch introduces a set of functions that implement recording
and replaying of block devices' operations. These functions form a thin
layer between blk_aio_ functions and replay subsystem.
All asynchronous block requests are added to the queue to be processed
at deterministically invoked record/replay checkpoints.
Queue is flushed at checkpoints and information about processed requests
is recorded to the log. In replay phase the queue is matched with
events read from the log. Therefore block devices requests are processed
deterministically.

Signed-off-by: Pavel Dovgalyuk 
---
 block/block-backend.c|   54 --
 include/sysemu/replay.h  |   30 
 replay/Makefile.objs |1 
 replay/replay-block.c|  172 ++
 replay/replay-events.c   |   51 ++
 replay/replay-internal.h |   24 ++
 stubs/replay.c   |   50 +
 7 files changed, 372 insertions(+), 10 deletions(-)
 create mode 100755 replay/replay-block.c

diff --git a/block/block-backend.c b/block/block-backend.c
index ebdf78a..1765369 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -17,6 +17,7 @@
 #include "block/throttle-groups.h"
 #include "sysemu/blockdev.h"
 #include "sysemu/sysemu.h"
+#include "sysemu/replay.h"
 #include "qapi-event.h"
 
 /* Number of coroutines to reserve per attached device model */
@@ -688,7 +689,7 @@ BlockAIOCB *blk_abort_aio_request(BlockBackend *blk,
 
 bh = aio_bh_new(blk_get_aio_context(blk), error_callback_bh, acb);
 acb->bh = bh;
-qemu_bh_schedule(bh);
+replay_bh_schedule_event(bh);
 
 return >common;
 }
@@ -702,8 +703,13 @@ BlockAIOCB *blk_aio_write_zeroes(BlockBackend *blk, 
int64_t sector_num,
 return blk_abort_aio_request(blk, cb, opaque, ret);
 }
 
-return bdrv_aio_write_zeroes(blk->bs, sector_num, nb_sectors, flags,
- cb, opaque);
+if (replay_mode == REPLAY_MODE_NONE) {
+return bdrv_aio_write_zeroes(blk->bs, sector_num, nb_sectors, flags,
+ cb, opaque);
+} else {
+return replay_aio_write_zeroes(blk->bs, sector_num, nb_sectors, flags,
+   cb, opaque);
+}
 }
 
 int blk_pread(BlockBackend *blk, int64_t offset, void *buf, int count)
@@ -762,7 +768,11 @@ BlockAIOCB *blk_aio_readv(BlockBackend *blk, int64_t 
sector_num,
 return blk_abort_aio_request(blk, cb, opaque, ret);
 }
 
-return bdrv_aio_readv(blk->bs, sector_num, iov, nb_sectors, cb, opaque);
+if (replay_mode == REPLAY_MODE_NONE) {
+return bdrv_aio_readv(blk->bs, sector_num, iov, nb_sectors, cb, 
opaque);
+} else {
+return replay_aio_readv(blk->bs, sector_num, iov, nb_sectors, cb, 
opaque);
+}
 }
 
 BlockAIOCB *blk_aio_writev(BlockBackend *blk, int64_t sector_num,
@@ -774,7 +784,11 @@ BlockAIOCB *blk_aio_writev(BlockBackend *blk, int64_t 
sector_num,
 return blk_abort_aio_request(blk, cb, opaque, ret);
 }
 
-return bdrv_aio_writev(blk->bs, sector_num, iov, nb_sectors, cb, opaque);
+if (replay_mode == REPLAY_MODE_NONE) {
+return bdrv_aio_writev(blk->bs, sector_num, iov, nb_sectors, cb, 
opaque);
+} else {
+return replay_aio_writev(blk->bs, sector_num, iov, nb_sectors, cb, 
opaque);
+}
 }
 
 BlockAIOCB *blk_aio_flush(BlockBackend *blk,
@@ -784,7 +798,11 @@ BlockAIOCB *blk_aio_flush(BlockBackend *blk,
 return blk_abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
 }
 
-return bdrv_aio_flush(blk->bs, cb, opaque);
+if (replay_mode == REPLAY_MODE_NONE) {
+return bdrv_aio_flush(blk->bs, cb, opaque);
+} else {
+return replay_aio_flush(blk->bs, cb, opaque);
+}
 }
 
 BlockAIOCB *blk_aio_discard(BlockBackend *blk,
@@ -796,17 +814,29 @@ BlockAIOCB *blk_aio_discard(BlockBackend *blk,
 return blk_abort_aio_request(blk, cb, opaque, ret);
 }
 
-return bdrv_aio_discard(blk->bs, sector_num, nb_sectors, cb, opaque);
+if (replay_mode == REPLAY_MODE_NONE) {
+return bdrv_aio_discard(blk->bs, sector_num, nb_sectors, cb, opaque);
+} else {
+return replay_aio_discard(blk->bs, sector_num, nb_sectors, cb, opaque);
+}
 }
 
 void blk_aio_cancel(BlockAIOCB *acb)
 {
-bdrv_aio_cancel(acb);
+if (replay_mode == REPLAY_MODE_NONE) {
+bdrv_aio_cancel(acb);
+} else {
+replay_aio_cancel(acb);
+}
 }
 
 void blk_aio_cancel_async(BlockAIOCB *acb)
 {
-bdrv_aio_cancel_async(acb);
+if (replay_mode == REPLAY_MODE_NONE) {
+bdrv_aio_cancel_async(acb);
+} else {
+replay_aio_cancel(acb);
+}
 }
 
 int blk_aio_multiwrite(BlockBackend *blk, BlockRequest *reqs, int num_reqs)
@@ -839,7 +869,11 @@ BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long 
int req, void *buf,
 return blk_abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
 }
 
-return