Re: [Qemu-devel] [PATCH 6/8] migration: implementation of hook_ram_sync

2015-10-08 Thread Denis V. Lunev

On 10/07/2015 12:44 PM, Paolo Bonzini wrote:


On 07/10/2015 08:20, Denis V. Lunev wrote:

+if (migrate_is_test()) {
+/* since no data is transfered during estimation all
+   all measurements below will be incorrect.
+   as well no need for delays. */
+continue;
+}

By applying delays, you can also test migration using bandwidth
limitations and try to estimate whether it will converge or not.
Perhaps if you use writev_buffer to implement the test QEMUFile you do
not need this anymore.

Paolo

you see, your proposal would be better for emulation approach.
In this case the 'test' will take much more time and may be
it will be better for testing purpose.

Here we are solving a bit different problem. We are trying to
collect reasonable data to make a prediction for migration
time and downtime for cluster or cloud management where
VMs are hosted on different physical hosts and are migrated
from one host to another using different policies.

In this case we need to make an estimate to check migration
feasibility and it would be nice if we will be able to achieve
this goal in the minimal period of time with a minimal overhead :)))
OK, this sounds a bit trivial :)

On the other hand this code could be used for testing that is
why I have spoken about this set.

Den



Re: [Qemu-devel] [PATCH 6/8] migration: implementation of hook_ram_sync

2015-10-08 Thread Denis V. Lunev

On 10/07/2015 12:44 PM, Paolo Bonzini wrote:


On 07/10/2015 08:20, Denis V. Lunev wrote:

All calls of this hook will be from ram_save_pending().

At the first call of this hook we need to save the initial
size of VM memory and put the migration thread to sleep for
decent period (downtime for example). During this period
guest would dirty memory.

The second and the last call.
We make our estimation of dirty bytes rate assuming that time
between two synchronizations of dirty bitmap differs from downtime
negligibly.

An alternative to this approach is receiving information about
size of data “transmitted” through the transport.

This would use before_ram_iterate/after_ram_iterate, right?


However, this
way creates large time and memory overheads:
1/Transmitted guest’s memory pages are copied to QEMUFile’s buffer
   (~8 sec per 4GB VM)

Note that they are not if you implement writev_buffer.


yep, but we will have to setup iovec entry for each page
but pls see below


2/Dirty memory pages are processed one by one (~60msec per 4GB VM)

That however improves the accuracy, doesn't it?

Paolo

from the point of estimate we need we need amount of dirtied
page per second as a count as a result thus I do not think
that this will make a difference.

Though the approach proposed by David in the letter below
is much better from the point of overhead and the result
was presented in the original description  as (2) aka ~60 msecs
per 4 GB VM was obtained that way. Sorry that this was not
clearly exposed in the description.

Den



[Qemu-devel] [PATCH 6/8] migration: implementation of hook_ram_sync

2015-10-07 Thread Denis V. Lunev
From: Igor Redko 

The key feature of the test transport is receiving information
about dirty memory. The qemu_test_sync_hook() allows to use
the migration infrastructure(code) for this purpose.

All calls of this hook will be from ram_save_pending().

At the first call of this hook we need to save the initial
size of VM memory and put the migration thread to sleep for
decent period (downtime for example). During this period
guest would dirty memory.

The second and the last call.
We make our estimation of dirty bytes rate assuming that time
between two synchronizations of dirty bitmap differs from downtime
negligibly.

An alternative to this approach is receiving information about
size of data “transmitted” through the transport. However, this
way creates large time and memory overheads:
1/Transmitted guest’s memory pages are copied to QEMUFile’s buffer
  (~8 sec per 4GB VM)
2/Dirty memory pages are processed one by one (~60msec per 4GB VM)

Signed-off-by: Igor Redko 
Reviewed-by: Anna Melekhova 
Signed-off-by: Denis V. Lunev 
---
 migration/migration.c |  8 
 migration/test.c  | 36 
 2 files changed, 44 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index d6cb3e2..3182e15 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1058,6 +1058,14 @@ static void *migration_thread(void *opaque)
   MIGRATION_STATUS_FAILED);
 break;
 }
+
+if (migrate_is_test()) {
+/* since no data is transfered during estimation all
+   all measurements below will be incorrect.
+   as well no need for delays. */
+continue;
+}
+
 current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
 if (current_time >= initial_time + BUFFER_DELAY) {
 uint64_t transferred_bytes = qemu_ftell(s->file) - initial_bytes;
diff --git a/migration/test.c b/migration/test.c
index 8d06988..b4d0761 100644
--- a/migration/test.c
+++ b/migration/test.c
@@ -18,6 +18,7 @@ typedef struct QEMUFileTest {
 
 static uint64_t transfered_bytes;
 static uint64_t initial_bytes;
+static int sync_cnt;
 
 static ssize_t qemu_test_put_buffer(void *opaque, const uint8_t *buf,
 int64_t pos, size_t size)
@@ -31,7 +32,41 @@ static int qemu_test_close(void *opaque)
 return 0;
 }
 
+static int qemu_test_sync_hook(QEMUFile *f, void *opaque,
+uint64_t flags, void *data)
+{
+static uint64_t dirtied_bytes;
+static uint64_t sleeptime_mcs;
+int64_t time_delta;
+uint64_t remaining_bytes = *((uint64_t *) data);
+MigrationState *s = (MigrationState *) opaque;
+switch (sync_cnt++) {
+case 0:
+/* First call will be from ram_save_begin
+ * so we need to save initial size of VM memory
+ * and sleep for decent period (downtime for example). */
+sleeptime_mcs = migrate_max_downtime()/1000;
+initial_bytes = remaining_bytes;
+usleep(sleeptime_mcs);
+break;
+case 1:
+/* Second and last call.
+ * We assume that time between two synchronizations of
+ * dirty bitmap differs from downtime negligibly and
+ * make our estimation of dirty bytes rate. */
+dirtied_bytes = remaining_bytes;
+time_delta = sleeptime_mcs / 1000;
+s->dirty_bytes_rate = dirtied_bytes * 1000 / time_delta;
+return -42;
+default:
+/* All calls after second are errors */
+return -1;
+}
+return 0;
+}
+
 static const QEMUFileOps test_write_ops = {
+.hook_ram_sync  = qemu_test_sync_hook,
 .put_buffer = qemu_test_put_buffer,
 .close  = qemu_test_close,
 };
@@ -41,6 +76,7 @@ static void *qemu_fopen_test(MigrationState *s, const char 
*mode)
 QEMUFileTest *t;
 transfered_bytes = 0;
 initial_bytes = 0;
+sync_cnt = 0;
 if (qemu_file_mode_is_not_valid(mode)) {
 return NULL;
 }
-- 
2.1.4




Re: [Qemu-devel] [PATCH 6/8] migration: implementation of hook_ram_sync

2015-10-07 Thread Paolo Bonzini


On 07/10/2015 08:20, Denis V. Lunev wrote:
> +if (migrate_is_test()) {
> +/* since no data is transfered during estimation all
> +   all measurements below will be incorrect.
> +   as well no need for delays. */
> +continue;
> +}

By applying delays, you can also test migration using bandwidth
limitations and try to estimate whether it will converge or not.
Perhaps if you use writev_buffer to implement the test QEMUFile you do
not need this anymore.

Paolo



Re: [Qemu-devel] [PATCH 6/8] migration: implementation of hook_ram_sync

2015-10-07 Thread Paolo Bonzini


On 07/10/2015 08:20, Denis V. Lunev wrote:
> 
> All calls of this hook will be from ram_save_pending().
> 
> At the first call of this hook we need to save the initial
> size of VM memory and put the migration thread to sleep for
> decent period (downtime for example). During this period
> guest would dirty memory.
> 
> The second and the last call.
> We make our estimation of dirty bytes rate assuming that time
> between two synchronizations of dirty bitmap differs from downtime
> negligibly.
> 
> An alternative to this approach is receiving information about
> size of data “transmitted” through the transport.

This would use before_ram_iterate/after_ram_iterate, right?

> However, this
> way creates large time and memory overheads:
> 1/Transmitted guest’s memory pages are copied to QEMUFile’s buffer
>   (~8 sec per 4GB VM)

Note that they are not if you implement writev_buffer.

> 2/Dirty memory pages are processed one by one (~60msec per 4GB VM)

That however improves the accuracy, doesn't it?

Paolo



Re: [Qemu-devel] [PATCH 6/8] migration: implementation of hook_ram_sync

2015-10-07 Thread Dr. David Alan Gilbert
* Denis V. Lunev (d...@openvz.org) wrote:
> From: Igor Redko 
> 
> The key feature of the test transport is receiving information
> about dirty memory. The qemu_test_sync_hook() allows to use
> the migration infrastructure(code) for this purpose.
> 
> All calls of this hook will be from ram_save_pending().
> 
> At the first call of this hook we need to save the initial
> size of VM memory and put the migration thread to sleep for
> decent period (downtime for example). During this period
> guest would dirty memory.
> 
> The second and the last call.
> We make our estimation of dirty bytes rate assuming that time
> between two synchronizations of dirty bitmap differs from downtime
> negligibly.
> 
> An alternative to this approach is receiving information about
> size of data “transmitted” through the transport. However, this
> way creates large time and memory overheads:
> 1/Transmitted guest’s memory pages are copied to QEMUFile’s buffer
>   (~8 sec per 4GB VM)
> 2/Dirty memory pages are processed one by one (~60msec per 4GB VM)

That's not true for two reasons:
   1) As long as you register a writev_buffer method on the QEMUFile
  RAM Pages get added by using add_to_iovec rather than actually
  copying the data; so all the other stuff does go that way (as
  do the page headers)

   2) If you make it look like the rdma transport and register the 'save_page'
  hook I think the overhead is even smaller.

Dave

> 
> Signed-off-by: Igor Redko 
> Reviewed-by: Anna Melekhova 
> Signed-off-by: Denis V. Lunev 
> ---
>  migration/migration.c |  8 
>  migration/test.c  | 36 
>  2 files changed, 44 insertions(+)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index d6cb3e2..3182e15 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1058,6 +1058,14 @@ static void *migration_thread(void *opaque)
>MIGRATION_STATUS_FAILED);
>  break;
>  }
> +
> +if (migrate_is_test()) {
> +/* since no data is transfered during estimation all
> +   all measurements below will be incorrect.
> +   as well no need for delays. */
> +continue;
> +}
> +
>  current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>  if (current_time >= initial_time + BUFFER_DELAY) {
>  uint64_t transferred_bytes = qemu_ftell(s->file) - initial_bytes;
> diff --git a/migration/test.c b/migration/test.c
> index 8d06988..b4d0761 100644
> --- a/migration/test.c
> +++ b/migration/test.c
> @@ -18,6 +18,7 @@ typedef struct QEMUFileTest {
>  
>  static uint64_t transfered_bytes;
>  static uint64_t initial_bytes;
> +static int sync_cnt;
>  
>  static ssize_t qemu_test_put_buffer(void *opaque, const uint8_t *buf,
>  int64_t pos, size_t size)
> @@ -31,7 +32,41 @@ static int qemu_test_close(void *opaque)
>  return 0;
>  }
>  
> +static int qemu_test_sync_hook(QEMUFile *f, void *opaque,
> +uint64_t flags, void *data)
> +{
> +static uint64_t dirtied_bytes;
> +static uint64_t sleeptime_mcs;
> +int64_t time_delta;
> +uint64_t remaining_bytes = *((uint64_t *) data);
> +MigrationState *s = (MigrationState *) opaque;
> +switch (sync_cnt++) {
> +case 0:
> +/* First call will be from ram_save_begin
> + * so we need to save initial size of VM memory
> + * and sleep for decent period (downtime for example). */
> +sleeptime_mcs = migrate_max_downtime()/1000;
> +initial_bytes = remaining_bytes;
> +usleep(sleeptime_mcs);
> +break;
> +case 1:
> +/* Second and last call.
> + * We assume that time between two synchronizations of
> + * dirty bitmap differs from downtime negligibly and
> + * make our estimation of dirty bytes rate. */
> +dirtied_bytes = remaining_bytes;
> +time_delta = sleeptime_mcs / 1000;
> +s->dirty_bytes_rate = dirtied_bytes * 1000 / time_delta;
> +return -42;
> +default:
> +/* All calls after second are errors */
> +return -1;
> +}
> +return 0;
> +}
> +
>  static const QEMUFileOps test_write_ops = {
> +.hook_ram_sync  = qemu_test_sync_hook,
>  .put_buffer = qemu_test_put_buffer,
>  .close  = qemu_test_close,
>  };
> @@ -41,6 +76,7 @@ static void *qemu_fopen_test(MigrationState *s, const char 
> *mode)
>  QEMUFileTest *t;
>  transfered_bytes = 0;
>  initial_bytes = 0;
> +sync_cnt = 0;
>  if (qemu_file_mode_is_not_valid(mode)) {
>  return NULL;
>  }
> -- 
> 2.1.4
> 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK