Re: [PATCH v1 1/1] COLO: only flush dirty ram pages from colo cache

2020-09-21 Thread Derek Su
Hi, Chen and Lei

Using Lei's patch is OK to me.
Please help to add "Signed-off-by: Derek Su " for merging
it.
Thank you. :)

Regards
Derek

Zhang, Chen  於 2020年9月22日 週二 下午1:37寫道:

> So, Derek, you will send new version patch?
>
>
>
> Thanks
>
> Zhang Chen
>
>
>
> *From:* Derek Su 
> *Sent:* Tuesday, September 22, 2020 1:18 PM
> *To:* Rao, Lei 
> *Cc:* Zhang, Chen ; qemu-devel <
> qemu-devel@nongnu.org>; zhang.zhanghaili...@huawei.com;
> quint...@redhat.com; dgilb...@redhat.com
> *Subject:* Re: [PATCH v1 1/1] COLO: only flush dirty ram pages from colo
> cache
>
>
>
> Hi, Lei
>
>
>
> Got it. Thanks.
>
>
>
> Regards,
>
> Derek
>
>
>
> Rao, Lei  於 2020年9月22日 週二 下午1:04寫道:
>
> Hi, Derek and Chen
>
> ram_bulk_stage is false by default before Hailiang's patch.
> For COLO, it does not seem to be used, so I think there is no need to
> reset it to true.
>
> Thanks,
> Lei.
>
> From: Derek Su 
> Sent: Tuesday, September 22, 2020 11:48 AM
> To: Zhang, Chen 
> Cc: qemu-devel ; Rao, Lei ;
> zhang.zhanghaili...@huawei.com; quint...@redhat.com; dgilb...@redhat.com
> Subject: Re: [PATCH v1 1/1] COLO: only flush dirty ram pages from colo
> cache
>
> Hi, Chen
>
> Sure.
>
> BTW, I just went through Lei's patch.
> ram_bulk_stage() might need to reset to true after stopping COLO service
> as my patch.
> How about your opinion?
>
> Thanks.
>
>
> Best regards,
> Derek
>
>
> Zhang, Chen <mailto:chen.zh...@intel.com> 於 2020年9月22日 週二 上午11:41寫道:
> Hi Derek and Lei,
>
> It looks same with Lei’ patch:
> [PATCH 2/3] Reduce the time of checkpoint for COLO
> Can you discuss to merge it into one patch?
>
> Thanks
> Zhang Chen
>
> From: Derek Su <mailto:dere...@qnap.com>
> Sent: Tuesday, September 22, 2020 11:31 AM
> To: qemu-devel <mailto:qemu-devel@nongnu.org>
> Cc: mailto:zhang.zhanghaili...@huawei.com; mailto:quint...@redhat.com;
> mailto:dgilb...@redhat.com; Zhang, Chen <mailto:chen.zh...@intel.com>
> Subject: Re: [PATCH v1 1/1] COLO: only flush dirty ram pages from colo
> cache
>
> Hello, all
>
> Ping...
>
> Regards,
> Derek Su
>
> Derek Su <mailto:dere...@qnap.com> 於 2020年9月10日 週四 下午6:47寫道:
> In secondary side, the colo_flush_ram_cache() calls
> migration_bitmap_find_dirty() to finding the dirty pages and
> flush them to host. But ram_state's ram_bulk_stage flag is always
> enabled in secondary side, it leads to the whole ram pages copy
> instead of only dirty pages.
>
> Here, the ram_bulk_stage in secondary side is disabled in the
> preparation of COLO incoming process to avoid the whole dirty
> ram pages flush.
>
> Signed-off-by: Derek Su <mailto:dere...@qnap.com>
> ---
>  migration/colo.c |  6 +-
>  migration/ram.c  | 10 ++
>  migration/ram.h  |  3 +++
>  3 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/migration/colo.c b/migration/colo.c
> index ea7d1e9d4e..6e644db306 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -844,6 +844,8 @@ void *colo_process_incoming_thread(void *opaque)
>  return NULL;
>  }
>
> +colo_disable_ram_bulk_stage();
> +
>  failover_init_state();
>
>  mis->to_src_file = qemu_file_get_return_path(mis->from_src_file);
> @@ -873,7 +875,7 @@ void *colo_process_incoming_thread(void *opaque)
>  goto out;
>  }
>  #else
> -abort();
> +abort();
>  #endif
>  vm_start();
>  trace_colo_vm_state_change("stop", "run");
> @@ -924,6 +926,8 @@ out:
>  qemu_fclose(fb);
>  }
>
> +colo_enable_ram_bulk_stage();
> +
>  /* Hope this not to be too long to loop here */
>  qemu_sem_wait(>colo_incoming_sem);
>  qemu_sem_destroy(>colo_incoming_sem);
> diff --git a/migration/ram.c b/migration/ram.c
> index 76d4fee5d5..65e9b12058 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -3357,6 +3357,16 @@ static bool postcopy_is_running(void)
>  return ps >= POSTCOPY_INCOMING_LISTENING && ps <
> POSTCOPY_INCOMING_END;
>  }
>
> +void colo_enable_ram_bulk_stage(void)
> +{
> +ram_state->ram_bulk_stage = true;
> +}
> +
> +void colo_disable_ram_bulk_stage(void)
> +{
> +ram_state->ram_bulk_stage = false;
> +}
> +
>  /*
>   * Flush content of RAM cache into SVM's memory.
>   * Only flush the pages that be dirtied by PVM or SVM or both.
> diff --git a/migration/ram.h b/migration/ram.h
> index 2eeaacfa13..c1c0ebbe0f 100644
> --- a/migration/ram.h
> +++ b/migration/ram.h
> @@ -69,4 +69,7 @@ void colo_flush_ram_cache(void);
>  void colo_release_ram_cache(void);
>  void colo_incoming_start_dirty_log(void);
>
> +void colo_enable_ram_bulk_stage(void);
> +void colo_disable_ram_bulk_stage(void);
> +
>  #endif
> --
> 2.25.1
>
>


Re: [PATCH v1 1/1] COLO: only flush dirty ram pages from colo cache

2020-09-21 Thread Derek Su
Hi, Lei

Got it. Thanks.

Regards,
Derek

Rao, Lei  於 2020年9月22日 週二 下午1:04寫道:

> Hi, Derek and Chen
>
> ram_bulk_stage is false by default before Hailiang's patch.
> For COLO, it does not seem to be used, so I think there is no need to
> reset it to true.
>
> Thanks,
> Lei.
>
> From: Derek Su 
> Sent: Tuesday, September 22, 2020 11:48 AM
> To: Zhang, Chen 
> Cc: qemu-devel ; Rao, Lei ;
> zhang.zhanghaili...@huawei.com; quint...@redhat.com; dgilb...@redhat.com
> Subject: Re: [PATCH v1 1/1] COLO: only flush dirty ram pages from colo
> cache
>
> Hi, Chen
>
> Sure.
>
> BTW, I just went through Lei's patch.
> ram_bulk_stage() might need to reset to true after stopping COLO service
> as my patch.
> How about your opinion?
>
> Thanks.
>
>
> Best regards,
> Derek
>
>
> Zhang, Chen <mailto:chen.zh...@intel.com> 於 2020年9月22日 週二 上午11:41寫道:
> Hi Derek and Lei,
>
> It looks same with Lei’ patch:
> [PATCH 2/3] Reduce the time of checkpoint for COLO
> Can you discuss to merge it into one patch?
>
> Thanks
> Zhang Chen
>
> From: Derek Su <mailto:dere...@qnap.com>
> Sent: Tuesday, September 22, 2020 11:31 AM
> To: qemu-devel <mailto:qemu-devel@nongnu.org>
> Cc: mailto:zhang.zhanghaili...@huawei.com; mailto:quint...@redhat.com;
> mailto:dgilb...@redhat.com; Zhang, Chen <mailto:chen.zh...@intel.com>
> Subject: Re: [PATCH v1 1/1] COLO: only flush dirty ram pages from colo
> cache
>
> Hello, all
>
> Ping...
>
> Regards,
> Derek Su
>
> Derek Su <mailto:dere...@qnap.com> 於 2020年9月10日 週四 下午6:47寫道:
> In secondary side, the colo_flush_ram_cache() calls
> migration_bitmap_find_dirty() to finding the dirty pages and
> flush them to host. But ram_state's ram_bulk_stage flag is always
> enabled in secondary side, it leads to the whole ram pages copy
> instead of only dirty pages.
>
> Here, the ram_bulk_stage in secondary side is disabled in the
> preparation of COLO incoming process to avoid the whole dirty
> ram pages flush.
>
> Signed-off-by: Derek Su <mailto:dere...@qnap.com>
> ---
>  migration/colo.c |  6 +-
>  migration/ram.c  | 10 ++
>  migration/ram.h  |  3 +++
>  3 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/migration/colo.c b/migration/colo.c
> index ea7d1e9d4e..6e644db306 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -844,6 +844,8 @@ void *colo_process_incoming_thread(void *opaque)
>  return NULL;
>  }
>
> +colo_disable_ram_bulk_stage();
> +
>  failover_init_state();
>
>  mis->to_src_file = qemu_file_get_return_path(mis->from_src_file);
> @@ -873,7 +875,7 @@ void *colo_process_incoming_thread(void *opaque)
>  goto out;
>  }
>  #else
> -abort();
> +abort();
>  #endif
>  vm_start();
>  trace_colo_vm_state_change("stop", "run");
> @@ -924,6 +926,8 @@ out:
>  qemu_fclose(fb);
>  }
>
> +colo_enable_ram_bulk_stage();
> +
>  /* Hope this not to be too long to loop here */
>  qemu_sem_wait(>colo_incoming_sem);
>  qemu_sem_destroy(>colo_incoming_sem);
> diff --git a/migration/ram.c b/migration/ram.c
> index 76d4fee5d5..65e9b12058 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -3357,6 +3357,16 @@ static bool postcopy_is_running(void)
>  return ps >= POSTCOPY_INCOMING_LISTENING && ps <
> POSTCOPY_INCOMING_END;
>  }
>
> +void colo_enable_ram_bulk_stage(void)
> +{
> +ram_state->ram_bulk_stage = true;
> +}
> +
> +void colo_disable_ram_bulk_stage(void)
> +{
> +ram_state->ram_bulk_stage = false;
> +}
> +
>  /*
>   * Flush content of RAM cache into SVM's memory.
>   * Only flush the pages that be dirtied by PVM or SVM or both.
> diff --git a/migration/ram.h b/migration/ram.h
> index 2eeaacfa13..c1c0ebbe0f 100644
> --- a/migration/ram.h
> +++ b/migration/ram.h
> @@ -69,4 +69,7 @@ void colo_flush_ram_cache(void);
>  void colo_release_ram_cache(void);
>  void colo_incoming_start_dirty_log(void);
>
> +void colo_enable_ram_bulk_stage(void);
> +void colo_disable_ram_bulk_stage(void);
> +
>  #endif
> --
> 2.25.1
>


Re: [PATCH v1 1/1] COLO: only flush dirty ram pages from colo cache

2020-09-21 Thread Derek Su
Hi, Chen

Sure.

BTW, I just went through Lei's patch.
ram_bulk_stage() might need to reset to true after stopping COLO service as
my patch.
How about your opinion?

Thanks.

Best regards,
Derek


Zhang, Chen  於 2020年9月22日 週二 上午11:41寫道:

> Hi Derek and Lei,
>
>
>
> It looks same with Lei’ patch:
>
> [PATCH 2/3] Reduce the time of checkpoint for COLO
>
> Can you discuss to merge it into one patch?
>
>
>
> Thanks
>
> Zhang Chen
>
>
>
> *From:* Derek Su 
> *Sent:* Tuesday, September 22, 2020 11:31 AM
> *To:* qemu-devel 
> *Cc:* zhang.zhanghaili...@huawei.com; quint...@redhat.com;
> dgilb...@redhat.com; Zhang, Chen 
> *Subject:* Re: [PATCH v1 1/1] COLO: only flush dirty ram pages from colo
> cache
>
>
>
> Hello, all
>
>
>
> Ping...
>
>
>
> Regards,
>
> Derek Su
>
>
>
> Derek Su  於 2020年9月10日 週四 下午6:47寫道:
>
> In secondary side, the colo_flush_ram_cache() calls
> migration_bitmap_find_dirty() to finding the dirty pages and
> flush them to host. But ram_state's ram_bulk_stage flag is always
> enabled in secondary side, it leads to the whole ram pages copy
> instead of only dirty pages.
>
> Here, the ram_bulk_stage in secondary side is disabled in the
> preparation of COLO incoming process to avoid the whole dirty
> ram pages flush.
>
> Signed-off-by: Derek Su 
> ---
>  migration/colo.c |  6 +-
>  migration/ram.c  | 10 ++
>  migration/ram.h  |  3 +++
>  3 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/migration/colo.c b/migration/colo.c
> index ea7d1e9d4e..6e644db306 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -844,6 +844,8 @@ void *colo_process_incoming_thread(void *opaque)
>  return NULL;
>  }
>
> +colo_disable_ram_bulk_stage();
> +
>  failover_init_state();
>
>  mis->to_src_file = qemu_file_get_return_path(mis->from_src_file);
> @@ -873,7 +875,7 @@ void *colo_process_incoming_thread(void *opaque)
>  goto out;
>  }
>  #else
> -abort();
> +abort();
>  #endif
>  vm_start();
>  trace_colo_vm_state_change("stop", "run");
> @@ -924,6 +926,8 @@ out:
>  qemu_fclose(fb);
>  }
>
> +colo_enable_ram_bulk_stage();
> +
>  /* Hope this not to be too long to loop here */
>  qemu_sem_wait(>colo_incoming_sem);
>  qemu_sem_destroy(>colo_incoming_sem);
> diff --git a/migration/ram.c b/migration/ram.c
> index 76d4fee5d5..65e9b12058 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -3357,6 +3357,16 @@ static bool postcopy_is_running(void)
>  return ps >= POSTCOPY_INCOMING_LISTENING && ps <
> POSTCOPY_INCOMING_END;
>  }
>
> +void colo_enable_ram_bulk_stage(void)
> +{
> +ram_state->ram_bulk_stage = true;
> +}
> +
> +void colo_disable_ram_bulk_stage(void)
> +{
> +ram_state->ram_bulk_stage = false;
> +}
> +
>  /*
>   * Flush content of RAM cache into SVM's memory.
>   * Only flush the pages that be dirtied by PVM or SVM or both.
> diff --git a/migration/ram.h b/migration/ram.h
> index 2eeaacfa13..c1c0ebbe0f 100644
> --- a/migration/ram.h
> +++ b/migration/ram.h
> @@ -69,4 +69,7 @@ void colo_flush_ram_cache(void);
>  void colo_release_ram_cache(void);
>  void colo_incoming_start_dirty_log(void);
>
> +void colo_enable_ram_bulk_stage(void);
> +void colo_disable_ram_bulk_stage(void);
> +
>  #endif
> --
> 2.25.1
>
>


Re: [PATCH v1 1/1] COLO: only flush dirty ram pages from colo cache

2020-09-21 Thread Derek Su
Hello, all

Ping...

Regards,
Derek Su

Derek Su  於 2020年9月10日 週四 下午6:47寫道:

> In secondary side, the colo_flush_ram_cache() calls
> migration_bitmap_find_dirty() to finding the dirty pages and
> flush them to host. But ram_state's ram_bulk_stage flag is always
> enabled in secondary side, it leads to the whole ram pages copy
> instead of only dirty pages.
>
> Here, the ram_bulk_stage in secondary side is disabled in the
> preparation of COLO incoming process to avoid the whole dirty
> ram pages flush.
>
> Signed-off-by: Derek Su 
> ---
>  migration/colo.c |  6 +-
>  migration/ram.c  | 10 ++
>  migration/ram.h  |  3 +++
>  3 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/migration/colo.c b/migration/colo.c
> index ea7d1e9d4e..6e644db306 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -844,6 +844,8 @@ void *colo_process_incoming_thread(void *opaque)
>  return NULL;
>  }
>
> +colo_disable_ram_bulk_stage();
> +
>  failover_init_state();
>
>  mis->to_src_file = qemu_file_get_return_path(mis->from_src_file);
> @@ -873,7 +875,7 @@ void *colo_process_incoming_thread(void *opaque)
>  goto out;
>  }
>  #else
> -abort();
> +abort();
>  #endif
>  vm_start();
>  trace_colo_vm_state_change("stop", "run");
> @@ -924,6 +926,8 @@ out:
>  qemu_fclose(fb);
>  }
>
> +colo_enable_ram_bulk_stage();
> +
>  /* Hope this not to be too long to loop here */
>  qemu_sem_wait(>colo_incoming_sem);
>  qemu_sem_destroy(>colo_incoming_sem);
> diff --git a/migration/ram.c b/migration/ram.c
> index 76d4fee5d5..65e9b12058 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -3357,6 +3357,16 @@ static bool postcopy_is_running(void)
>  return ps >= POSTCOPY_INCOMING_LISTENING && ps <
> POSTCOPY_INCOMING_END;
>  }
>
> +void colo_enable_ram_bulk_stage(void)
> +{
> +ram_state->ram_bulk_stage = true;
> +}
> +
> +void colo_disable_ram_bulk_stage(void)
> +{
> +ram_state->ram_bulk_stage = false;
> +}
> +
>  /*
>   * Flush content of RAM cache into SVM's memory.
>   * Only flush the pages that be dirtied by PVM or SVM or both.
> diff --git a/migration/ram.h b/migration/ram.h
> index 2eeaacfa13..c1c0ebbe0f 100644
> --- a/migration/ram.h
> +++ b/migration/ram.h
> @@ -69,4 +69,7 @@ void colo_flush_ram_cache(void);
>  void colo_release_ram_cache(void);
>  void colo_incoming_start_dirty_log(void);
>
> +void colo_enable_ram_bulk_stage(void);
> +void colo_disable_ram_bulk_stage(void);
> +
>  #endif
> --
> 2.25.1
>
>


Re: [PATCH v1 2/2] colo-compare: Record packet creation time by QEMU_CLOCK_REALTIME

2020-09-14 Thread Derek Su
Hello, Chen

Zhang, Chen  於 2020年9月15日 週二 上午8:09寫道:

> Hi Derek,
>
>
>
> Looks qemu vl.c, migration/migration.c…etc use the “QEMU_CLOCK_HOST” too.
>
It will also affected by host NTP. Do you means we should change all the
> QEMU_CLOCK_HOST to QEMU_CLOCK_REALTIME?
>

No, I think it depends on the purpose. For example, the setup_time/end_time
in migration/migraion.c are for statistical information.

There is maybe a low probability of the issue I mentioned. It would be
classified as a known issue in practical applications.
QEMU_CLOCK_HOST seems enough as you said.


> And use the QEMU_CLOCK_VIRTUAL is not correct, because we need to know
> time changes during VM stop.
>

Thanks.

Regards,
Derek



> Thanks
>
> Zhang Chen
>
>
>
> *From:* Derek Su 
> *Sent:* Tuesday, September 15, 2020 12:24 AM
> *To:* Zhang, Chen 
> *Cc:* jasow...@redhat.com; lizhij...@cn.fujitsu.com; qemu-devel@nongnu.org;
> C.H. Yang ; CT Cheng 
> *Subject:* Re: [PATCH v1 2/2] colo-compare: Record packet creation time
> by QEMU_CLOCK_REALTIME
>
>
>
>
> Hello, Chen
>
>
>
> My humble opinion is that the `creation_ms` and `now` fixed in my patch
> should be OK in colo-compare and performs well in my test if used
> QEMU_CLOCK_REALTIME. Though the vm state is changed in COLO scenario, the
> record and comparison of `creation_ms` and `now` are not affected by these
> vm changes.
>
>
>
> In net/colo.c and net/colo-compare.c functions using timer_mod(),
> using QEMU_CLOCK_HOST is dangerous if users change the host clock. The
> timer might not be fired on time as expected. The original time_mod using
> QEMU_CLOCK_VIRTUAL seems OK currently.
>
> Thanks.
>
>
>
> Regards,
>
> Derek
>
>
>
>
>
> Zhang, Chen  於 2020年9月14日 週一 下午3:42寫道:
>
>
>
>
>
> *From:* Derek Su 
> *Sent:* Monday, September 14, 2020 9:00 AM
> *To:* Zhang, Chen 
> *Cc:* jasow...@redhat.com; lizhij...@cn.fujitsu.com; qemu-devel@nongnu.org
> *Subject:* Re: [PATCH v1 2/2] colo-compare: Record packet creation time
> by QEMU_CLOCK_REALTIME
>
>
>
>
>
>
>
> Zhang, Chen 於 2020年9月14日 週一,上午4:06寫道:
>
>
>
>
>
> > -Original Message-
>
> > From: Zhang, Chen
>
> > Sent: Monday, September 14, 2020 4:02 AM
>
> > To: 'Derek Su' ; qemu-devel@nongnu.org
>
> > Cc: lizhij...@cn.fujitsu.com; jasow...@redhat.com
>
> > Subject: RE: [PATCH v1 2/2] colo-compare: Record packet creation time by
>
> > QEMU_CLOCK_REALTIME
>
> >
>
> >
>
> >
>
> > > -Original Message-
>
> > > From: Derek Su 
>
> > > Sent: Saturday, September 12, 2020 3:05 AM
>
> > > To: qemu-devel@nongnu.org
>
> > > Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com;
>
> > > jasow...@redhat.com; Derek Su 
>
> > > Subject: [PATCH v1 2/2] colo-compare: Record packet creation time by
>
> > > QEMU_CLOCK_REALTIME
>
> > >
>
> > > Record packet creation time by QEMU_CLOCK_REALTIME instead of
>
> > > QEMU_CLOCK_HOST. The time difference between `now` and packet
>
> > > `creation_ms` has the possibility of an unexpected negative value and
>
> > > results in wrong comparison after changing the host clock.
>
> > >
>
> >
>
> > Hi Derek,
>
> >
>
> > I think this issue caused by other code in this file use the
>
> > qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
>
> > I will change all code to QEMU_CLOCK_HOST to fix it with the patch 1/2.
>
>
>
> If you feel OK, or you can send the new version.  :-)
>
>
>
> Hello, Chen
>
>
>
> Is the monotonically increasing clock better than host clock in the COLO
> compare framework?
>
> The host clock is sometimes changed by the users manually or NTP
> synchronization automatically, and the cases may lead to the relative time
> be disordered.
>
> For example, the `creation_time` of one packet is advanced to the `now` in
> our tests.
>
>
>
> I incline to replace QEMU_CLOCK_REALTIME and QEMU_CLOCK_VIRTUAL with
> the monotonically increasing clock QEMU_CLOCK_REALTIME in COLO compare
> framework.
>
> How about your thoughts?
>
>
>
> Hi Derek,
>
>
>
> /**
>
> * QEMUClockType:
>
> *
>
> * The following clock types are available:
>
> *
>
> * @QEMU_CLOCK_REALTIME: Real time clock
>
> *
>
> * The real time clock should be used only for stuff which does not
>
> * change the virtual machine state, as it runs even if the virtual
>
> * machine is stopped.
>
> *
>
> * @QEMU_CLOCK_VIRTUAL: virtual clock
>
> *
>
> * The

Re: [PATCH v1 2/2] colo-compare: Record packet creation time by QEMU_CLOCK_REALTIME

2020-09-14 Thread Derek Su
Hello, Chen

My humble opinion is that the `creation_ms` and `now` fixed in my patch
should be OK in colo-compare and performs well in my test if used
QEMU_CLOCK_REALTIME. Though the vm state is changed in COLO scenario, the
record and comparison of `creation_ms` and `now` are not affected by these
vm changes.

In net/colo.c and net/colo-compare.c functions using timer_mod(),
using QEMU_CLOCK_HOST is dangerous if users change the host clock. The
timer might not be fired on time as expected. The original time_mod using
QEMU_CLOCK_VIRTUAL seems OK currently.
Thanks.

Regards,
Derek


Zhang, Chen  於 2020年9月14日 週一 下午3:42寫道:

>
>
>
>
> *From:* Derek Su 
> *Sent:* Monday, September 14, 2020 9:00 AM
> *To:* Zhang, Chen 
> *Cc:* jasow...@redhat.com; lizhij...@cn.fujitsu.com; qemu-devel@nongnu.org
> *Subject:* Re: [PATCH v1 2/2] colo-compare: Record packet creation time
> by QEMU_CLOCK_REALTIME
>
>
>
>
>
>
>
> Zhang, Chen 於 2020年9月14日 週一,上午4:06寫道:
>
>
>
>
>
> > -Original Message-
>
> > From: Zhang, Chen
>
> > Sent: Monday, September 14, 2020 4:02 AM
>
> > To: 'Derek Su' ; qemu-devel@nongnu.org
>
> > Cc: lizhij...@cn.fujitsu.com; jasow...@redhat.com
>
> > Subject: RE: [PATCH v1 2/2] colo-compare: Record packet creation time by
>
> > QEMU_CLOCK_REALTIME
>
> >
>
> >
>
> >
>
> > > -Original Message-
>
> > > From: Derek Su 
>
> > > Sent: Saturday, September 12, 2020 3:05 AM
>
> > > To: qemu-devel@nongnu.org
>
> > > Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com;
>
> > > jasow...@redhat.com; Derek Su 
>
> > > Subject: [PATCH v1 2/2] colo-compare: Record packet creation time by
>
> > > QEMU_CLOCK_REALTIME
>
> > >
>
> > > Record packet creation time by QEMU_CLOCK_REALTIME instead of
>
> > > QEMU_CLOCK_HOST. The time difference between `now` and packet
>
> > > `creation_ms` has the possibility of an unexpected negative value and
>
> > > results in wrong comparison after changing the host clock.
>
> > >
>
> >
>
> > Hi Derek,
>
> >
>
> > I think this issue caused by other code in this file use the
>
> > qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
>
> > I will change all code to QEMU_CLOCK_HOST to fix it with the patch 1/2.
>
>
>
> If you feel OK, or you can send the new version.  :-)
>
>
>
> Hello, Chen
>
>
>
> Is the monotonically increasing clock better than host clock in the COLO
> compare framework?
>
> The host clock is sometimes changed by the users manually or NTP
> synchronization automatically, and the cases may lead to the relative time
> be disordered.
>
> For example, the `creation_time` of one packet is advanced to the `now` in
> our tests.
>
>
>
> I incline to replace QEMU_CLOCK_REALTIME and QEMU_CLOCK_VIRTUAL with
> the monotonically increasing clock QEMU_CLOCK_REALTIME in COLO compare
> framework.
>
> How about your thoughts?
>
>
>
> Hi Derek,
>
>
>
> /**
>
> * QEMUClockType:
>
> *
>
> * The following clock types are available:
>
> *
>
> * @QEMU_CLOCK_REALTIME: Real time clock
>
> *
>
> * The real time clock should be used only for stuff which does not
>
> * change the virtual machine state, as it runs even if the virtual
>
> * machine is stopped.
>
> *
>
> * @QEMU_CLOCK_VIRTUAL: virtual clock
>
> *
>
> * The virtual clock only runs during the emulation. It stops
>
> * when the virtual machine is stopped.
>
> *
>
> * @QEMU_CLOCK_HOST: host clock
>
> *
>
> * The host clock should be used for device models that emulate accurate
>
> * real time sources. It will continue to run when the virtual machine
>
> * is suspended, and it will reflect system time changes the host may
>
> * undergo (e.g. due to NTP).
>
>
>
>
>
> When COLO running, it will change the virtual machine state.
>
> So I think use the QEMU_CLOCK_HOST is better.
>
>
>
> Thanks
>
> Zhang Chen
>
>
>
> If OK, I will send the new version again, thanks. :)
>
> Thank you.
>
>
>
> Regards,
>
> Derek
>
>
>
>
>
>
> >
>
> > Thanks
>
> > Zhang Chen
>
> >
>
> > > Signed-off-by: Derek Su 
>
> > > ---
>
> > >  net/colo-compare.c | 2 +-
>
> > >  net/colo.c | 2 +-
>
> > >  2 files changed, 2 insertions(+), 2 deletions(-)
>
> > >
>
> > > diff --git a/net/colo-compare.c b/net/colo-compare.c index
>
> > > c4de86ef34..29d7f9

Re: [PATCH v1 2/2] colo-compare: Record packet creation time by QEMU_CLOCK_REALTIME

2020-09-13 Thread Derek Su
Zhang, Chen 於 2020年9月14日 週一,上午4:06寫道:

>
>
>
>
> > -Original Message-
>
> > From: Zhang, Chen
>
> > Sent: Monday, September 14, 2020 4:02 AM
>
> > To: 'Derek Su' ; qemu-devel@nongnu.org
>
> > Cc: lizhij...@cn.fujitsu.com; jasow...@redhat.com
>
> > Subject: RE: [PATCH v1 2/2] colo-compare: Record packet creation time by
>
> > QEMU_CLOCK_REALTIME
>
> >
>
> >
>
> >
>
> > > -Original Message-
>
> > > From: Derek Su 
>
> > > Sent: Saturday, September 12, 2020 3:05 AM
>
> > > To: qemu-devel@nongnu.org
>
> > > Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com;
>
> > > jasow...@redhat.com; Derek Su 
>
> > > Subject: [PATCH v1 2/2] colo-compare: Record packet creation time by
>
> > > QEMU_CLOCK_REALTIME
>
> > >
>
> > > Record packet creation time by QEMU_CLOCK_REALTIME instead of
>
> > > QEMU_CLOCK_HOST. The time difference between `now` and packet
>
> > > `creation_ms` has the possibility of an unexpected negative value and
>
> > > results in wrong comparison after changing the host clock.
>
> > >
>
> >
>
> > Hi Derek,
>
> >
>
> > I think this issue caused by other code in this file use the
>
> > qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
>
> > I will change all code to QEMU_CLOCK_HOST to fix it with the patch 1/2.
>
>
>
> If you feel OK, or you can send the new version.  :-)
>
>
Hello, Chen

Is the monotonically increasing clock better than host clock in the COLO
compare framework?
The host clock is sometimes changed by the users manually or NTP
synchronization automatically, and the cases may lead to the relative time
be disordered.
For example, the `creation_time` of one packet is advanced to the `now` in
our tests.

I incline to replace QEMU_CLOCK_REALTIME and QEMU_CLOCK_VIRTUAL with
the monotonically increasing clock QEMU_CLOCK_REALTIME in COLO compare
framework.
How about your thoughts?

If OK, I will send the new version again, thanks. :)
Thank you.

Regards,
Derek


>
> >
>
> > Thanks
>
> > Zhang Chen
>
> >
>
> > > Signed-off-by: Derek Su 
>
> > > ---
>
> > >  net/colo-compare.c | 2 +-
>
> > >  net/colo.c | 2 +-
>
> > >  2 files changed, 2 insertions(+), 2 deletions(-)
>
> > >
>
> > > diff --git a/net/colo-compare.c b/net/colo-compare.c index
>
> > > c4de86ef34..29d7f986e3 100644
>
> > > --- a/net/colo-compare.c
>
> > > +++ b/net/colo-compare.c
>
> > > @@ -621,7 +621,7 @@ static int colo_packet_compare_other(Packet *spkt,
>
> > > Packet *ppkt)
>
> > >
>
> > >  static int colo_old_packet_check_one(Packet *pkt, void *user_data)  {
>
> > > -int64_t now = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>
> > > +int64_t now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>
> > >  uint32_t check_time = *(uint32_t *)user_data;
>
> > >
>
> > >  if ((now - pkt->creation_ms) > check_time) { diff --git
>
> > > a/net/colo.c b/net/colo.c index a6c66d829a..0441910169 100644
>
> > > --- a/net/colo.c
>
> > > +++ b/net/colo.c
>
> > > @@ -164,7 +164,7 @@ Packet *packet_new(const void *data, int size, int
>
> > > vnet_hdr_len)
>
> > >
>
> > >  pkt->data = g_memdup(data, size);
>
> > >  pkt->size = size;
>
> > > -pkt->creation_ms = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>
> > > +pkt->creation_ms = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>
> > >  pkt->vnet_hdr_len = vnet_hdr_len;
>
> > >  pkt->tcp_seq = 0;
>
> > >  pkt->tcp_ack = 0;
>
> > > --
>
> > > 2.25.1
>
>
>
>


Re: [PATCH v1 1/2] colo-compare: Fix incorrect data type conversion

2020-09-13 Thread Derek Su
Hi, Chen

Got it, thank you :)

Regards,
Derek

Zhang, Chen 於 2020年9月14日 週一,上午4:02寫道:

>
>
>
>
> > -Original Message-
>
> > From: Derek Su 
>
> > Sent: Saturday, September 12, 2020 3:05 AM
>
> > To: qemu-devel@nongnu.org
>
> > Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com;
>
> > jasow...@redhat.com; Derek Su 
>
> > Subject: [PATCH v1 1/2] colo-compare: Fix incorrect data type conversion
>
> >
>
> > Fix data type conversion of compare_timeout. The incorrect conversion
>
> > results in a random compare_timeout value and unexpected stalls in packet
>
> > comparison.
>
> >
>
>
>
> This bug already found on our internal test too. Just waiting to send.
>
> Good catch! But this patch not fixed the root cause.
>
> Change the compare_timeout from uint32_t to uint64_t is better.
>
> I will send a patch for this and tag reported by you.
>
>
>
> Thanks
>
> Zhang Chen
>
>
>
>
>
> > Signed-off-by: Derek Su 
>
> > ---
>
> >  net/colo-compare.c | 5 +++--
>
> >  1 file changed, 3 insertions(+), 2 deletions(-)
>
> >
>
> > diff --git a/net/colo-compare.c b/net/colo-compare.c index
>
> > 2c20de1537..c4de86ef34 100644
>
> > --- a/net/colo-compare.c
>
> > +++ b/net/colo-compare.c
>
> > @@ -619,11 +619,12 @@ static int colo_packet_compare_other(Packet *spkt,
>
> > Packet *ppkt)
>
> > ppkt->size - offset);  }
>
> >
>
> > -static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time)
>
> > +static int colo_old_packet_check_one(Packet *pkt, void *user_data)
>
> >  {
>
> >  int64_t now = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>
> > +uint32_t check_time = *(uint32_t *)user_data;
>
> >
>
> > -if ((now - pkt->creation_ms) > (*check_time)) {
>
> > +if ((now - pkt->creation_ms) > check_time) {
>
> >  trace_colo_old_packet_check_found(pkt->creation_ms);
>
> >  return 0;
>
> >  } else {
>
> > --
>
> > 2.25.1
>
>
>
> --

Best regards,

Derek Su
QNAP Systems, Inc.
Email: dere...@qnap.com
Tel: (+886)-2-2393-5152 ext. 15017
Address: 13F., No.56, Sec. 1, Xinsheng S. Rd., Zhongzheng Dist., Taipei
City, Taiwan


[PATCH v1 1/2] colo-compare: Fix incorrect data type conversion

2020-09-11 Thread Derek Su
Fix data type conversion of compare_timeout. The incorrect
conversion results in a random compare_timeout value and
unexpected stalls in packet comparison.

Signed-off-by: Derek Su 
---
 net/colo-compare.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 2c20de1537..c4de86ef34 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -619,11 +619,12 @@ static int colo_packet_compare_other(Packet *spkt, Packet 
*ppkt)
ppkt->size - offset);
 }
 
-static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time)
+static int colo_old_packet_check_one(Packet *pkt, void *user_data)
 {
 int64_t now = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+uint32_t check_time = *(uint32_t *)user_data;
 
-if ((now - pkt->creation_ms) > (*check_time)) {
+if ((now - pkt->creation_ms) > check_time) {
 trace_colo_old_packet_check_found(pkt->creation_ms);
 return 0;
 } else {
-- 
2.25.1




[PATCH v1 2/2] colo-compare: Record packet creation time by QEMU_CLOCK_REALTIME

2020-09-11 Thread Derek Su
Record packet creation time by QEMU_CLOCK_REALTIME instead of
QEMU_CLOCK_HOST. The time difference between `now` and packet
`creation_ms` has the possibility of an unexpected negative value
and results in wrong comparison after changing the host clock.

Signed-off-by: Derek Su 
---
 net/colo-compare.c | 2 +-
 net/colo.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index c4de86ef34..29d7f986e3 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -621,7 +621,7 @@ static int colo_packet_compare_other(Packet *spkt, Packet 
*ppkt)
 
 static int colo_old_packet_check_one(Packet *pkt, void *user_data)
 {
-int64_t now = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+int64_t now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
 uint32_t check_time = *(uint32_t *)user_data;
 
 if ((now - pkt->creation_ms) > check_time) {
diff --git a/net/colo.c b/net/colo.c
index a6c66d829a..0441910169 100644
--- a/net/colo.c
+++ b/net/colo.c
@@ -164,7 +164,7 @@ Packet *packet_new(const void *data, int size, int 
vnet_hdr_len)
 
 pkt->data = g_memdup(data, size);
 pkt->size = size;
-pkt->creation_ms = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+pkt->creation_ms = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
 pkt->vnet_hdr_len = vnet_hdr_len;
 pkt->tcp_seq = 0;
 pkt->tcp_ack = 0;
-- 
2.25.1




[PATCH v1 0/2] colo-compare bugfixes

2020-09-11 Thread Derek Su
Hello,

The fixes are for the bugs found in colo-compare during our testing
and applications.

Please help to review, thanks a lot.

Regards,
Derek Su

Derek Su (2):
  colo-compare: Fix incorrect data type conversion
  colo-compare: Record packet creation time by QEMU_CLOCK_REALTIME

 net/colo-compare.c | 7 ---
 net/colo.c | 2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

-- 
2.25.1




[PATCH v1 0/1] COLO: only flush dirty ram pages from colo cache

2020-09-10 Thread Derek Su
In the secondary side, the colo_flush_ram_cache() calls
migration_bitmap_find_dirty() finding the dirty pages and
flushes them to host. But ram_state's ram_bulk_stage flag is always
enabled in secondary side, it leads to the whole ram pages copy
instead of only dirty pages.

In the test VM with 4GB RAM under the steady state, the
colo_flush_ram_cache() consumes 650 ms.

Here, the ram_bulk_stage flag in secondary side is disabled in the
preparation of COLO incoming process to avoid the whole dirty ram
pages flush.

After patching, the time consumption of colo_flush_ram_cache() is
reduced to 10 ms averagely.

Please help to review and give comments, thanks a lot!


Derek Su (1):
  COLO: only flush dirty ram pages from colo cache

 migration/colo.c |  6 +-
 migration/ram.c  | 10 ++
 migration/ram.h  |  3 +++
 3 files changed, 18 insertions(+), 1 deletion(-)

-- 
2.25.1




[PATCH v1 1/1] COLO: only flush dirty ram pages from colo cache

2020-09-10 Thread Derek Su
In secondary side, the colo_flush_ram_cache() calls
migration_bitmap_find_dirty() to finding the dirty pages and
flush them to host. But ram_state's ram_bulk_stage flag is always
enabled in secondary side, it leads to the whole ram pages copy
instead of only dirty pages.

Here, the ram_bulk_stage in secondary side is disabled in the
preparation of COLO incoming process to avoid the whole dirty
ram pages flush.

Signed-off-by: Derek Su 
---
 migration/colo.c |  6 +-
 migration/ram.c  | 10 ++
 migration/ram.h  |  3 +++
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/migration/colo.c b/migration/colo.c
index ea7d1e9d4e..6e644db306 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -844,6 +844,8 @@ void *colo_process_incoming_thread(void *opaque)
 return NULL;
 }
 
+colo_disable_ram_bulk_stage();
+
 failover_init_state();
 
 mis->to_src_file = qemu_file_get_return_path(mis->from_src_file);
@@ -873,7 +875,7 @@ void *colo_process_incoming_thread(void *opaque)
 goto out;
 }
 #else
-abort();
+abort();
 #endif
 vm_start();
 trace_colo_vm_state_change("stop", "run");
@@ -924,6 +926,8 @@ out:
 qemu_fclose(fb);
 }
 
+colo_enable_ram_bulk_stage();
+
 /* Hope this not to be too long to loop here */
 qemu_sem_wait(>colo_incoming_sem);
 qemu_sem_destroy(>colo_incoming_sem);
diff --git a/migration/ram.c b/migration/ram.c
index 76d4fee5d5..65e9b12058 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3357,6 +3357,16 @@ static bool postcopy_is_running(void)
 return ps >= POSTCOPY_INCOMING_LISTENING && ps < POSTCOPY_INCOMING_END;
 }
 
+void colo_enable_ram_bulk_stage(void)
+{
+ram_state->ram_bulk_stage = true;
+}
+
+void colo_disable_ram_bulk_stage(void)
+{
+ram_state->ram_bulk_stage = false;
+}
+
 /*
  * Flush content of RAM cache into SVM's memory.
  * Only flush the pages that be dirtied by PVM or SVM or both.
diff --git a/migration/ram.h b/migration/ram.h
index 2eeaacfa13..c1c0ebbe0f 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -69,4 +69,7 @@ void colo_flush_ram_cache(void);
 void colo_release_ram_cache(void);
 void colo_incoming_start_dirty_log(void);
 
+void colo_enable_ram_bulk_stage(void);
+void colo_disable_ram_bulk_stage(void);
+
 #endif
-- 
2.25.1




[Bug 1894818] Re: COLO's guest VNC client hang after failover

2020-09-08 Thread Derek Su
Hi,

I also tested some emulated nic devices and virtio network devices (in
the attachment).

The VNC client's screen cannot be recovered while using all virtio
network devices and the emulated e1000e nic.

Thanks.

Regards,
Derek


** Attachment added: "截圖 2020-09-09 上午10.39.09.png"
   
https://bugs.launchpad.net/qemu/+bug/1894818/+attachment/5408894/+files/%E6%88%AA%E5%9C%96%202020-09-09%20%E4%B8%8A%E5%8D%8810.39.09.png

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1894818

Title:
  COLO's guest VNC client hang after failover

Status in QEMU:
  New

Bug description:
  Hello,

  After setting up COLO's primary and secondary VMs,
  I installed the vncserver and xrdp (apt install tightvncserver xrdp) inside 
the VM.

  I access the VM from another PC via VNC/RDP client, and everything is OK.
  Then, kill the primary VM and issue the failover commands.

  The expected result is that the VNC/RDP client can reconnect and
  resume automatically after failover. (I've confirmed the VNC/RDP
  client can reconnect automatically.)

  But in my test, the VNC client's screen hangs and cannot be recovered
  no longer. I need to restart VNC client by myself.

  BTW, it works well after killing SVM.

  Here is my QEMU networking device
  ```
  -device virtio-net-pci,id=e0,netdev=hn0 \
  -netdev 
tap,id=hn0,br=br0,vhost=off,helper=/usr/local/libexec/qemu-bridge-helper \
  ```

  Thanks.

  Regards,
  Derek

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1894818/+subscriptions



[Bug 1894818] Re: COLO's guest VNC client hang after failover

2020-09-08 Thread Derek Su
** Changed in: qemu
   Status: Invalid => New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1894818

Title:
  COLO's guest VNC client hang after failover

Status in QEMU:
  New

Bug description:
  Hello,

  After setting up COLO's primary and secondary VMs,
  I installed the vncserver and xrdp (apt install tightvncserver xrdp) inside 
the VM.

  I access the VM from another PC via VNC/RDP client, and everything is OK.
  Then, kill the primary VM and issue the failover commands.

  The expected result is that the VNC/RDP client can reconnect and
  resume automatically after failover. (I've confirmed the VNC/RDP
  client can reconnect automatically.)

  But in my test, the VNC client's screen hangs and cannot be recovered
  no longer. I need to restart VNC client by myself.

  BTW, it works well after killing SVM.

  Here is my QEMU networking device
  ```
  -device virtio-net-pci,id=e0,netdev=hn0 \
  -netdev 
tap,id=hn0,br=br0,vhost=off,helper=/usr/local/libexec/qemu-bridge-helper \
  ```

  Thanks.

  Regards,
  Derek

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1894818/+subscriptions



[Bug 1894818] Re: COLO's guest VNC client hang after failover

2020-09-08 Thread Derek Su
Hi, Lukas

After fixing the misuse of AWD.
There is still a high probability of the VNC/RDP client hangs after PVM died 
and SVM takeover.

Here are the steps.

1. Start PVM script
```
imagefolder="/mnt/nfs2/vms"
  
 qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 4096 -smp 2 -qmp stdio 
\   
   -device piix3-usb-uhci -device usb-tablet -name primary \
   -device virtio-net-pci,id=e0,netdev=hn0 \
   -netdev 
tap,id=hn0,br=br0,vhost=off,helper=/usr/local/libexec/qemu-bridge-helper \
   -chardev socket,id=mirror0,host=0.0.0.0,port=9003,server,nowait \
   -chardev socket,id=compare1,host=0.0.0.0,port=9004,server,wait \
   -chardev socket,id=compare0,host=127.0.0.1,port=9001,server,nowait \
   -chardev socket,id=compare0-0,host=127.0.0.1,port=9001 \
   -chardev socket,id=compare_out,host=127.0.0.1,port=9005,server,nowait \
   -chardev socket,id=compare_out0,host=127.0.0.1,port=9005 \
   -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 \
   -object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out \
   -object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 \
   -object iothread,id=iothread1 \
   -object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,\
outdev=compare_out0,iothread=iothread1 \
   -drive 
if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
children.0.file.filename=$imagefolder/primary.qcow2,children.0.driver=qcow2 
-vnc :0 -S
```

2. Start SVM script
```
#!/bin/bash
  
imagefolder="/mnt/nfs2/vms"
primary_ip=127.0.0.1

qemu-img create -f qcow2 $imagefolder/secondary-active.qcow2 100G
qemu-img create -f qcow2 $imagefolder/secondary-hidden.qcow2 100G

qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 4096 -smp 2 -qmp stdio \
-device piix3-usb-uhci -device usb-tablet -name secondary \
-device virtio-net-pci,id=e0,netdev=hn0 \   

-netdev 
tap,id=hn0,br=br0,vhost=off,helper=/usr/local/libexec/qemu-bridge-helper \
-chardev socket,id=red0,host=$primary_ip,port=9003,reconnect=1 \
-chardev socket,id=red1,host=$primary_ip,port=9004,reconnect=1 \
-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
-object filter-rewriter,id=rew0,netdev=hn0,queue=all \
-drive 
if=none,id=parent0,file.filename=$imagefolder/secondary.qcow2,driver=qcow2 \
-drive if=none,id=childs0,driver=replication,mode=secondary,file.driver=qcow2,\
top-id=colo-disk0,file.file.filename=$imagefolder/secondary-active.qcow2,\
file.backing.driver=qcow2,file.backing.file.filename=$imagefolder/secondary-hidden.qcow2,\
file.backing.backing=parent0 \
-drive 
if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,children.0=childs0
 \
-vnc :1 \
-incoming tcp:0.0.0.0:9998
```

3. On Secondary VM's QEMU monitor, issue command
{'execute':'qmp_capabilities'}
{'execute': 'nbd-server-start', 'arguments': {'addr': {'type': 'inet', 'data': 
{'host': '0.0.0.0', 'port': ''} } } }
{'execute': 'nbd-server-add', 'arguments': {'device': 'parent0', 'writable': 
true } }

4. On Primary VM's QEMU monitor, issue command:
{'execute':'qmp_capabilities'}
{'execute': 'human-monitor-command', 'arguments': {'command-line': 'drive_add 
-n buddy 
driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=,file.export=parent0,node-name=replication0'}}
{'execute': 'x-blockdev-change', 'arguments':{'parent': 'colo-disk0', 'node': 
'replication0' } }
{'execute': 'migrate-set-capabilities', 'arguments': {'capabilities': [ 
{'capability': 'x-colo', 'state': true } ] } }
{'execute': 'migrate', 'arguments': {'uri': 'tcp:127.0.0.2:9998' } }

5. kill PVM

6. On SVM, issues
```
{'execute': 'nbd-server-stop'}
{'execute': 'x-colo-lost-heartbeat'}

{'execute': 'object-del', 'arguments':{ 'id': 'f2' } }
{'execute': 'object-del', 'arguments':{ 'id': 'f1' } }
{'execute': 'chardev-remove', 'arguments':{ 'id': 'red1' } }
{'execute': 'chardev-remove', 'arguments':{ 'id': 'red0' } }
```

I use "-device virtio-net-pci" here, but after replacing with "-device rtl8139",
the behavior seems normal.
Is "-device virtio-net-pci" available in COLO?

Thanks.

Regards,
Derek

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1894818

Title:
  COLO's guest VNC client hang after failover

Status in QEMU:
  Invalid

Bug description:
  Hello,

  After setting up COLO's primary and secondary VMs,
  I installed the vncserver and xrdp (apt install tightvncserver xrdp) inside 
the VM.

  I access the VM from another PC via VNC/RDP client, and everything is OK.
  Then, kill the primary VM and issue the failover commands.

  The expected result is that the VNC/RDP client can reconnect and
  resume automatically after failover. (I've confirmed the VNC/RDP
  client can reconnect automatically.)

  But in my test, the VNC client's screen 

Re: [Bug 1894818] [NEW] COLO's guest VNC client hang after failover

2020-09-08 Thread Derek Su
Hi, Lukas

It caused by the advanced watchdog (AWD) feature instead of COLO itself.
I will check it if my misuse or not, thanks.

Best regards,

Derek

Lukas Straub <1894...@bugs.launchpad.net> 於 2020年9月8日 週二 下午8:30寫道:

> On Tue, 08 Sep 2020 10:25:52 -
> Launchpad Bug Tracker <1894...@bugs.launchpad.net> wrote:
>
> > You have been subscribed to a public bug by Derek Su (dereksu):
> >
> > Hello,
> >
> > After setting up COLO's primary and secondary VMs,
> > I installed the vncserver and xrdp (apt install tightvncserver xrdp)
> inside the VM.
> >
> > I access the VM from another PC via VNC/RDP client, and everything is OK.
> > Then, kill the primary VM and issue the failover commands.
> >
> > The expected result is that the VNC/RDP client can reconnect and resume
> > automatically after failover. (I've confirmed the VNC/RDP client can
> > reconnect automatically.)
> >
> > But in my test, the VNC client's screen hangs and cannot be recovered no
> > longer. (I need to restart VNC client by myself.)
> >
> > BTW, it works well after killing SVM.
> >
> > Here is my QEMU networking device
> > ```
> > -device virtio-net-pci,id=e0,netdev=hn0 \
> > -netdev
> tap,id=hn0,br=br0,vhost=off,helper=/usr/local/libexec/qemu-bridge-helper \
> > ```
> >
> > Thanks.
> >
> > Regards,
> > Derek
> >
> > ** Affects: qemu
> >  Importance: Undecided
> >  Status: New
> >
>
> Hello,
> Can you show the full qemu command line?
>
> Regards,
> Lukas Straub
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1894818
>
> Title:
>   COLO's guest VNC client hang after failover
>
> Status in QEMU:
>   New
>
> Bug description:
>   Hello,
>
>   After setting up COLO's primary and secondary VMs,
>   I installed the vncserver and xrdp (apt install tightvncserver xrdp)
> inside the VM.
>
>   I access the VM from another PC via VNC/RDP client, and everything is OK.
>   Then, kill the primary VM and issue the failover commands.
>
>   The expected result is that the VNC/RDP client can reconnect and
>   resume automatically after failover. (I've confirmed the VNC/RDP
>   client can reconnect automatically.)
>
>   But in my test, the VNC client's screen hangs and cannot be recovered
>   no longer. I need to restart VNC client by myself.
>
>   BTW, it works well after killing SVM.
>
>   Here is my QEMU networking device
>   ```
>   -device virtio-net-pci,id=e0,netdev=hn0 \
>   -netdev
> tap,id=hn0,br=br0,vhost=off,helper=/usr/local/libexec/qemu-bridge-helper \
>   ```
>
>   Thanks.
>
>   Regards,
>   Derek
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/qemu/+bug/1894818/+subscriptions
>


** Changed in: qemu
   Status: New => Invalid

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1894818

Title:
  COLO's guest VNC client hang after failover

Status in QEMU:
  Invalid

Bug description:
  Hello,

  After setting up COLO's primary and secondary VMs,
  I installed the vncserver and xrdp (apt install tightvncserver xrdp) inside 
the VM.

  I access the VM from another PC via VNC/RDP client, and everything is OK.
  Then, kill the primary VM and issue the failover commands.

  The expected result is that the VNC/RDP client can reconnect and
  resume automatically after failover. (I've confirmed the VNC/RDP
  client can reconnect automatically.)

  But in my test, the VNC client's screen hangs and cannot be recovered
  no longer. I need to restart VNC client by myself.

  BTW, it works well after killing SVM.

  Here is my QEMU networking device
  ```
  -device virtio-net-pci,id=e0,netdev=hn0 \
  -netdev 
tap,id=hn0,br=br0,vhost=off,helper=/usr/local/libexec/qemu-bridge-helper \
  ```

  Thanks.

  Regards,
  Derek

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1894818/+subscriptions



[Bug 1894818] Re: COLO's guest VNC client hang after failover

2020-09-08 Thread Derek Su
** Description changed:

  Hello,
  
  After setting up COLO's primary and secondary VMs,
  I installed the vncserver and xrdp (apt install tightvncserver xrdp) inside 
the VM.
  
  I access the VM from another PC via VNC/RDP client, and everything is OK.
  Then, kill the primary VM and issue the failover commands.
  
  The expected result is that the VNC/RDP client can reconnect and resume
  automatically after failover. (I've confirmed the VNC/RDP client can
  reconnect automatically.)
  
  But in my test, the VNC client's screen hangs and cannot be recovered no
  longer. (I need to restart VNC client by myself.)
  
  BTW, it works well after killing SVM.
  
+ Here is my QEMU networking device
+ ```
+ -device virtio-net-pci,id=e0,netdev=hn0 \
+ -netdev 
tap,id=hn0,br=br0,vhost=off,helper=/usr/local/libexec/qemu-bridge-helper \
+ ```
+ 
  Thanks.
  
  Regards,
  Derek

** Description changed:

  Hello,
  
  After setting up COLO's primary and secondary VMs,
  I installed the vncserver and xrdp (apt install tightvncserver xrdp) inside 
the VM.
  
  I access the VM from another PC via VNC/RDP client, and everything is OK.
  Then, kill the primary VM and issue the failover commands.
  
  The expected result is that the VNC/RDP client can reconnect and resume
  automatically after failover. (I've confirmed the VNC/RDP client can
  reconnect automatically.)
  
  But in my test, the VNC client's screen hangs and cannot be recovered no
- longer. (I need to restart VNC client by myself.)
+ longer. I need to restart VNC client by myself.
  
  BTW, it works well after killing SVM.
  
  Here is my QEMU networking device
  ```
  -device virtio-net-pci,id=e0,netdev=hn0 \
  -netdev 
tap,id=hn0,br=br0,vhost=off,helper=/usr/local/libexec/qemu-bridge-helper \
  ```
  
  Thanks.
  
  Regards,
  Derek

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1894818

Title:
  COLO's guest VNC client hang after failover

Status in QEMU:
  New

Bug description:
  Hello,

  After setting up COLO's primary and secondary VMs,
  I installed the vncserver and xrdp (apt install tightvncserver xrdp) inside 
the VM.

  I access the VM from another PC via VNC/RDP client, and everything is OK.
  Then, kill the primary VM and issue the failover commands.

  The expected result is that the VNC/RDP client can reconnect and
  resume automatically after failover. (I've confirmed the VNC/RDP
  client can reconnect automatically.)

  But in my test, the VNC client's screen hangs and cannot be recovered
  no longer. I need to restart VNC client by myself.

  BTW, it works well after killing SVM.

  Here is my QEMU networking device
  ```
  -device virtio-net-pci,id=e0,netdev=hn0 \
  -netdev 
tap,id=hn0,br=br0,vhost=off,helper=/usr/local/libexec/qemu-bridge-helper \
  ```

  Thanks.

  Regards,
  Derek

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1894818/+subscriptions



[Bug 1894818] Re: COLO's guest VNC client hang after failover

2020-09-08 Thread Derek Su
** Description changed:

  Hello,
  
  After setting up COLO's primary and secondary VMs,
  I installed the vncserver and xrdp (apt install tightvncserver xrdp) inside 
the VM.
  
  I access the VM from another PC via VNC/RDP client, and everything is OK.
  Then, kill the primary VM and issue the failover commands.
  
- The expected result is that the VNC/RDP client can resume after failover.
- But in my test, the VNC client's screen hangs and cannot be recovered no 
longer.
+ The expected result is that the VNC/RDP client can reconnect and resume
+ automatically after failover. (I've confirmed the VNC/RDP client can
+ reconnect automatically.)
+ 
+ But in my test, the VNC client's screen hangs and cannot be recovered no
+ longer. (I need to restart VNC client by myself.)
  
  BTW, it works well after killing SVM.
  
  Thanks.
  
  Regards,
  Derek

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1894818

Title:
  COLO's guest VNC client hang after failover

Status in QEMU:
  New

Bug description:
  Hello,

  After setting up COLO's primary and secondary VMs,
  I installed the vncserver and xrdp (apt install tightvncserver xrdp) inside 
the VM.

  I access the VM from another PC via VNC/RDP client, and everything is OK.
  Then, kill the primary VM and issue the failover commands.

  The expected result is that the VNC/RDP client can reconnect and
  resume automatically after failover. (I've confirmed the VNC/RDP
  client can reconnect automatically.)

  But in my test, the VNC client's screen hangs and cannot be recovered
  no longer. (I need to restart VNC client by myself.)

  BTW, it works well after killing SVM.

  Here is my QEMU networking device
  ```
  -device virtio-net-pci,id=e0,netdev=hn0 \
  -netdev 
tap,id=hn0,br=br0,vhost=off,helper=/usr/local/libexec/qemu-bridge-helper \
  ```

  Thanks.

  Regards,
  Derek

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1894818/+subscriptions



[Bug 1894818] [NEW] COLO's guest VNC client hang after failover

2020-09-08 Thread Derek Su
Public bug reported:

Hello,

After setting up COLO's primary and secondary VMs,
I installed the vncserver and xrdp (apt install tightvncserver xrdp) inside the 
VM.

I access the VM from another PC via VNC/RDP client, and everything is OK.
Then, kill the primary VM and issue the failover commands.

The expected result is that the VNC/RDP client can reconnect and resume
automatically after failover. (I've confirmed the VNC/RDP client can
reconnect automatically.)

But in my test, the VNC client's screen hangs and cannot be recovered no
longer. (I need to restart VNC client by myself.)

BTW, it works well after killing SVM.

Thanks.

Regards,
Derek

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1894818

Title:
  COLO's guest VNC client hang after failover

Status in QEMU:
  New

Bug description:
  Hello,

  After setting up COLO's primary and secondary VMs,
  I installed the vncserver and xrdp (apt install tightvncserver xrdp) inside 
the VM.

  I access the VM from another PC via VNC/RDP client, and everything is OK.
  Then, kill the primary VM and issue the failover commands.

  The expected result is that the VNC/RDP client can reconnect and
  resume automatically after failover. (I've confirmed the VNC/RDP
  client can reconnect automatically.)

  But in my test, the VNC client's screen hangs and cannot be recovered
  no longer. (I need to restart VNC client by myself.)

  BTW, it works well after killing SVM.

  Thanks.

  Regards,
  Derek

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1894818/+subscriptions



Re: [PATCH v1 0/1] COLO: migrate dirty ram pages before colo checkpoint

2020-08-15 Thread Derek Su
On Sat, Aug 15, 2020 at 9:42 AM Zhanghailiang
 wrote:
>
> > -Original Message-
> > From: Derek Su [mailto:jwsu1...@gmail.com]
> > Sent: Thursday, August 13, 2020 6:28 PM
> > To: Lukas Straub 
> > Cc: Derek Su ; qemu-devel@nongnu.org; Zhanghailiang
> > ; chy...@qnap.com; quint...@redhat.com;
> > dgilb...@redhat.com; ctch...@qnap.com
> > Subject: Re: [PATCH v1 0/1] COLO: migrate dirty ram pages before colo
> > checkpoint
> >
> > On Fri, Jul 31, 2020 at 3:52 PM Lukas Straub  wrote:
> > >
> > > On Sun, 21 Jun 2020 10:10:03 +0800
> > > Derek Su  wrote:
> > >
> > > > This series is to reduce the guest's downtime during colo checkpoint
> > > > by migrating dirty ram pages as many as possible before colo checkpoint.
> > > >
> > > > If the iteration count reaches COLO_RAM_MIGRATE_ITERATION_MAX or ram
> > > > pending size is lower than 'x-colo-migrate-ram-threshold', stop the
> > > > ram migration and do colo checkpoint.
> > > >
> > > > Test environment:
> > > > The both primary VM and secondary VM has 1GiB ram and 10GbE NIC for
> > > > FT traffic.
> > > > One fio buffer write job runs on the guest.
> > > > The result shows the total primary VM downtime is decreased by ~40%.
> > > >
> > > > Please help to review it and suggestions are welcomed.
> > > > Thanks.
> > >
> > > Hello Derek,
> > > Sorry for the late reply.
> > > I think this is not a good idea, because it unnecessarily introduces a 
> > > delay
> > between checkpoint request and the checkpoint itself and thus impairs 
> > network
> > bound workloads due to increased network latency. Workloads that are
> > independent from network don't cause many checkpoints anyway, so it doesn't
> > help there either.
> > >
> >
>
> Hi Derek,
>
> Actually, There is a quit interesting question we should think:
> What will happen if VM continues to run after detected a mismatched state 
> between PVM and SVM,
> According to the rules of COLO, we should stop VMs immediately to sync the 
> state between PVM and SVM,
> But here, you choose them to continue to run for a while, then there may be 
> more client's network packages
> Coming, and may cause more memory pages dirty, another side effect is the new 
> network packages will not
> Be sent out with high probability, because their replies should be different 
> since the state between PVM and SVM is different.
>
> So, IMHO, it makes non-sense to let VMs to continue to run after detected 
> them in different state.
> Besides, I don't think it is easy to construct this case in tests.
>
>
> Thanks,
> Hailiang
>

Hello, Hailiang

Thanks. I got your point.
In my tests, the mismatch between packets does not happen, so the
network latency does not increase.

By the way, I've tried your commit addressing this issue.
It is useful for low dirty memory and low dirty rate workload.

But in high "buffered IO read/write" workload, it results in PVM
resends massive and same dirty ram pages  every cycle triggered
by DEFAULT_RAM_PENDING_CHECK (default 1 second) timer, so hurt the IO
performance and without improvement of downtime?
Do you have any thoughts about this?

Is it possible to separate the checkpoint invoked by the periodic
timer and the packet mismatch and to use a different strategy
to cope with the long downtime issue?

Thanks.

Regards,
Derek

> s> Hello, Lukas & Zhanghailiang
> >
> > Thanks for your opinions.
> > I went through my patch, and I feel a little confused and would like to dig 
> > into it
> > more.
> >
> > In this patch, colo_migrate_ram_before_checkpoint() is before
> > COLO_MESSAGE_CHECKPOINT_REQUEST, so the SVM and PVM should not enter
> > the pause state.
> >
> > In the meanwhile, the packets to PVM/SVM can still be compared and notify
> > inconsistency if mismatched, right?
> > Is it possible to introduce extra network latency?
> >
> > In my test (randwrite to disk by fio with direct=0), the ping from another 
> > client to
> > the PVM  using generic colo and colo used this patch are below.
> > The network latency does not increase as my expectation.
> >
> > generic colo
> > ```
> > 64 bytes from 192.168.80.18: icmp_seq=87 ttl=64 time=28.109 ms
> > 64 bytes from 192.168.80.18: icmp_seq=88 ttl=64 time=16.747 ms
> > 64 bytes from 192.168.80.18: icmp_seq=89 ttl=64 time=2388.779 ms
> > <checkpoint start
> > 64 bytes from 192.168.80.18: icmp_seq=90 ttl=64 time=1385.792 ms
> > 64 bytes 

Re: [PATCH v1 0/1] COLO: migrate dirty ram pages before colo checkpoint

2020-08-13 Thread Derek Su
On Fri, Jul 31, 2020 at 3:52 PM Lukas Straub  wrote:
>
> On Sun, 21 Jun 2020 10:10:03 +0800
> Derek Su  wrote:
>
> > This series is to reduce the guest's downtime during colo checkpoint
> > by migrating dirty ram pages as many as possible before colo checkpoint.
> >
> > If the iteration count reaches COLO_RAM_MIGRATE_ITERATION_MAX or
> > ram pending size is lower than 'x-colo-migrate-ram-threshold',
> > stop the ram migration and do colo checkpoint.
> >
> > Test environment:
> > The both primary VM and secondary VM has 1GiB ram and 10GbE NIC
> > for FT traffic.
> > One fio buffer write job runs on the guest.
> > The result shows the total primary VM downtime is decreased by ~40%.
> >
> > Please help to review it and suggestions are welcomed.
> > Thanks.
>
> Hello Derek,
> Sorry for the late reply.
> I think this is not a good idea, because it unnecessarily introduces a delay 
> between checkpoint request and the checkpoint itself and thus impairs network 
> bound workloads due to increased network latency. Workloads that are 
> independent from network don't cause many checkpoints anyway, so it doesn't 
> help there either.
>

Hello, Lukas & Zhanghailiang

Thanks for your opinions.
I went through my patch, and I feel a little confused and would like
to dig into it more.

In this patch, colo_migrate_ram_before_checkpoint() is before
COLO_MESSAGE_CHECKPOINT_REQUEST,
so the SVM and PVM should not enter the pause state.

In the meanwhile, the packets to PVM/SVM can still be compared and
notify inconsistency if mismatched, right?
Is it possible to introduce extra network latency?

In my test (randwrite to disk by fio with direct=0),
the ping from another client to the PVM  using generic colo and colo
used this patch are below.
The network latency does not increase as my expectation.

generic colo
```
64 bytes from 192.168.80.18: icmp_seq=87 ttl=64 time=28.109 ms
64 bytes from 192.168.80.18: icmp_seq=88 ttl=64 time=16.747 ms
64 bytes from 192.168.80.18: icmp_seq=89 ttl=64 time=2388.779 ms
<checkpoint start
64 bytes from 192.168.80.18: icmp_seq=90 ttl=64 time=1385.792 ms
64 bytes from 192.168.80.18: icmp_seq=91 ttl=64 time=384.896 ms
<checkpoint end
64 bytes from 192.168.80.18: icmp_seq=92 ttl=64 time=3.895 ms
64 bytes from 192.168.80.18: icmp_seq=93 ttl=64 time=1.020 ms
64 bytes from 192.168.80.18: icmp_seq=94 ttl=64 time=0.865 ms
64 bytes from 192.168.80.18: icmp_seq=95 ttl=64 time=0.854 ms
64 bytes from 192.168.80.18: icmp_seq=96 ttl=64 time=28.359 ms
64 bytes from 192.168.80.18: icmp_seq=97 ttl=64 time=12.309 ms
64 bytes from 192.168.80.18: icmp_seq=98 ttl=64 time=0.870 ms
64 bytes from 192.168.80.18: icmp_seq=99 ttl=64 time=2371.733 ms
64 bytes from 192.168.80.18: icmp_seq=100 ttl=64 time=1371.440 ms
64 bytes from 192.168.80.18: icmp_seq=101 ttl=64 time=366.414 ms
64 bytes from 192.168.80.18: icmp_seq=102 ttl=64 time=0.818 ms
64 bytes from 192.168.80.18: icmp_seq=103 ttl=64 time=0.997 ms
```

colo used this patch
```
64 bytes from 192.168.80.18: icmp_seq=72 ttl=64 time=1.417 ms
64 bytes from 192.168.80.18: icmp_seq=73 ttl=64 time=0.931 ms
64 bytes from 192.168.80.18: icmp_seq=74 ttl=64 time=0.876 ms
64 bytes from 192.168.80.18: icmp_seq=75 ttl=64 time=1184.034 ms
<checkpoint start
64 bytes from 192.168.80.18: icmp_seq=76 ttl=64 time=181.297 ms
<checkpoint end
64 bytes from 192.168.80.18: icmp_seq=77 ttl=64 time=0.865 ms
64 bytes from 192.168.80.18: icmp_seq=78 ttl=64 time=0.858 ms
64 bytes from 192.168.80.18: icmp_seq=79 ttl=64 time=1.247 ms
64 bytes from 192.168.80.18: icmp_seq=80 ttl=64 time=0.946 ms
64 bytes from 192.168.80.18: icmp_seq=81 ttl=64 time=0.855 ms
64 bytes from 192.168.80.18: icmp_seq=82 ttl=64 time=0.868 ms
64 bytes from 192.168.80.18: icmp_seq=83 ttl=64 time=0.749 ms
64 bytes from 192.168.80.18: icmp_seq=84 ttl=64 time=2.154 ms
64 bytes from 192.168.80.18: icmp_seq=85 ttl=64 time=1499.186 ms
64 bytes from 192.168.80.18: icmp_seq=86 ttl=64 time=496.173 ms
64 bytes from 192.168.80.18: icmp_seq=87 ttl=64 time=0.854 ms
64 bytes from 192.168.80.18: icmp_seq=88 ttl=64 time=0.774 ms
```

Thank you.

Regards,
Derek

> Hailang did have a patch to migrate ram between checkpoints, which should 
> help all workloads, but it wasn't merged back then. I think you can pick it 
> up again, rebase and address David's and Eric's comments:
> https://lore.kernel.org/qemu-devel/20200217012049.22988-3-zhang.zhanghaili...@huawei.com/T/#u
>
> Hailang, are you ok with that?
>
> Regards,
> Lukas Straub



Re: [Virtio-fs] virtio-fs performance

2020-08-04 Thread Derek Su
Vivek Goyal  於 2020年7月28日 週二 下午11:27寫道:
>
> On Tue, Jul 28, 2020 at 02:49:36PM +0100, Stefan Hajnoczi wrote:
> > > I'm trying and testing the virtio-fs feature in QEMU v5.0.0.
> > > My host and guest OS are both ubuntu 18.04 with kernel 5.4, and the
> > > underlying storage is one single SSD.
> > >
> > > The configuations are:
> > > (1) virtiofsd
> > > ./virtiofsd -o
> > > source=/mnt/ssd/virtiofs,cache=auto,flock,posix_lock,writeback,xattr
> > > --thread-pool-size=1 --socket-path=/tmp/vhostqemu
> > >
> > > (2) qemu
> > > qemu-system-x86_64 \
> > > -enable-kvm \
> > > -name ubuntu \
> > > -cpu Westmere \
> > > -m 4096 \
> > > -global kvm-apic.vapic=false \
> > > -netdev 
> > > tap,id=hn0,vhost=off,br=br0,helper=/usr/local/libexec/qemu-bridge-helper
> > > \
> > > -device e1000,id=e0,netdev=hn0 \
> > > -blockdev '{"node-name": "disk0", "driver": "qcow2",
> > > "refcount-cache-size": 1638400, "l2-cache-size": 6553600, "file": {
> > > "driver": "file", "filename": "'${imagefolder}\/ubuntu.qcow2'"}}' \
> > > -device virtio-blk,drive=disk0,id=disk0 \
> > > -chardev socket,id=ch0,path=/tmp/vhostqemu \
> > > -device vhost-user-fs-pci,chardev=ch0,tag=myfs \
> > > -object memory-backend-memfd,id=mem,size=4G,share=on \
> > > -numa node,memdev=mem \
> > > -qmp stdio \
> > > -vnc :0
> > >
> > > (3) guest
> > > mount -t virtiofs myfs /mnt/virtiofs
> > >
> > > I tried to change virtiofsd's --thread-pool-size value and test the
> > > storage performance by fio.
> > > Before each read/write/randread/randwrite test, the pagecaches of
> > > guest and host are dropped.
> > >
> > > ```
> > > RW="read" # or write/randread/randwrite
> > > fio --name=test --rw=$RW --bs=4k --numjobs=1 --ioengine=libaio
> > > --runtime=60 --direct=0 --iodepth=64 --size=10g
> > > --filename=/mnt/virtiofs/testfile
> > > done
>
> Couple of things.
>
> - Can you try cache=none option in virtiofsd. That will bypass page
>   cache in guest. It also gets rid of latencies related to
>   file_remove_privs() as of now.
>
> - Also with direct=0, are we really driving iodepth of 64? With direct=0
>   it is cached I/O. Is it still asynchronous at this point of time of
>   we have fallen back to synchronous I/O and driving queue depth of
>   1.

Hi, Vivek

I did not see any difference in queue depth with direct={0|1} in my fio test.
Are there more clues to dig into this issue?

>
> - With cache=auto/always, I am seeing performance issues with small writes
>   and trying to address it.
>
> https://lore.kernel.org/linux-fsdevel/20200716144032.gc422...@redhat.com/
> https://lore.kernel.org/linux-fsdevel/20200724183812.19573-1-vgo...@redhat.com/

No problem, I'll try it, thanks.

Regards,
Derek

>
> Thanks
> Vivek
>
> > > ```
> > >
> > > --thread-pool-size=64 (default)
> > > seq read: 305 MB/s
> > > seq write: 118 MB/s
> > > rand 4KB read:  IOPS
> > > rand 4KB write: 21100 IOPS
> > >
> > > --thread-pool-size=1
> > > seq read: 387 MB/s
> > > seq write: 160 MB/s
> > > rand 4KB read: 2622 IOPS
> > > rand 4KB write: 30400 IOPS
> > >
> > > The results show the performance using default-pool-size (64) is
> > > poorer than using single thread.
> > > Is it due to the lock contention of the multiple threads?
> > > When can virtio-fs get better performance using multiple threads?
> > >
> > >
> > > I also tested the performance that guest accesses host's files via
> > > NFSv4/CIFS network filesystem.
> > > The "seq read" and "randread" performance of virtio-fs are also worse
> > > than the NFSv4 and CIFS.
> > >
> > > NFSv4:
> > >   seq write: 244 MB/s
> > >   rand 4K read: 4086 IOPS
> > >
> > > I cannot figure out why the perf of NFSv4/CIFS with the network stack
> > > is better than virtio-fs.
> > > Is it expected? Or, do I have an incorrect configuration?
> >
> > No, I remember benchmarking the thread pool and did not see such a big
> > difference.
> >
> > Please use direct=1 so that each I/O results in a virtio-fs request.
> > Otherwise the I/O pattern is not directly controlled by the benchmark
> > but by the page cache (readahead, etc).
> >
> > Using numactl(8) or taskset(1) to launch virtiofsd allows you to control
> > NUMA and CPU scheduling properties. For example, you could force all 64
> > threads to run on the same host CPU using taskset to see if that helps
> > this I/O bound workload.
> >
> > fio can collect detailed statistics on queue depths and a latency
> > histogram. It would be interesting to compare the --thread-pool-size=64
> > and --thread-pool-size=1 numbers.
> >
> > Comparing the "perf record -e kvm:kvm_exit" counts between the two might
> > also be interesting.
> >
> > Stefan
>
>
>
> > ___
> > Virtio-fs mailing list
> > virtio...@redhat.com
> > https://www.redhat.com/mailman/listinfo/virtio-fs
>



Re: virtio-fs performance

2020-08-04 Thread Derek Su
Hello,

Set the cache=none in virtiofsd and direct=1 in fio,
here are the results and kvm-exit count in 5 seconds.

--thread-pool-size=64 (default)
seq read: 307 MB/s (kvm-exit count=1076463)
seq write: 430 MB/s (kvm-exit count=1302493)
rand 4KB read: 65.2k IOPS (kvm-exit count=1322899)
rand 4KB write: 97.2k IOPS (kvm-exit count=1568618)

--thread-pool-size=1
seq read: 303 MB/s (kvm-exit count=1034614)
seq write: 358 MB/s. (kvm-exit count=1537735)
rand 4KB read: 7995 IOPS (kvm-exit count=438348)
rand 4KB write: 97.7k IOPS (kvm-exit count=1907585)

The thread-pool-size=64 improves the rand 4KB read performance largely,
but doesn't increases the kvm-exit count too much.

In addition, the fio avg. clat of rand 4K write are 960us for
thread-pool-size=64 and 7700us for thread-pool-size=1.

Regards,
Derek

Stefan Hajnoczi  於 2020年7月28日 週二 下午9:49寫道:
>
> > I'm trying and testing the virtio-fs feature in QEMU v5.0.0.
> > My host and guest OS are both ubuntu 18.04 with kernel 5.4, and the
> > underlying storage is one single SSD.
> >
> > The configuations are:
> > (1) virtiofsd
> > ./virtiofsd -o
> > source=/mnt/ssd/virtiofs,cache=auto,flock,posix_lock,writeback,xattr
> > --thread-pool-size=1 --socket-path=/tmp/vhostqemu
> >
> > (2) qemu
> > qemu-system-x86_64 \
> > -enable-kvm \
> > -name ubuntu \
> > -cpu Westmere \
> > -m 4096 \
> > -global kvm-apic.vapic=false \
> > -netdev 
> > tap,id=hn0,vhost=off,br=br0,helper=/usr/local/libexec/qemu-bridge-helper
> > \
> > -device e1000,id=e0,netdev=hn0 \
> > -blockdev '{"node-name": "disk0", "driver": "qcow2",
> > "refcount-cache-size": 1638400, "l2-cache-size": 6553600, "file": {
> > "driver": "file", "filename": "'${imagefolder}\/ubuntu.qcow2'"}}' \
> > -device virtio-blk,drive=disk0,id=disk0 \
> > -chardev socket,id=ch0,path=/tmp/vhostqemu \
> > -device vhost-user-fs-pci,chardev=ch0,tag=myfs \
> > -object memory-backend-memfd,id=mem,size=4G,share=on \
> > -numa node,memdev=mem \
> > -qmp stdio \
> > -vnc :0
> >
> > (3) guest
> > mount -t virtiofs myfs /mnt/virtiofs
> >
> > I tried to change virtiofsd's --thread-pool-size value and test the
> > storage performance by fio.
> > Before each read/write/randread/randwrite test, the pagecaches of
> > guest and host are dropped.
> >
> > ```
> > RW="read" # or write/randread/randwrite
> > fio --name=test --rw=$RW --bs=4k --numjobs=1 --ioengine=libaio
> > --runtime=60 --direct=0 --iodepth=64 --size=10g
> > --filename=/mnt/virtiofs/testfile
> > done
> > ```
> >
> > --thread-pool-size=64 (default)
> > seq read: 305 MB/s
> > seq write: 118 MB/s
> > rand 4KB read:  IOPS
> > rand 4KB write: 21100 IOPS
> >
> > --thread-pool-size=1
> > seq read: 387 MB/s
> > seq write: 160 MB/s
> > rand 4KB read: 2622 IOPS
> > rand 4KB write: 30400 IOPS
> >
> > The results show the performance using default-pool-size (64) is
> > poorer than using single thread.
> > Is it due to the lock contention of the multiple threads?
> > When can virtio-fs get better performance using multiple threads?
> >
> >
> > I also tested the performance that guest accesses host's files via
> > NFSv4/CIFS network filesystem.
> > The "seq read" and "randread" performance of virtio-fs are also worse
> > than the NFSv4 and CIFS.
> >
> > NFSv4:
> >   seq write: 244 MB/s
> >   rand 4K read: 4086 IOPS
> >
> > I cannot figure out why the perf of NFSv4/CIFS with the network stack
> > is better than virtio-fs.
> > Is it expected? Or, do I have an incorrect configuration?
>
> No, I remember benchmarking the thread pool and did not see such a big
> difference.
>
> Please use direct=1 so that each I/O results in a virtio-fs request.
> Otherwise the I/O pattern is not directly controlled by the benchmark
> but by the page cache (readahead, etc).
>
> Using numactl(8) or taskset(1) to launch virtiofsd allows you to control
> NUMA and CPU scheduling properties. For example, you could force all 64
> threads to run on the same host CPU using taskset to see if that helps
> this I/O bound workload.
>
> fio can collect detailed statistics on queue depths and a latency
> histogram. It would be interesting to compare the --thread-pool-size=64
> and --thread-pool-size=1 numbers.
>
> Comparing the "perf record -e kvm:kvm_exit" counts between the two might
> also be interesting.
>
> Stefan



Re: [PATCH v1 1/1] migration/colo.c: migrate dirty ram pages before colo checkpoint

2020-07-13 Thread Derek Su

Hello,

Ping...

Anyone have comments about this path?
To reduce the downtime during checkpoints, the patch tries to migrate 
memory page as many as possible just before entering COLO state.


Thanks.

Regards,
Derek

On 2020/6/21 上午10:10, Derek Su wrote:

To reduce the guest's downtime during checkpoint, migrate dirty
ram pages as many as possible before colo checkpoint.

If the iteration count reaches COLO_RAM_MIGRATE_ITERATION_MAX or
ram pending size is lower than 'x-colo-migrate-ram-threshold',
stop the ram migration and colo checkpoint.

Signed-off-by: Derek Su 
---
  migration/colo.c   | 79 ++
  migration/migration.c  | 20 +++
  migration/trace-events |  2 ++
  monitor/hmp-cmds.c |  8 +
  qapi/migration.json| 18 --
  5 files changed, 125 insertions(+), 2 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index ea7d1e9d4e..a0c71d7c56 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -46,6 +46,13 @@ static Notifier packets_compare_notifier;
  static COLOMode last_colo_mode;
  
  #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)

+#define COLO_RAM_MIGRATE_ITERATION_MAX 10
+
+typedef enum {
+COLO_MIG_ITERATE_RESUME = 0,  /* Resume migration iteration */
+COLO_MIG_ITERATE_BREAK  = 1,  /* Break migration loop */
+} COLOMigIterateState;
+
  
  bool migration_in_colo_state(void)

  {
@@ -398,6 +405,68 @@ static uint64_t colo_receive_message_value(QEMUFile *f, 
uint32_t expect_msg,
  return value;
  }
  
+static int colo_migrate_ram_iteration(MigrationState *s, Error **errp)

+{
+int64_t threshold_size = s->parameters.x_colo_migrate_ram_threshold;
+uint64_t pending_size, pend_pre, pend_compat, pend_post;
+Error *local_err = NULL;
+int ret;
+
+if (threshold_size == 0) {
+return COLO_MIG_ITERATE_BREAK;
+}
+
+qemu_savevm_state_pending(s->to_dst_file, threshold_size,
+  _pre, _compat, _post);
+pending_size = pend_pre + pend_compat + pend_post;
+
+trace_colo_migrate_pending(pending_size, threshold_size,
+   pend_pre, pend_compat, pend_post);
+
+/* Still a significant amount to transfer */
+if (pending_size && pending_size >= threshold_size) {
+colo_send_message(s->to_dst_file, COLO_MESSAGE_MIGRATE_RAM, 
_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return COLO_MIG_ITERATE_BREAK;
+}
+
+qemu_savevm_state_iterate(s->to_dst_file, false);
+qemu_put_byte(s->to_dst_file, QEMU_VM_EOF);
+
+ret = qemu_file_get_error(s->to_dst_file);
+if (ret < 0) {
+error_setg_errno(errp, -ret, "Failed to send dirty pages");
+return COLO_MIG_ITERATE_BREAK;
+}
+
+return COLO_MIG_ITERATE_RESUME;
+}
+
+trace_colo_migration_low_pending(pending_size);
+
+return COLO_MIG_ITERATE_BREAK;
+}
+
+static void colo_migrate_ram_before_checkpoint(MigrationState *s, Error **errp)
+{
+Error *local_err = NULL;
+int iterate_count = 0;
+
+while (iterate_count++ < COLO_RAM_MIGRATE_ITERATION_MAX) {
+COLOMigIterateState state;
+
+state = colo_migrate_ram_iteration(s, _err);
+if (state == COLO_MIG_ITERATE_BREAK) {
+if (local_err) {
+error_propagate(errp, local_err);
+}
+
+return;
+}
+};
+}
+
  static int colo_do_checkpoint_transaction(MigrationState *s,
QIOChannelBuffer *bioc,
QEMUFile *fb)
@@ -405,6 +474,11 @@ static int colo_do_checkpoint_transaction(MigrationState 
*s,
  Error *local_err = NULL;
  int ret = -1;
  
+colo_migrate_ram_before_checkpoint(s, _err);

+if (local_err) {
+goto out;
+}
+
  colo_send_message(s->to_dst_file, COLO_MESSAGE_CHECKPOINT_REQUEST,
_err);
  if (local_err) {
@@ -819,6 +893,11 @@ static void 
colo_wait_handle_message(MigrationIncomingState *mis,
  case COLO_MESSAGE_CHECKPOINT_REQUEST:
  colo_incoming_process_checkpoint(mis, fb, bioc, errp);
  break;
+case COLO_MESSAGE_MIGRATE_RAM:
+if (qemu_loadvm_state_main(mis->from_src_file, mis) < 0) {
+error_setg(errp, "Load ram failed");
+}
+break;
  default:
  error_setg(errp, "Got unknown COLO message: %d", msg);
  break;
diff --git a/migration/migration.c b/migration/migration.c
index 481a590f72..390937ae5d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -88,6 +88,7 @@
  
  /* The delay time (in ms) between two COLO checkpoints */

  #define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100)
+#define DEFAULT_COLO_MIGRATE_RAM_THRESHOLD (100 * 1024 * 1024UL)
  #define DEFAULT_MIGRATE_MULTIFD_CHANNELS 2
  #define DEFAULT_MIGRATE_MULT

Re: [PATCH v1] chardev/char-socket: fix double free of err after socket is disconnected

2020-06-24 Thread Derek Su
Oops! Sorry, I dont’t notice this patch before.

Thanks.

Derek

Philippe Mathieu-Daudé 於 2020年6月24日 週三,下午6:12寫道:

> On 6/24/20 12:00 PM, Derek Su wrote:
> > The err is freed in check_report_connect_error() conditionally,
> > calling error_free() directly may lead to a double-free bug.
>
> This seems the same issue Lichun is working on, right?
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg714709.html
>
> >
> > Signed-off-by: Derek Su 
> > ---
> >  chardev/char-socket.c | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > index afebeec5c3..a009bed5ee 100644
> > --- a/chardev/char-socket.c
> > +++ b/chardev/char-socket.c
> > @@ -1086,7 +1086,11 @@ static void qemu_chr_socket_connected(QIOTask
> *task, void *opaque)
> >  if (qio_task_propagate_error(task, )) {
> >  tcp_chr_change_state(s, TCP_CHARDEV_STATE_DISCONNECTED);
> >  check_report_connect_error(chr, err);
> > -error_free(err);
> > +
> > +if (!s->connect_err_reported) {
> > +error_free(err);
> > +}
> > +
> >  goto cleanup;
> >  }
> >
> >
>
> --

Best regards,

Derek Su
QNAP Systems, Inc.
Email: dere...@qnap.com
Tel: (+886)-2-2393-5152 ext. 15017
Address: 13F., No.56, Sec. 1, Xinsheng S. Rd., Zhongzheng Dist., Taipei
City, Taiwan


[PATCH v1] chardev/char-socket: fix double free of err after socket is disconnected

2020-06-24 Thread Derek Su
The err is freed in check_report_connect_error() conditionally,
calling error_free() directly may lead to a double-free bug.

Signed-off-by: Derek Su 
---
 chardev/char-socket.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index afebeec5c3..a009bed5ee 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -1086,7 +1086,11 @@ static void qemu_chr_socket_connected(QIOTask *task, 
void *opaque)
 if (qio_task_propagate_error(task, )) {
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_DISCONNECTED);
 check_report_connect_error(chr, err);
-error_free(err);
+
+if (!s->connect_err_reported) {
+error_free(err);
+}
+
 goto cleanup;
 }
 
-- 
2.17.1




[PATCH v1 0/1] COLO: migrate dirty ram pages before colo checkpoint

2020-06-20 Thread Derek Su
This series is to reduce the guest's downtime during colo checkpoint
by migrating dirty ram pages as many as possible before colo checkpoint.

If the iteration count reaches COLO_RAM_MIGRATE_ITERATION_MAX or
ram pending size is lower than 'x-colo-migrate-ram-threshold',
stop the ram migration and do colo checkpoint.

Test environment:
The both primary VM and secondary VM has 1GiB ram and 10GbE NIC
for FT traffic.
One fio buffer write job runs on the guest. 

The result shows the total primary VM downtime is decreased by ~40%.

Please help to review it and suggestions are welcomed.
Thanks.

Derek Su (1):
  migration/colo.c: migrate dirty ram pages before colo checkpoint

 migration/colo.c   | 79 ++
 migration/migration.c  | 20 +++
 migration/trace-events |  2 ++
 monitor/hmp-cmds.c |  8 +
 qapi/migration.json| 18 --
 5 files changed, 125 insertions(+), 2 deletions(-)

-- 
2.17.1




[PATCH v1 1/1] migration/colo.c: migrate dirty ram pages before colo checkpoint

2020-06-20 Thread Derek Su
To reduce the guest's downtime during checkpoint, migrate dirty
ram pages as many as possible before colo checkpoint.

If the iteration count reaches COLO_RAM_MIGRATE_ITERATION_MAX or
ram pending size is lower than 'x-colo-migrate-ram-threshold',
stop the ram migration and colo checkpoint.

Signed-off-by: Derek Su 
---
 migration/colo.c   | 79 ++
 migration/migration.c  | 20 +++
 migration/trace-events |  2 ++
 monitor/hmp-cmds.c |  8 +
 qapi/migration.json| 18 --
 5 files changed, 125 insertions(+), 2 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index ea7d1e9d4e..a0c71d7c56 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -46,6 +46,13 @@ static Notifier packets_compare_notifier;
 static COLOMode last_colo_mode;
 
 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
+#define COLO_RAM_MIGRATE_ITERATION_MAX 10
+
+typedef enum {
+COLO_MIG_ITERATE_RESUME = 0,  /* Resume migration iteration */
+COLO_MIG_ITERATE_BREAK  = 1,  /* Break migration loop */
+} COLOMigIterateState;
+
 
 bool migration_in_colo_state(void)
 {
@@ -398,6 +405,68 @@ static uint64_t colo_receive_message_value(QEMUFile *f, 
uint32_t expect_msg,
 return value;
 }
 
+static int colo_migrate_ram_iteration(MigrationState *s, Error **errp)
+{
+int64_t threshold_size = s->parameters.x_colo_migrate_ram_threshold;
+uint64_t pending_size, pend_pre, pend_compat, pend_post;
+Error *local_err = NULL;
+int ret;
+
+if (threshold_size == 0) {
+return COLO_MIG_ITERATE_BREAK;
+}
+
+qemu_savevm_state_pending(s->to_dst_file, threshold_size,
+  _pre, _compat, _post);
+pending_size = pend_pre + pend_compat + pend_post;
+
+trace_colo_migrate_pending(pending_size, threshold_size,
+   pend_pre, pend_compat, pend_post);
+
+/* Still a significant amount to transfer */
+if (pending_size && pending_size >= threshold_size) {
+colo_send_message(s->to_dst_file, COLO_MESSAGE_MIGRATE_RAM, 
_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return COLO_MIG_ITERATE_BREAK;
+}
+
+qemu_savevm_state_iterate(s->to_dst_file, false);
+qemu_put_byte(s->to_dst_file, QEMU_VM_EOF);
+
+ret = qemu_file_get_error(s->to_dst_file);
+if (ret < 0) {
+error_setg_errno(errp, -ret, "Failed to send dirty pages");
+return COLO_MIG_ITERATE_BREAK;
+}
+
+return COLO_MIG_ITERATE_RESUME;
+}
+
+trace_colo_migration_low_pending(pending_size);
+
+return COLO_MIG_ITERATE_BREAK;
+}
+
+static void colo_migrate_ram_before_checkpoint(MigrationState *s, Error **errp)
+{
+Error *local_err = NULL;
+int iterate_count = 0;
+
+while (iterate_count++ < COLO_RAM_MIGRATE_ITERATION_MAX) {
+COLOMigIterateState state;
+
+state = colo_migrate_ram_iteration(s, _err);
+if (state == COLO_MIG_ITERATE_BREAK) {
+if (local_err) {
+error_propagate(errp, local_err);
+}
+
+return;
+}
+};
+}
+
 static int colo_do_checkpoint_transaction(MigrationState *s,
   QIOChannelBuffer *bioc,
   QEMUFile *fb)
@@ -405,6 +474,11 @@ static int colo_do_checkpoint_transaction(MigrationState 
*s,
 Error *local_err = NULL;
 int ret = -1;
 
+colo_migrate_ram_before_checkpoint(s, _err);
+if (local_err) {
+goto out;
+}
+
 colo_send_message(s->to_dst_file, COLO_MESSAGE_CHECKPOINT_REQUEST,
   _err);
 if (local_err) {
@@ -819,6 +893,11 @@ static void 
colo_wait_handle_message(MigrationIncomingState *mis,
 case COLO_MESSAGE_CHECKPOINT_REQUEST:
 colo_incoming_process_checkpoint(mis, fb, bioc, errp);
 break;
+case COLO_MESSAGE_MIGRATE_RAM:
+if (qemu_loadvm_state_main(mis->from_src_file, mis) < 0) {
+error_setg(errp, "Load ram failed");
+}
+break;
 default:
 error_setg(errp, "Got unknown COLO message: %d", msg);
 break;
diff --git a/migration/migration.c b/migration/migration.c
index 481a590f72..390937ae5d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -88,6 +88,7 @@
 
 /* The delay time (in ms) between two COLO checkpoints */
 #define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100)
+#define DEFAULT_COLO_MIGRATE_RAM_THRESHOLD (100 * 1024 * 1024UL)
 #define DEFAULT_MIGRATE_MULTIFD_CHANNELS 2
 #define DEFAULT_MIGRATE_MULTIFD_COMPRESSION MULTIFD_COMPRESSION_NONE
 /* 0: means nocompress, 1: best speed, ... 9: best compress ratio */
@@ -800,6 +801,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error 
**errp)
 params->downtime_limit = s->parameters.downtime_limit;
 params->has_x_checkpoi

Re: cluster_size got from backup_calculate_cluster_size()

2020-05-21 Thread Derek Su

On 2020/5/22 上午4:53, Vladimir Sementsov-Ogievskiy wrote:

21.05.2020 21:19, John Snow wrote:



On 5/21/20 5:56 AM, Derek Su wrote:

Hi,

The cluster_size got from backup_calculate_cluster_size(),
MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size), is 64K regardless
of the target image's cluster size.


Not regardless -- but it is using 64K as a minimum.


Right, 64K is a minimum. Thanks for the correctness.






For example:

If the cluster size of source and target qcow2 images are both 16K, the
64K from backup_calculate_cluster_size() results in unwanted copies of
clusters.

The behavior slows down the backup (block-copy) process when the
source image receives lots of rand writes.


Are we talking about incremental, full, or top?


Our use case is MIRROR_SYNC_MODE_NONE (only copy new writes from source 
image to the target).







Is the following calculation reasonable for the above issue?


```
static int64_t backup_calculate_cluster_size(BlockDriverState *target,
  Error **errp)
{
 ...

 ret = bdrv_get_info(target, );

 ...

 return (bdi.cluster_size == 0 ?
 BACKUP_CLUSTER_SIZE_DEFAULT : cluster_size);
}

```

Thanks.
Regards,

Derek




Might be reasonable. When I made this the "default", it actually used to
just be "hardcoded." We could continue in this direction and go all the
way and turn it into a tune-able.

I didn't think to make it lower than 64K because I was thinking about
the linear, full backup case where cluster sizes that were too small
might slow down the loop with too much metadata.


Yes, currently backup-loop is copying cluster-by-cluster, so if we allow 
clusters less than 64k, it may become slower (at least for full-backup). 
Still, my work about backup-performance is close to its end, and I hope, 
in 5.1 will be merged. One of effects is that backup copying loop may 
copy more than a cluster at once, so this problem will gone.


Keeping this in mind, I think we can allow smaller clusters now, as 
anyway, small clusters is a rare special case.




(And the default qcow2 is 64K, so it seemed like a good choice at the 
time.)


We could create a default-cluster-size tunable, perhaps. It's at 64K
now, but perhaps you can override it down to 16 or lower. We could
possibly treat a value of 0 as "no minimum; use the smallest common 
size."


This is a separate issue entirely, but we may also wish to begin
offering a cluster-size override directly. In the case that we know this
value is unsafe, we reject the command. In the case that it's ambiguous
due to lack of info, we can defer to the user's judgment. This would
allow us to force the backup to run in cases where we presently abort
out of fear.

CCing Vladimir who has been working on the backup loop extensively.

--js






Yes. I agree with that the minimum 64K or larger is a good choice for 
the linear and full backup choice.


In the COLO application (https://wiki.qemu.org/Features/COLO), the 
primary vm (PVM)'s write requests are replicated to the secondary 
vm(SVM)'s secondary disk. The writes also trigger block copies copying 
the old data on the secondary disk (source) to the hidden disk (target). 
However, the checkpoint mechanism empties the hidden disk periodically, 
so it is a endless backup (block-copy) process.


If the PVM write requests are random 4K blocks, the block copy on SVM 
becomes inefficient in the above use case.


I conducted some performance tests.
If the qcow2 cluster size 16K, compared with the 64K backup granularity 
size, the 16K backup granularity size shows that the seq. write 
performance decreases by 10% and the rand. write performance increases 
by 40%.


I think a tunable default-cluster-size seems good for such applications.

Thanks.
Regards,
Derek



cluster_size got from backup_calculate_cluster_size()

2020-05-21 Thread Derek Su

Hi,

The cluster_size got from backup_calculate_cluster_size(),
MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size), is 64K regardless
of the target image's cluster size.


For example:

If the cluster size of source and target qcow2 images are both 16K, the 
64K from backup_calculate_cluster_size() results in unwanted copies of 
clusters.


The behavior slows down the backup (block-copy) process when the
source image receives lots of rand writes.


Is the following calculation reasonable for the above issue?


```
static int64_t backup_calculate_cluster_size(BlockDriverState *target,
 Error **errp)
{
...

ret = bdrv_get_info(target, );

...

return (bdi.cluster_size == 0 ?
BACKUP_CLUSTER_SIZE_DEFAULT : cluster_size);
}

```

Thanks.
Regards,

Derek



Re: [PATCH v5 00/31] Add subcluster allocation to qcow2

2020-05-20 Thread Derek Su

Hi, Berto

Excuse me, I'd like to test v5, but I failed to apply the series to 
master branch. Which commit can I use?


Thanks.

Regards,
Derek

On 2020/5/6 上午1:38, Alberto Garcia wrote:

Hi,

here's the new version of the patches to add subcluster allocation
support to qcow2.

Please refer to the cover letter of the first version for a full
description of the patches:

https://lists.gnu.org/archive/html/qemu-block/2019-10/msg00983.html

Important changes here:

- I fixed hopefully all of the issues mentioned in the previous
   review. Thanks to everyone who contributed.

- There's now support for partial zeroing of clusters (i.e. at the
   subcluster level).

- Many more tests.

- QCOW_OFLAG_ZERO is simply ignored now and not considered a sign of a
   corrupt image anymore. I hesitated about this, but we could still
   add that check later. I think there's a case for adding a new
   QCOW2_CLUSTER_INVALID type and include this and other scenarios that
   we already consider corrupt (for example: clusters with unaligned
   offsets). We would need to see if for 'qemu-img check' adding
   QCOW2_CLUSTER_INVALID complicates things or not. But I think that
   all is material for its own series.

And I think that's all. See below for the detailed list of changes,
and thanks again for the feedback.

Berto

v5:
- Patch 01: Fix indentation [Max], add trace event [Vladimir]
- Patch 02: Add host_cluster_offset variable [Vladirmir]
- Patch 05: Have separate l2_entry and cluster_offset variables [Vladimir]
- Patch 06: Only context changes due to patch 05
- Patch 11: New patch
- Patch 13: Change documentation of get_l2_entry()
- Patch 14: Add QCOW_OFLAG_SUB_{ALLOC,ZERO}_RANGE [Eric] and rewrite
 the other macros.
 Ignore QCOW_OFLAG_ZERO on images with subclusters
 (i.e. don't treat them as corrupted).
- Patch 15: New patch
- Patch 19: Optimize cow by skipping all leading and trailing zero and
 unallocated subclusters [Vladimir]
 Return 0 on success [Vladimir]
 Squash patch that updated handle_dependencies() [Vladirmir]
- Patch 20: Call count_contiguous_subclusters() after the main switch
 in qcow2_get_host_offset() [Vladimir]
 Add assertion and remove goto statement [Vladimir]
- Patch 21: Rewrite algorithm.
- Patch 22: Rewrite algorithm.
- Patch 24: Replace loop with the _RANGE macros from patch 14 [Eric]
- Patch 27: New patch
- Patch 28: Update version number and expected output from tests.
- Patch 31: Add many more new tests

v4: https://lists.gnu.org/archive/html/qemu-block/2020-03/msg00966.html
v3: https://lists.gnu.org/archive/html/qemu-block/2019-12/msg00587.html
v2: https://lists.gnu.org/archive/html/qemu-block/2019-10/msg01642.html
v1: https://lists.gnu.org/archive/html/qemu-block/2019-10/msg00983.html

Output of git backport-diff against v4:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/31:[0005] [FC] 'qcow2: Make Qcow2AioTask store the full host offset'
002/31:[0018] [FC] 'qcow2: Convert qcow2_get_cluster_offset() into 
qcow2_get_host_offset()'
003/31:[] [--] 'qcow2: Add calculate_l2_meta()'
004/31:[] [--] 'qcow2: Split cluster_needs_cow() out of 
count_cow_clusters()'
005/31:[0038] [FC] 'qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in 
handle_copied()'
006/31:[0004] [FC] 'qcow2: Add get_l2_entry() and set_l2_entry()'
007/31:[] [--] 'qcow2: Document the Extended L2 Entries feature'
008/31:[] [--] 'qcow2: Add dummy has_subclusters() function'
009/31:[] [--] 'qcow2: Add subcluster-related fields to BDRVQcow2State'
010/31:[] [--] 'qcow2: Add offset_to_sc_index()'
011/31:[down] 'qcow2: Add offset_into_subcluster() and size_to_subclusters()'
012/31:[] [--] 'qcow2: Add l2_entry_size()'
013/31:[0003] [FC] 'qcow2: Update get/set_l2_entry() and add 
get/set_l2_bitmap()'
014/31:[0023] [FC] 'qcow2: Add QCow2SubclusterType and 
qcow2_get_subcluster_type()'
015/31:[down] 'qcow2: Add qcow2_cluster_is_allocated()'
016/31:[] [--] 'qcow2: Add cluster type parameter to 
qcow2_get_host_offset()'
017/31:[] [--] 'qcow2: Replace QCOW2_CLUSTER_* with QCOW2_SUBCLUSTER_*'
018/31:[] [--] 'qcow2: Handle QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC'
019/31:[0066] [FC] 'qcow2: Add subcluster support to calculate_l2_meta()'
020/31:[0022] [FC] 'qcow2: Add subcluster support to qcow2_get_host_offset()'
021/31:[0040] [FC] 'qcow2: Add subcluster support to zero_in_l2_slice()'
022/31:[0061] [FC] 'qcow2: Add subcluster support to discard_in_l2_slice()'
023/31:[] [--] 'qcow2: Add subcluster support to check_refcounts_l2()'
024/31:[0019] [FC] 'qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2()'
025/31:[] [--] 'qcow2: Clear the L2 bitmap when allocating a compressed 
cluster'
026/31:[] [--] 'qcow2: Add subcluster 

Re: [PATCH v4 5/6] net/colo-compare.c, softmmu/vl.c: Check that colo-compare is active

2020-05-07 Thread Derek Su

On 2020/5/8 上午10:26, Zhang, Chen wrote:




-Original Message-
From: Lukas Straub 
Sent: Thursday, May 7, 2020 11:54 PM
To: Zhang, Chen 
Cc: qemu-devel ; Li Zhijian
; Jason Wang ; Marc-
André Lureau ; Paolo Bonzini

Subject: Re: [PATCH v4 5/6] net/colo-compare.c, softmmu/vl.c: Check that
colo-compare is active

On Thu, 7 May 2020 11:38:04 +
"Zhang, Chen"  wrote:


-Original Message-
From: Lukas Straub 
Sent: Monday, May 4, 2020 6:28 PM
To: qemu-devel 
Cc: Zhang, Chen ; Li Zhijian
; Jason Wang ; Marc-
André Lureau ; Paolo Bonzini

Subject: [PATCH v4 5/6] net/colo-compare.c, softmmu/vl.c: Check that
colo- compare is active

If the colo-compare object is removed before failover and a
checkpoint happens, qemu crashes because it tries to lock the
destroyed event_mtx in colo_notify_compares_event.

Fix this by checking if everything is initialized by introducing a
new variable colo_compare_active which is protected by a new mutex
colo_compare_mutex. The new mutex also protects against concurrent
access of the net_compares list and makes sure that
colo_notify_compares_event isn't active while we destroy event_mtx
and event_complete_cond.

With this it also is again possible to use colo without colo-compare
(periodic
mode) and to use multiple colo-compare for multiple network interfaces.



Hi Lukas,

For this case I think we don't need to touch vl.c code, we can solve this

issue from another perspective:

How to remove colo-compare?
User will use qemu-monitor or QMP command to disable an object, so we
just need return operation failed When user try to remove colo-compare

object while COLO is running.

Yeah, but that still leaves the other problem that colo can't be used without
colo-compare (qemu crashes then).


Yes, the COLO-compare is necessary module in COLO original design.
At most cases, user need it do dynamic sync.
For rare cases, maybe we can add a new colo-compare parameter to bypass all the 
network workload.


Hi, Chen

In our application, we only need "periodical mode" because of the 
performance issue, and have internal patch now. Is it OK to send the 
internal patch for review?


Thanks.

Regards,
Derek



Thanks
Zhang Chen



Regards,
Lukas Straub


Thanks
Zhang Chen


Signed-off-by: Lukas Straub 
---
  net/colo-compare.c | 35 +--
  net/colo-compare.h |  1 +
  softmmu/vl.c   |  2 ++
  3 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c index
56db3d3bfc..c7572d75e9 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -54,6 +54,8 @@ static NotifierList colo_compare_notifiers =
#define REGULAR_PACKET_CHECK_MS 3000  #define

DEFAULT_TIME_OUT_MS

3000

+static QemuMutex colo_compare_mutex; static bool
+colo_compare_active;
  static QemuMutex event_mtx;
  static QemuCond event_complete_cond;  static int
event_unhandled_count; @@ -906,6 +908,12 @@ static void
check_old_packet_regular(void *opaque) void
colo_notify_compares_event(void *opaque, int event, Error **errp)  {
  CompareState *s;
+qemu_mutex_lock(_compare_mutex);
+
+if (!colo_compare_active) {
+qemu_mutex_unlock(_compare_mutex);
+return;
+}

  qemu_mutex_lock(_mtx);
  QTAILQ_FOREACH(s, _compares, next) { @@ -919,6 +927,7 @@
void colo_notify_compares_event(void *opaque, int event, Error **errp)
  }

  qemu_mutex_unlock(_mtx);
+qemu_mutex_unlock(_compare_mutex);
  }

  static void colo_compare_timer_init(CompareState *s) @@ -1274,7
+1283,14 @@ static void colo_compare_complete(UserCreatable *uc,

Error **errp)

 s->vnet_hdr);
  }

+qemu_mutex_lock(_compare_mutex);
+if (!colo_compare_active) {
+qemu_mutex_init(_mtx);
+qemu_cond_init(_complete_cond);
+colo_compare_active = true;
+}
  QTAILQ_INSERT_TAIL(_compares, s, next);
+qemu_mutex_unlock(_compare_mutex);

  s->out_sendco.s = s;
  s->out_sendco.chr = >chr_out; @@ -1290,9 +1306,6 @@ static
void colo_compare_complete(UserCreatable
*uc, Error **errp)

  g_queue_init(>conn_list);

-qemu_mutex_init(_mtx);
-qemu_cond_init(_complete_cond);
-
  s->connection_track_table =
g_hash_table_new_full(connection_key_hash,
connection_key_equal,
g_free, @@
-1384,12 +1397,19 @@ static void colo_compare_finalize(Object *obj)

  qemu_bh_delete(s->event_bh);

+qemu_mutex_lock(_compare_mutex);
  QTAILQ_FOREACH(tmp, _compares, next) {
  if (tmp == s) {
  QTAILQ_REMOVE(_compares, s, next);
  break;
  }
  }
+if (QTAILQ_EMPTY(_compares)) {
+colo_compare_active = false;
+qemu_mutex_destroy(_mtx);
+qemu_cond_destroy(_complete_cond);
+}
+qemu_mutex_unlock(_compare_mutex);

  AioContext *ctx = iothread_get_aio_context(s->iothread);
  

Re: [PATCH 1/3] net/colo-compare.c: Create event_bh with the right AioContext

2020-04-23 Thread Derek Su

On 2020/4/23 下午3:29, Zhang, Chen wrote:




-Original Message-
From: Lukas Straub 
Sent: Wednesday, April 22, 2020 5:40 PM
To: Zhang, Chen 
Cc: qemu-devel ; Li Zhijian
; Jason Wang ; Marc-
André Lureau ; Paolo Bonzini

Subject: Re: [PATCH 1/3] net/colo-compare.c: Create event_bh with the right
AioContext

On Wed, 22 Apr 2020 09:03:00 +
"Zhang, Chen"  wrote:


-Original Message-
From: Lukas Straub 
Sent: Wednesday, April 22, 2020 4:43 PM
To: Zhang, Chen 
Cc: qemu-devel ; Li Zhijian
; Jason Wang ; Marc-
André Lureau ; Paolo Bonzini

Subject: Re: [PATCH 1/3] net/colo-compare.c: Create event_bh with
the right AioContext

On Wed, 22 Apr 2020 08:29:39 +
"Zhang, Chen"  wrote:


-Original Message-
From: Lukas Straub 
Sent: Thursday, April 9, 2020 2:34 AM
To: qemu-devel 
Cc: Zhang, Chen ; Li Zhijian
; Jason Wang ;
Marc- André Lureau ; Paolo Bonzini

Subject: [PATCH 1/3] net/colo-compare.c: Create event_bh with
the right AioContext

qemu_bh_new will set the bh to be executed in the main loop.
This causes problems as colo_compare_handle_event assumes that
it has exclusive access the queues, which are also accessed in
the iothread. It also assumes that it runs in a different thread
than the caller and takes the appropriate locks.

Create the bh with the AioContext of the iothread to fulfill
these assumptions.



Looks good for me, I assume it will increase performance. Do you
have

related data?

No, this fixes several crashes because the queues where accessed
concurrently from multiple threads. Sorry for my bad wording.


Can you describe some details about the crash? Step by step?
Maybe I can re-produce and test it for this patch.


There is no clear test case. For me the crashes happened after 1-20h of
runtime with lots of checkpoints (800ms) and some network traffic. The
coredump always showed that two threads where doing operations on the
queues simultaneously.
Unfortunately, I don't have the coredumps anymore.


OK, Although I have not encountered the problem you described.
I have test this patch, looks running fine.

Reviewed-by: Zhang Chen 

Thanks
Zhang Chen



Hi,

I encountered PVM crash caused by the race condition issue in v4.2.0.
Here is the coredump for reference.

```
warning: core file may not match specified executable file.
 Core was generated by `qemu-system-x86_64 -name source -enable-kvm 
-cpu core2duo -m 1024 -global kvm-a'.

 Program terminated with signal SIGSEGV, Segmentation fault.
 #0 0x55cb478bcd25 in qemu_hexdump (buf=0x0, fp=0x7f6e9122b680 
, prefix=0x55cb47a388f5 "colo-compare spkt", size=1514) 
at util/hexdump.c:34

 34 fprintf(fp, " %02x", (unsigned char)buf[b + i]);
 [Current thread is 1 (Thread 0x7f6da1ade700 (LWP 6119))]
 (gdb) where
 #0 0x55cb478bcd25 in qemu_hexdump (buf=0x0, fp=0x7f6e9122b680 
, prefix=0x55cb47a388f5 "colo-compare spkt", size=1514) 
at util/hexdump.c:34
 #1 0x55cb476fa1b5 in colo_compare_tcp (s=0x55cb496429f0, 
conn=0x7f6e78003e30) at net/colo-compare.c:462
 #2 0x55cb476fa8c1 in colo_compare_connection 
(opaque=0x7f6e78003e30, user_data=0x55cb496429f0) at net/colo-compare.c:687
 #3 0x55cb476fb4ab in compare_pri_rs_finalize 
(pri_rs=0x55cb49642b18) at net/colo-compare.c:1001
 #4 0x55cb476eb46f in net_fill_rstate (rs=0x55cb49642b18, 
buf=0x7f6da1add2c8 "", size=1064) at net/net.c:1764
 #5 0x55cb476faafa in compare_pri_chr_in (opaque=0x55cb496429f0, 
buf=0x7f6da1adc6f0 "`E˧V\210RT", size=4096) at net/colo-compare.c:776
 #6 0x55cb47815363 in qemu_chr_be_write_impl (s=0x55cb48c87ec0, 
buf=0x7f6da1adc6f0 "`E˧V\210RT", len=4096) at chardev/char.c:177
 #7 0x55cb478153c7 in qemu_chr_be_write (s=0x55cb48c87ec0, 
buf=0x7f6da1adc6f0 "`E˧V\210RT", len=4096) at chardev/char.c:189
 #8 0x55cb4781e002 in tcp_chr_read (chan=0x55cb48ef7690, 
cond=G_IO_IN, opaque=0x55cb48c87ec0) at chardev/char-socket.c:525
 #9 0x55cb47839655 in qio_channel_fd_source_dispatch 
(source=0x7f6e78002050, callback=0x55cb4781de53 , 
user_data=0x55cb48c87ec0) at io/channel-watch.c:84
 #10 0x7f6e950e1285 in g_main_context_dispatch () at 
/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0

 #11 0x7f6e950e1650 in () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
 #12 0x7f6e950e1962 in g_main_loop_run () at 
/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
 #13 0x55cb474920ae in iothread_run (opaque=0x55cb48c67f10) at 
iothread.c:82
 #14 0x55cb478a699d in qemu_thread_start (args=0x55cb498035d0) at 
util/qemu-thread-posix.c:519
 #15 0x7f6e912376db in start_thread (arg=0x7f6da1ade700) at 
pthread_create.c:463
 #16 0x7f6e90f6088f in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

 (gdb)
```

COLO works well after applying this patch in my tests.

Reviewed-by: Derek Su 
Tested-by: Derek Su 

Regards,
Derek








Regards,
Lukas Straub


Thanks
Zhang Chen



Regards,
Lukas Straub


Thanks
Zhang Chen


Signed-off-by: Lu

Re: [PATCH v4 00/30] Add subcluster allocation to qcow2

2020-04-20 Thread Derek Su

Hello,

This work is promising and interesting.
I'd like to try this new feature.
Could you please export a branch because the patches cannot be applied to 
current master?
Thanks.

Regards,
Derek


On 2020/3/18 上午2:15, Alberto Garcia wrote:

Hi,

here's the new version of the patches to add subcluster allocation
support to qcow2.

Please refer to the cover letter of the first version for a full
description of the patches:

https://lists.gnu.org/archive/html/qemu-block/2019-10/msg00983.html

I think that this version fixes all the problems pointed out by Max
and Eric during the review a couple of weeks ago. I also dropped the
RFC tag.

Berto

v4:
- Patch 01: New patch
- Patch 02: New patch
- Patch 05: Documentation updates [Eric]
- Patch 06: Fix rebase conflicts
- Patch 07: Change bit order in the subcluster allocation bitmap.
 Change incompatible bit number. [Max, Eric]
- Patch 09: Rename QCOW_MAX_SUBCLUSTERS_PER_CLUSTER to
 QCOW_EXTL2_SUBCLUSTERS_PER_CLUSTER [Eric]
- Patch 13: Change bit order in the subcluster allocation bitmap [Max, Eric]
 Add more documentation.
 Ignore the subcluster bitmap in the L2 entries of
 compressed clusters.
- Patch 14: New patch
- Patch 15: Update to work with the changes from patches 02 and 14.
- Patch 16: Update to work with the changes from patches 02 and 14.
- Patch 18: Update to work with the changes from patches 02 and 14.
 Update documentation.
 Fix return value on early exit.
- Patch 20: Make sure to clear the subcluster allocation bitmap when a
 cluster is unallocated.
- Patch 26: Update to work with the changes from patch 14.
- Patch 27: New patch [Max]
- Patch 28: Update version number, incompatible bit number and test
 expectations.
- Patch 30: Add new tests.
 Make the test verify its own results. [Max]

v3: https://lists.gnu.org/archive/html/qemu-block/2019-12/msg00587.html
- Patch 01: Rename host_offset to host_cluster_offset and make 'bytes'
 an unsigned int [Max]
- Patch 03: Rename cluster_needs_cow to cluster_needs_new_alloc and
 count_cow_clusters to count_single_write_clusters. Update
 documentation and add more assertions and checks [Max]
- Patch 09: Update qcow2_co_truncate() to properly support extended L2
 entries [Max]
- Patch 10: Forbid calling set_l2_bitmap() if the image does not have
 extended L2 entries [Max]
- Patch 11 (new): Add QCow2SubclusterType [Max]
- Patch 12 (new): Replace QCOW2_CLUSTER_* with QCOW2_SUBCLUSTER_*
- Patch 13 (new): Handle QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC
- Patch 14: Use QCow2SubclusterType instead of QCow2ClusterType [Max]
- Patch 15: Use QCow2SubclusterType instead of QCow2ClusterType [Max]
- Patch 19: Don't call set_l2_bitmap() if the image does not have
 extended L2 entries [Max]
- Patch 21: Use smaller data types.
- Patch 22: Don't call set_l2_bitmap() if the image does not have
 extended L2 entries [Max]
- Patch 23: Use smaller data types.
- Patch 25: Update test results and documentation. Move the check for
 the minimum subcluster size to validate_cluster_size().
- Patch 26 (new): Add subcluster support to qcow2_measure()
- Patch 27: Add more tests

v2: https://lists.gnu.org/archive/html/qemu-block/2019-10/msg01642.html
- Patch 12: Update after the changes in 88f468e546.
- Patch 21 (new): Clear the L2 bitmap when allocating a compressed
   cluster. Compressed clusters should have the bitmap all set to 0.
- Patch 24: Document the new fields in the QAPI documentation [Eric].
- Patch 25: Allow qcow2 preallocation with backing files.
- Patch 26: Add some tests for qcow2 images with extended L2 entries.

v1: https://lists.gnu.org/archive/html/qemu-block/2019-10/msg00983.html

Output of git backport-diff against v3:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/30:[down] 'qcow2: Make Qcow2AioTask store the full host offset'
002/30:[down] 'qcow2: Convert qcow2_get_cluster_offset() into 
qcow2_get_host_offset()'
003/30:[] [-C] 'qcow2: Add calculate_l2_meta()'
004/30:[] [--] 'qcow2: Split cluster_needs_cow() out of 
count_cow_clusters()'
005/30:[0020] [FC] 'qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in 
handle_copied()'
006/30:[0010] [FC] 'qcow2: Add get_l2_entry() and set_l2_entry()'
007/30:[0020] [FC] 'qcow2: Document the Extended L2 Entries feature'
008/30:[] [--] 'qcow2: Add dummy has_subclusters() function'
009/30:[0004] [FC] 'qcow2: Add subcluster-related fields to BDRVQcow2State'
010/30:[] [--] 'qcow2: Add offset_to_sc_index()'
011/30:[] [-C] 'qcow2: Add l2_entry_size()'
012/30:[] [--] 'qcow2: Update get/set_l2_entry() and add 
get/set_l2_bitmap()'
013/30:[0046] [FC] 'qcow2: Add 

[PATCH v5 1/1] colo-compare: Fix memory leak in packet_enqueue()

2020-04-09 Thread Derek Su
The patch is to fix the "pkt" memory leak in packet_enqueue().
The allocated "pkt" needs to be freed if the colo compare
primary or secondary queue is too big.

Replace the error_report of full queue with a trace event.

Signed-off-by: Derek Su 
---
 net/colo-compare.c | 23 +++
 net/trace-events   |  1 +
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 10c0239f9d..035e11d4d3 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -122,6 +122,10 @@ enum {
 SECONDARY_IN,
 };
 
+static const char *colo_mode[] = {
+[PRIMARY_IN] = "primary",
+[SECONDARY_IN] = "secondary",
+};
 
 static int compare_chr_send(CompareState *s,
 const uint8_t *buf,
@@ -217,6 +221,7 @@ static int packet_enqueue(CompareState *s, int mode, 
Connection **con)
 ConnectionKey key;
 Packet *pkt = NULL;
 Connection *conn;
+int ret;
 
 if (mode == PRIMARY_IN) {
 pkt = packet_new(s->pri_rs.buf,
@@ -245,16 +250,18 @@ static int packet_enqueue(CompareState *s, int mode, 
Connection **con)
 }
 
 if (mode == PRIMARY_IN) {
-if (!colo_insert_packet(>primary_list, pkt, >pack)) {
-error_report("colo compare primary queue size too big,"
- "drop packet");
-}
+ret = colo_insert_packet(>primary_list, pkt, >pack);
 } else {
-if (!colo_insert_packet(>secondary_list, pkt, >sack)) {
-error_report("colo compare secondary queue size too big,"
- "drop packet");
-}
+ret = colo_insert_packet(>secondary_list, pkt, >sack);
 }
+
+if (!ret) {
+trace_colo_compare_drop_packet(colo_mode[mode],
+"queue size too big, drop packet");
+packet_destroy(pkt, NULL);
+pkt = NULL;
+}
+
 *con = conn;
 
 return 0;
diff --git a/net/trace-events b/net/trace-events
index 02c13fd0ba..fa49c71533 100644
--- a/net/trace-events
+++ b/net/trace-events
@@ -12,6 +12,7 @@ colo_proxy_main(const char *chr) ": %s"
 
 # colo-compare.c
 colo_compare_main(const char *chr) ": %s"
+colo_compare_drop_packet(const char *queue, const char *chr) ": %s: %s"
 colo_compare_udp_miscompare(const char *sta, int size) ": %s = %d"
 colo_compare_icmp_miscompare(const char *sta, int size) ": %s = %d"
 colo_compare_ip_info(int psize, const char *sta, const char *stb, int ssize, 
const char *stc, const char *std) "ppkt size = %d, ip_src = %s, ip_dst = %s, 
spkt size = %d, ip_src = %s, ip_dst = %s"
-- 
2.17.1




[PATCH v5 0/1] COLO: handling of the full primary or secondary queue

2020-04-09 Thread Derek Su
The series is to handle the full primary or secondary queue in colo-compare.

Fix the "pkt" memory leak in packet_enqueue().
Reproduce steps:
1. Setup PVM and SVM both with NIC e1000 by the steps descripted
   in the wiki qemu/COLO
2. Run "iperf3 -s" in PVM
3. Run "iperf3 -c  -t 7200" in client

The memory usage of qemu-system-x86_64 increases as the PVM's QMP
shows "qemu-system-x86_64: colo compare secondary queue size too big,
drop packet".

Please help to review, thanks.

V5:
 - Replace the error_report of full queue with a trace event
 - Remove handling of the full primary or secondary queue which hurt
   network throughput too much

V4:
 - Remove redundant flush of packets

V3:
 - handling of the full primary or secondary queue according to the
   suggestion from Zhang Chen
V2:
 - Fix incorrect patch format

Derek Su (1):
  colo-compare: Fix memory leak in packet_enqueue()

 net/colo-compare.c | 23 +++
 net/trace-events   |  1 +
 2 files changed, 16 insertions(+), 8 deletions(-)

-- 
2.17.1




Re: [PATCH v4 2/2] net/colo-compare.c: handling of the full primary or secondary queue

2020-04-09 Thread Derek Su
Hello, Zhang and Lukas

Sure, after my re-test, the performance is hurt. Will update it later.

By the way, could I also move the "error_report("colo compare
primary/secondary queue size too big, drop packet");" to trace?
The use of error_report is a little strange and make a flood in log.

May I  also make "MAX_QUEUE_SIZE"  be user-configurable in this series?

Thanks,
Derek Su




Zhang, Chen  於 2020年4月9日 週四 下午2:59寫道:
>
>
>
> > -Original Message-
> > From: Lukas Straub 
> > Sent: Thursday, April 9, 2020 3:19 AM
> > To: Derek Su 
> > Cc: qemu-devel@nongnu.org; lizhij...@cn.fujitsu.com; chy...@qnap.com;
> > jasow...@redhat.com; ctch...@qnap.com; Zhang, Chen
> > ; jwsu1...@gmail.com
> > Subject: Re: [PATCH v4 2/2] net/colo-compare.c: handling of the full primary
> > or secondary queue
> >
> > On Sat, 28 Mar 2020 20:46:46 +0800
> > Derek Su  wrote:
> >
> > > The pervious handling of the full primary or queue is only dropping
> > > the packet. If there are lots of clients to the guest VM, the "drop"
> > > will lead to the lost of the networking connection until next
> > > checkpoint.
> > >
> > > To address the issue, this patch drops the packet firstly.
> > > Then, do checkpoint and flush packets.
> > >
> > > Signed-off-by: Derek Su 
> >
> > Hello,
> > I had a look at this again and did some benchmarking.
> > First just qemu 5.0-rc1 with my bugfixes
> > ( https://lists.nongnu.org/archive/html/qemu-devel/2020-
> > 04/msg01432.html ) Then qemu 5.0-rc1 with my bugfixes and this patch
> > series.
> >
> > This commit hurts performance too much:
> > Client-to-server bandwidth falls from ~45.9 Mbit/s to 22.9 Mbit/s.
> > Server-to-client bandwidth falls from ~6.3 Mbit/s to just ~674 Kbit/s.
> > Average latency rises from ~197ms to ~397ms.
> >
> > Meanwhile the packet loss without this commit is negligible, only 1-2 ping
> > packets got lost during each test run.
> >
> > Instead I think we should just turn the error message into a trace so it
> > doesn't flood the logs.
>
> We re-test this patch, Lukas is right.
> Sorry for the original idea, looks like it did not show better performance in 
> the test.
> Agree with Lukas's comments. Derek, can you please change it?
>
> Thanks
> Zhang Chen
>
>
> >
> > Regards,
> > Lukas Straub



[PATCH v4 2/2] net/colo-compare.c: handling of the full primary or secondary queue

2020-03-28 Thread Derek Su
The pervious handling of the full primary or queue is only dropping
the packet. If there are lots of clients to the guest VM,
the "drop" will lead to the lost of the networking connection
until next checkpoint.

To address the issue, this patch drops the packet firstly.
Then, do checkpoint and flush packets.

Signed-off-by: Derek Su 
---
 net/colo-compare.c | 39 ---
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index cdd87b2aa8..fe8779cf2d 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -125,6 +125,12 @@ static const char *colo_mode[] = {
 [SECONDARY_IN] = "secondary",
 };
 
+enum {
+QUEUE_INSERT_ERR = -1,
+QUEUE_INSERT_OK = 0,
+QUEUE_INSERT_FULL = 1,
+};
+
 static int compare_chr_send(CompareState *s,
 const uint8_t *buf,
 uint32_t size,
@@ -211,8 +217,10 @@ static int colo_insert_packet(GQueue *queue, Packet *pkt, 
uint32_t *max_ack)
 }
 
 /*
- * Return 0 on success, if return -1 means the pkt
- * is unsupported(arp and ipv6) and will be sent later
+ * Return QUEUE_INSERT_OK on success.
+ * If return QUEUE_INSERT_FULL means list is full, and
+ * QUEUE_INSERT_ERR means the pkt is unsupported(arp and ipv6) and
+ * will be sent later
  */
 static int packet_enqueue(CompareState *s, int mode, Connection **con)
 {
@@ -234,7 +242,7 @@ static int packet_enqueue(CompareState *s, int mode, 
Connection **con)
 if (parse_packet_early(pkt)) {
 packet_destroy(pkt, NULL);
 pkt = NULL;
-return -1;
+return QUEUE_INSERT_ERR;
 }
 fill_connection_key(pkt, );
 
@@ -258,11 +266,12 @@ static int packet_enqueue(CompareState *s, int mode, 
Connection **con)
  "drop packet", colo_mode[mode]);
 packet_destroy(pkt, NULL);
 pkt = NULL;
+return QUEUE_INSERT_FULL;
 }
 
 *con = conn;
 
-return 0;
+return QUEUE_INSERT_OK;
 }
 
 static inline bool after(uint32_t seq1, uint32_t seq2)
@@ -995,17 +1004,21 @@ static void compare_pri_rs_finalize(SocketReadState 
*pri_rs)
 {
 CompareState *s = container_of(pri_rs, CompareState, pri_rs);
 Connection *conn = NULL;
+int ret;
 
-if (packet_enqueue(s, PRIMARY_IN, )) {
+ret = packet_enqueue(s, PRIMARY_IN, );
+if (ret == QUEUE_INSERT_OK) {
+/* compare packet in the specified connection */
+colo_compare_connection(conn, s);
+} else if (ret == QUEUE_INSERT_FULL) {
+colo_compare_inconsistency_notify(s);
+} else {
 trace_colo_compare_main("primary: unsupported packet in");
 compare_chr_send(s,
  pri_rs->buf,
  pri_rs->packet_len,
  pri_rs->vnet_hdr_len,
  false);
-} else {
-/* compare packet in the specified connection */
-colo_compare_connection(conn, s);
 }
 }
 
@@ -1013,12 +1026,16 @@ static void compare_sec_rs_finalize(SocketReadState 
*sec_rs)
 {
 CompareState *s = container_of(sec_rs, CompareState, sec_rs);
 Connection *conn = NULL;
+int ret;
 
-if (packet_enqueue(s, SECONDARY_IN, )) {
-trace_colo_compare_main("secondary: unsupported packet in");
-} else {
+ret = packet_enqueue(s, SECONDARY_IN, );
+if (ret == QUEUE_INSERT_OK) {
 /* compare packet in the specified connection */
 colo_compare_connection(conn, s);
+} else if (ret == QUEUE_INSERT_FULL) {
+colo_compare_inconsistency_notify(s);
+} else {
+trace_colo_compare_main("secondary: unsupported packet in");
 }
 }
 
-- 
2.17.1




[PATCH v4 0/2] COLO: handling of the full primary or secondary queue

2020-03-28 Thread Derek Su
The series is to handle the full primary or secondary queue in colo-compare.

(1) fix the "pkt" memory leak in packet_enqueue().
Reproduce steps:
1. Setup PVM and SVM both with NIC e1000 by the steps descripted
   in the wiki qemu/COLO
2. Run "iperf3 -s" in PVM
3. Run "iperf3 -c  -t 7200" in client

The memory usage of qemu-system-x86_64 increases as the PVM's QMP 
shows "qemu-system-x86_64: colo compare secondary queue size too big,
drop packet".

(2) The pervious handling of the full primary or queue is only dropping
the packet. If there are lots of clients to the guest VM,
the "drop" will lead to the lost of the networking connection
until next checkpoint, and therefore decrease the response of service.

This patch drops the packet firstly. Then, send all queued primary
packets, remove all queued secondary packets (flush packets) and
do checkpoint.

Please help to review, thanks.

V4:
 - Remove redundant flush of packets

V3:
 - handling of the full primary or secondary queue according to the
   suggestion from Zhang Chen
V2:
 - Fix incorrect patch format

Derek Su (2):
  net/colo-compare.c: Fix memory leak in packet_enqueue()
  net/colo-compare.c: handling of the full primary or secondary queue

 net/colo-compare.c | 62 --
 1 file changed, 43 insertions(+), 19 deletions(-)

-- 
2.17.1




[PATCH v4 1/2] net/colo-compare.c: Fix memory leak in packet_enqueue()

2020-03-28 Thread Derek Su
The patch is to fix the "pkt" memory leak in packet_enqueue().
The allocated "pkt" needs to be freed if the colo compare
primary or secondary queue is too big.

Signed-off-by: Derek Su 
---
 net/colo-compare.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 7ee17f2cf8..cdd87b2aa8 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -120,6 +120,10 @@ enum {
 SECONDARY_IN,
 };
 
+static const char *colo_mode[] = {
+[PRIMARY_IN] = "primary",
+[SECONDARY_IN] = "secondary",
+};
 
 static int compare_chr_send(CompareState *s,
 const uint8_t *buf,
@@ -215,6 +219,7 @@ static int packet_enqueue(CompareState *s, int mode, 
Connection **con)
 ConnectionKey key;
 Packet *pkt = NULL;
 Connection *conn;
+int ret;
 
 if (mode == PRIMARY_IN) {
 pkt = packet_new(s->pri_rs.buf,
@@ -243,16 +248,18 @@ static int packet_enqueue(CompareState *s, int mode, 
Connection **con)
 }
 
 if (mode == PRIMARY_IN) {
-if (!colo_insert_packet(>primary_list, pkt, >pack)) {
-error_report("colo compare primary queue size too big,"
- "drop packet");
-}
+ret = colo_insert_packet(>primary_list, pkt, >pack);
 } else {
-if (!colo_insert_packet(>secondary_list, pkt, >sack)) {
-error_report("colo compare secondary queue size too big,"
- "drop packet");
-}
+ret = colo_insert_packet(>secondary_list, pkt, >sack);
 }
+
+if (!ret) {
+error_report("colo compare %s queue size too big,"
+ "drop packet", colo_mode[mode]);
+packet_destroy(pkt, NULL);
+pkt = NULL;
+}
+
 *con = conn;
 
 return 0;
-- 
2.17.1




Re: [PATCH v3 2/2] net/colo-compare.c: handling of the full primary or secondary queue

2020-03-27 Thread Derek Su
Lukas Straub  於 2020年3月28日 週六 上午2:28寫道:
>
> On Sat, 28 Mar 2020 02:20:21 +0800
> Derek Su  wrote:
>
> > Lukas Straub  於 2020年3月28日 週六 上午1:46寫道:
> > >
> > > On Wed, 25 Mar 2020 17:43:54 +0800
> > > Derek Su  wrote:
> > >
> > > > The pervious handling of the full primary or queue is only dropping
> > > > the packet. If there are lots of clients to the guest VM,
> > > > the "drop" will lead to the lost of the networking connection
> > > > until next checkpoint.
> > > >
> > > > To address the issue, this patch drops the packet firstly.
> > > > Then, send all queued primary packets, remove all queued secondary
> > > > packets and do checkpoint.
> > > >
> > > > Signed-off-by: Derek Su 
> > > > ---
> > > >  net/colo-compare.c | 41 ++---
> > > >  1 file changed, 30 insertions(+), 11 deletions(-)
> > > >
> > > > diff --git a/net/colo-compare.c b/net/colo-compare.c
> > > > index cdd87b2aa8..1a52f50fbe 100644
> > > > --- a/net/colo-compare.c
> > > > +++ b/net/colo-compare.c
> > > > @@ -125,6 +125,12 @@ static const char *colo_mode[] = {
> > > >  [SECONDARY_IN] = "secondary",
> > > >  };
> > > >
> > > > +enum {
> > > > +QUEUE_INSERT_ERR = -1,
> > > > +QUEUE_INSERT_OK = 0,
> > > > +QUEUE_INSERT_FULL = 1,
> > > > +};
> > > > +
> > > >  static int compare_chr_send(CompareState *s,
> > > >  const uint8_t *buf,
> > > >  uint32_t size,
> > > > @@ -211,8 +217,10 @@ static int colo_insert_packet(GQueue *queue, 
> > > > Packet *pkt, uint32_t *max_ack)
> > > >  }
> > > >
> > > >  /*
> > > > - * Return 0 on success, if return -1 means the pkt
> > > > - * is unsupported(arp and ipv6) and will be sent later
> > > > + * Return QUEUE_INSERT_OK on success.
> > > > + * If return QUEUE_INSERT_FULL means list is full, and
> > > > + * QUEUE_INSERT_ERR means the pkt is unsupported(arp and ipv6) and
> > > > + * will be sent later
> > > >   */
> > > >  static int packet_enqueue(CompareState *s, int mode, Connection **con)
> > > >  {
> > > > @@ -234,7 +242,7 @@ static int packet_enqueue(CompareState *s, int 
> > > > mode, Connection **con)
> > > >  if (parse_packet_early(pkt)) {
> > > >  packet_destroy(pkt, NULL);
> > > >  pkt = NULL;
> > > > -return -1;
> > > > +return QUEUE_INSERT_ERR;
> > > >  }
> > > >  fill_connection_key(pkt, );
> > > >
> > > > @@ -258,11 +266,12 @@ static int packet_enqueue(CompareState *s, int 
> > > > mode, Connection **con)
> > > >   "drop packet", colo_mode[mode]);
> > > >  packet_destroy(pkt, NULL);
> > > >  pkt = NULL;
> > > > +return QUEUE_INSERT_FULL;
> > > >  }
> > > >
> > > >  *con = conn;
> > > >
> > > > -return 0;
> > > > +return QUEUE_INSERT_OK;
> > > >  }
> > > >
> > > >  static inline bool after(uint32_t seq1, uint32_t seq2)
> > > > @@ -995,17 +1004,22 @@ static void 
> > > > compare_pri_rs_finalize(SocketReadState *pri_rs)
> > > >  {
> > > >  CompareState *s = container_of(pri_rs, CompareState, pri_rs);
> > > >  Connection *conn = NULL;
> > > > +int ret;
> > > >
> > > > -if (packet_enqueue(s, PRIMARY_IN, )) {
> > > > +ret = packet_enqueue(s, PRIMARY_IN, );
> > > > +if (ret == QUEUE_INSERT_OK) {
> > > > +/* compare packet in the specified connection */
> > > > +colo_compare_connection(conn, s);
> > > > +} else if (ret == QUEUE_INSERT_FULL) {
> > > > +g_queue_foreach(>conn_list, colo_flush_packets, s);
> > > > +colo_compare_inconsistency_notify(s);
> > > > +} else {
> > > >  trace_colo_compare_main("primary: unsupported packet in");
> > > >  compare_chr_send(s,
> > > >   pri_rs->buf,
> > > >   pri_

Re: [PATCH v3 2/2] net/colo-compare.c: handling of the full primary or secondary queue

2020-03-27 Thread Derek Su
Lukas Straub  於 2020年3月28日 週六 上午1:46寫道:
>
> On Wed, 25 Mar 2020 17:43:54 +0800
> Derek Su  wrote:
>
> > The pervious handling of the full primary or queue is only dropping
> > the packet. If there are lots of clients to the guest VM,
> > the "drop" will lead to the lost of the networking connection
> > until next checkpoint.
> >
> > To address the issue, this patch drops the packet firstly.
> > Then, send all queued primary packets, remove all queued secondary
> > packets and do checkpoint.
> >
> > Signed-off-by: Derek Su 
> > ---
> >  net/colo-compare.c | 41 ++---
> >  1 file changed, 30 insertions(+), 11 deletions(-)
> >
> > diff --git a/net/colo-compare.c b/net/colo-compare.c
> > index cdd87b2aa8..1a52f50fbe 100644
> > --- a/net/colo-compare.c
> > +++ b/net/colo-compare.c
> > @@ -125,6 +125,12 @@ static const char *colo_mode[] = {
> >  [SECONDARY_IN] = "secondary",
> >  };
> >
> > +enum {
> > +QUEUE_INSERT_ERR = -1,
> > +QUEUE_INSERT_OK = 0,
> > +QUEUE_INSERT_FULL = 1,
> > +};
> > +
> >  static int compare_chr_send(CompareState *s,
> >  const uint8_t *buf,
> >  uint32_t size,
> > @@ -211,8 +217,10 @@ static int colo_insert_packet(GQueue *queue, Packet 
> > *pkt, uint32_t *max_ack)
> >  }
> >
> >  /*
> > - * Return 0 on success, if return -1 means the pkt
> > - * is unsupported(arp and ipv6) and will be sent later
> > + * Return QUEUE_INSERT_OK on success.
> > + * If return QUEUE_INSERT_FULL means list is full, and
> > + * QUEUE_INSERT_ERR means the pkt is unsupported(arp and ipv6) and
> > + * will be sent later
> >   */
> >  static int packet_enqueue(CompareState *s, int mode, Connection **con)
> >  {
> > @@ -234,7 +242,7 @@ static int packet_enqueue(CompareState *s, int mode, 
> > Connection **con)
> >  if (parse_packet_early(pkt)) {
> >  packet_destroy(pkt, NULL);
> >  pkt = NULL;
> > -return -1;
> > +return QUEUE_INSERT_ERR;
> >  }
> >  fill_connection_key(pkt, );
> >
> > @@ -258,11 +266,12 @@ static int packet_enqueue(CompareState *s, int mode, 
> > Connection **con)
> >   "drop packet", colo_mode[mode]);
> >  packet_destroy(pkt, NULL);
> >  pkt = NULL;
> > +return QUEUE_INSERT_FULL;
> >  }
> >
> >  *con = conn;
> >
> > -return 0;
> > +return QUEUE_INSERT_OK;
> >  }
> >
> >  static inline bool after(uint32_t seq1, uint32_t seq2)
> > @@ -995,17 +1004,22 @@ static void compare_pri_rs_finalize(SocketReadState 
> > *pri_rs)
> >  {
> >  CompareState *s = container_of(pri_rs, CompareState, pri_rs);
> >  Connection *conn = NULL;
> > +int ret;
> >
> > -if (packet_enqueue(s, PRIMARY_IN, )) {
> > +ret = packet_enqueue(s, PRIMARY_IN, );
> > +if (ret == QUEUE_INSERT_OK) {
> > +/* compare packet in the specified connection */
> > +colo_compare_connection(conn, s);
> > +} else if (ret == QUEUE_INSERT_FULL) {
> > +g_queue_foreach(>conn_list, colo_flush_packets, s);
> > +colo_compare_inconsistency_notify(s);
> > +} else {
> >  trace_colo_compare_main("primary: unsupported packet in");
> >  compare_chr_send(s,
> >   pri_rs->buf,
> >   pri_rs->packet_len,
> >   pri_rs->vnet_hdr_len,
> >   false);
> > -} else {
> > -/* compare packet in the specified connection */
> > -colo_compare_connection(conn, s);
> >  }
> >  }
> >
> > @@ -1013,12 +1027,17 @@ static void compare_sec_rs_finalize(SocketReadState 
> > *sec_rs)
> >  {
> >  CompareState *s = container_of(sec_rs, CompareState, sec_rs);
> >  Connection *conn = NULL;
> > +int ret;
> >
> > -if (packet_enqueue(s, SECONDARY_IN, )) {
> > -trace_colo_compare_main("secondary: unsupported packet in");
> > -} else {
> > +ret = packet_enqueue(s, SECONDARY_IN, );
> > +if (ret == QUEUE_INSERT_OK) {
> >  /* compare packet in the specified connection */
> >  colo_compare_connection(conn, s);
> > +} else if (ret == QUEUE_INSERT_FULL) {
> > +g_queue_foreach(>conn_list, colo_flush_packets, s);
> > +colo_compare_inconsistency_notify(s);
> > +} else {
> > +trace_colo_compare_main("secondary: unsupported packet in");
> >  }
> >  }
> >
>
> Hi,
> I don't think we have to flush here because the (post-)checkpoint event will 
> flush the packets for us.
>

Hi,
Yes, the periodical checkpoint can flush the packets.

But, if many clients send many packets to the vm,
there is a high probability that packets are dropped because the
primary/secondary queues are always full.
It causes lots of re-transmission between clients and vm and
deteriorate service response to clients.

Sincerely,
Derek Su

> Regards,
> Lukas Straub



[PATCH v3 2/2] net/colo-compare.c: handling of the full primary or secondary queue

2020-03-25 Thread Derek Su
The pervious handling of the full primary or queue is only dropping
the packet. If there are lots of clients to the guest VM,
the "drop" will lead to the lost of the networking connection
until next checkpoint.

To address the issue, this patch drops the packet firstly.
Then, send all queued primary packets, remove all queued secondary
packets and do checkpoint.

Signed-off-by: Derek Su 
---
 net/colo-compare.c | 41 ++---
 1 file changed, 30 insertions(+), 11 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index cdd87b2aa8..1a52f50fbe 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -125,6 +125,12 @@ static const char *colo_mode[] = {
 [SECONDARY_IN] = "secondary",
 };
 
+enum {
+QUEUE_INSERT_ERR = -1,
+QUEUE_INSERT_OK = 0,
+QUEUE_INSERT_FULL = 1,
+};
+
 static int compare_chr_send(CompareState *s,
 const uint8_t *buf,
 uint32_t size,
@@ -211,8 +217,10 @@ static int colo_insert_packet(GQueue *queue, Packet *pkt, 
uint32_t *max_ack)
 }
 
 /*
- * Return 0 on success, if return -1 means the pkt
- * is unsupported(arp and ipv6) and will be sent later
+ * Return QUEUE_INSERT_OK on success.
+ * If return QUEUE_INSERT_FULL means list is full, and
+ * QUEUE_INSERT_ERR means the pkt is unsupported(arp and ipv6) and
+ * will be sent later
  */
 static int packet_enqueue(CompareState *s, int mode, Connection **con)
 {
@@ -234,7 +242,7 @@ static int packet_enqueue(CompareState *s, int mode, 
Connection **con)
 if (parse_packet_early(pkt)) {
 packet_destroy(pkt, NULL);
 pkt = NULL;
-return -1;
+return QUEUE_INSERT_ERR;
 }
 fill_connection_key(pkt, );
 
@@ -258,11 +266,12 @@ static int packet_enqueue(CompareState *s, int mode, 
Connection **con)
  "drop packet", colo_mode[mode]);
 packet_destroy(pkt, NULL);
 pkt = NULL;
+return QUEUE_INSERT_FULL;
 }
 
 *con = conn;
 
-return 0;
+return QUEUE_INSERT_OK;
 }
 
 static inline bool after(uint32_t seq1, uint32_t seq2)
@@ -995,17 +1004,22 @@ static void compare_pri_rs_finalize(SocketReadState 
*pri_rs)
 {
 CompareState *s = container_of(pri_rs, CompareState, pri_rs);
 Connection *conn = NULL;
+int ret;
 
-if (packet_enqueue(s, PRIMARY_IN, )) {
+ret = packet_enqueue(s, PRIMARY_IN, );
+if (ret == QUEUE_INSERT_OK) {
+/* compare packet in the specified connection */
+colo_compare_connection(conn, s);
+} else if (ret == QUEUE_INSERT_FULL) {
+g_queue_foreach(>conn_list, colo_flush_packets, s);
+colo_compare_inconsistency_notify(s);
+} else {
 trace_colo_compare_main("primary: unsupported packet in");
 compare_chr_send(s,
  pri_rs->buf,
  pri_rs->packet_len,
  pri_rs->vnet_hdr_len,
  false);
-} else {
-/* compare packet in the specified connection */
-colo_compare_connection(conn, s);
 }
 }
 
@@ -1013,12 +1027,17 @@ static void compare_sec_rs_finalize(SocketReadState 
*sec_rs)
 {
 CompareState *s = container_of(sec_rs, CompareState, sec_rs);
 Connection *conn = NULL;
+int ret;
 
-if (packet_enqueue(s, SECONDARY_IN, )) {
-trace_colo_compare_main("secondary: unsupported packet in");
-} else {
+ret = packet_enqueue(s, SECONDARY_IN, );
+if (ret == QUEUE_INSERT_OK) {
 /* compare packet in the specified connection */
 colo_compare_connection(conn, s);
+} else if (ret == QUEUE_INSERT_FULL) {
+g_queue_foreach(>conn_list, colo_flush_packets, s);
+colo_compare_inconsistency_notify(s);
+} else {
+trace_colo_compare_main("secondary: unsupported packet in");
 }
 }
 
-- 
2.17.1




[PATCH v3 1/2] net/colo-compare.c: Fix memory leak in packet_enqueue()

2020-03-25 Thread Derek Su
The patch is to fix the "pkt" memory leak in packet_enqueue().
The allocated "pkt" needs to be freed if the colo compare
primary or secondary queue is too big.

Signed-off-by: Derek Su 
---
 net/colo-compare.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 7ee17f2cf8..cdd87b2aa8 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -120,6 +120,10 @@ enum {
 SECONDARY_IN,
 };
 
+static const char *colo_mode[] = {
+[PRIMARY_IN] = "primary",
+[SECONDARY_IN] = "secondary",
+};
 
 static int compare_chr_send(CompareState *s,
 const uint8_t *buf,
@@ -215,6 +219,7 @@ static int packet_enqueue(CompareState *s, int mode, 
Connection **con)
 ConnectionKey key;
 Packet *pkt = NULL;
 Connection *conn;
+int ret;
 
 if (mode == PRIMARY_IN) {
 pkt = packet_new(s->pri_rs.buf,
@@ -243,16 +248,18 @@ static int packet_enqueue(CompareState *s, int mode, 
Connection **con)
 }
 
 if (mode == PRIMARY_IN) {
-if (!colo_insert_packet(>primary_list, pkt, >pack)) {
-error_report("colo compare primary queue size too big,"
- "drop packet");
-}
+ret = colo_insert_packet(>primary_list, pkt, >pack);
 } else {
-if (!colo_insert_packet(>secondary_list, pkt, >sack)) {
-error_report("colo compare secondary queue size too big,"
- "drop packet");
-}
+ret = colo_insert_packet(>secondary_list, pkt, >sack);
 }
+
+if (!ret) {
+error_report("colo compare %s queue size too big,"
+ "drop packet", colo_mode[mode]);
+packet_destroy(pkt, NULL);
+pkt = NULL;
+}
+
 *con = conn;
 
 return 0;
-- 
2.17.1




[PATCH v3 0/2] COLO: handling of the full primary or secondary queue

2020-03-25 Thread Derek Su
The series is to handle the full primary or secondary queue in colo-compare.

(1) fix the "pkt" memory leak in packet_enqueue().
Reproduce steps:
1. Setup PVM and SVM both with NIC e1000 by the steps descripted
   in the wiki qemu/COLO
2. Run "iperf3 -s" in PVM
3. Run "iperf3 -c  -t 7200" in client

The memory usage of qemu-system-x86_64 increases as the PVM's QMP 
shows "qemu-system-x86_64: colo compare secondary queue size too big,
drop packet".

(2) The pervious handling of the full primary or queue is only dropping
the packet. If there are lots of clients to the guest VM,
the "drop" will lead to the lost of the networking connection
until next checkpoint, and therefore slow down the service.

This patch drops the packet firstly. Then, send all queued primary
packets, remove all queued secondary packets and do checkpoint.

Please review, thanks.

V3:
 - handling of the full primary or secondary queue according to the
   suggestion Zhang Chen
V2:
 - Fix incorrect patch format

Derek Su (2):
  net/colo-compare.c: Fix memory leak in packet_enqueue()
  net/colo-compare.c: handling of the full primary or secondary queue

 net/colo-compare.c | 64 --
 1 file changed, 45 insertions(+), 19 deletions(-)

-- 
2.17.1




Re: [PATCH v2 1/1] net/colo-compare.c: Fix memory leak in packet_enqueue()

2020-03-24 Thread Derek Su
Jing-Wei Su  於 2020年3月25日 週三 上午10:05寫道:
>
> Zhang, Chen  於 2020年3月25日 週三 上午9:37寫道:
> >
> >
> >
> > > -Original Message-
> > > From: Jing-Wei Su 
> > > Sent: Tuesday, March 24, 2020 10:47 AM
> > > To: Zhang, Chen 
> > > Cc: qemu-devel@nongnu.org; lizhij...@cn.fujitsu.com;
> > > jasow...@redhat.com; dere...@qnap.com
> > > Subject: Re: [PATCH v2 1/1] net/colo-compare.c: Fix memory leak in
> > > packet_enqueue()
> > >
> > > Zhang, Chen  於 2020年3月24日 週二 上午3:24
> > > 寫道:
> > > >
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: Derek Su 
> > > > > Sent: Monday, March 23, 2020 1:48 AM
> > > > > To: qemu-devel@nongnu.org
> > > > > Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com;
> > > > > jasow...@redhat.com; dere...@qnap.com
> > > > > Subject: [PATCH v2 1/1] net/colo-compare.c: Fix memory leak in
> > > > > packet_enqueue()
> > > > >
> > > > > The patch is to fix the "pkt" memory leak in packet_enqueue().
> > > > > The allocated "pkt" needs to be freed if the colo compare primary or
> > > > > secondary queue is too big.
> > > >
> > > > Hi Derek,
> > > >
> > > > Thank you for the patch.
> > > > I re-think this issue in a big view, looks just free the pkg is not 
> > > > enough in
> > > this situation.
> > > > The root cause is network is too busy to compare, So, better choice is
> > > > notify COLO frame to do a checkpoint and clean up all the network
> > > > queue. This work maybe decrease COLO network performance but seams
> > > better than drop lots of pkg.
> > > >
> > > > Thanks
> > > > Zhang Chen
> > > >
> > >
> > > Hello, Zhang
> > >
> > > Got it.
> > > What is the concern of the massive "drop packets"?
> > > Does the behavior make the COLO do checkpoint periodically (~20 seconds)
> > > instead of doing immediate checkpoint when encountering different
> > > response packets?
> >
> > The concern of the "drop packets" is guest will lose network connection with
> > most of real clients until next periodic force checkpoint. COLO designed 
> > for dynamic
> > control checkpoint, so I think do a checkpoint here will help guest supply 
> > service faster.
> >
>
> I see.
> I'll update the patch with your suggestion later.
>

Hi, Zhang
Here is the idea and pseudo code to handle the "drop packet".

```
ret = packet_enqueue
(1) ret == 0
  compare connection
(2) ret == -1
  send packet
(3) ret == queue insertion fail
  do checkpoint
  send all queued primary packets
  remove all queued secondary packets
```

Do you have any comment for the handling?

Thanks
Derek

> > >
> > > It seems that frequent checkpoints caused by the full queue (busy
> > > network) instead of different
> > > response packets may harm the high speed network (10 Gbps or higher)
> > > performance dramatically.
> >
> > Yes, maybe I can send a patch to make user adjust queue size depend on it's 
> > own environment.
> > But with larger queue size, colo-compare will spend much time to do compare 
> > packet when network
> > Is real busy status.
>
> Thank you. The user-configurable queue size will be very helpful.
>
> Thanks.
> Derek Su
>
> >
> > Thanks
> > Zhang Chen
> >
> > >
> > > Thanks
> > > Derek
> > >
> > > > >
> > > > > Signed-off-by: Derek Su 
> > > > > ---
> > > > >  net/colo-compare.c | 23 +++
> > > > >  1 file changed, 15 insertions(+), 8 deletions(-)
> > > > >
> > > > > diff --git a/net/colo-compare.c b/net/colo-compare.c index
> > > > > 7ee17f2cf8..cdd87b2aa8 100644
> > > > > --- a/net/colo-compare.c
> > > > > +++ b/net/colo-compare.c
> > > > > @@ -120,6 +120,10 @@ enum {
> > > > >  SECONDARY_IN,
> > > > >  };
> > > > >
> > > > > +static const char *colo_mode[] = {
> > > > > +[PRIMARY_IN] = "primary",
> > > > > +[SECONDARY_IN] = "secondary",
> > > > > +};
> > > > >
> > > > >  static 

[PATCH v2 0/1] COLO: Fix memory leak in packet_enqueue()

2020-03-22 Thread Derek Su
The patch is to fix the memory leak in packet_enqueue().
The allocated "pkt" needs to be freed if the colo compare
primary or secondary queue is too big to insert.

Reproduce steps:
(1) Setup PVM and SVM both with NIC e1000 by the steps descripted
in the wiki qemu/COLO
(2) Run "iperf3 -s" in PVM
(3) Run "iperfs -c  -t 7200"

The memory usage of qemu-system-x86_64 increases as the PVM's QMP 
shows "qemu-system-x86_64: colo compare secondary queue size too big,
drop packet".

Please review, thanks.

V2:
 - Fix incorrect patch format

Derek Su (1):
  net/colo-compare.c: Fix memory leak in packet_enqueue()

 net/colo-compare.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

-- 
2.17.1




[PATCH v2 1/1] net/colo-compare.c: Fix memory leak in packet_enqueue()

2020-03-22 Thread Derek Su
The patch is to fix the "pkt" memory leak in packet_enqueue().
The allocated "pkt" needs to be freed if the colo compare
primary or secondary queue is too big.

Signed-off-by: Derek Su 
---
 net/colo-compare.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 7ee17f2cf8..cdd87b2aa8 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -120,6 +120,10 @@ enum {
 SECONDARY_IN,
 };
 
+static const char *colo_mode[] = {
+[PRIMARY_IN] = "primary",
+[SECONDARY_IN] = "secondary",
+};
 
 static int compare_chr_send(CompareState *s,
 const uint8_t *buf,
@@ -215,6 +219,7 @@ static int packet_enqueue(CompareState *s, int mode, 
Connection **con)
 ConnectionKey key;
 Packet *pkt = NULL;
 Connection *conn;
+int ret;
 
 if (mode == PRIMARY_IN) {
 pkt = packet_new(s->pri_rs.buf,
@@ -243,16 +248,18 @@ static int packet_enqueue(CompareState *s, int mode, 
Connection **con)
 }
 
 if (mode == PRIMARY_IN) {
-if (!colo_insert_packet(>primary_list, pkt, >pack)) {
-error_report("colo compare primary queue size too big,"
- "drop packet");
-}
+ret = colo_insert_packet(>primary_list, pkt, >pack);
 } else {
-if (!colo_insert_packet(>secondary_list, pkt, >sack)) {
-error_report("colo compare secondary queue size too big,"
- "drop packet");
-}
+ret = colo_insert_packet(>secondary_list, pkt, >sack);
 }
+
+if (!ret) {
+error_report("colo compare %s queue size too big,"
+ "drop packet", colo_mode[mode]);
+packet_destroy(pkt, NULL);
+pkt = NULL;
+}
+
 *con = conn;
 
 return 0;
-- 
2.17.1




[PATCH 1/1] net/colo-compare.c: Fix memory leak in packet_enqueue()

2020-03-22 Thread Derek Su
The patch is to fix the "pkt" memory leak in packet_enqueue().
The allocated "pkt" needs to be freed if the colo compare
primary or secondary queue is too big.

Signed-off-by: Derek Su 
---
 net/colo-compare.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 7ee17f2cf8..cdd87b2aa8 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -120,6 +120,10 @@ enum {
 SECONDARY_IN,
 };

+static const char *colo_mode[] = {
+[PRIMARY_IN] = "primary",
+[SECONDARY_IN] = "secondary",
+};

 static int compare_chr_send(CompareState *s,
 const uint8_t *buf,
@@ -215,6 +219,7 @@ static int packet_enqueue(CompareState *s, int mode,
Connection **con)
 ConnectionKey key;




 Packet *pkt = NULL;
 Connection *conn;
+int ret;

 if (mode == PRIMARY_IN) {
 pkt = packet_new(s->pri_rs.buf,
@@ -243,16 +248,18 @@ static int packet_enqueue(CompareState *s, int mode,
Connection **con)
 }

 if (mode == PRIMARY_IN) {
-if (!colo_insert_packet(>primary_list, pkt, >pack)) {
-error_report("colo compare primary queue size too big,"
- "drop packet");
-}
+ret = colo_insert_packet(>primary_list, pkt, >pack);
 } else {
-if (!colo_insert_packet(>secondary_list, pkt, >sack)) {
-error_report("colo compare secondary queue size too big,"
- "drop packet");
-}
+ret = colo_insert_packet(>secondary_list, pkt, >sack);
 }
+
+if (!ret) {
+error_report("colo compare %s queue size too big,"
+ "drop packet", colo_mode[mode]);
+packet_destroy(pkt, NULL);
+pkt = NULL;
+}
+
 *con = conn;

 return 0;
-- 
2.17.1


[PATCH 0/1] COLO: Fix memory leak in packet_enqueue()

2020-03-22 Thread Derek Su
The patch is to fix the memory leak in packet_enqueue().
The allocated "pkt" needs to be freed if the colo compare
primary or secondary queue is too big to insert.

Reproduce steps:
(1) Setup PVM and SVM both with NIC e1000 by the steps descripted


in the wiki qemu/COLO
(2) Run "iperf3 -s" in PVM
(3) Run "iperfs -c  -t 7200"

The memory usage of qemu-system-x86_64 increases as
the PVM's QMP shows "qemu-system-x86_64: colo compare
secondary queue size too big,drop packet".


Derek Su (1):
  net/colo-compare.c: Fix memory leak in packet_enqueue()

 net/colo-compare.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

-- 
2.17.1


[PATCH 1/1] COLO: Fix memory leak in packet_enqueue()

2020-03-22 Thread Derek Su
The patch is to fix the "pkt" memory leak in packet_enqueue().
The allocated "pkt" needs to be freed if the colo compare
primary or secondary queue is too big.

Signed-off-by: Derek Su 
---
 net/colo-compare.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 7ee17f2cf8..cdd87b2aa8 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -120,6 +120,10 @@ enum {
 SECONDARY_IN,
 };

+static const char *colo_mode[] = {
+[PRIMARY_IN] = "primary",
+[SECONDARY_IN] = "secondary",
+};

 static int compare_chr_send(CompareState *s,
 const uint8_t *buf,
@@ -215,6 +219,7 @@ static int packet_enqueue(CompareState *s, int mode,
Connection **con)
 ConnectionKey key;
 Packet *pkt = NULL;
 Connection *conn;
+int ret;

 if (mode == PRIMARY_IN) {
 pkt = packet_new(s->pri_rs.buf,
@@ -243,16 +248,18 @@ static int packet_enqueue(CompareState *s, int mode,
Connection **con)
 }

 if (mode == PRIMARY_IN) {
-if (!colo_insert_packet(>primary_list, pkt, >pack)) {
-error_report("colo compare primary queue size too big,"
- "drop packet");
-}
+ret = colo_insert_packet(>primary_list, pkt, >pack);
 } else {
-if (!colo_insert_packet(>secondary_list, pkt, >sack)) {
-error_report("colo compare secondary queue size too big,"
- "drop packet");
-}
+ret = colo_insert_packet(>secondary_list, pkt, >sack);
 }
+
+if (!ret) {
+error_report("colo compare %s queue size too big,"
+ "drop packet", colo_mode[mode]);
+packet_destroy(pkt, NULL);
+pkt = NULL;
+}
+
 *con = conn;

 return 0;
-- 
2.17.1