Re: [Qemu-devel] [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-09-01 Thread Wanpeng Li
2016-09-01 13:46 GMT+08:00 Li, Liang Z :
>> Subject: Re: [PATCH v3 kernel 0/7] Extend virtio-balloon for fast 
>> (de)inflating
>> & fast live migration
>>
>> 2016-08-08 14:35 GMT+08:00 Liang Li :
>> > This patch set contains two parts of changes to the virtio-balloon.
>> >
>> > One is the change for speeding up the inflating & deflating process,
>> > the main idea of this optimization is to use bitmap to send the page
>> > information to host instead of the PFNs, to reduce the overhead of
>> > virtio data transmission, address translation and madvise(). This can
>> > help to improve the performance by about 85%.
>> >
>> > Another change is for speeding up live migration. By skipping process
>> > guest's free pages in the first round of data copy, to reduce needless
>> > data processing, this can help to save quite a lot of CPU cycles and
>> > network bandwidth. We put guest's free page information in bitmap and
>> > send it to host with the virt queue of virtio-balloon. For an idle 8GB
>> > guest, this can help to shorten the total live migration time from
>> > 2Sec to about 500ms in the 10Gbps network environment.
>>
>> I just read the slides of this feature for recent kvm forum, the cloud
>> providers more care about live migration downtime to avoid customers'
>> perception than total time, however, this feature will increase downtime
>> when acquire the benefit of reducing total time, maybe it will be more
>> acceptable if there is no downside for downtime.
>>
>> Regards,
>> Wanpeng Li
>
> In theory, there is no factor that will increase the downtime. There is no 
> additional operation
> and no more data copy during the stop and copy stage. But in the test, the 
> downtime increases
> and this can be reproduced. I think the busy network line maybe the reason 
> for this. With this
>  optimization, a huge amount of data is written to the socket in a shorter 
> time, so some of the write
> operation may need to wait. Without this optimization, zero page checking 
> takes more time,
> the network is not so busy.
>
> If the guest is not an idle one, I think the gap of the downtime will not so 
> obvious.  Anyway, the

http://www.linux-kvm.org/images/c/c3/03x06B-Liang_Li-Real_Time_and_Fast_Live_Migration_Update_for_NFV.pdf
The slides show almost the similar percentage for the idle and the
non-idle guests, they both increase  ~50% downtime.

Regards,
Wanpeng Li



Re: [Qemu-devel] [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-08-31 Thread Li, Liang Z
> Subject: Re: [PATCH v3 kernel 0/7] Extend virtio-balloon for fast 
> (de)inflating
> & fast live migration
> 
> 2016-08-08 14:35 GMT+08:00 Liang Li :
> > This patch set contains two parts of changes to the virtio-balloon.
> >
> > One is the change for speeding up the inflating & deflating process,
> > the main idea of this optimization is to use bitmap to send the page
> > information to host instead of the PFNs, to reduce the overhead of
> > virtio data transmission, address translation and madvise(). This can
> > help to improve the performance by about 85%.
> >
> > Another change is for speeding up live migration. By skipping process
> > guest's free pages in the first round of data copy, to reduce needless
> > data processing, this can help to save quite a lot of CPU cycles and
> > network bandwidth. We put guest's free page information in bitmap and
> > send it to host with the virt queue of virtio-balloon. For an idle 8GB
> > guest, this can help to shorten the total live migration time from
> > 2Sec to about 500ms in the 10Gbps network environment.
> 
> I just read the slides of this feature for recent kvm forum, the cloud
> providers more care about live migration downtime to avoid customers'
> perception than total time, however, this feature will increase downtime
> when acquire the benefit of reducing total time, maybe it will be more
> acceptable if there is no downside for downtime.
> 
> Regards,
> Wanpeng Li

In theory, there is no factor that will increase the downtime. There is no 
additional operation
and no more data copy during the stop and copy stage. But in the test, the 
downtime increases
and this can be reproduced. I think the busy network line maybe the reason for 
this. With this
 optimization, a huge amount of data is written to the socket in a shorter 
time, so some of the write
operation may need to wait. Without this optimization, zero page checking takes 
more time,
the network is not so busy.

If the guest is not an idle one, I think the gap of the downtime will not so 
obvious.  Anyway, the
downtime is still less than the  max_down_time set by the user.

Thanks!
Liang


Re: [Qemu-devel] [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-08-31 Thread Wanpeng Li
2016-08-08 14:35 GMT+08:00 Liang Li :
> This patch set contains two parts of changes to the virtio-balloon.
>
> One is the change for speeding up the inflating & deflating process,
> the main idea of this optimization is to use bitmap to send the page
> information to host instead of the PFNs, to reduce the overhead of
> virtio data transmission, address translation and madvise(). This can
> help to improve the performance by about 85%.
>
> Another change is for speeding up live migration. By skipping process
> guest's free pages in the first round of data copy, to reduce needless
> data processing, this can help to save quite a lot of CPU cycles and
> network bandwidth. We put guest's free page information in bitmap and
> send it to host with the virt queue of virtio-balloon. For an idle 8GB
> guest, this can help to shorten the total live migration time from 2Sec
> to about 500ms in the 10Gbps network environment.

I just read the slides of this feature for recent kvm forum, the cloud
providers more care about live migration downtime to avoid customers'
perception than total time, however, this feature will increase
downtime when acquire the benefit of reducing total time, maybe it
will be more acceptable if there is no downside for downtime.

Regards,
Wanpeng Li



Re: [Qemu-devel] [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-08-31 Thread Li, Liang Z
Hi Michael,

I know you are very busy. If you have time, could you help to take a look at 
this patch set?

Thanks!
Liang

> -Original Message-
> From: Li, Liang Z
> Sent: Thursday, August 18, 2016 9:06 AM
> To: Michael S. Tsirkin
> Cc: virtualizat...@lists.linux-foundation.org; linux...@kvack.org; virtio-
> d...@lists.oasis-open.org; k...@vger.kernel.org; qemu-devel@nongnu.org;
> quint...@redhat.com; dgilb...@redhat.com; Hansen, Dave; linux-
> ker...@vger.kernel.org
> Subject: RE: [PATCH v3 kernel 0/7] Extend virtio-balloon for fast 
> (de)inflating
> & fast live migration
> 
> Hi Michael,
> 
> Could you help to review this version when you have time?
> 
> Thanks!
> Liang
> 
> > -Original Message-
> > From: Li, Liang Z
> > Sent: Monday, August 08, 2016 2:35 PM
> > To: linux-ker...@vger.kernel.org
> > Cc: virtualizat...@lists.linux-foundation.org; linux...@kvack.org;
> > virtio- d...@lists.oasis-open.org; k...@vger.kernel.org;
> > qemu-devel@nongnu.org; quint...@redhat.com; dgilb...@redhat.com;
> > Hansen, Dave; Li, Liang Z
> > Subject: [PATCH v3 kernel 0/7] Extend virtio-balloon for fast
> > (de)inflating & fast live migration
> >
> > This patch set contains two parts of changes to the virtio-balloon.
> >
> > One is the change for speeding up the inflating & deflating process,
> > the main idea of this optimization is to use bitmap to send the page
> > information to host instead of the PFNs, to reduce the overhead of
> > virtio data transmission, address translation and madvise(). This can
> > help to improve the performance by about 85%.
> >
> > Another change is for speeding up live migration. By skipping process
> > guest's free pages in the first round of data copy, to reduce needless
> > data processing, this can help to save quite a lot of CPU cycles and
> > network bandwidth. We put guest's free page information in bitmap and
> > send it to host with the virt queue of virtio-balloon. For an idle 8GB
> > guest, this can help to shorten the total live migration time from
> > 2Sec to about 500ms in the 10Gbps network environment.
> >
> > Dave Hansen suggested a new scheme to encode the data structure,
> > because of additional complexity, it's not implemented in v3.
> >
> > Changes from v2 to v3:
> > * Change the name of 'free page' to 'unused page'.
> > * Use the scatter & gather bitmap instead of a 1MB page bitmap.
> > * Fix overwriting the page bitmap after kicking.
> > * Some of MST's comments for v2.
> >
> > Changes from v1 to v2:
> > * Abandon the patch for dropping page cache.
> > * Put some structures to uapi head file.
> > * Use a new way to determine the page bitmap size.
> > * Use a unified way to send the free page information with the bitmap
> > * Address the issues referred in MST's comments
> >
> >
> > Liang Li (7):
> >   virtio-balloon: rework deflate to add page to a list
> >   virtio-balloon: define new feature bit and page bitmap head
> >   mm: add a function to get the max pfn
> >   virtio-balloon: speed up inflate/deflate process
> >   mm: add the related functions to get unused page
> >   virtio-balloon: define feature bit and head for misc virt queue
> >   virtio-balloon: tell host vm's unused page info
> >
> >  drivers/virtio/virtio_balloon.c | 390
> > 
> >  include/linux/mm.h  |   3 +
> >  include/uapi/linux/virtio_balloon.h |  41 
> >  mm/page_alloc.c |  94 +
> >  4 files changed, 485 insertions(+), 43 deletions(-)
> >
> > --
> > 1.8.3.1




Re: [Qemu-devel] [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-08-17 Thread Li, Liang Z
Hi Michael,

Could you help to review this version when you have time? 

Thanks!
Liang

> -Original Message-
> From: Li, Liang Z
> Sent: Monday, August 08, 2016 2:35 PM
> To: linux-ker...@vger.kernel.org
> Cc: virtualizat...@lists.linux-foundation.org; linux...@kvack.org; virtio-
> d...@lists.oasis-open.org; k...@vger.kernel.org; qemu-devel@nongnu.org;
> quint...@redhat.com; dgilb...@redhat.com; Hansen, Dave; Li, Liang Z
> Subject: [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating &
> fast live migration
> 
> This patch set contains two parts of changes to the virtio-balloon.
> 
> One is the change for speeding up the inflating & deflating process, the main
> idea of this optimization is to use bitmap to send the page information to
> host instead of the PFNs, to reduce the overhead of virtio data transmission,
> address translation and madvise(). This can help to improve the performance
> by about 85%.
> 
> Another change is for speeding up live migration. By skipping process guest's
> free pages in the first round of data copy, to reduce needless data 
> processing,
> this can help to save quite a lot of CPU cycles and network bandwidth. We
> put guest's free page information in bitmap and send it to host with the virt
> queue of virtio-balloon. For an idle 8GB guest, this can help to shorten the
> total live migration time from 2Sec to about 500ms in the 10Gbps network
> environment.
> 
> Dave Hansen suggested a new scheme to encode the data structure,
> because of additional complexity, it's not implemented in v3.
> 
> Changes from v2 to v3:
> * Change the name of 'free page' to 'unused page'.
> * Use the scatter & gather bitmap instead of a 1MB page bitmap.
> * Fix overwriting the page bitmap after kicking.
> * Some of MST's comments for v2.
> 
> Changes from v1 to v2:
> * Abandon the patch for dropping page cache.
> * Put some structures to uapi head file.
> * Use a new way to determine the page bitmap size.
> * Use a unified way to send the free page information with the bitmap
> * Address the issues referred in MST's comments
> 
> 
> Liang Li (7):
>   virtio-balloon: rework deflate to add page to a list
>   virtio-balloon: define new feature bit and page bitmap head
>   mm: add a function to get the max pfn
>   virtio-balloon: speed up inflate/deflate process
>   mm: add the related functions to get unused page
>   virtio-balloon: define feature bit and head for misc virt queue
>   virtio-balloon: tell host vm's unused page info
> 
>  drivers/virtio/virtio_balloon.c | 390
> 
>  include/linux/mm.h  |   3 +
>  include/uapi/linux/virtio_balloon.h |  41 
>  mm/page_alloc.c |  94 +
>  4 files changed, 485 insertions(+), 43 deletions(-)
> 
> --
> 1.8.3.1




Re: [Qemu-devel] [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-08-08 Thread Li, Liang Z
> Subject: Re: [PATCH v3 kernel 0/7] Extend virtio-balloon for fast 
> (de)inflating
> & fast live migration
> 
> On 08/07/2016 11:35 PM, Liang Li wrote:
> > Dave Hansen suggested a new scheme to encode the data structure,
> > because of additional complexity, it's not implemented in v3.
> 
> FWIW, I don't think it takes any additional complexity here, at least in the
> guest implementation side.  The thing I suggested would just mean explicitly
> calling out that there was a single bitmap instead of implying it in the ABI.
> 
> Do you think the scheme I suggested is the way to go?

Yes, I think so.  And I will do that in the later version. In this V3, I just 
want to solve the 
issue caused by a large page bitmap in v2.

Liang



Re: [Qemu-devel] [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-08-08 Thread Dave Hansen
On 08/07/2016 11:35 PM, Liang Li wrote:
> Dave Hansen suggested a new scheme to encode the data structure,
> because of additional complexity, it's not implemented in v3.

FWIW, I don't think it takes any additional complexity here, at least in
the guest implementation side.  The thing I suggested would just mean
explicitly calling out that there was a single bitmap instead of
implying it in the ABI.

Do you think the scheme I suggested is the way to go?



[Qemu-devel] [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-08-08 Thread Liang Li
This patch set contains two parts of changes to the virtio-balloon. 

One is the change for speeding up the inflating & deflating process,
the main idea of this optimization is to use bitmap to send the page
information to host instead of the PFNs, to reduce the overhead of
virtio data transmission, address translation and madvise(). This can
help to improve the performance by about 85%.

Another change is for speeding up live migration. By skipping process
guest's free pages in the first round of data copy, to reduce needless
data processing, this can help to save quite a lot of CPU cycles and
network bandwidth. We put guest's free page information in bitmap and
send it to host with the virt queue of virtio-balloon. For an idle 8GB
guest, this can help to shorten the total live migration time from 2Sec
to about 500ms in the 10Gbps network environment.  

Dave Hansen suggested a new scheme to encode the data structure,
because of additional complexity, it's not implemented in v3.

Changes from v2 to v3:
* Change the name of 'free page' to 'unused page'.
* Use the scatter & gather bitmap instead of a 1MB page bitmap. 
* Fix overwriting the page bitmap after kicking. 
* Some of MST's comments for v2. 

Changes from v1 to v2:
* Abandon the patch for dropping page cache.
* Put some structures to uapi head file.
* Use a new way to determine the page bitmap size.
* Use a unified way to send the free page information with the bitmap 
* Address the issues referred in MST's comments


Liang Li (7):
  virtio-balloon: rework deflate to add page to a list
  virtio-balloon: define new feature bit and page bitmap head
  mm: add a function to get the max pfn
  virtio-balloon: speed up inflate/deflate process
  mm: add the related functions to get unused page
  virtio-balloon: define feature bit and head for misc virt queue
  virtio-balloon: tell host vm's unused page info

 drivers/virtio/virtio_balloon.c | 390 
 include/linux/mm.h  |   3 +
 include/uapi/linux/virtio_balloon.h |  41 
 mm/page_alloc.c |  94 +
 4 files changed, 485 insertions(+), 43 deletions(-)

-- 
1.8.3.1