Re: [Qemu-devel] [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration
2016-09-01 13:46 GMT+08:00 Li, Liang Z: >> Subject: Re: [PATCH v3 kernel 0/7] Extend virtio-balloon for fast >> (de)inflating >> & fast live migration >> >> 2016-08-08 14:35 GMT+08:00 Liang Li : >> > This patch set contains two parts of changes to the virtio-balloon. >> > >> > One is the change for speeding up the inflating & deflating process, >> > the main idea of this optimization is to use bitmap to send the page >> > information to host instead of the PFNs, to reduce the overhead of >> > virtio data transmission, address translation and madvise(). This can >> > help to improve the performance by about 85%. >> > >> > Another change is for speeding up live migration. By skipping process >> > guest's free pages in the first round of data copy, to reduce needless >> > data processing, this can help to save quite a lot of CPU cycles and >> > network bandwidth. We put guest's free page information in bitmap and >> > send it to host with the virt queue of virtio-balloon. For an idle 8GB >> > guest, this can help to shorten the total live migration time from >> > 2Sec to about 500ms in the 10Gbps network environment. >> >> I just read the slides of this feature for recent kvm forum, the cloud >> providers more care about live migration downtime to avoid customers' >> perception than total time, however, this feature will increase downtime >> when acquire the benefit of reducing total time, maybe it will be more >> acceptable if there is no downside for downtime. >> >> Regards, >> Wanpeng Li > > In theory, there is no factor that will increase the downtime. There is no > additional operation > and no more data copy during the stop and copy stage. But in the test, the > downtime increases > and this can be reproduced. I think the busy network line maybe the reason > for this. With this > optimization, a huge amount of data is written to the socket in a shorter > time, so some of the write > operation may need to wait. Without this optimization, zero page checking > takes more time, > the network is not so busy. > > If the guest is not an idle one, I think the gap of the downtime will not so > obvious. Anyway, the http://www.linux-kvm.org/images/c/c3/03x06B-Liang_Li-Real_Time_and_Fast_Live_Migration_Update_for_NFV.pdf The slides show almost the similar percentage for the idle and the non-idle guests, they both increase ~50% downtime. Regards, Wanpeng Li
Re: [Qemu-devel] [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration
> Subject: Re: [PATCH v3 kernel 0/7] Extend virtio-balloon for fast > (de)inflating > & fast live migration > > 2016-08-08 14:35 GMT+08:00 Liang Li: > > This patch set contains two parts of changes to the virtio-balloon. > > > > One is the change for speeding up the inflating & deflating process, > > the main idea of this optimization is to use bitmap to send the page > > information to host instead of the PFNs, to reduce the overhead of > > virtio data transmission, address translation and madvise(). This can > > help to improve the performance by about 85%. > > > > Another change is for speeding up live migration. By skipping process > > guest's free pages in the first round of data copy, to reduce needless > > data processing, this can help to save quite a lot of CPU cycles and > > network bandwidth. We put guest's free page information in bitmap and > > send it to host with the virt queue of virtio-balloon. For an idle 8GB > > guest, this can help to shorten the total live migration time from > > 2Sec to about 500ms in the 10Gbps network environment. > > I just read the slides of this feature for recent kvm forum, the cloud > providers more care about live migration downtime to avoid customers' > perception than total time, however, this feature will increase downtime > when acquire the benefit of reducing total time, maybe it will be more > acceptable if there is no downside for downtime. > > Regards, > Wanpeng Li In theory, there is no factor that will increase the downtime. There is no additional operation and no more data copy during the stop and copy stage. But in the test, the downtime increases and this can be reproduced. I think the busy network line maybe the reason for this. With this optimization, a huge amount of data is written to the socket in a shorter time, so some of the write operation may need to wait. Without this optimization, zero page checking takes more time, the network is not so busy. If the guest is not an idle one, I think the gap of the downtime will not so obvious. Anyway, the downtime is still less than the max_down_time set by the user. Thanks! Liang
Re: [Qemu-devel] [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration
2016-08-08 14:35 GMT+08:00 Liang Li: > This patch set contains two parts of changes to the virtio-balloon. > > One is the change for speeding up the inflating & deflating process, > the main idea of this optimization is to use bitmap to send the page > information to host instead of the PFNs, to reduce the overhead of > virtio data transmission, address translation and madvise(). This can > help to improve the performance by about 85%. > > Another change is for speeding up live migration. By skipping process > guest's free pages in the first round of data copy, to reduce needless > data processing, this can help to save quite a lot of CPU cycles and > network bandwidth. We put guest's free page information in bitmap and > send it to host with the virt queue of virtio-balloon. For an idle 8GB > guest, this can help to shorten the total live migration time from 2Sec > to about 500ms in the 10Gbps network environment. I just read the slides of this feature for recent kvm forum, the cloud providers more care about live migration downtime to avoid customers' perception than total time, however, this feature will increase downtime when acquire the benefit of reducing total time, maybe it will be more acceptable if there is no downside for downtime. Regards, Wanpeng Li
Re: [Qemu-devel] [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration
Hi Michael, I know you are very busy. If you have time, could you help to take a look at this patch set? Thanks! Liang > -Original Message- > From: Li, Liang Z > Sent: Thursday, August 18, 2016 9:06 AM > To: Michael S. Tsirkin > Cc: virtualizat...@lists.linux-foundation.org; linux...@kvack.org; virtio- > d...@lists.oasis-open.org; k...@vger.kernel.org; qemu-devel@nongnu.org; > quint...@redhat.com; dgilb...@redhat.com; Hansen, Dave; linux- > ker...@vger.kernel.org > Subject: RE: [PATCH v3 kernel 0/7] Extend virtio-balloon for fast > (de)inflating > & fast live migration > > Hi Michael, > > Could you help to review this version when you have time? > > Thanks! > Liang > > > -Original Message- > > From: Li, Liang Z > > Sent: Monday, August 08, 2016 2:35 PM > > To: linux-ker...@vger.kernel.org > > Cc: virtualizat...@lists.linux-foundation.org; linux...@kvack.org; > > virtio- d...@lists.oasis-open.org; k...@vger.kernel.org; > > qemu-devel@nongnu.org; quint...@redhat.com; dgilb...@redhat.com; > > Hansen, Dave; Li, Liang Z > > Subject: [PATCH v3 kernel 0/7] Extend virtio-balloon for fast > > (de)inflating & fast live migration > > > > This patch set contains two parts of changes to the virtio-balloon. > > > > One is the change for speeding up the inflating & deflating process, > > the main idea of this optimization is to use bitmap to send the page > > information to host instead of the PFNs, to reduce the overhead of > > virtio data transmission, address translation and madvise(). This can > > help to improve the performance by about 85%. > > > > Another change is for speeding up live migration. By skipping process > > guest's free pages in the first round of data copy, to reduce needless > > data processing, this can help to save quite a lot of CPU cycles and > > network bandwidth. We put guest's free page information in bitmap and > > send it to host with the virt queue of virtio-balloon. For an idle 8GB > > guest, this can help to shorten the total live migration time from > > 2Sec to about 500ms in the 10Gbps network environment. > > > > Dave Hansen suggested a new scheme to encode the data structure, > > because of additional complexity, it's not implemented in v3. > > > > Changes from v2 to v3: > > * Change the name of 'free page' to 'unused page'. > > * Use the scatter & gather bitmap instead of a 1MB page bitmap. > > * Fix overwriting the page bitmap after kicking. > > * Some of MST's comments for v2. > > > > Changes from v1 to v2: > > * Abandon the patch for dropping page cache. > > * Put some structures to uapi head file. > > * Use a new way to determine the page bitmap size. > > * Use a unified way to send the free page information with the bitmap > > * Address the issues referred in MST's comments > > > > > > Liang Li (7): > > virtio-balloon: rework deflate to add page to a list > > virtio-balloon: define new feature bit and page bitmap head > > mm: add a function to get the max pfn > > virtio-balloon: speed up inflate/deflate process > > mm: add the related functions to get unused page > > virtio-balloon: define feature bit and head for misc virt queue > > virtio-balloon: tell host vm's unused page info > > > > drivers/virtio/virtio_balloon.c | 390 > > > > include/linux/mm.h | 3 + > > include/uapi/linux/virtio_balloon.h | 41 > > mm/page_alloc.c | 94 + > > 4 files changed, 485 insertions(+), 43 deletions(-) > > > > -- > > 1.8.3.1
Re: [Qemu-devel] [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration
Hi Michael, Could you help to review this version when you have time? Thanks! Liang > -Original Message- > From: Li, Liang Z > Sent: Monday, August 08, 2016 2:35 PM > To: linux-ker...@vger.kernel.org > Cc: virtualizat...@lists.linux-foundation.org; linux...@kvack.org; virtio- > d...@lists.oasis-open.org; k...@vger.kernel.org; qemu-devel@nongnu.org; > quint...@redhat.com; dgilb...@redhat.com; Hansen, Dave; Li, Liang Z > Subject: [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & > fast live migration > > This patch set contains two parts of changes to the virtio-balloon. > > One is the change for speeding up the inflating & deflating process, the main > idea of this optimization is to use bitmap to send the page information to > host instead of the PFNs, to reduce the overhead of virtio data transmission, > address translation and madvise(). This can help to improve the performance > by about 85%. > > Another change is for speeding up live migration. By skipping process guest's > free pages in the first round of data copy, to reduce needless data > processing, > this can help to save quite a lot of CPU cycles and network bandwidth. We > put guest's free page information in bitmap and send it to host with the virt > queue of virtio-balloon. For an idle 8GB guest, this can help to shorten the > total live migration time from 2Sec to about 500ms in the 10Gbps network > environment. > > Dave Hansen suggested a new scheme to encode the data structure, > because of additional complexity, it's not implemented in v3. > > Changes from v2 to v3: > * Change the name of 'free page' to 'unused page'. > * Use the scatter & gather bitmap instead of a 1MB page bitmap. > * Fix overwriting the page bitmap after kicking. > * Some of MST's comments for v2. > > Changes from v1 to v2: > * Abandon the patch for dropping page cache. > * Put some structures to uapi head file. > * Use a new way to determine the page bitmap size. > * Use a unified way to send the free page information with the bitmap > * Address the issues referred in MST's comments > > > Liang Li (7): > virtio-balloon: rework deflate to add page to a list > virtio-balloon: define new feature bit and page bitmap head > mm: add a function to get the max pfn > virtio-balloon: speed up inflate/deflate process > mm: add the related functions to get unused page > virtio-balloon: define feature bit and head for misc virt queue > virtio-balloon: tell host vm's unused page info > > drivers/virtio/virtio_balloon.c | 390 > > include/linux/mm.h | 3 + > include/uapi/linux/virtio_balloon.h | 41 > mm/page_alloc.c | 94 + > 4 files changed, 485 insertions(+), 43 deletions(-) > > -- > 1.8.3.1
Re: [Qemu-devel] [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration
> Subject: Re: [PATCH v3 kernel 0/7] Extend virtio-balloon for fast > (de)inflating > & fast live migration > > On 08/07/2016 11:35 PM, Liang Li wrote: > > Dave Hansen suggested a new scheme to encode the data structure, > > because of additional complexity, it's not implemented in v3. > > FWIW, I don't think it takes any additional complexity here, at least in the > guest implementation side. The thing I suggested would just mean explicitly > calling out that there was a single bitmap instead of implying it in the ABI. > > Do you think the scheme I suggested is the way to go? Yes, I think so. And I will do that in the later version. In this V3, I just want to solve the issue caused by a large page bitmap in v2. Liang
Re: [Qemu-devel] [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration
On 08/07/2016 11:35 PM, Liang Li wrote: > Dave Hansen suggested a new scheme to encode the data structure, > because of additional complexity, it's not implemented in v3. FWIW, I don't think it takes any additional complexity here, at least in the guest implementation side. The thing I suggested would just mean explicitly calling out that there was a single bitmap instead of implying it in the ABI. Do you think the scheme I suggested is the way to go?
[Qemu-devel] [PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration
This patch set contains two parts of changes to the virtio-balloon. One is the change for speeding up the inflating & deflating process, the main idea of this optimization is to use bitmap to send the page information to host instead of the PFNs, to reduce the overhead of virtio data transmission, address translation and madvise(). This can help to improve the performance by about 85%. Another change is for speeding up live migration. By skipping process guest's free pages in the first round of data copy, to reduce needless data processing, this can help to save quite a lot of CPU cycles and network bandwidth. We put guest's free page information in bitmap and send it to host with the virt queue of virtio-balloon. For an idle 8GB guest, this can help to shorten the total live migration time from 2Sec to about 500ms in the 10Gbps network environment. Dave Hansen suggested a new scheme to encode the data structure, because of additional complexity, it's not implemented in v3. Changes from v2 to v3: * Change the name of 'free page' to 'unused page'. * Use the scatter & gather bitmap instead of a 1MB page bitmap. * Fix overwriting the page bitmap after kicking. * Some of MST's comments for v2. Changes from v1 to v2: * Abandon the patch for dropping page cache. * Put some structures to uapi head file. * Use a new way to determine the page bitmap size. * Use a unified way to send the free page information with the bitmap * Address the issues referred in MST's comments Liang Li (7): virtio-balloon: rework deflate to add page to a list virtio-balloon: define new feature bit and page bitmap head mm: add a function to get the max pfn virtio-balloon: speed up inflate/deflate process mm: add the related functions to get unused page virtio-balloon: define feature bit and head for misc virt queue virtio-balloon: tell host vm's unused page info drivers/virtio/virtio_balloon.c | 390 include/linux/mm.h | 3 + include/uapi/linux/virtio_balloon.h | 41 mm/page_alloc.c | 94 + 4 files changed, 485 insertions(+), 43 deletions(-) -- 1.8.3.1