Re: [PATCH RESEND v3 00/10] migration: introduce dirtylimit capability
My sincere apologies for not replying sooner. This needs a rebase now. But let me have a look at it first.
Re: [PATCH RESEND v3 00/10] migration: introduce dirtylimit capability
Ping, Hi, David, how about the commit about live migration: [PATCH RESEND v3 08/10] migration: Implement dirty-limit convergence algo. 在 2022/12/4 1:09, huang...@chinatelecom.cn 写道: From: Hyman Huang(黄勇) v3(resend): - fix the syntax error of the topic. v3: This version make some modifications inspired by Peter and Markus as following: 1. Do the code clean up in [PATCH v2 02/11] suggested by Markus 2. Replace the [PATCH v2 03/11] with a much simpler patch posted by Peter to fix the following bug: https://bugzilla.redhat.com/show_bug.cgi?id=2124756 3. Fix the error path of migrate_params_check in [PATCH v2 04/11] pointed out by Markus. Enrich the commit message to explain why x-vcpu-dirty-limit-period an unstable parameter. 4. Refactor the dirty-limit convergence algo in [PATCH v2 07/11] suggested by Peter: a. apply blk_mig_bulk_active check before enable dirty-limit b. drop the unhelpful check function before enable dirty-limit c. change the migration_cancel logic, just cancel dirty-limit only if dirty-limit capability turned on. d. abstract a code clean commit [PATCH v3 07/10] to adjust the check order before enable auto-converge 5. Change the name of observing indexes during dirty-limit live migration to make them more easy-understanding. Use the maximum throttle time of vpus as "dirty-limit-throttle-time-per-full" 6. Fix some grammatical and spelling errors pointed out by Markus and enrich the document about the dirty-limit live migration observing indexes "dirty-limit-ring-full-time" and "dirty-limit-throttle-time-per-full" 7. Change the default value of x-vcpu-dirty-limit-period to 1000ms, which is optimal value pointed out in cover letter in that testing environment. 8. Drop the 2 guestperf test commits [PATCH v2 10/11], [PATCH v2 11/11] and post them with a standalone series in the future. Thanks Peter and Markus sincerely for the passionate, efficient and careful comments and suggestions. Please review. Yong v2: This version make a little bit modifications comparing with version 1 as following: 1. fix the overflow issue reported by Peter Maydell 2. add parameter check for hmp "set_vcpu_dirty_limit" command 3. fix the racing issue between dirty ring reaper thread and Qemu main thread. 4. add migrate parameter check for x-vcpu-dirty-limit-period and vcpu-dirty-limit. 5. add the logic to forbid hmp/qmp commands set_vcpu_dirty_limit, cancel_vcpu_dirty_limit during dirty-limit live migration when implement dirty-limit convergence algo. 6. add capability check to ensure auto-converge and dirty-limit are mutually exclusive. 7. pre-check if kvm dirty ring size is configured before setting dirty-limit migrate parameter A more comprehensive test was done comparing with version 1. The following are test environment: - a. Host hardware info: CPU: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s):2 NUMA node0 CPU(s): 0-15,32-47 NUMA node1 CPU(s): 16-31,48-63 Memory: Hynix 503Gi Interface: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09) Speed: 1000Mb/s b. Host software info: OS: ctyunos release 2 Kernel: 4.19.90-2102.2.0.0066.ctl2.x86_64 Libvirt baseline version: libvirt-6.9.0 Qemu baseline version: qemu-5.0 c. vm scale CPU: 4 Memory: 4G - All the supplementary test data shown as follows are basing on above test environment. In version 1, we post a test data from unixbench as follows: $ taskset -c 8-15 ./Run -i 2 -c 8 {unixbench test item} host cpu: Intel(R) Xeon(R) Platinum 8378A host interface speed: 1000Mb/s |-+++---| | UnixBench test item | Normal | Dirtylimit | Auto-converge | |-+++---| | dhry2reg| 32800 | 32786 | 25292 | | whetstone-double| 10326 | 10315 | 9847 | | pipe| 15442 | 15271 | 14506 | | context1| 7260 | 6235 | 4514 | | spawn | 3663 | 3317 | 3249 | | syscall | 4669 | 4667 | 3841 | |-+++---| In version 2, we post a supplementary test data that do not use taskset and make the scenario more general, see as follows: $ ./Run per-vcpu data: |-+++---| | UnixBench test item | Normal | Dirtylimit | Auto-converge | |-+++---| | dhry2reg| 2991
Re: [PATCH RESEND v3 00/10] migration: introduce dirtylimit capability
Ping ? 在 2022/12/4 1:09, huang...@chinatelecom.cn 写道: From: Hyman Huang(黄勇) v3(resend): - fix the syntax error of the topic. v3: This version make some modifications inspired by Peter and Markus as following: 1. Do the code clean up in [PATCH v2 02/11] suggested by Markus 2. Replace the [PATCH v2 03/11] with a much simpler patch posted by Peter to fix the following bug: https://bugzilla.redhat.com/show_bug.cgi?id=2124756 3. Fix the error path of migrate_params_check in [PATCH v2 04/11] pointed out by Markus. Enrich the commit message to explain why x-vcpu-dirty-limit-period an unstable parameter. 4. Refactor the dirty-limit convergence algo in [PATCH v2 07/11] suggested by Peter: a. apply blk_mig_bulk_active check before enable dirty-limit b. drop the unhelpful check function before enable dirty-limit c. change the migration_cancel logic, just cancel dirty-limit only if dirty-limit capability turned on. d. abstract a code clean commit [PATCH v3 07/10] to adjust the check order before enable auto-converge 5. Change the name of observing indexes during dirty-limit live migration to make them more easy-understanding. Use the maximum throttle time of vpus as "dirty-limit-throttle-time-per-full" 6. Fix some grammatical and spelling errors pointed out by Markus and enrich the document about the dirty-limit live migration observing indexes "dirty-limit-ring-full-time" and "dirty-limit-throttle-time-per-full" 7. Change the default value of x-vcpu-dirty-limit-period to 1000ms, which is optimal value pointed out in cover letter in that testing environment. 8. Drop the 2 guestperf test commits [PATCH v2 10/11], [PATCH v2 11/11] and post them with a standalone series in the future. Thanks Peter and Markus sincerely for the passionate, efficient and careful comments and suggestions. Please review. Yong v2: This version make a little bit modifications comparing with version 1 as following: 1. fix the overflow issue reported by Peter Maydell 2. add parameter check for hmp "set_vcpu_dirty_limit" command 3. fix the racing issue between dirty ring reaper thread and Qemu main thread. 4. add migrate parameter check for x-vcpu-dirty-limit-period and vcpu-dirty-limit. 5. add the logic to forbid hmp/qmp commands set_vcpu_dirty_limit, cancel_vcpu_dirty_limit during dirty-limit live migration when implement dirty-limit convergence algo. 6. add capability check to ensure auto-converge and dirty-limit are mutually exclusive. 7. pre-check if kvm dirty ring size is configured before setting dirty-limit migrate parameter A more comprehensive test was done comparing with version 1. The following are test environment: - a. Host hardware info: CPU: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s):2 NUMA node0 CPU(s): 0-15,32-47 NUMA node1 CPU(s): 16-31,48-63 Memory: Hynix 503Gi Interface: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09) Speed: 1000Mb/s b. Host software info: OS: ctyunos release 2 Kernel: 4.19.90-2102.2.0.0066.ctl2.x86_64 Libvirt baseline version: libvirt-6.9.0 Qemu baseline version: qemu-5.0 c. vm scale CPU: 4 Memory: 4G - All the supplementary test data shown as follows are basing on above test environment. In version 1, we post a test data from unixbench as follows: $ taskset -c 8-15 ./Run -i 2 -c 8 {unixbench test item} host cpu: Intel(R) Xeon(R) Platinum 8378A host interface speed: 1000Mb/s |-+++---| | UnixBench test item | Normal | Dirtylimit | Auto-converge | |-+++---| | dhry2reg| 32800 | 32786 | 25292 | | whetstone-double| 10326 | 10315 | 9847 | | pipe| 15442 | 15271 | 14506 | | context1| 7260 | 6235 | 4514 | | spawn | 3663 | 3317 | 3249 | | syscall | 4669 | 4667 | 3841 | |-+++---| In version 2, we post a supplementary test data that do not use taskset and make the scenario more general, see as follows: $ ./Run per-vcpu data: |-+++---| | UnixBench test item | Normal | Dirtylimit | Auto-converge | |-+++---| | dhry2reg| 2991 | 2902 | 1722 | | whetstone-double| 1018 | 1006 | 627 | | Execl Throughput| 955
[PATCH RESEND v3 00/10] migration: introduce dirtylimit capability
From: Hyman Huang(黄勇) v3(resend): - fix the syntax error of the topic. v3: This version make some modifications inspired by Peter and Markus as following: 1. Do the code clean up in [PATCH v2 02/11] suggested by Markus 2. Replace the [PATCH v2 03/11] with a much simpler patch posted by Peter to fix the following bug: https://bugzilla.redhat.com/show_bug.cgi?id=2124756 3. Fix the error path of migrate_params_check in [PATCH v2 04/11] pointed out by Markus. Enrich the commit message to explain why x-vcpu-dirty-limit-period an unstable parameter. 4. Refactor the dirty-limit convergence algo in [PATCH v2 07/11] suggested by Peter: a. apply blk_mig_bulk_active check before enable dirty-limit b. drop the unhelpful check function before enable dirty-limit c. change the migration_cancel logic, just cancel dirty-limit only if dirty-limit capability turned on. d. abstract a code clean commit [PATCH v3 07/10] to adjust the check order before enable auto-converge 5. Change the name of observing indexes during dirty-limit live migration to make them more easy-understanding. Use the maximum throttle time of vpus as "dirty-limit-throttle-time-per-full" 6. Fix some grammatical and spelling errors pointed out by Markus and enrich the document about the dirty-limit live migration observing indexes "dirty-limit-ring-full-time" and "dirty-limit-throttle-time-per-full" 7. Change the default value of x-vcpu-dirty-limit-period to 1000ms, which is optimal value pointed out in cover letter in that testing environment. 8. Drop the 2 guestperf test commits [PATCH v2 10/11], [PATCH v2 11/11] and post them with a standalone series in the future. Thanks Peter and Markus sincerely for the passionate, efficient and careful comments and suggestions. Please review. Yong v2: This version make a little bit modifications comparing with version 1 as following: 1. fix the overflow issue reported by Peter Maydell 2. add parameter check for hmp "set_vcpu_dirty_limit" command 3. fix the racing issue between dirty ring reaper thread and Qemu main thread. 4. add migrate parameter check for x-vcpu-dirty-limit-period and vcpu-dirty-limit. 5. add the logic to forbid hmp/qmp commands set_vcpu_dirty_limit, cancel_vcpu_dirty_limit during dirty-limit live migration when implement dirty-limit convergence algo. 6. add capability check to ensure auto-converge and dirty-limit are mutually exclusive. 7. pre-check if kvm dirty ring size is configured before setting dirty-limit migrate parameter A more comprehensive test was done comparing with version 1. The following are test environment: - a. Host hardware info: CPU: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s):2 NUMA node0 CPU(s): 0-15,32-47 NUMA node1 CPU(s): 16-31,48-63 Memory: Hynix 503Gi Interface: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09) Speed: 1000Mb/s b. Host software info: OS: ctyunos release 2 Kernel: 4.19.90-2102.2.0.0066.ctl2.x86_64 Libvirt baseline version: libvirt-6.9.0 Qemu baseline version: qemu-5.0 c. vm scale CPU: 4 Memory: 4G - All the supplementary test data shown as follows are basing on above test environment. In version 1, we post a test data from unixbench as follows: $ taskset -c 8-15 ./Run -i 2 -c 8 {unixbench test item} host cpu: Intel(R) Xeon(R) Platinum 8378A host interface speed: 1000Mb/s |-+++---| | UnixBench test item | Normal | Dirtylimit | Auto-converge | |-+++---| | dhry2reg| 32800 | 32786 | 25292 | | whetstone-double| 10326 | 10315 | 9847 | | pipe| 15442 | 15271 | 14506 | | context1| 7260 | 6235 | 4514 | | spawn | 3663 | 3317 | 3249 | | syscall | 4669 | 4667 | 3841 | |-+++---| In version 2, we post a supplementary test data that do not use taskset and make the scenario more general, see as follows: $ ./Run per-vcpu data: |-+++---| | UnixBench test item | Normal | Dirtylimit | Auto-converge | |-+++---| | dhry2reg| 2991 | 2902 | 1722 | | whetstone-double| 1018 | 1006 | 627 | | Execl Throughput| 955| 320| 660 | | File Copy - 1 | 2362 | 805| 1325