Re: [Qemu-devel] About QEMU BQL and dirty log switch in Migration

Jay Zhou Wed, 17 May 2017 00:37:54 -0700


On 2017/5/17 13:47, Wanpeng Li wrote:

Hi Zhoujian,
2017-05-17 10:20 GMT+08:00 Zhoujian (jay) <jianjay.z...@huawei.com>:

Hi Wanpeng,

On 11/05/2017 14:07, Zhoujian (jay) wrote:

-        * Scan sptes if dirty logging has been stopped, dropping those
-        * which can be collapsed into a single large-page spte.  Later
-        * page faults will create the large-page sptes.
+        * Reset each vcpu's mmu, then page faults will create the

large-page

+        * sptes later.
          */
         if ((change != KVM_MR_DELETE) &&
                 (old->flags & KVM_MEM_LOG_DIRTY_PAGES) &&
-               !(new->flags & KVM_MEM_LOG_DIRTY_PAGES))
-               kvm_mmu_zap_collapsible_sptes(kvm, new);


This is an unlikely branch(unless guest live migration fails and continue
to run on the source machine) instead of hot path, do you have any
performance number for your real workloads?


Sorry to bother you again.

Recently, I have tested the performance before migration and after migration 
failure
using spec cpu2006 https://www.spec.org/cpu2006/, which is a standard 
performance
evaluation tool.

These are the results:
******
     Before migration the score is 153, and the TLB miss statistics of the qemu 
process is:
     linux-sjrfac:/mnt/zhoujian # perf stat -e 
dTLB-load-misses,dTLB-loads,dTLB-store-misses, \
     dTLB-stores,iTLB-load-misses,iTLB-loads -p 26463 sleep 10

     Performance counter stats for process id '26463':

            698,938      dTLB-load-misses          #    0.13% of all dTLB cache 
hits   (50.46%)
        543,303,875      dTLB-loads                                             
       (50.43%)
            199,597      dTLB-store-misses                                      
       (16.51%)
         60,128,561      dTLB-stores                                            
       (16.67%)
             69,986      iTLB-load-misses          #    6.17% of all iTLB cache 
hits   (16.67%)
          1,134,097      iTLB-loads                                             
       (33.33%)

       10.000684064 seconds time elapsed

     After migration failure the score is 149, and the TLB miss statistics of 
the qemu process is:
     linux-sjrfac:/mnt/zhoujian # perf stat -e 
dTLB-load-misses,dTLB-loads,dTLB-store-misses, \
     dTLB-stores,iTLB-load-misses,iTLB-loads -p 26463 sleep 10

     Performance counter stats for process id '26463':

            765,400      dTLB-load-misses          #    0.14% of all dTLB cache 
hits   (50.50%)
        540,972,144      dTLB-loads                                             
       (50.47%)
            207,670      dTLB-store-misses                                      
       (16.50%)
         58,363,787      dTLB-stores                                            
       (16.67%)
            109,772      iTLB-load-misses          #    9.52% of all iTLB cache 
hits   (16.67%)
          1,152,784      iTLB-loads                                             
       (33.32%)

       10.000703078 seconds time elapsed
******


Could you comment out the original "lazy collapse small sptes into
large sptes" codes in the function kvm_arch_commit_memory_region() and
post the results here?


  With the patch below,

diff --git a/source/x86/x86.c b/source/x86/x86.c
index 054a7d3..e0288d5 100644
--- a/source/x86/x86.c
+++ b/source/x86/x86.c
@@ -8548,10 +8548,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
         * which can be collapsed into a single large-page spte.  Later
         * page faults will create the large-page sptes.
         */
-       if ((change != KVM_MR_DELETE) &&
-               (old->flags & KVM_MEM_LOG_DIRTY_PAGES) &&
-               !(new->flags & KVM_MEM_LOG_DIRTY_PAGES))
-               kvm_mmu_zap_collapsible_sptes(kvm, new);

        /*
         * Set up write protection and/or dirty logging for the new slot.

After migration failure the score is 148, and the TLB miss statisticsof the qemu process is:linux-sjrfac:/mnt/zhoujian # perf stat -edTLB-load-misses,dTLB-loads,dTLB-store-misses,dTLB-stores,iTLB-load-misses,iTLB-loads-p 12432 sleep 10


 Performance counter stats for process id '12432':

1,052,697 dTLB-load-misses # 0.19% of alldTLB cache hits (50.45%)551,828,702 dTLB-loads(50.46%)147,228 dTLB-store-misses(16.55%)60,427,834 dTLB-stores(16.50%)93,793 iTLB-load-misses # 7.43% of alliTLB cache hits (16.67%)1,262,137 iTLB-loads(33.33%)


      10.000709900 seconds time elapsed

  Regards,
  Jay Zhou

Regards,
Wanpeng Li


These are the steps:
======
  (1) the version of kmod is 4.4.11(with slightly modified) and the version of 
qemu is 2.6.0
     (with slightly modified), the kmod is applied with the following patch 
according to
     Paolo's advice:

diff --git a/source/x86/x86.c b/source/x86/x86.c
index 054a7d3..75a4bb3 100644
--- a/source/x86/x86.c
+++ b/source/x86/x86.c
@@ -8550,8 +8550,10 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
          */
         if ((change != KVM_MR_DELETE) &&
                 (old->flags & KVM_MEM_LOG_DIRTY_PAGES) &&
-               !(new->flags & KVM_MEM_LOG_DIRTY_PAGES))
-               kvm_mmu_zap_collapsible_sptes(kvm, new);
+               !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) {
+               printk(KERN_ERR "zj make KVM_REQ_MMU_RELOAD request\n");
+               kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_RELOAD);
+       }

         /*
          * Set up write protection and/or dirty logging for the new slot.

(2) I started up a memory preoccupied 10G VM(suse11sp3), which means its "RES 
column" in top is 10G,
     in order to set up the EPT table in advance.
(3) And then, I run the test case 429.mcf of spec cpu2006 before migration and 
after migration failure.
     The 429.mcf is a memory intensive workload, and the migration failure is 
constructed deliberately
     with the following patch of qemu:

diff --git a/migration/migration.c b/migration/migration.c
index 5d725d0..88dfc59 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -625,6 +625,9 @@ static void process_incoming_migration_co(void *opaque)
                        MIGRATION_STATUS_ACTIVE);
      ret = qemu_loadvm_state(f);

+    // deliberately construct the migration failure
+    exit(EXIT_FAILURE);
+
      ps = postcopy_state_get();
      trace_process_incoming_migration_co_end(ret, ps);
      if (ps != POSTCOPY_INCOMING_NONE) {
======


Results of the score and TLB miss rate are almost the same, and I am confused.
May I ask which tool do you use to evaluate the performance?
And if my test steps are wrong, please let me know, thank you.

Regards,
Jay Zhou

Re: [Qemu-devel] About QEMU BQL and dirty log switch in Migration

Reply via email to