Re: [Xen-devel] [PATCH v6 1/1] xen/time: do not decrease steal time after live migration on xen

2017-11-08 Thread Boris Ostrovsky
On 10/31/2017 09:46 PM, Dongli Zhang wrote:
> After guest live migration on xen, steal time in /proc/stat
> (cpustat[CPUTIME_STEAL]) might decrease because steal returned by
> xen_steal_lock() might be less than this_rq()->prev_steal_time which is
> derived from previous return value of xen_steal_clock().
>
> For instance, steal time of each vcpu is 335 before live migration.
>
> cpu  198 0 368 200064 1962 0 0 1340 0 0
> cpu0 38 0 81 50063 492 0 0 335 0 0
> cpu1 65 0 97 49763 634 0 0 335 0 0
> cpu2 38 0 81 50098 462 0 0 335 0 0
> cpu3 56 0 107 50138 374 0 0 335 0 0
>
> After live migration, steal time is reduced to 312.
>
> cpu  200 0 370 200330 1971 0 0 1248 0 0
> cpu0 38 0 82 50123 500 0 0 312 0 0
> cpu1 65 0 97 49832 634 0 0 312 0 0
> cpu2 39 0 82 50167 462 0 0 312 0 0
> cpu3 56 0 107 50207 374 0 0 312 0 0
>
> Since runstate times are cumulative and cleared during xen live migration
> by xen hypervisor, the idea of this patch is to accumulate runstate times
> to global percpu variables before live migration suspend. Once guest VM is
> resumed, xen_get_runstate_snapshot_cpu() would always return the sum of new
> runstate times and previously accumulated times stored in global percpu
> variables. Comments before the call of HYPERVISOR_suspend() has been
> removed as it is inaccurate. The call can return an error code (e.g.,
> possibly -EPERM in the future).
>
> Similar and more severe issue would impact prior linux 4.8-4.10 as
> discussed by Michael Las at
> https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest,
> which would overflow steal time and lead to 100% st usage in top command
> for linux 4.8-4.10. A backport of this patch would fix that issue.
>
> References: 
> https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest
> Signed-off-by: Dongli Zhang 

Applied to for-linus-4.15.

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 1/1] xen/time: do not decrease steal time after live migration on xen

2017-11-02 Thread Dongli Zhang
Hi Boris,

On 11/03/2017 04:28 AM, Boris Ostrovsky wrote:
> On 11/01/2017 09:19 PM, Dongli Zhang wrote:
>> Hi Boris,
>>
>> I have received from l...@intel.com that the prior version of patch hit issue
>> during compilation with aarch64-linux-gnu-gcc. I think this patch reviewed by
>> you would hit the same compiling issue on arm64 (there is no issue with 
>> x86_64).
>>
>> -
>>
>> 1st issue:
>>
>> Without including header  into driver/xen/time.c, compilation 
>> on
>> x86_64 works well (without any warning or error) but arm64 would hit the
>> following error:
>>
>> drivers/xen/time.c: In function ‘xen_manage_runstate_time’:
>> drivers/xen/time.c:94:20: error: implicit declaration of function
>> ‘kmalloc_array’ [-Werror=implicit-function-declaration]
>>runstate_delta = kmalloc_array(num_possible_cpus(),
>> ^
>>
>> drivers/xen/time.c:131:3: error: implicit declaration of function ‘kfree’
>> [-Werror=implicit-function-declaration]
>>kfree(runstate_delta);
>>^
>> cc1: some warnings being treated as errors
>>
>> About the 1st issue, should I submit a new patch including  or
>> just a incremental based on previous patch merged into your own branch
>> /tree?
>>
>> -
>>
>> 2nd issue:
>>
>> aarch64-linux-gnu-gcc expects a cast for kmalloc_array(). Is this really
>> necessary as I did find people casting the return type of
>> kmalloc/kcalloc/kmalloc_array in linux source code (e.g.,
>> drivers/block/virtio_blk.c). Can we just ignore this warning?
>>
>> drivers/xen/time.c:94:18: warning: assignment makes pointer from integer 
>> without
>> a cast [-Wint-conversion]
>>runstate_delta = kmalloc_array(num_possible_cpus(),
>>   ^
>> -
> 
> That's because you need to declare kmalloc_array(), otherwise the
> compiler by default assumes that it returns an int. So including
> linux/slab.h should take care of both warnings.
> 
> I can add it while committing.

Please help add it while committing. Thank you very much for your help!

> 
> 
> -boris
> 
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
> 

Dongli Zhang

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 1/1] xen/time: do not decrease steal time after live migration on xen

2017-11-02 Thread Boris Ostrovsky
On 11/01/2017 09:19 PM, Dongli Zhang wrote:
> Hi Boris,
>
> I have received from l...@intel.com that the prior version of patch hit issue
> during compilation with aarch64-linux-gnu-gcc. I think this patch reviewed by
> you would hit the same compiling issue on arm64 (there is no issue with 
> x86_64).
>
> -
>
> 1st issue:
>
> Without including header  into driver/xen/time.c, compilation on
> x86_64 works well (without any warning or error) but arm64 would hit the
> following error:
>
> drivers/xen/time.c: In function ‘xen_manage_runstate_time’:
> drivers/xen/time.c:94:20: error: implicit declaration of function
> ‘kmalloc_array’ [-Werror=implicit-function-declaration]
>runstate_delta = kmalloc_array(num_possible_cpus(),
> ^
>
> drivers/xen/time.c:131:3: error: implicit declaration of function ‘kfree’
> [-Werror=implicit-function-declaration]
>kfree(runstate_delta);
>^
> cc1: some warnings being treated as errors
>
> About the 1st issue, should I submit a new patch including  or
> just a incremental based on previous patch merged into your own branch
> /tree?
>
> -
>
> 2nd issue:
>
> aarch64-linux-gnu-gcc expects a cast for kmalloc_array(). Is this really
> necessary as I did find people casting the return type of
> kmalloc/kcalloc/kmalloc_array in linux source code (e.g.,
> drivers/block/virtio_blk.c). Can we just ignore this warning?
>
> drivers/xen/time.c:94:18: warning: assignment makes pointer from integer 
> without
> a cast [-Wint-conversion]
>runstate_delta = kmalloc_array(num_possible_cpus(),
>   ^
> -

That's because you need to declare kmalloc_array(), otherwise the
compiler by default assumes that it returns an int. So including
linux/slab.h should take care of both warnings.

I can add it while committing.


-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 1/1] xen/time: do not decrease steal time after live migration on xen

2017-11-01 Thread Dongli Zhang
Hi Boris,

I have received from l...@intel.com that the prior version of patch hit issue
during compilation with aarch64-linux-gnu-gcc. I think this patch reviewed by
you would hit the same compiling issue on arm64 (there is no issue with x86_64).

-

1st issue:

Without including header  into driver/xen/time.c, compilation on
x86_64 works well (without any warning or error) but arm64 would hit the
following error:

drivers/xen/time.c: In function ‘xen_manage_runstate_time’:
drivers/xen/time.c:94:20: error: implicit declaration of function
‘kmalloc_array’ [-Werror=implicit-function-declaration]
   runstate_delta = kmalloc_array(num_possible_cpus(),
^

drivers/xen/time.c:131:3: error: implicit declaration of function ‘kfree’
[-Werror=implicit-function-declaration]
   kfree(runstate_delta);
   ^
cc1: some warnings being treated as errors

About the 1st issue, should I submit a new patch including  or
just a incremental based on previous patch merged into your own branch
/tree?

-

2nd issue:

aarch64-linux-gnu-gcc expects a cast for kmalloc_array(). Is this really
necessary as I did find people casting the return type of
kmalloc/kcalloc/kmalloc_array in linux source code (e.g.,
drivers/block/virtio_blk.c). Can we just ignore this warning?

drivers/xen/time.c:94:18: warning: assignment makes pointer from integer without
a cast [-Wint-conversion]
   runstate_delta = kmalloc_array(num_possible_cpus(),
  ^
-


Thank you very much!

Dongli Zhang


On 11/02/2017 03:19 AM, Boris Ostrovsky wrote:
> On 10/31/2017 09:46 PM, Dongli Zhang wrote:
>> After guest live migration on xen, steal time in /proc/stat
>> (cpustat[CPUTIME_STEAL]) might decrease because steal returned by
>> xen_steal_lock() might be less than this_rq()->prev_steal_time which is
>> derived from previous return value of xen_steal_clock().
>>
>> For instance, steal time of each vcpu is 335 before live migration.
>>
>> cpu  198 0 368 200064 1962 0 0 1340 0 0
>> cpu0 38 0 81 50063 492 0 0 335 0 0
>> cpu1 65 0 97 49763 634 0 0 335 0 0
>> cpu2 38 0 81 50098 462 0 0 335 0 0
>> cpu3 56 0 107 50138 374 0 0 335 0 0
>>
>> After live migration, steal time is reduced to 312.
>>
>> cpu  200 0 370 200330 1971 0 0 1248 0 0
>> cpu0 38 0 82 50123 500 0 0 312 0 0
>> cpu1 65 0 97 49832 634 0 0 312 0 0
>> cpu2 39 0 82 50167 462 0 0 312 0 0
>> cpu3 56 0 107 50207 374 0 0 312 0 0
>>
>> Since runstate times are cumulative and cleared during xen live migration
>> by xen hypervisor, the idea of this patch is to accumulate runstate times
>> to global percpu variables before live migration suspend. Once guest VM is
>> resumed, xen_get_runstate_snapshot_cpu() would always return the sum of new
>> runstate times and previously accumulated times stored in global percpu
>> variables. Comments before the call of HYPERVISOR_suspend() has been
>> removed as it is inaccurate. The call can return an error code (e.g.,
>> possibly -EPERM in the future).
> 
> I'd like split comment removal bit into a separate paragraph. I can do
> this when committing if you don't mind.
> 
>>
>> Similar and more severe issue would impact prior linux 4.8-4.10 as
>> discussed by Michael Las at
>> https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest,
>> which would overflow steal time and lead to 100% st usage in top command
>> for linux 4.8-4.10. A backport of this patch would fix that issue.
>>
>> References: 
>> https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest
>> Signed-off-by: Dongli Zhang 
>>
>> ---
> 
> Reviewed-by: Boris Ostrovsky 
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 1/1] xen/time: do not decrease steal time after live migration on xen

2017-11-01 Thread Boris Ostrovsky
On 10/31/2017 09:46 PM, Dongli Zhang wrote:
> After guest live migration on xen, steal time in /proc/stat
> (cpustat[CPUTIME_STEAL]) might decrease because steal returned by
> xen_steal_lock() might be less than this_rq()->prev_steal_time which is
> derived from previous return value of xen_steal_clock().
>
> For instance, steal time of each vcpu is 335 before live migration.
>
> cpu  198 0 368 200064 1962 0 0 1340 0 0
> cpu0 38 0 81 50063 492 0 0 335 0 0
> cpu1 65 0 97 49763 634 0 0 335 0 0
> cpu2 38 0 81 50098 462 0 0 335 0 0
> cpu3 56 0 107 50138 374 0 0 335 0 0
>
> After live migration, steal time is reduced to 312.
>
> cpu  200 0 370 200330 1971 0 0 1248 0 0
> cpu0 38 0 82 50123 500 0 0 312 0 0
> cpu1 65 0 97 49832 634 0 0 312 0 0
> cpu2 39 0 82 50167 462 0 0 312 0 0
> cpu3 56 0 107 50207 374 0 0 312 0 0
>
> Since runstate times are cumulative and cleared during xen live migration
> by xen hypervisor, the idea of this patch is to accumulate runstate times
> to global percpu variables before live migration suspend. Once guest VM is
> resumed, xen_get_runstate_snapshot_cpu() would always return the sum of new
> runstate times and previously accumulated times stored in global percpu
> variables. Comments before the call of HYPERVISOR_suspend() has been
> removed as it is inaccurate. The call can return an error code (e.g.,
> possibly -EPERM in the future).

I'd like split comment removal bit into a separate paragraph. I can do
this when committing if you don't mind.

>
> Similar and more severe issue would impact prior linux 4.8-4.10 as
> discussed by Michael Las at
> https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest,
> which would overflow steal time and lead to 100% st usage in top command
> for linux 4.8-4.10. A backport of this patch would fix that issue.
>
> References: 
> https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest
> Signed-off-by: Dongli Zhang 
>
> ---

Reviewed-by: Boris Ostrovsky 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v6 1/1] xen/time: do not decrease steal time after live migration on xen

2017-10-31 Thread Dongli Zhang
After guest live migration on xen, steal time in /proc/stat
(cpustat[CPUTIME_STEAL]) might decrease because steal returned by
xen_steal_lock() might be less than this_rq()->prev_steal_time which is
derived from previous return value of xen_steal_clock().

For instance, steal time of each vcpu is 335 before live migration.

cpu  198 0 368 200064 1962 0 0 1340 0 0
cpu0 38 0 81 50063 492 0 0 335 0 0
cpu1 65 0 97 49763 634 0 0 335 0 0
cpu2 38 0 81 50098 462 0 0 335 0 0
cpu3 56 0 107 50138 374 0 0 335 0 0

After live migration, steal time is reduced to 312.

cpu  200 0 370 200330 1971 0 0 1248 0 0
cpu0 38 0 82 50123 500 0 0 312 0 0
cpu1 65 0 97 49832 634 0 0 312 0 0
cpu2 39 0 82 50167 462 0 0 312 0 0
cpu3 56 0 107 50207 374 0 0 312 0 0

Since runstate times are cumulative and cleared during xen live migration
by xen hypervisor, the idea of this patch is to accumulate runstate times
to global percpu variables before live migration suspend. Once guest VM is
resumed, xen_get_runstate_snapshot_cpu() would always return the sum of new
runstate times and previously accumulated times stored in global percpu
variables. Comments before the call of HYPERVISOR_suspend() has been
removed as it is inaccurate. The call can return an error code (e.g.,
possibly -EPERM in the future).

Similar and more severe issue would impact prior linux 4.8-4.10 as
discussed by Michael Las at
https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest,
which would overflow steal time and lead to 100% st usage in top command
for linux 4.8-4.10. A backport of this patch would fix that issue.

References: 
https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest
Signed-off-by: Dongli Zhang 

---
Changed since v1:
  * relocate modification to xen_get_runstate_snapshot_cpu

Changed since v2:
  * accumulate runstate times before live migration

Changed since v3:
  * do not accumulate times in the case of guest checkpointing

Changed since v4:
  * allocate array of vcpu_runstate_info to reduce number of memory allocation

Changed since v5:
  * remove old incorrect comments above hypercall and mention in commit message
  * rename xen_accumulate_runstate_time() to xen_manage_runstate_time()
  * move global static pointer into xen_manage_runstate_time
  * change warn and alert to pr_warn_once() or pr_warn()
  * change kcalloc to kmalloc_array
  * do not add RUNSTATE_max to change Xen ABI and use 4 in the code instead

---
 drivers/xen/manage.c  |  7 ++---
 drivers/xen/time.c| 71 +--
 include/xen/xen-ops.h |  1 +
 3 files changed, 72 insertions(+), 7 deletions(-)

diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
index c425d03..8835065 100644
--- a/drivers/xen/manage.c
+++ b/drivers/xen/manage.c
@@ -72,18 +72,15 @@ static int xen_suspend(void *data)
}
 
gnttab_suspend();
+   xen_manage_runstate_time(-1);
xen_arch_pre_suspend();
 
-   /*
-* This hypercall returns 1 if suspend was cancelled
-* or the domain was merely checkpointed, and 0 if it
-* is resuming in a new domain.
-*/
si->cancelled = HYPERVISOR_suspend(xen_pv_domain()
? virt_to_gfn(xen_start_info)
: 0);
 
xen_arch_post_suspend(si->cancelled);
+   xen_manage_runstate_time(si->cancelled ? 1 : 0);
gnttab_resume();
 
if (!si->cancelled) {
diff --git a/drivers/xen/time.c b/drivers/xen/time.c
index ac5f23f..65a0b25 100644
--- a/drivers/xen/time.c
+++ b/drivers/xen/time.c
@@ -19,6 +19,8 @@
 /* runstate info updated by Xen */
 static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate);
 
+static DEFINE_PER_CPU(u64[4], old_runstate_time);
+
 /* return an consistent snapshot of 64-bit time/counter value */
 static u64 get64(const u64 *p)
 {
@@ -47,8 +49,8 @@ static u64 get64(const u64 *p)
return ret;
 }
 
-static void xen_get_runstate_snapshot_cpu(struct vcpu_runstate_info *res,
- unsigned int cpu)
+static void xen_get_runstate_snapshot_cpu_delta(
+ struct vcpu_runstate_info *res, unsigned int cpu)
 {
u64 state_time;
struct vcpu_runstate_info *state;
@@ -66,6 +68,71 @@ static void xen_get_runstate_snapshot_cpu(struct 
vcpu_runstate_info *res,
 (state_time & XEN_RUNSTATE_UPDATE));
 }
 
+static void xen_get_runstate_snapshot_cpu(struct vcpu_runstate_info *res,
+ unsigned int cpu)
+{
+   int i;
+
+   xen_get_runstate_snapshot_cpu_delta(res, cpu);
+
+   for (i = 0; i < 4; i++)
+   res->time[i] += per_cpu(old_runstate_time, cpu)[i];
+}
+
+void xen_manage_runstate_time(int action)
+{
+   static struct vcpu_runstate_info *runstate_delta;
+   struct vcpu_runstate_info state;
+   int cpu, i;
+