Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen

2019-01-16 Thread Boris Ostrovsky
On 1/16/19 10:29 AM, Juergen Gross wrote:
> On 16/01/2019 16:07, Boris Ostrovsky wrote:
>> On 1/16/19 9:33 AM, Juergen Gross wrote:
>>> On 16/01/2019 14:17, Boris Ostrovsky wrote:
 On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote:

> @@ -1650,13 +1650,14 @@ void xen_callback_vector(void)
> xen_have_vector_callback = 0;
> return;
> }
> -   pr_info("Xen HVM callback vector for event delivery is
> enabled\n");
> +   if (!silent)
> +   pr_info("Xen HVM callback vector for event
> delivery is enabled\n");
 How about replacing pr_info() with pr_info_once()?
>>> What a nice and simple idea!
>>>
>>> Extra patch or V4?
>>>
>>
>> I can add this while committing, I don't think it's worth a whole new patch.
>>
>> One outstanding question I have is whether anything needs to be added to
>> the commit message (Thomas had some questions)
> He didn't react to my explanation. I'm interpreting that as him being
> fine with my explanation, which I believe is not suitable to be added
> to the commit message.

OK. Applied to (now properly named) for-linus-5.0

-boris


Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen

2019-01-16 Thread Juergen Gross
On 16/01/2019 16:07, Boris Ostrovsky wrote:
> On 1/16/19 9:33 AM, Juergen Gross wrote:
>> On 16/01/2019 14:17, Boris Ostrovsky wrote:
>>> On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote:
>>>
 @@ -1650,13 +1650,14 @@ void xen_callback_vector(void)
 xen_have_vector_callback = 0;
 return;
 }
 -   pr_info("Xen HVM callback vector for event delivery is
 enabled\n");
 +   if (!silent)
 +   pr_info("Xen HVM callback vector for event
 delivery is enabled\n");
>>> How about replacing pr_info() with pr_info_once()?
>> What a nice and simple idea!
>>
>> Extra patch or V4?
>>
> 
> 
> I can add this while committing, I don't think it's worth a whole new patch.
> 
> One outstanding question I have is whether anything needs to be added to
> the commit message (Thomas had some questions)

He didn't react to my explanation. I'm interpreting that as him being
fine with my explanation, which I believe is not suitable to be added
to the commit message.


Juergen


Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen

2019-01-16 Thread Boris Ostrovsky
On 1/16/19 9:33 AM, Juergen Gross wrote:
> On 16/01/2019 14:17, Boris Ostrovsky wrote:
>> On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote:
>>
>>> @@ -1650,13 +1650,14 @@ void xen_callback_vector(void)
>>> xen_have_vector_callback = 0;
>>> return;
>>> }
>>> -   pr_info("Xen HVM callback vector for event delivery is
>>> enabled\n");
>>> +   if (!silent)
>>> +   pr_info("Xen HVM callback vector for event
>>> delivery is enabled\n");
>> How about replacing pr_info() with pr_info_once()?
> What a nice and simple idea!
>
> Extra patch or V4?
>


I can add this while committing, I don't think it's worth a whole new patch.

One outstanding question I have is whether anything needs to be added to
the commit message (Thomas had some questions)


-boris


Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen

2019-01-16 Thread Juergen Gross
On 16/01/2019 14:17, Boris Ostrovsky wrote:
> On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote:
> 
>> @@ -1650,13 +1650,14 @@ void xen_callback_vector(void)
>> xen_have_vector_callback = 0;
>> return;
>> }
>> -   pr_info("Xen HVM callback vector for event delivery is
>> enabled\n");
>> +   if (!silent)
>> +   pr_info("Xen HVM callback vector for event
>> delivery is enabled\n");
> 
> How about replacing pr_info() with pr_info_once()?

What a nice and simple idea!

Extra patch or V4?


Juergen


Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen

2019-01-16 Thread Boris Ostrovsky
On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote:

> @@ -1650,13 +1650,14 @@ void xen_callback_vector(void)
> xen_have_vector_callback = 0;
> return;
> }
> -   pr_info("Xen HVM callback vector for event delivery is
> enabled\n");
> +   if (!silent)
> +   pr_info("Xen HVM callback vector for event
> delivery is enabled\n");

How about replacing pr_info() with pr_info_once()?

-boris



Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen

2019-01-15 Thread Juergen Gross
On 16/01/2019 01:24, Hans van Kranenburg wrote:
> Hi,
> 
> On 1/14/19 1:44 PM, Juergen Gross wrote:
>> Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable'
>> sched_clock() interface") broke Xen guest time handling across
>> migration:
>>
>> [  187.249951] Freezing user space processes ... (elapsed 0.001 seconds) 
>> done.
>> [  187.251137] OOM killer disabled.
>> [  187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 
>> seconds) done.
>> [  187.252299] suspending xenstore...
>> [  187.266987] xen:grant_table: Grant tables using version 1 layout
>> [18446743811.706476] OOM killer enabled.
>> [18446743811.706478] Restarting tasks ... done.
>> [18446743811.720505] Setting capacity to 16777216
>>
>> Fix that by setting xen_sched_clock_offset at resume time to ensure a
>> monotonic clock value.
>>
>> [...]
> 
> With v3 of the patch, I see the time jump in one log line happen, but
> only when using PVH.
> 
> [   49.486453] Freezing user space processes ... (elapsed 0.002 seconds)
> done.
> [   49.488743] OOM killer disabled.
> [   49.488764] Freezing remaining freezable tasks ... (elapsed 0.001
> seconds) done.
> [   49.491117] suspending xenstore...
> [2000731.388722] xen:events: Xen HVM callback vector for event delivery
> is enabled
> [   49.491750] xen:grant_table: Grant tables using version 1 layout
> [   49.810722] OOM killer enabled.
> [   49.810744] Restarting tasks ... done.
> [   49.856263] Setting capacity to 6291456
> [   50.006002] Setting capacity to 10485760
> 
> If I start as PV, it never seems to happen.
> 
> Up to you to decide how important this is. :)

We could do something like below. Boris?


Juergen
---
diff --git a/arch/x86/xen/suspend_hvm.c b/arch/x86/xen/suspend_hvm.c
index e666b614cf6d..088f3a6b4be9 100644
--- a/arch/x86/xen/suspend_hvm.c
+++ b/arch/x86/xen/suspend_hvm.c
@@ -13,6 +13,6 @@ void xen_hvm_post_suspend(int suspend_cancelled)
xen_hvm_init_shared_info();
xen_vcpu_restore();
}
-   xen_callback_vector();
+   xen_callback_vector(true);
xen_unplug_emulated_devices();
 }
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index 0e60bd918695..ba293fda3265 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -55,7 +55,7 @@ void xen_enable_sysenter(void);
 void xen_enable_syscall(void);
 void xen_vcpu_restore(void);

-void xen_callback_vector(void);
+void xen_callback_vector(bool silent);
 void xen_hvm_init_shared_info(void);
 void xen_unplug_emulated_devices(void);

diff --git a/drivers/xen/events/events_base.c
b/drivers/xen/events/events_base.c
index 93194f3e7540..8d8d50bea215 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -1637,7 +1637,7 @@ EXPORT_SYMBOL_GPL(xen_set_callback_via);
 /* Vector callbacks are better than PCI interrupts to receive event
  * channel notifications because we can receive vector callbacks on any
  * vcpu and we don't need PCI support or APIC interactions. */
-void xen_callback_vector(void)
+void xen_callback_vector(bool silent)
 {
int rc;
uint64_t callback_via;
@@ -1650,13 +1650,14 @@ void xen_callback_vector(void)
xen_have_vector_callback = 0;
return;
}
-   pr_info("Xen HVM callback vector for event delivery is
enabled\n");
+   if (!silent)
+   pr_info("Xen HVM callback vector for event
delivery is enabled\n");
alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR,
xen_hvm_callback_vector);
}
 }
 #else
-void xen_callback_vector(void) {}
+void xen_callback_vector(bool silent) {}
 #endif

 #undef MODULE_PARAM_PREFIX
@@ -1692,7 +1693,7 @@ void __init xen_init_IRQ(void)
pci_xen_initial_domain();
}
if (xen_feature(XENFEAT_hvm_callback_vector))
-   xen_callback_vector();
+   xen_callback_vector(false);

if (xen_hvm_domain()) {
native_init_IRQ();


Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen

2019-01-15 Thread Hans van Kranenburg
Hi Boris,

On 1/14/19 2:54 PM, Boris Ostrovsky wrote:
> On 1/14/19 7:44 AM, Juergen Gross wrote:
>> Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable'
>> sched_clock() interface") broke Xen guest time handling across
>> migration:
>>
>> [  187.249951] Freezing user space processes ... (elapsed 0.001 seconds) 
>> done.
>> [  187.251137] OOM killer disabled.
>> [  187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 
>> seconds) done.
>> [  187.252299] suspending xenstore...
>> [  187.266987] xen:grant_table: Grant tables using version 1 layout
>> [18446743811.706476] OOM killer enabled.
>> [18446743811.706478] Restarting tasks ... done.
>> [18446743811.720505] Setting capacity to 16777216
>>
>> Fix that by setting xen_sched_clock_offset at resume time to ensure a
>> monotonic clock value.
>>
>> Fixes: f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' 
>> sched_clock() interface")
>> Cc:  # 4.11
>> Reported-by: Hans van Kranenburg 
>> Signed-off-by: Juergen Gross 
> 
> Reviewed-by: Boris Ostrovsky 

Can you please change the address to my work email?

Reported-by: Hans van Kranenburg 

Also (see other email):

Tested-by: Hans van Kranenburg 

Thanks,
Hans


Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen

2019-01-15 Thread Hans van Kranenburg
Hi,

On 1/14/19 1:44 PM, Juergen Gross wrote:
> Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable'
> sched_clock() interface") broke Xen guest time handling across
> migration:
> 
> [  187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
> [  187.251137] OOM killer disabled.
> [  187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) 
> done.
> [  187.252299] suspending xenstore...
> [  187.266987] xen:grant_table: Grant tables using version 1 layout
> [18446743811.706476] OOM killer enabled.
> [18446743811.706478] Restarting tasks ... done.
> [18446743811.720505] Setting capacity to 16777216
> 
> Fix that by setting xen_sched_clock_offset at resume time to ensure a
> monotonic clock value.
> 
> [...]

With v3 of the patch, I see the time jump in one log line happen, but
only when using PVH.

[   49.486453] Freezing user space processes ... (elapsed 0.002 seconds)
done.
[   49.488743] OOM killer disabled.
[   49.488764] Freezing remaining freezable tasks ... (elapsed 0.001
seconds) done.
[   49.491117] suspending xenstore...
[2000731.388722] xen:events: Xen HVM callback vector for event delivery
is enabled
[   49.491750] xen:grant_table: Grant tables using version 1 layout
[   49.810722] OOM killer enabled.
[   49.810744] Restarting tasks ... done.
[   49.856263] Setting capacity to 6291456
[   50.006002] Setting capacity to 10485760

If I start as PV, it never seems to happen.

Up to you to decide how important this is. :)

FYI this is with v3 on top of the Debian stretch-backports 4.19 kernel,
which I'm starting to use now to reboot things with.

-# uname -a
Linux appnode-kylie 4.19.0-0.bpo.1-cloud-amd64 #1 SMP Debian
4.19.12-1~bpo9+1+mendix1 (2019-01-15) x86_64 GNU/Linux

https://salsa.debian.org/knorrie-guest/linux/commits/mendix/stretch-backports

Hans


Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen

2019-01-15 Thread Juergen Gross
On 15/01/2019 11:43, Thomas Gleixner wrote:
> On Mon, 14 Jan 2019, Juergen Gross wrote:
> 
>> Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable'
>> sched_clock() interface") broke Xen guest time handling across
>> migration:
>>
>> [  187.249951] Freezing user space processes ... (elapsed 0.001 seconds) 
>> done.
>> [  187.251137] OOM killer disabled.
>> [  187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 
>> seconds) done.
>> [  187.252299] suspending xenstore...
>> [  187.266987] xen:grant_table: Grant tables using version 1 layout
>> [18446743811.706476] OOM killer enabled.
>> [18446743811.706478] Restarting tasks ... done.
>> [18446743811.720505] Setting capacity to 16777216
> 
> I see that it's broken, but the changelog could do with an explanation WHY
> it broke.

This seems to be rather complex.

I believe the mentioned commit just ignored Xen guests resulting in a
"stable" clock where it shouldn't, but maybe I have missed another
aspect of this commit which is to blame. I tried to fix that by
replacing using_native_sched_clock() with a hypervisor specific pvops
function.

Unfortunately this didn't work, maybe due to other uses of
using_native_sched_clock() added by later patches. In the end it was
quite clear that updating the Xen clock offset was the right thing to
do, so I ended up with this patch.


Juergen


Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen

2019-01-15 Thread Thomas Gleixner
On Mon, 14 Jan 2019, Juergen Gross wrote:

> Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable'
> sched_clock() interface") broke Xen guest time handling across
> migration:
> 
> [  187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
> [  187.251137] OOM killer disabled.
> [  187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) 
> done.
> [  187.252299] suspending xenstore...
> [  187.266987] xen:grant_table: Grant tables using version 1 layout
> [18446743811.706476] OOM killer enabled.
> [18446743811.706478] Restarting tasks ... done.
> [18446743811.720505] Setting capacity to 16777216

I see that it's broken, but the changelog could do with an explanation WHY
it broke.

Other than that this looks about right.

Thanks,

tglx


Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen

2019-01-14 Thread Boris Ostrovsky
On 1/14/19 7:44 AM, Juergen Gross wrote:
> Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable'
> sched_clock() interface") broke Xen guest time handling across
> migration:
>
> [  187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
> [  187.251137] OOM killer disabled.
> [  187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) 
> done.
> [  187.252299] suspending xenstore...
> [  187.266987] xen:grant_table: Grant tables using version 1 layout
> [18446743811.706476] OOM killer enabled.
> [18446743811.706478] Restarting tasks ... done.
> [18446743811.720505] Setting capacity to 16777216
>
> Fix that by setting xen_sched_clock_offset at resume time to ensure a
> monotonic clock value.
>
> Fixes: f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' 
> sched_clock() interface")
> Cc:  # 4.11
> Reported-by: Hans van Kranenburg 
> Signed-off-by: Juergen Gross 

Reviewed-by: Boris Ostrovsky