Re: [PATCH v4 2/2] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32

2022-03-11 Thread Boris Ostrovsky



On 3/2/22 11:40 AM, Dongli Zhang wrote:

The sched_clock() can be used very early since commit 857baa87b642
("sched/clock: Enable sched clock early"). In addition, with commit
38669ba205d1 ("x86/xen/time: Output xen sched_clock time from 0"), kdump
kernel in Xen HVM guest may panic at very early stage when accessing
&__this_cpu_read(xen_vcpu)->time as in below:

setup_arch()
  -> init_hypervisor_platform()
  -> x86_init.hyper.init_platform = xen_hvm_guest_init()
  -> xen_hvm_init_time_ops()
  -> xen_clocksource_read()
  -> src = &__this_cpu_read(xen_vcpu)->time;

This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info'
embedded inside 'shared_info' during early stage until xen_vcpu_setup() is
used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address.

However, when Xen HVM guest panic on vcpu >= 32, since
xen_vcpu_info_reset(0) would set per_cpu(xen_vcpu, cpu) = NULL when
vcpu >= 32, xen_clocksource_read() on vcpu >= 32 would panic.

This patch calls xen_hvm_init_time_ops() again later in
xen_hvm_smp_prepare_boot_cpu() after the 'vcpu_info' for boot vcpu is
registered when the boot vcpu is >= 32.

This issue can be reproduced on purpose via below command at the guest
side when kdump/kexec is enabled:

"taskset -c 33 echo c > /proc/sysrq-trigger"

The bugfix for PVM is not implemented due to the lack of testing
environment.

Cc: Joe Jin 
Signed-off-by: Dongli Zhang 



Reviewed-by: Boris Ostrovsky 


Applied to for-linus-5.18 (with return path adjusted)




Re: [PATCH v4 2/2] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32

2022-03-02 Thread Dongli Zhang
Hi Boris,

On 3/2/22 6:11 PM, Boris Ostrovsky wrote:
> 
> On 3/2/22 7:31 PM, Dongli Zhang wrote:
>> Hi Boris,
>>
>> On 3/2/22 4:20 PM, Boris Ostrovsky wrote:
>>> On 3/2/22 11:40 AM, Dongli Zhang wrote:
    void __init xen_hvm_init_time_ops(void)
    {
 +    static bool hvm_time_initialized;
 +
 +    if (hvm_time_initialized)
 +    return;
 +
    /*
     * vector callback is needed otherwise we cannot receive interrupts
     * on cpu > 0 and at this point we don't know how many cpus are
     * available.
     */
    if (!xen_have_vector_callback)
 -    return;
 +    goto exit;
>>>
>>> Why not just return? Do we expect the value of xen_have_vector_callback to
>>> change?
>> I just want to keep above sync with 
>>
>>>
>>> -boris
>>>
>>>
      if (!xen_feature(XENFEAT_hvm_safe_pvclock)) {
    pr_info("Xen doesn't support pvclock on HVM, disable pv timer");
 +    goto exit;
 +    }
>> ... here.
>>
>> That is, I want the main logic of xen_hvm_init_time_ops() to run for at most
>> once. Both of above two if statements will "go to exit".
> 
> 
> I didn't notice this actually.
> 
> 
> I think both of them should return early, there is no reason to set
> hvm_time_initialized to true when, in fact, we have not initialized anything.
> And to avoid printing the warning twice we can just replace it with 
> pr_info_once().
> 
> 
> I can fix it up when committing so no need to resend. So unless you disagree

Thank you very much for fixing it during committing.

Dongli Zhang



Re: [PATCH v4 2/2] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32

2022-03-02 Thread Boris Ostrovsky



On 3/2/22 7:31 PM, Dongli Zhang wrote:

Hi Boris,

On 3/2/22 4:20 PM, Boris Ostrovsky wrote:

On 3/2/22 11:40 AM, Dongli Zhang wrote:

   void __init xen_hvm_init_time_ops(void)
   {
+    static bool hvm_time_initialized;
+
+    if (hvm_time_initialized)
+    return;
+
   /*
    * vector callback is needed otherwise we cannot receive interrupts
    * on cpu > 0 and at this point we don't know how many cpus are
    * available.
    */
   if (!xen_have_vector_callback)
-    return;
+    goto exit;


Why not just return? Do we expect the value of xen_have_vector_callback to 
change?

I just want to keep above sync with 



-boris



     if (!xen_feature(XENFEAT_hvm_safe_pvclock)) {
   pr_info("Xen doesn't support pvclock on HVM, disable pv timer");
+    goto exit;
+    }

... here.

That is, I want the main logic of xen_hvm_init_time_ops() to run for at most
once. Both of above two if statements will "go to exit".



I didn't notice this actually.


I think both of them should return early, there is no reason to set 
hvm_time_initialized to true when, in fact, we have not initialized anything. 
And to avoid printing the warning twice we can just replace it with 
pr_info_once().


I can fix it up when committing so no need to resend. So unless you disagree


Reviewed-by: Boris Ostrovsky 



Thank you very much!

Dongli Zhang


+
+    /*
+ * Only MAX_VIRT_CPUS 'vcpu_info' are embedded inside 'shared_info'.
+ * The __this_cpu_read(xen_vcpu) is still NULL when Xen HVM guest
+ * boots on vcpu >= MAX_VIRT_CPUS (e.g., kexec), To access
+ * __this_cpu_read(xen_vcpu) via xen_clocksource_read() will panic.
+ *
+ * The xen_hvm_init_time_ops() should be called again later after
+ * __this_cpu_read(xen_vcpu) is available.
+ */
+    if (!__this_cpu_read(xen_vcpu)) {
+    pr_info("Delay xen_init_time_common() as kernel is running on
vcpu=%d\n",
+    xen_vcpu_nr(0));
   return;
   }
   @@ -577,6 +597,9 @@ void __init xen_hvm_init_time_ops(void)
   x86_cpuinit.setup_percpu_clockev = xen_hvm_setup_cpu_clockevents;
     x86_platform.set_wallclock = xen_set_wallclock;
+
+exit:
+    hvm_time_initialized = true;
   }
   #endif
   




Re: [PATCH v4 2/2] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32

2022-03-02 Thread Dongli Zhang
Hi Boris,

On 3/2/22 4:20 PM, Boris Ostrovsky wrote:
> 
> On 3/2/22 11:40 AM, Dongli Zhang wrote:
>>   void __init xen_hvm_init_time_ops(void)
>>   {
>> +    static bool hvm_time_initialized;
>> +
>> +    if (hvm_time_initialized)
>> +    return;
>> +
>>   /*
>>    * vector callback is needed otherwise we cannot receive interrupts
>>    * on cpu > 0 and at this point we don't know how many cpus are
>>    * available.
>>    */
>>   if (!xen_have_vector_callback)
>> -    return;
>> +    goto exit;
> 
> 
> Why not just return? Do we expect the value of xen_have_vector_callback to 
> change?

I just want to keep above sync with 

> 
> 
> -boris
> 
> 
>>     if (!xen_feature(XENFEAT_hvm_safe_pvclock)) {
>>   pr_info("Xen doesn't support pvclock on HVM, disable pv timer");
>> +    goto exit;
>> +    }

... here.

That is, I want the main logic of xen_hvm_init_time_ops() to run for at most
once. Both of above two if statements will "go to exit".

Thank you very much!

Dongli Zhang

>> +
>> +    /*
>> + * Only MAX_VIRT_CPUS 'vcpu_info' are embedded inside 'shared_info'.
>> + * The __this_cpu_read(xen_vcpu) is still NULL when Xen HVM guest
>> + * boots on vcpu >= MAX_VIRT_CPUS (e.g., kexec), To access
>> + * __this_cpu_read(xen_vcpu) via xen_clocksource_read() will panic.
>> + *
>> + * The xen_hvm_init_time_ops() should be called again later after
>> + * __this_cpu_read(xen_vcpu) is available.
>> + */
>> +    if (!__this_cpu_read(xen_vcpu)) {
>> +    pr_info("Delay xen_init_time_common() as kernel is running on
>> vcpu=%d\n",
>> +    xen_vcpu_nr(0));
>>   return;
>>   }
>>   @@ -577,6 +597,9 @@ void __init xen_hvm_init_time_ops(void)
>>   x86_cpuinit.setup_percpu_clockev = xen_hvm_setup_cpu_clockevents;
>>     x86_platform.set_wallclock = xen_set_wallclock;
>> +
>> +exit:
>> +    hvm_time_initialized = true;
>>   }
>>   #endif
>>   



Re: [PATCH v4 2/2] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32

2022-03-02 Thread Boris Ostrovsky



On 3/2/22 11:40 AM, Dongli Zhang wrote:

  void __init xen_hvm_init_time_ops(void)
  {
+   static bool hvm_time_initialized;
+
+   if (hvm_time_initialized)
+   return;
+
/*
 * vector callback is needed otherwise we cannot receive interrupts
 * on cpu > 0 and at this point we don't know how many cpus are
 * available.
 */
if (!xen_have_vector_callback)
-   return;
+   goto exit;



Why not just return? Do we expect the value of xen_have_vector_callback to 
change?


-boris


  
  	if (!xen_feature(XENFEAT_hvm_safe_pvclock)) {

pr_info("Xen doesn't support pvclock on HVM, disable pv timer");
+   goto exit;
+   }
+
+   /*
+* Only MAX_VIRT_CPUS 'vcpu_info' are embedded inside 'shared_info'.
+* The __this_cpu_read(xen_vcpu) is still NULL when Xen HVM guest
+* boots on vcpu >= MAX_VIRT_CPUS (e.g., kexec), To access
+* __this_cpu_read(xen_vcpu) via xen_clocksource_read() will panic.
+*
+* The xen_hvm_init_time_ops() should be called again later after
+* __this_cpu_read(xen_vcpu) is available.
+*/
+   if (!__this_cpu_read(xen_vcpu)) {
+   pr_info("Delay xen_init_time_common() as kernel is running on 
vcpu=%d\n",
+   xen_vcpu_nr(0));
return;
}
  
@@ -577,6 +597,9 @@ void __init xen_hvm_init_time_ops(void)

x86_cpuinit.setup_percpu_clockev = xen_hvm_setup_cpu_clockevents;
  
  	x86_platform.set_wallclock = xen_set_wallclock;

+
+exit:
+   hvm_time_initialized = true;
  }
  #endif