Re: [Xen-devel] 回复: Re: [PATCH] Choose retpoline only when it is safe to use

2018-02-06 Thread Zhenzhong Duan

On 2018/2/6 18:50, Andrew Cooper wrote:

On 06/02/18 10:29, zhenzhong.duan wrote:



2018年2月6日 17:20于 Andrew Cooper <andrew.coop...@citrix.com 
<mailto:andrew.coop...@citrix.com>>写道:

>
> On 06/02/2018 09:13, Zhenzhong Duan wrote:
> > 在 2018/2/6 16:59, Andrew Cooper 写道:
> >> On 06/02/2018 08:43, Zhenzhong Duan wrote:
> >>> When ( ibrs && thunk == THUNK_DEFAULT && !retpoline_safe() ) is 
true,

> >>> thunk is set to THUNK_JMP rather than THUNK_RETPOLINE.
> >>>
> >>> When (!ibrs && thunk == THUNK_DEFAULT && !retpoline_safe() ) is 
true,

> >>> we should do the same.
> >>>
> >>> Signed-off-by: Zhenzhong Duan <zhenzhong.d...@oracle.com 
<mailto:zhenzhong.d...@oracle.com>>

> >> Why?  What improvement is this intended to give?
> > No improvement, I just feel if retpoline isn't safe, THUNK_JMP is
> > better and safer.
> > Above first check is working that way.
>
> If your only two choices are unsafe repoline or plain jumps, then 
unsafe

> repoline is far far far safer.
>
> Its unsafe properties only kick in on an RSB underflow, and an 
attacker
> would have to do call-depths analysis of the running binary to 
identify

> which rets to attempt to poison.
>
Thanks for explaining.
So, for a retpoline safe processor, it just stop using RSB when it's 
empty to avoid underflow?




The qualification of whether a processor is retpoline-safe or not is 
whether an RSB underflow causes a BTB lookup (unsafe) or not (safe).


RSB underflows will always occur; you cannot get rid of them, but in 
most cases (i.e. pre-Skylake) they don't fall back to using a 
prediction source which can be poisoned by an attacker.

Understood.



Another question:

if (opt_thunk == THUNK_DEFAULT && opt_ibrs == -1 &&
CONFIG_INDIRECT_THUNK && !cpu_has_lfence_dispatch && !retpoline_safe())
results in "thunk = THUNK_JMP" regardless of the value of 
"boot_cpu_has(X86_FEATURE_IBRSB)"




Your formatting is hard to follow,

Sorry, sent from mobile.
but cpu_has_lfence_dispatch can only be false when virtualised under 
an SP2-unaware hypervisor on AMD hardware, at which point 
retpoline_safe() will return true.  Also, AMD don't have IBRS in 
microcode and their future plans don't appear to involve using that 
particular CPUID bit.
That does make sense for AMD. But what about Intel hardware which has 
(!cpu_has_lfence_dispatch && !retpoline_safe() && 
!boot_cpu_has(X86_FEATURE_IBRSB)), should we  prefer THUNK_RETPOLINE?


--
thanks
zduan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] 回复: Re: [PATCH] Choose retpoline only when it is safe to use

2018-02-06 Thread Zhenzhong Duan

在 2018/2/6 19:56, Andrew Cooper 写道:

On 06/02/18 11:41, Zhenzhong Duan wrote:

On 2018/2/6 18:50, Andrew Cooper wrote:

On 06/02/18 10:29, zhenzhong.duan wrote:



2018年2月6日 17:20于 Andrew Cooper <andrew.coop...@citrix.com 
<mailto:andrew.coop...@citrix.com>>写道:

>
> On 06/02/2018 09:13, Zhenzhong Duan wrote:
> > 在 2018/2/6 16:59, Andrew Cooper 写道:
> >> On 06/02/2018 08:43, Zhenzhong Duan wrote:
> >>> When ( ibrs && thunk == THUNK_DEFAULT && !retpoline_safe() ) 
is true,

> >>> thunk is set to THUNK_JMP rather than THUNK_RETPOLINE.
> >>>
> >>> When (!ibrs && thunk == THUNK_DEFAULT && !retpoline_safe() ) 
is true,

> >>> we should do the same.
> >>>
> >>> Signed-off-by: Zhenzhong Duan <zhenzhong.d...@oracle.com 
<mailto:zhenzhong.d...@oracle.com>>

> >> Why?  What improvement is this intended to give?
> > No improvement, I just feel if retpoline isn't safe, THUNK_JMP is
> > better and safer.
> > Above first check is working that way.
>
> If your only two choices are unsafe repoline or plain jumps, then 
unsafe

> repoline is far far far safer.
>
> Its unsafe properties only kick in on an RSB underflow, and an 
attacker
> would have to do call-depths analysis of the running binary to 
identify

> which rets to attempt to poison.
>
Thanks for explaining.
So, for a retpoline safe processor, it just stop using RSB when 
it's empty to avoid underflow?




The qualification of whether a processor is retpoline-safe or not is 
whether an RSB underflow causes a BTB lookup (unsafe) or not (safe).


RSB underflows will always occur; you cannot get rid of them, but in 
most cases (i.e. pre-Skylake) they don't fall back to using a 
prediction source which can be poisoned by an attacker.

Understood.



Another question:

if (opt_thunk == THUNK_DEFAULT && opt_ibrs == -1 &&
CONFIG_INDIRECT_THUNK && !cpu_has_lfence_dispatch && 
!retpoline_safe())
results in "thunk = THUNK_JMP" regardless of the value of 
"boot_cpu_has(X86_FEATURE_IBRSB)"




Your formatting is hard to follow,

Sorry, sent from mobile.
but cpu_has_lfence_dispatch can only be false when virtualised under 
an SP2-unaware hypervisor on AMD hardware, at which point 
retpoline_safe() will return true.  Also, AMD don't have IBRS in 
microcode and their future plans don't appear to involve using that 
particular CPUID bit.
That does make sense for AMD. But what about Intel hardware which has 
(!cpu_has_lfence_dispatch && !retpoline_safe() && 
!boot_cpu_has(X86_FEATURE_IBRSB)), should we  prefer THUNK_RETPOLINE? 


Yes, and we do end up choosing RETPOLINE in those circumstances, as we 
hit the default case you originally tried to patch.


If that's the case, I think we need another patch like below.

@@ -223,7 +223,7 @@ void __init init_speculation_mitigations(void)
  */
 else if ( retpoline_safe() )
 thunk = THUNK_RETPOLINE;
-    else
+    else if ( boot_cpu_has(X86_FEATURE_IBRSB) )
 ibrs = true;
 }
 /* Without compiler thunk support, use IBRS if available. */

As ibrs was set to true when !boot_cpu_has(X86_FEATURE_IBRSB) with 
current code, then thunk was set to THUNK_JMP rather than THUNK_RETPOLINE.


--
thanks
zduan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] Fix a panic in SPEC_CTRL_ENTRY_FROM_INTR_IST

2018-02-13 Thread Zhenzhong Duan
On on IBRS available env, bootup panic when bti=0 like below:

(XEN) Speculative mitigation facilities:
(XEN)   Hardware features: SMEP IBRS/IBPB STIBP
(XEN) BTI mitigations: Thunk N/A, Others: IBRS- SMEP
(XEN) [ Xen-4.4.4OVM  x86_64  debug=n  Tainted:C ]
(XEN) CPU:0
(XEN) RIP:e008:[]
entry.o#handle_ist_exception+0xd1/0x176
(XEN) RFLAGS: 00010046   CONTEXT: hypervisor
(XEN) rax:    rbx:    rcx: 0048
(XEN) rdx: 0001   rsi:    rdi: 
(XEN) rbp:    rsp: 82d080529f58   r8:  
(XEN) r9:     r10:    r11: 
(XEN) r12:    r13:    r14: 82d08052
(XEN) r15:    cr0: 8005003b   cr4: 001506f0
(XEN) cr3: 76fbe000   cr2: 
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen stack trace from rsp=82d080529f58:
(XEN)0018  0002 82d080528000
(XEN) 82d0802a50e0 82d08052fd98 82d08072fc00
(XEN) 0001 0400 0830
(XEN) 000a 82d0803f0fc0 0002
(XEN)82d080298876 e008 0046 82d08052fdf8
(XEN)
(XEN) Xen call trace:
(XEN)[] entry.o#handle_ist_exception+0xd1/0x176
(XEN)
(XEN)
(XEN) 
(XEN) Panic on CPU 0:
(XEN) GENERAL PROTECTION FAULT
(XEN) [error_code=]
(XEN) 

It's due to %edx isn't cleared to zero before wrmsr.

DO_OVERWRITE_RSB clobbers %eax and happend to cover the bug in certain case so
we didn't reproduce without bti=0. Change to use %edx.

Reviewed-by: Boris Ostrovsky <boris.ostrov...@oracle.com>
Tested-by: Boris Ostrovsky <boris.ostrov...@oracle.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.d...@oracle.com>
---
 xen/include/asm-x86/spec_ctrl_asm.h |7 ---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/xen/include/asm-x86/spec_ctrl_asm.h 
b/xen/include/asm-x86/spec_ctrl_asm.h
index 814f53d..55d87c0 100644
--- a/xen/include/asm-x86/spec_ctrl_asm.h
+++ b/xen/include/asm-x86/spec_ctrl_asm.h
@@ -269,16 +269,16 @@
  * This is logical merge of DO_OVERWRITE_RSB and DO_SPEC_CTRL_ENTRY
  * maybexen=1, but with conditionals rather than alternatives.
  */
-movzbl STACK_CPUINFO_FIELD(bti_ist_info)(%r14), %eax
+movzbl STACK_CPUINFO_FIELD(bti_ist_info)(%r14), %edx
 
-testb $BTI_IST_RSB, %al
+testb $BTI_IST_RSB, %dl
 jz .L\@_skip_rsb
 
 DO_OVERWRITE_RSB
 
 .L\@_skip_rsb:
 
-testb $BTI_IST_WRMSR, %al
+testb $BTI_IST_WRMSR, %dl
 jz .L\@_skip_wrmsr
 
 xor %edx, %edx
@@ -291,6 +291,7 @@
  * Load Xen's intended value.  SPEC_CTRL_IBRS vs 0 is encoded in the
  * bottom bit of bti_ist_info, via a deliberate alias with BTI_IST_IBRS.
  */
+xor %edx, %edx
 mov $MSR_SPEC_CTRL, %ecx
 and $BTI_IST_IBRS, %eax
 wrmsr
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86: fix a crash in SPEC_CTRL_ENTRY_FROM_INTR_IST

2018-02-14 Thread Zhenzhong Duan

在 2018/2/14 16:09, Jan Beulich 写道:

In an IBRS available env, bootup panic when bti=0 like below:

(XEN) Speculative mitigation facilities:
(XEN)   Hardware features: SMEP IBRS/IBPB STIBP
(XEN) BTI mitigations: Thunk N/A, Others: IBRS- SMEP
(XEN) [ Xen-4.4.4OVM  x86_64  debug=n  Tainted:C ]
(XEN) CPU:0
(XEN) RIP:e008:[]
entry.o#handle_ist_exception+0xd1/0x176
(XEN) RFLAGS: 00010046   CONTEXT: hypervisor
(XEN) rax:    rbx:    rcx: 0048
(XEN) rdx: 0001   rsi:    rdi: 
(XEN) rbp:    rsp: 82d080529f58   r8:  
(XEN) r9:     r10:    r11: 
(XEN) r12:    r13:    r14: 82d08052
(XEN) r15:    cr0: 8005003b   cr4: 001506f0
(XEN) cr3: 76fbe000   cr2: 
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen stack trace from rsp=82d080529f58:
(XEN)0018  0002 82d080528000
(XEN) 82d0802a50e0 82d08052fd98 82d08072fc00
(XEN) 0001 0400 0830
(XEN) 000a 82d0803f0fc0 0002
(XEN)82d080298876 e008 0046 82d08052fdf8
(XEN)
(XEN) Xen call trace:
(XEN)[] entry.o#handle_ist_exception+0xd1/0x176
(XEN)
(XEN)
(XEN) 
(XEN) Panic on CPU 0:
(XEN) GENERAL PROTECTION FAULT
(XEN) [error_code=]
(XEN) 

It's due to %edx isn't cleared to zero before wrmsr.

DO_OVERWRITE_RSB clobbers %eax and happend to cover the bug in certain case so
we didn't reproduce without bti=0.

Signed-off-by: Zhenzhong Duan <zhenzhong.d...@oracle.com>

Re-do actual code change. Also drop an unused label.

Signed-off-by: Jan Beulich <jbeul...@suse.com>

--- a/xen/include/asm-x86/spec_ctrl_asm.h
+++ b/xen/include/asm-x86/spec_ctrl_asm.h
@@ -274,7 +274,9 @@
  testb $BTI_IST_RSB, %al
  jz .L\@_skip_rsb
  
+mov %eax, %edx

  DO_OVERWRITE_RSB
+mov %edx, %eax
  
  .L\@_skip_rsb:
  
@@ -286,13 +288,13 @@

  setz %dl
  and %dl, STACK_CPUINFO_FIELD(use_shadow_spec_ctrl)(%r14)
  
-.L\@_entry_from_xen:

  /*
   * Load Xen's intended value.  SPEC_CTRL_IBRS vs 0 is encoded in the
   * bottom bit of bti_ist_info, via a deliberate alias with BTI_IST_IBRS.
   */
  mov $MSR_SPEC_CTRL, %ecx
  and $BTI_IST_IBRS, %eax
+xor %edx, %edx
  wrmsr
  
  /* Opencoded UNLIKELY_START() with no condition. */




I just found this patch could be optimized a bit actually by only adding 
two instructions. Let me prepare a v3 patch, a few minutes.


--
thanks
zduan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3] x86: fix a crash in SPEC_CTRL_ENTRY_FROM_INTR_IST

2018-02-14 Thread Zhenzhong Duan

在 2018/2/14 17:58, Jan Beulich 写道:

On 14.02.18 at 10:25,  wrote:

--- a/xen/include/asm-x86/spec_ctrl_asm.h
+++ b/xen/include/asm-x86/spec_ctrl_asm.h
@@ -269,28 +269,29 @@
   * This is logical merge of DO_OVERWRITE_RSB and DO_SPEC_CTRL_ENTRY
   * maybexen=1, but with conditionals rather than alternatives.
   */
-movzbl STACK_CPUINFO_FIELD(bti_ist_info)(%r14), %eax
+movzbl STACK_CPUINFO_FIELD(bti_ist_info)(%r14), %edx
  
-testb $BTI_IST_RSB, %al

+testb $BTI_IST_RSB, %dl
  jz .L\@_skip_rsb
  
  DO_OVERWRITE_RSB
  
  .L\@_skip_rsb:
  
-testb $BTI_IST_WRMSR, %al

+testb $BTI_IST_WRMSR, %dl
  jz .L\@_skip_wrmsr
  
+mov %edx, %eax

  xor %edx, %edx
  testb $3, UREGS_cs(%rsp)
  setz %dl
  and %dl, STACK_CPUINFO_FIELD(use_shadow_spec_ctrl)(%r14)
  
-.L\@_entry_from_xen:

  /*
   * Load Xen's intended value.  SPEC_CTRL_IBRS vs 0 is encoded in the
   * bottom bit of bti_ist_info, via a deliberate alias with BTI_IST_IBRS.
   */
+xor %edx, %edx
  mov $MSR_SPEC_CTRL, %ecx
  and $BTI_IST_IBRS, %eax
  wrmsr

While indeed you add one less instruction, you don't shrink overall
code size compared to v2. I also prefer v2 because of being more
explicit about the register needing to be preserved across
DO_OVERWRITE_RSB.
Then Ok, in fact my inital thought is to avoid unnecessory mov 
instructions around DO_OVERWRITE_RSB in the 'jmp _skip_wrmsr' case, so 
tried to remove them.


--
thanks
zduan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3] x86: fix a crash in SPEC_CTRL_ENTRY_FROM_INTR_IST

2018-02-14 Thread Zhenzhong Duan
In an IBRS available env, bootup panic when bti=0 like below:

(XEN) Speculative mitigation facilities:
(XEN)   Hardware features: SMEP IBRS/IBPB STIBP
(XEN) BTI mitigations: Thunk N/A, Others: IBRS- SMEP
(XEN) [ Xen-4.4.4OVM  x86_64  debug=n  Tainted:C ]
(XEN) CPU:0
(XEN) RIP:e008:[] entry.o#handle_ist_exception+0xd1/0x176
(XEN) RFLAGS: 00010046   CONTEXT: hypervisor
(XEN) rax:    rbx:    rcx: 0048
(XEN) rdx: 0001   rsi:    rdi: 
(XEN) rbp:    rsp: 82d080529f58   r8:  
(XEN) r9:     r10:    r11: 
(XEN) r12:    r13:    r14: 82d08052
(XEN) r15:    cr0: 8005003b   cr4: 001506f0
(XEN) cr3: 76fbe000   cr2: 
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen stack trace from rsp=82d080529f58:
(XEN)0018  0002 82d080528000
(XEN) 82d0802a50e0 82d08052fd98 82d08072fc00
(XEN) 0001 0400 0830
(XEN) 000a 82d0803f0fc0 0002
(XEN)82d080298876 e008 0046 82d08052fdf8
(XEN)
(XEN) Xen call trace:
(XEN)[] entry.o#handle_ist_exception+0xd1/0x176
(XEN)
(XEN)
(XEN) 
(XEN) Panic on CPU 0:
(XEN) GENERAL PROTECTION FAULT
(XEN) [error_code=]
(XEN) 

It's due to %edx isn't cleared to zero before wrmsr.

DO_OVERWRITE_RSB clobbers %eax and happend to cover the bug in certain case so
we didn't reproduce without bti=0. Change to use %edx.

Also drop an unused label per Jan.

Reported-by: Boris Ostrovsky <boris.ostrov...@oracle.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.d...@oracle.com>
Signed-off-by: Jan Beulich <jbeul...@suse.com>
---
 xen/include/asm-x86/spec_ctrl_asm.h |9 +
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/xen/include/asm-x86/spec_ctrl_asm.h 
b/xen/include/asm-x86/spec_ctrl_asm.h
index 814f53d..7b259c4 100644
--- a/xen/include/asm-x86/spec_ctrl_asm.h
+++ b/xen/include/asm-x86/spec_ctrl_asm.h
@@ -269,28 +269,29 @@
  * This is logical merge of DO_OVERWRITE_RSB and DO_SPEC_CTRL_ENTRY
  * maybexen=1, but with conditionals rather than alternatives.
  */
-movzbl STACK_CPUINFO_FIELD(bti_ist_info)(%r14), %eax
+movzbl STACK_CPUINFO_FIELD(bti_ist_info)(%r14), %edx
 
-testb $BTI_IST_RSB, %al
+testb $BTI_IST_RSB, %dl
 jz .L\@_skip_rsb
 
 DO_OVERWRITE_RSB
 
 .L\@_skip_rsb:
 
-testb $BTI_IST_WRMSR, %al
+testb $BTI_IST_WRMSR, %dl
 jz .L\@_skip_wrmsr
 
+mov %edx, %eax
 xor %edx, %edx
 testb $3, UREGS_cs(%rsp)
 setz %dl
 and %dl, STACK_CPUINFO_FIELD(use_shadow_spec_ctrl)(%r14)
 
-.L\@_entry_from_xen:
 /*
  * Load Xen's intended value.  SPEC_CTRL_IBRS vs 0 is encoded in the
  * bottom bit of bti_ist_info, via a deliberate alias with BTI_IST_IBRS.
  */
+xor %edx, %edx
 mov $MSR_SPEC_CTRL, %ecx
 and $BTI_IST_IBRS, %eax
 wrmsr
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] Fix a panic in SPEC_CTRL_ENTRY_FROM_INTR_IST

2018-02-14 Thread Zhenzhong Duan

在 2018/2/14 15:56, Jan Beulich 写道:

On 14.02.18 at 05:03, <zhenzhong.d...@oracle.com> wrote:

On on IBRS available env, bootup panic when bti=0 like below:

(XEN) Speculative mitigation facilities:
(XEN)   Hardware features: SMEP IBRS/IBPB STIBP
(XEN) BTI mitigations: Thunk N/A, Others: IBRS- SMEP
(XEN) [ Xen-4.4.4OVM  x86_64  debug=n  Tainted:C ]
(XEN) CPU:0
(XEN) RIP:e008:[]
entry.o#handle_ist_exception+0xd1/0x176
(XEN) RFLAGS: 00010046   CONTEXT: hypervisor
(XEN) rax:    rbx:    rcx: 0048
(XEN) rdx: 0001   rsi:    rdi: 
(XEN) rbp:    rsp: 82d080529f58   r8:  
(XEN) r9:     r10:    r11: 
(XEN) r12:    r13:    r14: 82d08052
(XEN) r15:    cr0: 8005003b   cr4: 001506f0
(XEN) cr3: 76fbe000   cr2: 
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen stack trace from rsp=82d080529f58:
(XEN)0018  0002 82d080528000
(XEN) 82d0802a50e0 82d08052fd98 82d08072fc00
(XEN) 0001 0400 0830
(XEN) 000a 82d0803f0fc0 0002
(XEN)82d080298876 e008 0046 82d08052fdf8
(XEN)
(XEN) Xen call trace:
(XEN)[] entry.o#handle_ist_exception+0xd1/0x176
(XEN)
(XEN)
(XEN) 
(XEN) Panic on CPU 0:
(XEN) GENERAL PROTECTION FAULT
(XEN) [error_code=]
(XEN) 

It's due to %edx isn't cleared to zero before wrmsr.

DO_OVERWRITE_RSB clobbers %eax and happend to cover the bug in certain case so
we didn't reproduce without bti=0. Change to use %edx.

Reviewed-by: Boris Ostrovsky <boris.ostrov...@oracle.com>
Tested-by: Boris Ostrovsky <boris.ostrov...@oracle.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.d...@oracle.com>

Two formal things: Please put tags in sequential (in time) order:
reporter, author(s), reviewers/testers. And please follow patch
submission rules - send them _to_ the list, with maintainers (and
other interested parties) on _cc_.

Got it.



--- a/xen/include/asm-x86/spec_ctrl_asm.h
+++ b/xen/include/asm-x86/spec_ctrl_asm.h
@@ -269,16 +269,16 @@
   * This is logical merge of DO_OVERWRITE_RSB and DO_SPEC_CTRL_ENTRY
   * maybexen=1, but with conditionals rather than alternatives.
   */
-movzbl STACK_CPUINFO_FIELD(bti_ist_info)(%r14), %eax
+movzbl STACK_CPUINFO_FIELD(bti_ist_info)(%r14), %edx
  
-testb $BTI_IST_RSB, %al

+testb $BTI_IST_RSB, %dl
  jz .L\@_skip_rsb
  
  DO_OVERWRITE_RSB
  
  .L\@_skip_rsb:
  
-testb $BTI_IST_WRMSR, %al

+testb $BTI_IST_WRMSR, %dl
  jz .L\@_skip_wrmsr
  
  xor %edx, %edx

@@ -291,6 +291,7 @@
   * Load Xen's intended value.  SPEC_CTRL_IBRS vs 0 is encoded in the
   * bottom bit of bti_ist_info, via a deliberate alias with BTI_IST_IBRS.
   */
+xor %edx, %edx
  mov $MSR_SPEC_CTRL, %ecx
  and $BTI_IST_IBRS, %eax
  wrmsr

This is wrong now, and the comment very clearly states why. I think
rather than switching %eax to %edx it would be better to preserve
%eax (e.g. by saving into %edx) around DO_OVERWRITE_RSB. I'll
send an updated patch in a few minutes.

Right, I missed this part, thanks for point.
Your v2 patch looks good for me.

--
thanks
zduan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] Choose retpoline only when it is safe to use

2018-02-06 Thread Zhenzhong Duan
When ( ibrs && thunk == THUNK_DEFAULT && !retpoline_safe() ) is true,
thunk is set to THUNK_JMP rather than THUNK_RETPOLINE.

When (!ibrs && thunk == THUNK_DEFAULT && !retpoline_safe() ) is true,
we should do the same.

Signed-off-by: Zhenzhong Duan <zhenzhong.d...@oracle.com>
---
 xen/arch/x86/spec_ctrl.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
index f10ffbf..ab4b244 100644
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -247,10 +247,10 @@ void __init init_speculation_mitigations(void)
 
 /*
  * If there are still no thunk preferences, the compiled default is
- * actually retpoline, and it is better than nothing.
+ * actually retpoline, and it is better than nothing if it's retpoline 
safe.
  */
 if ( thunk == THUNK_DEFAULT )
-thunk = THUNK_RETPOLINE;
+thunk = retpoline_safe() ? THUNK_RETPOLINE : THUNK_JMP;
 
 /* Apply the chosen settings. */
 if ( thunk == THUNK_LFENCE )
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] Choose retpoline only when it is safe to use

2018-02-06 Thread Zhenzhong Duan

在 2018/2/6 16:59, Andrew Cooper 写道:

On 06/02/2018 08:43, Zhenzhong Duan wrote:

When ( ibrs && thunk == THUNK_DEFAULT && !retpoline_safe() ) is true,
thunk is set to THUNK_JMP rather than THUNK_RETPOLINE.

When (!ibrs && thunk == THUNK_DEFAULT && !retpoline_safe() ) is true,
we should do the same.

Signed-off-by: Zhenzhong Duan <zhenzhong.d...@oracle.com>

Why?  What improvement is this intended to give?
No improvement, I just feel if retpoline isn't safe, THUNK_JMP is better 
and safer.

Above first check is working that way.


(IOW, the logic here is deliberate, and has a specific purpose.)

Ok, Just ignore the noise if it's deliberate.

--
thanks
zduan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86/mmcfg/drhd: Move acpi_mmcfg_init() before calling acpi_parse_dmar()

2018-08-15 Thread Zhenzhong Duan
On a multiple pci segment system such as HPE Superdome-Flex, pci config space
from nonzero segment is accessed with mmcfg during acpi parsing DMAR region.

We need to setup mmcfg mapping before that or else drhd isn't correctly setup
and devices under it fail to load in dom0.

Signed-off-by: Zhenzhong Duan 
Tested-by: Gopalasetty, Manoj 
---
 xen/arch/x86/setup.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 8301de8..9af7426 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1493,6 +1493,10 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
 generic_apic_probe();
 
+pt_pci_init();
+
+acpi_mmcfg_init();
+
 acpi_boot_init();
 
 if ( smp_found_config )
@@ -1596,12 +1600,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
 local_irq_enable();
 
-pt_pci_init();
-
 vesa_mtrr_init();
 
-acpi_mmcfg_init();
-
 early_msi_init();
 
 iommu_setup();/* setup iommu if available */
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86/mmcfg: Remove redundant code in pci_mmcfg_reject_broken()

2018-08-15 Thread Zhenzhong Duan
No functional change.

Signed-off-by: Zhenzhong Duan 
---
 xen/arch/x86/x86_64/mmconfig-shared.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/x86_64/mmconfig-shared.c 
b/xen/arch/x86/x86_64/mmconfig-shared.c
index 7c3b7fd..4a6ca26 100644
--- a/xen/arch/x86/x86_64/mmconfig-shared.c
+++ b/xen/arch/x86/x86_64/mmconfig-shared.c
@@ -371,8 +371,6 @@ static bool_t __init pci_mmcfg_reject_broken(void)
 (pci_mmcfg_config[0].address == 0))
 return 0;
 
-cfg = _mmcfg_config[0];
-
 for (i = 0; i < pci_mmcfg_config_num; i++) {
 u64 addr, size;
 
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/mmcfg/drhd: Move acpi_mmcfg_init() before calling acpi_parse_dmar()

2018-08-16 Thread Zhenzhong Duan


On 2018/8/16 17:13, Zhenzhong Duan wrote:



--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1493,6 +1493,10 @@ void __init noreturn __start_xen(unsigned long 
mbi_p)

  generic_apic_probe();
+    pt_pci_init();
+
+    acpi_mmcfg_init();
+
  acpi_boot_init();


With the dependency being _in_ acpi_boot_init(), the invocation of
acpi_mmcfg_init() should now be moved there.
Yes, I feel better to move pt_pci_init() and acpi_mmcfg_init() in 
acpi_boot_init() before acpi_mmcfg_init(). Any more comments?
I see acpi_boot_init() is empty func when CONFIG_ACPI_BOOT isn't set. Do 
we support disabling this config option? If yes, I think above change 
will break non-acpi case.


Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/mmcfg/drhd: Move acpi_mmcfg_init() before calling acpi_parse_dmar()

2018-08-16 Thread Zhenzhong Duan

On 2018/8/16 15:10, Jan Beulich wrote:

On 16.08.18 at 07:10,  wrote:

On a multiple pci segment system such as HPE Superdome-Flex, pci config space
from nonzero segment is accessed with mmcfg during acpi parsing DMAR region.


First of all - can you please write a little more helpful (to reviewers)
patch description. I had to hunt down the config space accesses
instead of you clearly saying where they are (and why they are
unavoidably there).

Sorry, I'll improve it in v2


Furthermore you also move the invocation of pt_pci_init(), with no
explanation at all. I did not want to invest the time to understand
why that's needed.
I'll add it in v2. I moved pt_pci_init() ahead because pci_add_segment() 
depending on pci_segments radix tree being initialized.

acpi_mmcfg_init
  ->acpi_parse_mcfg
->pci_add_segment



We need to setup mmcfg mapping before that or else drhd isn't correctly setup
and devices under it fail to load in dom0.


That's the improvement side of the change. Mind also saying a word
on why the reordering won't break any other dependencies? After all
you move up the functions across dozens of other ones, not the least
far ahead of the point where IRQs get enabled the first time.
pt_pci_init() initialize pci_segments radix tree, acpi_mmcfg_init() 
initialize pci_mmcfg_virt[] and setup mmcfg mapping in 
pci_mmcfg_virt[idx].virt. I thought it's ok to just have these 
structures initialzed earlier.


Have you investigated the alternative of deferring acpi_dmar_init()
to a later point, or at least the part of it that needs to do PCI
config space accesses? I'm not currently convinced the device scope
parsing needs to happen that early: Neither iommu_supports_eim()
nor iommu_enable_x2apic_IR() look to depend on that information
at the first glance, and I think these are the routines that require
(part of) the DMAR parsing to happen early.
I moved acpi_mmcfg_init() ahead of acpi_boot_init() because below code 
sequence depending on pci_mmcfg_virt being correctly setup.

acpi_dmar_init
  ->acpi_parse_dmar
->acpi_parse_one_drhd
  ->acpi_parse_dev_scope
->pci_conf_read8
  ->pci_mmcfg_read
->pci_dev_base
  ->get_virt



--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1493,6 +1493,10 @@ void __init noreturn __start_xen(unsigned long mbi_p)
  
  generic_apic_probe();
  
+pt_pci_init();

+
+acpi_mmcfg_init();
+
  acpi_boot_init();


With the dependency being _in_ acpi_boot_init(), the invocation of
acpi_mmcfg_init() should now be moved there.
Yes, I feel better to move pt_pci_init() and acpi_mmcfg_init() in 
acpi_boot_init() before acpi_mmcfg_init(). Any more comments?


Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/mmcfg/drhd: Move acpi_mmcfg_init() before calling acpi_parse_dmar()

2018-08-16 Thread Zhenzhong Duan

在 2018/8/16 18:42, Jan Beulich 写道:

On 16.08.18 at 11:30,  wrote:

On 2018/8/16 17:13, Zhenzhong Duan wrote:

--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1493,6 +1493,10 @@ void __init noreturn __start_xen(unsigned long
mbi_p)
   generic_apic_probe();
+pt_pci_init();
+
+acpi_mmcfg_init();
+
   acpi_boot_init();


With the dependency being _in_ acpi_boot_init(), the invocation of
acpi_mmcfg_init() should now be moved there.

Yes, I feel better to move pt_pci_init() and acpi_mmcfg_init() in
acpi_boot_init() before acpi_mmcfg_init(). Any more comments?

I see acpi_boot_init() is empty func when CONFIG_ACPI_BOOT isn't set. Do
we support disabling this config option? If yes, I think above change
will break non-acpi case.


I'm sorry, but I'm lost: grep produces no single hit on my tree
when looking for ACPI_BOOT. Are you looking at some older tree?
Even when considering ACPI - that Kconfig option exists only for
ARM's purposes right now. If you were to make it user selectable,
I think you'd first have to fix a number of build issues in case it
got turned off.

Sorry, I wrongly looked into oracle internal branch.
In upstream, it's CONFIG_ACPI. It looks not an issue as you said 
CONFIG_ACPI is only for ARM.


Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/mmcfg/drhd: Move acpi_mmcfg_init() before calling acpi_parse_dmar()

2018-08-16 Thread Zhenzhong Duan

在 2018/8/16 18:37, Jan Beulich 写道:

On 16.08.18 at 11:13,  wrote:

On 2018/8/16 15:10, Jan Beulich wrote:

Have you investigated the alternative of deferring acpi_dmar_init()
to a later point, or at least the part of it that needs to do PCI
config space accesses? I'm not currently convinced the device scope
parsing needs to happen that early: Neither iommu_supports_eim()
nor iommu_enable_x2apic_IR() look to depend on that information
at the first glance, and I think these are the routines that require
(part of) the DMAR parsing to happen early.

I moved acpi_mmcfg_init() ahead of acpi_boot_init() because below code
sequence depending on pci_mmcfg_virt being correctly setup.
acpi_dmar_init
->acpi_parse_dmar
  ->acpi_parse_one_drhd
->acpi_parse_dev_scope
  ->pci_conf_read8
->pci_mmcfg_read
  ->pci_dev_base
->get_virt


Have you read my previous response in full? I'm specifically asking
whether the device scope parsing (i.e. acpi_parse_dev_scope())
really needs to happen as early as it does now. Without that, the
dependency on MMCFG accesses working would go away.
I'll move acpi_dmar_init() to later point to have a test as 
acpi_parse_one_drhd, acpi_parse_one_rmrr and acpi_parse_one_atsr all 
called acpi_parse_dev_scope.



--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1493,6 +1493,10 @@ void __init noreturn __start_xen(unsigned long mbi_p)
   
   generic_apic_probe();
   
+pt_pci_init();

+
+acpi_mmcfg_init();
+
   acpi_boot_init();


With the dependency being _in_ acpi_boot_init(), the invocation of
acpi_mmcfg_init() should now be moved there.

Yes, I feel better to move pt_pci_init() and acpi_mmcfg_init() in
acpi_boot_init() before acpi_mmcfg_init().


I didn't say move both, did I?

That said, now having looked at what it actually does, I think you want
to rename it if the suggested alternative route can't be used, as in
particular the pt_ prefix is quite misleading then. Once that has
happened, moving the invocation perhaps even _into_ acpi_mcfg_init()

Understood.
Personly I like this way more as pt_pci_init() and acpi_mmcfg_init() 
only initialized some global variable and harmless to move ahead. Also 
acpi_mcfg_init from its name is suitable to move in acpi_boot_init()


Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 1/2] x86/mmcfg: Rename pt_pci_init() and call it in acpi_mmcfg_init()

2018-08-17 Thread Zhenzhong Duan
Given what pt_pci_init() actually does, rename it properly and move its
declaration to pci.h, move the only call in acpi_mmcfg_init().

No functional change.

Signed-off-by: Zhenzhong Duan 
Tested-by: Gopalasetty, Manoj 
---
 xen/arch/x86/setup.c  |2 --
 xen/arch/x86/x86_64/mmconfig-shared.c |2 ++
 xen/drivers/passthrough/pci.c |2 +-
 xen/include/xen/iommu.h   |2 --
 xen/include/xen/pci.h |1 +
 5 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 8301de8..d5cc584 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1596,8 +1596,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
 local_irq_enable();
 
-pt_pci_init();
-
 vesa_mtrr_init();
 
 acpi_mmcfg_init();
diff --git a/xen/arch/x86/x86_64/mmconfig-shared.c 
b/xen/arch/x86/x86_64/mmconfig-shared.c
index 7c3b7fd..c426354 100644
--- a/xen/arch/x86/x86_64/mmconfig-shared.c
+++ b/xen/arch/x86/x86_64/mmconfig-shared.c
@@ -402,6 +402,8 @@ void __init acpi_mmcfg_init(void)
 {
 bool_t valid = 1;
 
+pci_segments_init();
+
 /* MMCONFIG disabled */
 if ((pci_probe & PCI_PROBE_MMCONF) == 0)
 return;
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index c4890a4..d1adffa 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -126,7 +126,7 @@ static int pci_segments_iterate(
 return rc;
 }
 
-void __init pt_pci_init(void)
+void __init pci_segments_init(void)
 {
 radix_tree_init(_segments);
 if ( !alloc_pseg(0) )
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 6b42e3b..e35d941 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -92,8 +92,6 @@ struct domain_iommu {
 #define iommu_clear_feature(d, f) clear_bit(f, dom_iommu(d)->features)
 
 #ifdef CONFIG_HAS_PCI
-void pt_pci_init(void);
-
 struct pirq;
 int hvm_do_IRQ_dpci(struct domain *, struct pirq *);
 int pt_irq_create_bind(struct domain *, const struct xen_domctl_bind_pt_irq *);
diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
index 4cfa774..580e820 100644
--- a/xen/include/xen/pci.h
+++ b/xen/include/xen/pci.h
@@ -143,6 +143,7 @@ struct pci_dev *pci_lock_domain_pdev(
 void setup_hwdom_pci_devices(struct domain *,
 int (*)(u8 devfn, struct pci_dev *));
 int pci_release_devices(struct domain *d);
+void pci_segments_init(void);
 int pci_add_segment(u16 seg);
 const unsigned long *pci_get_ro_map(u16 seg);
 int pci_add_device(u16 seg, u8 bus, u8 devfn,
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 2/2] x86/mmcfg/drhd: Move acpi_mmcfg_init() call before calling acpi_parse_dmar()

2018-08-17 Thread Zhenzhong Duan
pci_conf_read8() needs pci mmcfg mapping to work on multiple pci segments
system such as HPE Superdome-Flex.

Move acpi_mmcfg_init() call in acpi_boot_init() before calling
acpi_parse_dmar() so that when pci_conf_read8() is called in
acpi_parse_dev_scope(), we already have the mapping set up.

acpi_mmcfg_init() only setup mmcfg mapping and has some global variables
initialized so there is no hazard to move it ahead. Meanwhile from its
name, acpi_boot_init() is a proper place to call it.

The alternative is moving the acpi_parse_dev_scope() call to a later point
after acpi_mmcfg_init(). But acpi_parse_one_drhd(), acpi_parse_one_rmrr()
and acpi_parse_one_atsr() all called acpi_parse_dev_scope() as their main
job. Splitting those functions to two pieces looks less optimal and meaningless.

Signed-off-by: Zhenzhong Duan 
Tested-by: Gopalasetty, Manoj 
---
 xen/arch/x86/acpi/boot.c |2 ++
 xen/arch/x86/setup.c |2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c
index 8e6c96d..e89c2e9 100644
--- a/xen/arch/x86/acpi/boot.c
+++ b/xen/arch/x86/acpi/boot.c
@@ -724,6 +724,8 @@ int __init acpi_boot_init(void)
 
acpi_table_parse(ACPI_SIG_HPET, acpi_parse_hpet);
 
+   acpi_mmcfg_init();
+
acpi_dmar_init();
 
erst_init();
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index d5cc584..9d0cd3b 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1598,8 +1598,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
 vesa_mtrr_init();
 
-acpi_mmcfg_init();
-
 early_msi_init();
 
 iommu_setup();/* setup iommu if available */
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 2/2] x86/dmar: zap DMAR signature for dom0 once in TBOOT case

2018-08-21 Thread Zhenzhong Duan
When TBOOT enabled, acpi_parse_dmar() called acpi_dmar_zap() to zap
DMAR table, then tboot_parse_dmar_table() called acpi_dmar_zap() to
zap it again. This is unnecessory, remove the second call.

Some stale comments not right for current code are also removed.
No functional change.

Signed-off-by: Zhenzhong Duan 
---
v3: fixup patch and description.

 xen/arch/x86/tboot.c |6 --
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/tboot.c b/xen/arch/x86/tboot.c
index d5a5292..1006f95 100644
--- a/xen/arch/x86/tboot.c
+++ b/xen/arch/x86/tboot.c
@@ -461,8 +461,6 @@ int __init tboot_parse_dmar_table(acpi_table_handler 
dmar_handler)
 if ( txt_heap_base == 0 )
 return 1;
 
-/* map TXT heap into Xen addr space */
-
 /* walk heap to SinitMleData */
 pa = txt_heap_base;
 /* skip BiosData */
@@ -490,10 +488,6 @@ int __init tboot_parse_dmar_table(acpi_table_handler 
dmar_handler)
 rc = dmar_handler(dmar_table);
 xfree(dmar_table);
 
-/* acpi_parse_dmar() zaps APCI DMAR signature in TXT heap table */
-/* but dom0 will read real table, so must zap it there too */
-acpi_dmar_zap();
-
 return rc;
 }
 
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 2/2] x86/dmar: zap DMAR signature for dom0 once in TBOOT case

2018-08-21 Thread Zhenzhong Duan
Commit 6c298ecc1f ("vtd: Reinstate ACPI DMAR on system shutdown or
S3/S4/S5") did everything for acpi_dmar_zap() call to be unnecessary,
except for invoking the function from acpi_parse_dmar(), which
123c779379 ("VTd/dmar: Tweak how the DMAR table is clobbered")
added several years later.

Some stale comments are also removed, No functional change.

Signed-off-by: Zhenzhong Duan 
---
v4: fixup description per Jan

 xen/arch/x86/tboot.c |6 --
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/tboot.c b/xen/arch/x86/tboot.c
index d5a5292..1006f95 100644
--- a/xen/arch/x86/tboot.c
+++ b/xen/arch/x86/tboot.c
@@ -461,8 +461,6 @@ int __init tboot_parse_dmar_table(acpi_table_handler 
dmar_handler)
 if ( txt_heap_base == 0 )
 return 1;
 
-/* map TXT heap into Xen addr space */
-
 /* walk heap to SinitMleData */
 pa = txt_heap_base;
 /* skip BiosData */
@@ -490,10 +488,6 @@ int __init tboot_parse_dmar_table(acpi_table_handler 
dmar_handler)
 rc = dmar_handler(dmar_table);
 xfree(dmar_table);
 
-/* acpi_parse_dmar() zaps APCI DMAR signature in TXT heap table */
-/* but dom0 will read real table, so must zap it there too */
-acpi_dmar_zap();
-
 return rc;
 }
 
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 2/2] x86/mmcfg/drhd: Move acpi_mmcfg_init() call before calling acpi_parse_dmar()

2018-08-21 Thread Zhenzhong Duan
pci_conf_read8() needs pci mmcfg mapping to work on multiple pci
segments system such as HPE Superdome-Flex.

Move acpi_mmcfg_init() call in acpi_boot_init() before calling
acpi_parse_dmar() so that when pci_conf_read8() is called in
acpi_parse_dev_scope(), we already have the mapping set up.

mmio_ro_ranges initialization is also moved ahead as it's the only
dependency of pci_mmcfg_arch_enable() need to be moved. Also
checked codes between the old and new call sites to ensure we
don't break anything.

Furthermore MMCFG will continue to not work this early (or
more precisely not at all until Dom0 boot has progressed far
enough) if the range(s) isn't/aren't marked reserved in E820.

Signed-off-by: Zhenzhong Duan 
Tested-by: Gopalasetty, Manoj 
---
 xen/arch/x86/acpi/boot.c |2 ++
 xen/arch/x86/setup.c |8 +++-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c
index 8e6c96d..e89c2e9 100644
--- a/xen/arch/x86/acpi/boot.c
+++ b/xen/arch/x86/acpi/boot.c
@@ -724,6 +724,8 @@ int __init acpi_boot_init(void)
 
acpi_table_parse(ACPI_SIG_HPET, acpi_parse_hpet);
 
+   acpi_mmcfg_init();
+
acpi_dmar_init();
 
erst_init();
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index d5cc584..eabb011 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1493,6 +1493,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
 generic_apic_probe();
 
+mmio_ro_ranges = rangeset_new(NULL, "r/o mmio ranges",
+  RANGESETF_prettyprint_hex);
+
 acpi_boot_init();
 
 if ( smp_found_config )
@@ -1525,9 +1528,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 /* Low mappings were only needed for some BIOS table parsing. */
 zap_low_mappings();
 
-mmio_ro_ranges = rangeset_new(NULL, "r/o mmio ranges",
-  RANGESETF_prettyprint_hex);
-
 init_apic_mappings();
 
 normalise_cpu_order();
@@ -1598,8 +1598,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
 vesa_mtrr_init();
 
-acpi_mmcfg_init();
-
 early_msi_init();
 
 iommu_setup();/* setup iommu if available */
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 2/2] x86/mmcfg/drhd: Move acpi_mmcfg_init() call before calling acpi_parse_dmar()

2018-08-22 Thread Zhenzhong Duan

在 2018/8/22 16:42, Roger Pau Monné 写道:

On Wed, Aug 22, 2018 at 04:39:05PM +0800, Zhenzhong Duan wrote:

在 2018/8/22 15:36, Roger Pau Monné 写道:

On Tue, Aug 21, 2018 at 09:53:08PM -0700, Zhenzhong Duan wrote:

pci_conf_read8() needs pci mmcfg mapping to work on multiple pci
segments system such as HPE Superdome-Flex.

Move acpi_mmcfg_init() call in acpi_boot_init() before calling
acpi_parse_dmar() so that when pci_conf_read8() is called in
acpi_parse_dev_scope(), we already have the mapping set up.

mmio_ro_ranges initialization is also moved ahead as it's the only
dependency of pci_mmcfg_arch_enable() need to be moved. Also
checked codes between the old and new call sites to ensure we
don't break anything.

Furthermore MMCFG will continue to not work this early (or
more precisely not at all until Dom0 boot has progressed far
enough) if the range(s) isn't/aren't marked reserved in E820.

Signed-off-by: Zhenzhong Duan 
Tested-by: Gopalasetty, Manoj 


LGTM:

Reviewed-by: Roger Pau Monné 

Thanks for reviewing.


I'm however failing to find patch 1/2 in this series. Could you please
make sure threading is setup correctly?

Hmm, how to make that happen, do you mean sending the two patches in a
bunch?
patch 1/2 is '[PATCH v2 1/2] x86/mmcfg: Rename pt_pci_init() and call it in
acpi_mmcfg_init()"

It's acked in v2, so there is no v3 for patch 1/2.


Oh, I see. So you have two choices here: either send patch 1/2 with
the Ack together with patch 2/2, or just drop the 2/2 prefix and send
the patch alone. But sending v3 2/2 without v3 1/2 makes people think
there's a v3 1/2 lost somewhere IMO.

Understand, I'll resend the two again.

Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 2/2] x86/mmcfg/drhd: Move acpi_mmcfg_init() call before calling acpi_parse_dmar()

2018-08-22 Thread Zhenzhong Duan

在 2018/8/22 15:36, Roger Pau Monné 写道:

On Tue, Aug 21, 2018 at 09:53:08PM -0700, Zhenzhong Duan wrote:

pci_conf_read8() needs pci mmcfg mapping to work on multiple pci
segments system such as HPE Superdome-Flex.

Move acpi_mmcfg_init() call in acpi_boot_init() before calling
acpi_parse_dmar() so that when pci_conf_read8() is called in
acpi_parse_dev_scope(), we already have the mapping set up.

mmio_ro_ranges initialization is also moved ahead as it's the only
dependency of pci_mmcfg_arch_enable() need to be moved. Also
checked codes between the old and new call sites to ensure we
don't break anything.

Furthermore MMCFG will continue to not work this early (or
more precisely not at all until Dom0 boot has progressed far
enough) if the range(s) isn't/aren't marked reserved in E820.

Signed-off-by: Zhenzhong Duan 
Tested-by: Gopalasetty, Manoj 


LGTM:

Reviewed-by: Roger Pau Monné 

Thanks for reviewing.


I'm however failing to find patch 1/2 in this series. Could you please
make sure threading is setup correctly?
Hmm, how to make that happen, do you mean sending the two patches in a 
bunch?
patch 1/2 is '[PATCH v2 1/2] x86/mmcfg: Rename pt_pci_init() and call it 
in acpi_mmcfg_init()"


It's acked in v2, so there is no v3 for patch 1/2.

Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH RESEND v3 1/2] x86/mmcfg: Rename pt_pci_init() and call it in acpi_mmcfg_init()

2018-08-22 Thread Zhenzhong Duan
Given what pt_pci_init() actually does, rename it properly and move its
declaration to pci.h, move the only call in acpi_mmcfg_init().

No functional change.

Signed-off-by: Zhenzhong Duan 
Tested-by: Gopalasetty, Manoj 
Acked-by: Jan Beulich 
---
 xen/arch/x86/setup.c  |2 --
 xen/arch/x86/x86_64/mmconfig-shared.c |2 ++
 xen/drivers/passthrough/pci.c |2 +-
 xen/include/xen/iommu.h   |2 --
 xen/include/xen/pci.h |1 +
 5 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 8301de8..d5cc584 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1596,8 +1596,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
 local_irq_enable();
 
-pt_pci_init();
-
 vesa_mtrr_init();
 
 acpi_mmcfg_init();
diff --git a/xen/arch/x86/x86_64/mmconfig-shared.c 
b/xen/arch/x86/x86_64/mmconfig-shared.c
index 7c3b7fd..c426354 100644
--- a/xen/arch/x86/x86_64/mmconfig-shared.c
+++ b/xen/arch/x86/x86_64/mmconfig-shared.c
@@ -402,6 +402,8 @@ void __init acpi_mmcfg_init(void)
 {
 bool_t valid = 1;
 
+pci_segments_init();
+
 /* MMCONFIG disabled */
 if ((pci_probe & PCI_PROBE_MMCONF) == 0)
 return;
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index c4890a4..d1adffa 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -126,7 +126,7 @@ static int pci_segments_iterate(
 return rc;
 }
 
-void __init pt_pci_init(void)
+void __init pci_segments_init(void)
 {
 radix_tree_init(_segments);
 if ( !alloc_pseg(0) )
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 6b42e3b..e35d941 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -92,8 +92,6 @@ struct domain_iommu {
 #define iommu_clear_feature(d, f) clear_bit(f, dom_iommu(d)->features)
 
 #ifdef CONFIG_HAS_PCI
-void pt_pci_init(void);
-
 struct pirq;
 int hvm_do_IRQ_dpci(struct domain *, struct pirq *);
 int pt_irq_create_bind(struct domain *, const struct xen_domctl_bind_pt_irq *);
diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
index 4cfa774..580e820 100644
--- a/xen/include/xen/pci.h
+++ b/xen/include/xen/pci.h
@@ -143,6 +143,7 @@ struct pci_dev *pci_lock_domain_pdev(
 void setup_hwdom_pci_devices(struct domain *,
 int (*)(u8 devfn, struct pci_dev *));
 int pci_release_devices(struct domain *d);
+void pci_segments_init(void);
 int pci_add_segment(u16 seg);
 const unsigned long *pci_get_ro_map(u16 seg);
 int pci_add_device(u16 seg, u8 bus, u8 devfn,
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH RESEND v3 2/2] x86/mmcfg/drhd: Move acpi_mmcfg_init() call before calling acpi_parse_dmar()

2018-08-22 Thread Zhenzhong Duan
pci_conf_read8() needs pci mmcfg mapping to work on multiple pci
segments system such as HPE Superdome-Flex.

Move acpi_mmcfg_init() call in acpi_boot_init() before calling
acpi_parse_dmar() so that when pci_conf_read8() is called in
acpi_parse_dev_scope(), we already have the mapping set up.

mmio_ro_ranges initialization is also moved ahead as it's the only
dependency of pci_mmcfg_arch_enable() need to be moved. Also
checked codes between the old and new call sites to ensure we
don't break anything.

Furthermore MMCFG will continue to not work this early (or
more precisely not at all until Dom0 boot has progressed far
enough) if the range(s) isn't/aren't marked reserved in E820.

Signed-off-by: Zhenzhong Duan 
Tested-by: Gopalasetty, Manoj 
Reviewed-by: Roger Pau Monné 
---
 xen/arch/x86/acpi/boot.c |2 ++
 xen/arch/x86/setup.c |8 +++-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c
index 8e6c96d..e89c2e9 100644
--- a/xen/arch/x86/acpi/boot.c
+++ b/xen/arch/x86/acpi/boot.c
@@ -724,6 +724,8 @@ int __init acpi_boot_init(void)
 
acpi_table_parse(ACPI_SIG_HPET, acpi_parse_hpet);
 
+   acpi_mmcfg_init();
+
acpi_dmar_init();
 
erst_init();
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index d5cc584..eabb011 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1493,6 +1493,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
 generic_apic_probe();
 
+mmio_ro_ranges = rangeset_new(NULL, "r/o mmio ranges",
+  RANGESETF_prettyprint_hex);
+
 acpi_boot_init();
 
 if ( smp_found_config )
@@ -1525,9 +1528,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 /* Low mappings were only needed for some BIOS table parsing. */
 zap_low_mappings();
 
-mmio_ro_ranges = rangeset_new(NULL, "r/o mmio ranges",
-  RANGESETF_prettyprint_hex);
-
 init_apic_mappings();
 
 normalise_cpu_order();
@@ -1598,8 +1598,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
 vesa_mtrr_init();
 
-acpi_mmcfg_init();
-
 early_msi_init();
 
 iommu_setup();/* setup iommu if available */
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] 答复: [PATCH v4 2/2] x86/dmar: zap DMAR signature for dom0 once in TBOOT case

2018-08-28 Thread Zhenzhong Duan
Hi Maintainer,

Any comments?

Thanks
Zhenzhong
- zhenzhong.d...@oracle.com wrote:

> Commit 6c298ecc1f ("vtd: Reinstate ACPI DMAR on system shutdown or
> S3/S4/S5") did everything for acpi_dmar_zap() call to be unnecessary,
> except for invoking the function from acpi_parse_dmar(), which
> 123c779379 ("VTd/dmar: Tweak how the DMAR table is clobbered")
> added several years later.
> 
> Some stale comments are also removed, No functional change.
> 
> Signed-off-by: Zhenzhong Duan 
> ---
> v4: fixup description per Jan
> 
>  xen/arch/x86/tboot.c |6 --
>  1 files changed, 0 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/x86/tboot.c b/xen/arch/x86/tboot.c
> index d5a5292..1006f95 100644
> --- a/xen/arch/x86/tboot.c
> +++ b/xen/arch/x86/tboot.c
> @@ -461,8 +461,6 @@ int __init
> tboot_parse_dmar_table(acpi_table_handler dmar_handler)
>  if ( txt_heap_base == 0 )
>  return 1;
>  
> -/* map TXT heap into Xen addr space */
> -
>  /* walk heap to SinitMleData */
>  pa = txt_heap_base;
>  /* skip BiosData */
> @@ -490,10 +488,6 @@ int __init
> tboot_parse_dmar_table(acpi_table_handler dmar_handler)
>  rc = dmar_handler(dmar_table);
>  xfree(dmar_table);
>  
> -/* acpi_parse_dmar() zaps APCI DMAR signature in TXT heap table
> */
> -/* but dom0 will read real table, so must zap it there too */
> -acpi_dmar_zap();
> -
>  return rc;
>  }
>  
> -- 
> 1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 1/2] dmar: iommu mem leak fix

2018-08-19 Thread Zhenzhong Duan
Release memory allocated for drhd iommu in error path.

Signed-off-by: Zhenzhong Duan 
---
 xen/drivers/passthrough/vtd/dmar.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/dmar.c 
b/xen/drivers/passthrough/vtd/dmar.c
index 46decd4..8c5fa80 100644
--- a/xen/drivers/passthrough/vtd/dmar.c
+++ b/xen/drivers/passthrough/vtd/dmar.c
@@ -100,6 +100,7 @@ static void __init disable_all_dmar_units(void)
 {
 list_del(>list);
 scope_devices_free(>scope);
+iommu_free(drhd->iommu);
 xfree(drhd);
 }
 list_for_each_entry_safe ( rmrr, _rmrr, _rmrr_units, list )
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 2/2] x86/dmar: zap DMAR signature for dom0 once in TBOOT case

2018-08-19 Thread Zhenzhong Duan
When TBOOT enabled, acpi_parse_dmar() zap a copy of DMAR table rather
than the real table, so make it controled by config option based on the
fact that we already have done the real zapping in tboot_parse_dmar_table().

As said above, acpi_parse_dmar() doesn't zaps APCI DMAR signature in
real TXT heap table, fix the stale comments.

Signed-off-by: Zhenzhong Duan 
---
 xen/arch/x86/tboot.c   |3 +--
 xen/drivers/passthrough/vtd/dmar.c |2 ++
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/tboot.c b/xen/arch/x86/tboot.c
index d5a5292..f22fa24 100644
--- a/xen/arch/x86/tboot.c
+++ b/xen/arch/x86/tboot.c
@@ -490,8 +490,7 @@ int __init tboot_parse_dmar_table(acpi_table_handler 
dmar_handler)
 rc = dmar_handler(dmar_table);
 xfree(dmar_table);
 
-/* acpi_parse_dmar() zaps APCI DMAR signature in TXT heap table */
-/* but dom0 will read real table, so must zap it there too */
+/* Dom0 will read real DMAR table, so must zap it there */
 acpi_dmar_zap();
 
 return rc;
diff --git a/xen/drivers/passthrough/vtd/dmar.c 
b/xen/drivers/passthrough/vtd/dmar.c
index 8c5fa80..ed4c04e 100644
--- a/xen/drivers/passthrough/vtd/dmar.c
+++ b/xen/drivers/passthrough/vtd/dmar.c
@@ -866,8 +866,10 @@ static int __init acpi_parse_dmar(struct acpi_table_header 
*table)
 }
 
 out:
+#ifndef CONFIG_TBOOT
 /* Zap ACPI DMAR signature to prevent dom0 using vt-d HW. */
 acpi_dmar_zap();
+#endif
 return ret;
 }
 
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/2] x86/mmcfg/drhd: Move acpi_mmcfg_init() call before calling acpi_parse_dmar()

2018-08-19 Thread Zhenzhong Duan

On 2018/8/17 20:28, Jan Beulich wrote:

On 17.08.18 at 09:01,  wrote:

pci_conf_read8() needs pci mmcfg mapping to work on multiple pci segments
system such as HPE Superdome-Flex.

Move acpi_mmcfg_init() call in acpi_boot_init() before calling
acpi_parse_dmar() so that when pci_conf_read8() is called in
acpi_parse_dev_scope(), we already have the mapping set up.

acpi_mmcfg_init() only setup mmcfg mapping and has some global variables
initialized so there is no hazard to move it ahead.


I'm afraid this is a gross understatement. "Setup mappings"
alone is already putting such movement under question. Have
you checked explicitly that the initialization needed for this
setting up of mappings, if any, has actually occurred before
the new point the function gets called?

In particular, mmio_ro_ranges looks to get set up only after
the call to acpi_boot_init(). I guess you didn't see a crash
solely because you also move the invocation across the call
to zap_low_mappings().

Similary pci_mmcfg_check_hostbridge() doesn't look all that
trivial.

Please may I ask that you be quite a bit more careful here,
even more so when you've been warned already?


Meanwhile from its
name, acpi_boot_init() is a proper place to call it.

The alternative is moving the acpi_parse_dev_scope() call to a later point
after acpi_mmcfg_init(). But acpi_parse_one_drhd(), acpi_parse_one_rmrr()
and acpi_parse_one_atsr() all called acpi_parse_dev_scope() as their main
job. Splitting those functions to two pieces looks less optimal and
meaningless.


Certainly not meaningless; I'm also not sure why you consider
the device scope parsing their main job. It's perhaps their
most involved part, but the fact alone that for the purposes
here we could probably get away without that part already
suggests to me that this is a secondary (yet necessary) aspect.

Furthermore MMCFG will continue to not work this early (or
more precisely not at all until Dom0 boot has progressed far
enough) if the range(s) isn't/aren't marked reserved in E820.
It would be worthwhile saying half a sentence to this effect
in the description.

Jan,

I meet some difficulty moving dev scope parsing to later point. Please 
suggest.


First, acpi_parse_dev_scope() may fail and disable_all_dmar_units() is 
called to free all dmar related allocations which is already used by 
iommu_supports_eim().


Second, In acpi_parse_one_drhd(), acpi_register_drhd_unit() should not 
be moved later so that acpi_drhd_units is setup early, but it's called 
if pci_device_detect() return true, pci_device_detect() depends on mmcfg 
to work which isn't setup mapping yet.


I'm thinking about moving below piece of code earlier too, and I checked 
pci_mmcfg_check_hostbridge() carefully, it's secure, what do you think 
about that?


mmio_ro_ranges = rangeset_new(NULL, "r/o mmio ranges",
  RANGESETF_prettyprint_hex);


Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 1/2] dmar: iommu mem leak fix

2018-08-20 Thread Zhenzhong Duan
Release memory allocated for drhd iommu in error path.

-v2: fixup wrong parameter hiden due to my removing -Werror

Signed-off-by: Zhenzhong Duan 
---
 xen/drivers/passthrough/vtd/dmar.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/dmar.c 
b/xen/drivers/passthrough/vtd/dmar.c
index 46decd4..8c5fa80 100644
--- a/xen/drivers/passthrough/vtd/dmar.c
+++ b/xen/drivers/passthrough/vtd/dmar.c
@@ -100,6 +100,7 @@ static void __init disable_all_dmar_units(void)
 {
 list_del(>list);
 scope_devices_free(>scope);
+iommu_free(drhd);
 xfree(drhd);
 }
 list_for_each_entry_safe ( rmrr, _rmrr, _rmrr_units, list )
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 2/2] x86/dmar: zap DMAR signature for dom0 once in TBOOT case

2018-08-20 Thread Zhenzhong Duan
When TBOOT enabled, acpi_parse_dmar() zap a copy of DMAR table rather
than the real table, so make it controled by config option based on the
fact that we already have done the real zapping in tboot_parse_dmar_table().

As said above, acpi_parse_dmar() doesn't zaps APCI DMAR signature in
real TXT heap table, fix the stale comments.

This is osmetic change, it's unnecessory to zap a copy of DMAR table which
is freed later.

-v2: Add some description per Jan.

Signed-off-by: Zhenzhong Duan 
---
 xen/arch/x86/tboot.c   |3 +--
 xen/drivers/passthrough/vtd/dmar.c |2 ++
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/tboot.c b/xen/arch/x86/tboot.c
index d5a5292..f22fa24 100644
--- a/xen/arch/x86/tboot.c
+++ b/xen/arch/x86/tboot.c
@@ -490,8 +490,7 @@ int __init tboot_parse_dmar_table(acpi_table_handler 
dmar_handler)
 rc = dmar_handler(dmar_table);
 xfree(dmar_table);
 
-/* acpi_parse_dmar() zaps APCI DMAR signature in TXT heap table */
-/* but dom0 will read real table, so must zap it there too */
+/* Dom0 will read real DMAR table, so must zap it there */
 acpi_dmar_zap();
 
 return rc;
diff --git a/xen/drivers/passthrough/vtd/dmar.c 
b/xen/drivers/passthrough/vtd/dmar.c
index 8c5fa80..ed4c04e 100644
--- a/xen/drivers/passthrough/vtd/dmar.c
+++ b/xen/drivers/passthrough/vtd/dmar.c
@@ -866,8 +866,10 @@ static int __init acpi_parse_dmar(struct acpi_table_header 
*table)
 }
 
 out:
+#ifndef CONFIG_TBOOT
 /* Zap ACPI DMAR signature to prevent dom0 using vt-d HW. */
 acpi_dmar_zap();
+#endif
 return ret;
 }
 
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] 答复: [PATCH v2 2/2] x86/dmar: zap DMAR signature for dom0 once in TBOOT case

2018-08-20 Thread Zhenzhong Duan
Please ignore this patch, looks description is wrong, I'll resend, sorry for 
noise.

Zhenzhong
- zhenzhong.d...@oracle.com wrote:

> When TBOOT enabled, acpi_parse_dmar() zap a copy of DMAR table rather
> than the real table, so make it controled by config option based on
> the
> fact that we already have done the real zapping in
> tboot_parse_dmar_table().
> 
> As said above, acpi_parse_dmar() doesn't zaps APCI DMAR signature in
> real TXT heap table, fix the stale comments.
> 
> This is osmetic change, it's unnecessory to zap a copy of DMAR table
> which
> is freed later.
> 
> -v2: Add some description per Jan.
> 
> Signed-off-by: Zhenzhong Duan 
> ---
>  xen/arch/x86/tboot.c   |3 +--
>  xen/drivers/passthrough/vtd/dmar.c |2 ++
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/x86/tboot.c b/xen/arch/x86/tboot.c
> index d5a5292..f22fa24 100644
> --- a/xen/arch/x86/tboot.c
> +++ b/xen/arch/x86/tboot.c
> @@ -490,8 +490,7 @@ int __init
> tboot_parse_dmar_table(acpi_table_handler dmar_handler)
>  rc = dmar_handler(dmar_table);
>  xfree(dmar_table);
>  
> -/* acpi_parse_dmar() zaps APCI DMAR signature in TXT heap table
> */
> -/* but dom0 will read real table, so must zap it there too */
> +/* Dom0 will read real DMAR table, so must zap it there */
>  acpi_dmar_zap();
>  
>  return rc;
> diff --git a/xen/drivers/passthrough/vtd/dmar.c
> b/xen/drivers/passthrough/vtd/dmar.c
> index 8c5fa80..ed4c04e 100644
> --- a/xen/drivers/passthrough/vtd/dmar.c
> +++ b/xen/drivers/passthrough/vtd/dmar.c
> @@ -866,8 +866,10 @@ static int __init acpi_parse_dmar(struct
> acpi_table_header *table)
>  }
>  
>  out:
> +#ifndef CONFIG_TBOOT
>  /* Zap ACPI DMAR signature to prevent dom0 using vt-d HW. */
>  acpi_dmar_zap();
> +#endif
>  return ret;
>  }
>  
> -- 
> 1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] x86/dmar: zap DMAR signature for dom0 once in TBOOT case

2018-08-20 Thread Zhenzhong Duan

On 2018/8/20 16:44, Andrew Cooper wrote:

On 20/08/2018 09:30, Jan Beulich wrote:

On 20.08.18 at 05:32,  wrote:

When TBOOT enabled, acpi_parse_dmar() zap a copy of DMAR table rather
than the real table, so make it controled by config option based on the
fact that we already have done the real zapping in tboot_parse_dmar_table().

Is this just a cosmetic change, or is there any harm done by the extra
zapping?


Before 123c7793797502b222300eb710cd3873dcca41ee, calling it multiple
times would require an equivalent number of reinstate calls for it to work.

That change went into Xen 4.6, whereas I guess this is a patch being
upstreamed from Oracles 4.4 patchqueue?

Not for that reason. Patch is for upstream.
tboot_parse_dmar_table() called acpi_dmar_zap() the first time, 
acpi_parse_dmar called it the second time, I thought it's a dup.


Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] x86/dmar: zap DMAR signature for dom0 once in TBOOT case

2018-08-20 Thread Zhenzhong Duan

On 2018/8/20 16:30, Jan Beulich wrote:

On 20.08.18 at 05:32,  wrote:

When TBOOT enabled, acpi_parse_dmar() zap a copy of DMAR table rather
than the real table, so make it controled by config option based on the
fact that we already have done the real zapping in tboot_parse_dmar_table().


Is this just a cosmetic change, or is there any harm done by the extra
zapping?
Cosmetic change, I feel it isn't necessory to zap a copy of DMAR table 
which is freed later.


Thanks
Zhenzhong



As said above, acpi_parse_dmar() doesn't zaps APCI DMAR signature in
real TXT heap table, fix the stale comments.

Signed-off-by: Zhenzhong Duan 
---
  xen/arch/x86/tboot.c   |3 +--
  xen/drivers/passthrough/vtd/dmar.c |2 ++
  2 files changed, 3 insertions(+), 2 deletions(-)


You've again failed to Cc maintainers (included now).

Jan


--- a/xen/arch/x86/tboot.c
+++ b/xen/arch/x86/tboot.c
@@ -490,8 +490,7 @@ int __init tboot_parse_dmar_table(acpi_table_handler 
dmar_handler)
  rc = dmar_handler(dmar_table);
  xfree(dmar_table);
  
-/* acpi_parse_dmar() zaps APCI DMAR signature in TXT heap table */

-/* but dom0 will read real table, so must zap it there too */
+/* Dom0 will read real DMAR table, so must zap it there */
  acpi_dmar_zap();
  
  return rc;

--- a/xen/drivers/passthrough/vtd/dmar.c
+++ b/xen/drivers/passthrough/vtd/dmar.c
@@ -866,8 +866,10 @@ static int __init acpi_parse_dmar(struct acpi_table_header 
*table)
  }
  
  out:

+#ifndef CONFIG_TBOOT
  /* Zap ACPI DMAR signature to prevent dom0 using vt-d HW. */
  acpi_dmar_zap();
+#endif
  return ret;
  }
  





___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/2] x86/mmcfg/drhd: Move acpi_mmcfg_init() call before calling acpi_parse_dmar()

2018-08-20 Thread Zhenzhong Duan

On 2018/8/20 15:45, Jan Beulich wrote:

On 20.08.18 at 05:38,  wrote:

On 2018/8/17 20:28, Jan Beulich wrote:

On 17.08.18 at 09:01,  wrote:

pci_conf_read8() needs pci mmcfg mapping to work on multiple pci segments
system such as HPE Superdome-Flex.

Move acpi_mmcfg_init() call in acpi_boot_init() before calling
acpi_parse_dmar() so that when pci_conf_read8() is called in
acpi_parse_dev_scope(), we already have the mapping set up.

acpi_mmcfg_init() only setup mmcfg mapping and has some global variables
initialized so there is no hazard to move it ahead.


I'm afraid this is a gross understatement. "Setup mappings"
alone is already putting such movement under question. Have
you checked explicitly that the initialization needed for this
setting up of mappings, if any, has actually occurred before
the new point the function gets called?

In particular, mmio_ro_ranges looks to get set up only after
the call to acpi_boot_init(). I guess you didn't see a crash
solely because you also move the invocation across the call
to zap_low_mappings().

Similary pci_mmcfg_check_hostbridge() doesn't look all that
trivial.

Please may I ask that you be quite a bit more careful here,
even more so when you've been warned already?


Meanwhile from its
name, acpi_boot_init() is a proper place to call it.

The alternative is moving the acpi_parse_dev_scope() call to a later point
after acpi_mmcfg_init(). But acpi_parse_one_drhd(), acpi_parse_one_rmrr()
and acpi_parse_one_atsr() all called acpi_parse_dev_scope() as their main
job. Splitting those functions to two pieces looks less optimal and
meaningless.


Certainly not meaningless; I'm also not sure why you consider
the device scope parsing their main job. It's perhaps their
most involved part, but the fact alone that for the purposes
here we could probably get away without that part already
suggests to me that this is a secondary (yet necessary) aspect.

Furthermore MMCFG will continue to not work this early (or
more precisely not at all until Dom0 boot has progressed far
enough) if the range(s) isn't/aren't marked reserved in E820.
It would be worthwhile saying half a sentence to this effect
in the description.


I meet some difficulty moving dev scope parsing to later point. Please
suggest.

First, acpi_parse_dev_scope() may fail and disable_all_dmar_units() is
called to free all dmar related allocations which is already used by
iommu_supports_eim().


Hmm, right - iommu_supports_eim() indeed requires device scope parsing
to have happened.


I'm thinking about moving below piece of code earlier too, and I checked
pci_mmcfg_check_hostbridge() carefully, it's secure, what do you think
about that?

  mmio_ro_ranges = rangeset_new(NULL, "r/o mmio ranges",
RANGESETF_prettyprint_hex);



That's a reasonable thing to do, and is (as pointed out) a necessary
prereq. But to be very clear - you'll also have to prove it's sufficient,
and for that it doesn't suffice to consider pci_mmcfg_check_hostbridge()
alone.
Not sure how to prove, I checked over acpi_mmcfg_init() carefully, 
acpi_disabled and DMI info are used and they are initialized earlier 
than acpi_dmar_init() call, I only found mmio_ro_ranges need to be moved.


Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RESEND v3 1/2] x86/mmcfg: Rename pt_pci_init() and call it in acpi_mmcfg_init()

2018-08-27 Thread Zhenzhong Duan

在 2018/8/27 16:16, Jan Beulich 写道:

On 22.08.18 at 11:16,  wrote:

Given what pt_pci_init() actually does, rename it properly and move its
declaration to pci.h, move the only call in acpi_mmcfg_init().

No functional change.

Signed-off-by: Zhenzhong Duan 
Tested-by: Gopalasetty, Manoj 
Acked-by: Jan Beulich 


Please avoid needless re-sending of patches that have already
gone in. Roger's remark regarding the questionable solitary v3
2/2 was correct, just that you should have dropped the 2/2
instead of re-sending an already applied patch.

Ok.
To avoid endless resending, I'll follow that way next time.

Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86/physdev: Remove redundant assignment in allocate_and_map_msi_pirq()

2018-07-20 Thread Zhenzhong Duan
No functional change.

Signed-off-by: Zhenzhong Duan 
---
 xen/arch/x86/irq.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 87ef2e8..5253fd1 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2701,8 +2701,6 @@ int allocate_and_map_msi_pirq(struct domain *d, int 
index, int *pirq_p,
 return -EINVAL;
 }
 
-msi->irq = irq;
-
 pcidevs_lock();
 /* Verify or get pirq. */
 spin_lock(>event_lock);
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/physdev: Remove redundant assignment in allocate_and_map_msi_pirq()

2018-07-22 Thread Zhenzhong Duan

在 2018/7/20 18:13, Andrew Cooper 写道:

On 20/07/18 11:13, Roger Pau Monné wrote:

On Fri, Jul 20, 2018 at 02:29:34AM -0700, Zhenzhong Duan wrote:

No functional change.

Signed-off-by: Zhenzhong Duan 
---
  xen/arch/x86/irq.c |2 --
  1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 87ef2e8..5253fd1 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2701,8 +2701,6 @@ int allocate_and_map_msi_pirq(struct domain *d, int 
index, int *pirq_p,
  return -EINVAL;
  }
  
-msi->irq = irq;

I would prefer to remove the assignment in the MAP_PIRQ_TYPE_MULTI_MSI
case rather than here. IMO this one makes it clearer that msi->irq is
always set.


Me too.  I can fix this up on commit.

Sure, go ahead, thanks Roger, Andrew for review.

Regards
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] xen/grant: Mute gcc warning in steal_linear_address()

2018-08-29 Thread Zhenzhong Duan
Move reference of ol1e ahead or else we see below warning.

cc1: warnings being treated as errors
grant_table.c: In function 'replace_grant_pv_mapping':
grant_table.c:142: warning: 'ol1e.l1' may be used uninitialized in this function

Signed-off-by: Zhenzhong Duan 
---
 xen/arch/x86/pv/grant_table.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/pv/grant_table.c b/xen/arch/x86/pv/grant_table.c
index 6b7d855..5180334 100644
--- a/xen/arch/x86/pv/grant_table.c
+++ b/xen/arch/x86/pv/grant_table.c
@@ -167,6 +167,9 @@ static bool steal_linear_address(unsigned long linear, 
l1_pgentry_t *out)
 ol1e = *pl1e;
 okay = UPDATE_ENTRY(l1, pl1e, ol1e, l1e_empty(), mfn_x(gl1mfn), curr, 0);
 
+if ( okay )
+*out = ol1e;
+
  out_unlock:
 page_unlock(page);
  out_put:
@@ -174,9 +177,6 @@ static bool steal_linear_address(unsigned long linear, 
l1_pgentry_t *out)
  out_unmap:
 unmap_domain_page(pl1e);
 
-if ( okay )
-*out = ol1e;
-
  out:
 return okay;
 }
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] 答复: [PATCH RESEND v3 2/2] x86/mmcfg/drhd: Move acpi_mmcfg_init() call before calling acpi_parse_dmar()

2018-08-29 Thread Zhenzhong Duan
Hi Maintainers,

May I ask if this patch will be merged upstream or not? Our customer
is pushing us urgently with timeline for errata release and we are waiting
for official version from upstream.

Thanks
Zhenzhong
- zhenzhong.d...@oracle.com 写道:

> pci_conf_read8() needs pci mmcfg mapping to work on multiple pci
> segments system such as HPE Superdome-Flex.
> 
> Move acpi_mmcfg_init() call in acpi_boot_init() before calling
> acpi_parse_dmar() so that when pci_conf_read8() is called in
> acpi_parse_dev_scope(), we already have the mapping set up.
> 
> mmio_ro_ranges initialization is also moved ahead as it's the only
> dependency of pci_mmcfg_arch_enable() need to be moved. Also
> checked codes between the old and new call sites to ensure we
> don't break anything.
> 
> Furthermore MMCFG will continue to not work this early (or
> more precisely not at all until Dom0 boot has progressed far
> enough) if the range(s) isn't/aren't marked reserved in E820.
> 
> Signed-off-by: Zhenzhong Duan 
> Tested-by: Gopalasetty, Manoj 
> Reviewed-by: Roger Pau Monné 
> ---
>  xen/arch/x86/acpi/boot.c |2 ++
>  xen/arch/x86/setup.c |8 +++-
>  2 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c
> index 8e6c96d..e89c2e9 100644
> --- a/xen/arch/x86/acpi/boot.c
> +++ b/xen/arch/x86/acpi/boot.c
> @@ -724,6 +724,8 @@ int __init acpi_boot_init(void)
>  
>   acpi_table_parse(ACPI_SIG_HPET, acpi_parse_hpet);
>  
> + acpi_mmcfg_init();
> +
>   acpi_dmar_init();
>  
>   erst_init();
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index d5cc584..eabb011 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1493,6 +1493,9 @@ void __init noreturn __start_xen(unsigned long
> mbi_p)
>  
>  generic_apic_probe();
>  
> +mmio_ro_ranges = rangeset_new(NULL, "r/o mmio ranges",
> +  RANGESETF_prettyprint_hex);
> +
>  acpi_boot_init();
>  
>  if ( smp_found_config )
> @@ -1525,9 +1528,6 @@ void __init noreturn __start_xen(unsigned long
> mbi_p)
>  /* Low mappings were only needed for some BIOS table parsing. */
>  zap_low_mappings();
>  
> -mmio_ro_ranges = rangeset_new(NULL, "r/o mmio ranges",
> -  RANGESETF_prettyprint_hex);
> -
>  init_apic_mappings();
>  
>  normalise_cpu_order();
> @@ -1598,8 +1598,6 @@ void __init noreturn __start_xen(unsigned long
> mbi_p)
>  
>  vesa_mtrr_init();
>  
> -acpi_mmcfg_init();
> -
>  early_msi_init();
>  
>  iommu_setup();/* setup iommu if available */
> -- 
> 1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/grant: Mute gcc warning in steal_linear_address()

2018-08-30 Thread Zhenzhong Duan

在 2018/8/30 16:04, Jan Beulich 写道:

On 30.08.18 at 06:06,  wrote:

Move reference of ol1e ahead or else we see below warning.

cc1: warnings being treated as errors
grant_table.c: In function 'replace_grant_pv_mapping':
grant_table.c:142: warning: 'ol1e.l1' may be used uninitialized in this
function

Signed-off-by: Zhenzhong Duan 


I'm fine with the change, but going forward please be more specific
with your descriptions: Whether or not warnings get emitted for a
certain piece of code often highly depends on compiler version. Let
us know which version you've run this into with. For the record, I'm
building on a wide range of gcc versions every once in a while, and
I've not come across this warning so far (or else I would have sent
a patch). The really old versions (4.1.x for example) can't really be
used anymore for certain other reasons anyway, and I think it's
about time we at least raise the bar a little (also for binutils, which
iirc was the actual problem with here, using a version matching in
original release time with gcc 4.1.x).


gcc version 4.1.2 20080704 (Red Hat 4.1.2-55)

Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/boot: Disable IBRS in intr/nmi exit path at bootup stage

2018-03-26 Thread Zhenzhong Duan

在 2018/3/26 21:39, Jan Beulich 写道:

On 21.03.18 at 03:58,  wrote:

After reset, IBRS is disabled by processor, but a coming intr/nmi leave IBRS
enabled after their exit. It's not necessory for bootup code to run in low
performance with IBRS enabled.

On ORACLE X6-2(500GB/88 cpus, dom0 11GB/20 vcpus), we observed an 200s+ delay
in construct_dom0.

By initializing use_shadow_spec_ctrl with 1, IBRS is disabled in intr/nmi exit
path at bootup stage. Then delay in construct_dom0 is ~50s.


While I can certainly follow the argumentation, did you pay
attention to Andrew also modifying what you would call "bootup
code" in commit 7c508612f7 ("x86: Support indirect thunks from
assembly code")? That wasn't just a random change - we
specifically want it for the case of bringing up CPUs at runtime.
You'll need to be equally careful here, I think: Rather than
storing literal 1 (which should have been "true" anyway), you'll
want to store (system_state < SYS_STATE_active) or maybe
(system_state != SYS_STATE_active), at least when the CPU
being booted is a hyperthread of a CPU which is active already.

Make sense for me, thank you for explanation. I'll send a new patch.

Regards
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2] x86/boot: Disable IBRS in intr/nmi exit path at bootup stage

2018-03-26 Thread Zhenzhong Duan
After reset, IBRS is disabled by processor, but a coming intr/nmi leave IBRS
enabled after their exit. It's not necessory for bootup code to run in low
performance with IBRS enabled.

On ORACLE X6-2(500GB/88 cpus, dom0 11GB/20 vcpus), we observed an 200s+ delay
in construct_dom0.

By initializing use_shadow_spec_ctrl with the result of (system_state <
SYS_STATE_active), IBRS is disabled in intr/nmi exit path at bootup stage.
Then delay in construct_dom0 is ~50s.

When hot-onlining a CPU, we initialize IBRS early and set use_shadow_spec_ctrl
to false to avoid Branch Target Injection from sibling threads.

v2: Use (system_state < SYS_STATE_active) to initialize use_shadow_spec_ctrl
instead of literal 1 per Jan.

Signed-off-by: Zhenzhong Duan <zhenzhong.d...@oracle.com>
---
 xen/include/asm-x86/spec_ctrl.h |   16 +++-
 1 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/xen/include/asm-x86/spec_ctrl.h b/xen/include/asm-x86/spec_ctrl.h
index 5ab4ff3..1672317 100644
--- a/xen/include/asm-x86/spec_ctrl.h
+++ b/xen/include/asm-x86/spec_ctrl.h
@@ -32,8 +32,22 @@ extern uint8_t default_bti_ist_info;
 static inline void init_shadow_spec_ctrl_state(void)
 {
 struct cpu_info *info = get_cpu_info();
+uint32_t val = SPEC_CTRL_IBRS;
+
+/* Initialize IA32_SPEC_CTRL MSR for hotplugging cpu early */
+if ( system_state >= SYS_STATE_active )
+asm volatile (ALTERNATIVE(ASM_NOP3, "wrmsr", X86_FEATURE_XEN_IBRS_SET)
+  :: "a" (val), "c" (MSR_SPEC_CTRL), "d" (0) : "memory");
 
-info->shadow_spec_ctrl = info->use_shadow_spec_ctrl = 0;
+info->shadow_spec_ctrl = 0;
+/*
+ * We want to make sure we clear IBRS in interrupt exit path
+ * (DO_SPEC_CTRL_EXIT_TO_XEN) while dom0 is still booting to
+ * avoid unnecessary performance impact. As soon as dom0 has
+ * booted use_shadow_spec_ctrl will be cleared, for example,
+ * in idle routine.
+ */
+info->use_shadow_spec_ctrl = system_state < SYS_STATE_active;
 info->bti_ist_info = default_bti_ist_info;
 }
 
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86/boot: Disable IBRS in intr/nmi exit path at bootup stage

2018-03-27 Thread Zhenzhong Duan


On 2018/3/27 16:52, Jan Beulich wrote:

On 27.03.18 at 06:52,  wrote:

After reset, IBRS is disabled by processor, but a coming intr/nmi leave IBRS
enabled after their exit. It's not necessory for bootup code to run in low
performance with IBRS enabled.

On ORACLE X6-2(500GB/88 cpus, dom0 11GB/20 vcpus), we observed an 200s+ delay
in construct_dom0.

By initializing use_shadow_spec_ctrl with the result of (system_state <
SYS_STATE_active), IBRS is disabled in intr/nmi exit path at bootup stage.
Then delay in construct_dom0 is ~50s.

When hot-onlining a CPU, we initialize IBRS early and set use_shadow_spec_ctrl
to false to avoid Branch Target Injection from sibling threads.

v2: Use (system_state < SYS_STATE_active) to initialize use_shadow_spec_ctrl
instead of literal 1 per Jan.


Please place revision information below the first --- marker.

Ok




--- a/xen/include/asm-x86/spec_ctrl.h
+++ b/xen/include/asm-x86/spec_ctrl.h
@@ -32,8 +32,22 @@ extern uint8_t default_bti_ist_info;
  static inline void init_shadow_spec_ctrl_state(void)
  {
  struct cpu_info *info = get_cpu_info();
+uint32_t val = SPEC_CTRL_IBRS;


Why do you need this variable?
This is a copy of the same code in spec_ctrl_enter_idle() and 
spec_ctrl_exit_idle().



+/* Initialize IA32_SPEC_CTRL MSR for hotplugging cpu early */
+if ( system_state >= SYS_STATE_active )
+asm volatile (ALTERNATIVE(ASM_NOP3, "wrmsr", X86_FEATURE_XEN_IBRS_SET)
+  :: "a" (val), "c" (MSR_SPEC_CTRL), "d" (0) : "memory");


I can see the point of doing this, but the title of the patch doesn't
cover it (I think this has been missing independent of your interrupt/
NMI paths consideration).

Could I make a seperate patch for above four lines?
Looks hard to describe all these in one title.


Further INIT# (unlike RESET#) doesn't clear the register, so you
may want/need to also clear the register in the
X86_FEATURE_XEN_IBRS_CLEAR case.
I did consider using ALTERNATIVE_2 here, so dumped the IA32_SPEC_CTRL 
msr's value just after the entry of init_shadow_spec_ctrl_state() for a 
hot-onlining CPU and it's zero, this is the X86_FEATURE_XEN_IBRS_SET case.

In X86_FEATURE_XEN_IBRS_CLEAR case, will IBRS ever be set?


Also you don't need ASM_NOP3 here after 4008c71d7a ("x86/alt:
Support for automatic padding calculations").

Yes


Additionally I think it would be better to keep low and high parts
of the value next to each other in the constraints, rather than
putting the MSR index in the middle.
Same copy from spec_ctrl_enter_idle() and spec_ctrl_exit_idle(), maybe 
it's better to have a seperate patch to fix all of them, including the 
variable val.



-info->shadow_spec_ctrl = info->use_shadow_spec_ctrl = 0;
+info->shadow_spec_ctrl = 0;
+/*
+ * We want to make sure we clear IBRS in interrupt exit path
+ * (DO_SPEC_CTRL_EXIT_TO_XEN) while dom0 is still booting to
+ * avoid unnecessary performance impact. As soon as dom0 has
+ * booted use_shadow_spec_ctrl will be cleared, for example,
+ * in idle routine.
+ */
+info->use_shadow_spec_ctrl = system_state < SYS_STATE_active;


I think the code overall would be more readable if you had just a
single condition (in if/else form).

Ok


And then there is the question of whether to use < / >= or
!= / == : In the resume case, not guest vCPU-s are active (yet),
so perhaps the latter would be better.

Ok


In any event please give Andrew a chance to reply before you
send another version, as he may have a different opinion and/or
other valuable input.

Ok, I'll wait Andrew's reply before go ahead.

Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/alt: Fix wrong usage of as_max in OLDINSTR_2

2018-03-28 Thread Zhenzhong Duan

On 2018/3/28 13:52, Jan Beulich write:

Zhenzhong Duan <zhenzhong.d...@oracle.com> 03/28/18 4:03 AM >>>

When ALTERNATIVE_2 is used, we see below error during build.
"error: macro "as_max" requires 2 arguments, but only 1 given"

Signed-off-by: Zhenzhong Duan <zhenzhong.d...@oracle.com>


See "[PATCH v2 2/6] x86: fix OLDINSTR_2()" sent on Mar 13th.

Oh, I see.

Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86/alt: Fix wrong usage of as_max in OLDINSTR_2

2018-03-27 Thread Zhenzhong Duan
When ALTERNATIVE_2 is used, we see below error during build.
"error: macro "as_max" requires 2 arguments, but only 1 given"

Signed-off-by: Zhenzhong Duan <zhenzhong.d...@oracle.com>
---
 xen/include/asm-x86/alternative.h |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/include/asm-x86/alternative.h 
b/xen/include/asm-x86/alternative.h
index 4803368..3b577c4 100644
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -54,8 +54,8 @@ extern void alternative_instructions(void);
 
 #define OLDINSTR_2(oldinstr, n1, n2) \
 OLDINSTR(oldinstr,   \
- as_max((alt_repl_len(n1),   \
- alt_repl_len(n2)) "-" alt_orig_len))
+ as_max(alt_repl_len(n1),\
+ alt_repl_len(n2)) "-" alt_orig_len)
 
 #define ALTINSTR_ENTRY(feature, num)\
 " .long .LXEN%=_orig_s - .\n" /* label   */ \
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86/boot: Disable IBRS in intr/nmi exit path at bootup stage

2018-04-02 Thread Zhenzhong Duan

On 2018/3/27 16:52, Jan Beulich wrote:

On 27.03.18 at 06:52,  wrote:

After reset, IBRS is disabled by processor, but a coming intr/nmi leave IBRS
enabled after their exit. It's not necessory for bootup code to run in low
performance with IBRS enabled.

On ORACLE X6-2(500GB/88 cpus, dom0 11GB/20 vcpus), we observed an 200s+ delay
in construct_dom0.

By initializing use_shadow_spec_ctrl with the result of (system_state <
SYS_STATE_active), IBRS is disabled in intr/nmi exit path at bootup stage.
Then delay in construct_dom0 is ~50s.

When hot-onlining a CPU, we initialize IBRS early and set use_shadow_spec_ctrl
to false to avoid Branch Target Injection from sibling threads.

v2: Use (system_state < SYS_STATE_active) to initialize use_shadow_spec_ctrl
instead of literal 1 per Jan.


Please place revision information below the first --- marker.


--- a/xen/include/asm-x86/spec_ctrl.h
+++ b/xen/include/asm-x86/spec_ctrl.h
@@ -32,8 +32,22 @@ extern uint8_t default_bti_ist_info;
  static inline void init_shadow_spec_ctrl_state(void)
  {
  struct cpu_info *info = get_cpu_info();
+uint32_t val = SPEC_CTRL_IBRS;


Why do you need this variable?


+/* Initialize IA32_SPEC_CTRL MSR for hotplugging cpu early */
+if ( system_state >= SYS_STATE_active )
+asm volatile (ALTERNATIVE(ASM_NOP3, "wrmsr", X86_FEATURE_XEN_IBRS_SET)
+  :: "a" (val), "c" (MSR_SPEC_CTRL), "d" (0) : "memory");


I can see the point of doing this, but the title of the patch doesn't
cover it (I think this has been missing independent of your interrupt/
NMI paths consideration).

Further INIT# (unlike RESET#) doesn't clear the register, so you
may want/need to also clear the register in the
X86_FEATURE_XEN_IBRS_CLEAR case.

Also you don't need ASM_NOP3 here after 4008c71d7a ("x86/alt:
Support for automatic padding calculations").

Additionally I think it would be better to keep low and high parts
of the value next to each other in the constraints, rather than
putting the MSR index in the middle.


-info->shadow_spec_ctrl = info->use_shadow_spec_ctrl = 0;
+info->shadow_spec_ctrl = 0;
+/*
+ * We want to make sure we clear IBRS in interrupt exit path
+ * (DO_SPEC_CTRL_EXIT_TO_XEN) while dom0 is still booting to
+ * avoid unnecessary performance impact. As soon as dom0 has
+ * booted use_shadow_spec_ctrl will be cleared, for example,
+ * in idle routine.
+ */
+info->use_shadow_spec_ctrl = system_state < SYS_STATE_active;


I think the code overall would be more readable if you had just a
single condition (in if/else form).

And then there is the question of whether to use < / >= or
!= / == : In the resume case, not guest vCPU-s are active (yet),
so perhaps the latter would be better.

In any event please give Andrew a chance to reply before you
send another version, as he may have a different opinion and/or
other valuable input.


Hi Andrew,

May I have your comments? If there is no further suggestions from you, 
I'll prepare to make the new version.


Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86/boot: Disable IBRS in intr/nmi exit path at bootup stage

2018-03-20 Thread Zhenzhong Duan
After reset, IBRS is disabled by processor, but a coming intr/nmi leave IBRS
enabled after their exit. It's not necessory for bootup code to run in low
performance with IBRS enabled.

On ORACLE X6-2(500GB/88 cpus, dom0 11GB/20 vcpus), we observed an 200s+ delay
in construct_dom0.

By initializing use_shadow_spec_ctrl with 1, IBRS is disabled in intr/nmi exit
path at bootup stage. Then delay in construct_dom0 is ~50s.

Signed-off-by: Zhenzhong Duan <zhenzhong.d...@oracle.com>
---
 xen/include/asm-x86/spec_ctrl.h |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/xen/include/asm-x86/spec_ctrl.h b/xen/include/asm-x86/spec_ctrl.h
index 5ab4ff3..c619a80 100644
--- a/xen/include/asm-x86/spec_ctrl.h
+++ b/xen/include/asm-x86/spec_ctrl.h
@@ -33,7 +33,8 @@ static inline void init_shadow_spec_ctrl_state(void)
 {
 struct cpu_info *info = get_cpu_info();
 
-info->shadow_spec_ctrl = info->use_shadow_spec_ctrl = 0;
+info->shadow_spec_ctrl = 0;
+info->use_shadow_spec_ctrl = 1;
 info->bti_ist_info = default_bti_ist_info;
 }
 
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 5/7] x86/xen: nopv parameter support for HVM guest

2019-06-25 Thread Zhenzhong Duan
PVH guest needs PV extentions to work, so nopv parameter is ignored
for PVH but not for HVM guest.

In order for nopv parameter to take effect for HVM guest, we need to
distinguish between PVH and HVM guest early in hypervisor detection
code. By moving the detection of PVH in xen_platform_hvm(),
xen_pvh_domain() could be used for that purpose.

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: xen-devel@lists.xenproject.org
---
 arch/x86/xen/enlighten_hvm.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 7fcb4ea..26939e7 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -25,6 +25,7 @@
 #include "mmu.h"
 #include "smp.h"
 
+extern bool nopv;
 static unsigned long shared_info_pfn;
 
 void xen_hvm_init_shared_info(void)
@@ -226,20 +227,24 @@ static uint32_t __init xen_platform_hvm(void)
if (xen_pv_domain())
return 0;
 
+#ifdef CONFIG_XEN_PVH
+   /* Test for PVH domain (PVH boot path taken overrides ACPI flags). */
+   if (!x86_platform.legacy.rtc && x86_platform.legacy.no_vga)
+   xen_pvh = true;
+#endif
+
+   if (!xen_pvh_domain() && nopv)
+   return 0;
+
return xen_cpuid_base();
 }
 
 static __init void xen_hvm_guest_late_init(void)
 {
 #ifdef CONFIG_XEN_PVH
-   /* Test for PVH domain (PVH boot path taken overrides ACPI flags). */
-   if (!xen_pvh &&
-   (x86_platform.legacy.rtc || !x86_platform.legacy.no_vga))
+   if (!xen_pvh)
return;
 
-   /* PVH detected. */
-   xen_pvh = true;
-
/* Make sure we don't fall back to (default) ACPI_IRQ_MODEL_PIC. */
if (!nr_ioapics && acpi_irq_model == ACPI_IRQ_MODEL_PIC)
acpi_irq_model = ACPI_IRQ_MODEL_PLATFORM;
@@ -258,4 +263,5 @@ static __init void xen_hvm_guest_late_init(void)
.init.init_mem_mapping  = xen_hvm_init_mem_mapping,
.init.guest_late_init   = xen_hvm_guest_late_init,
.runtime.pin_vcpu   = xen_pin_vcpu,
+   .ignore_nopv= true,
 };
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 3/7] x86: Add nopv parameter to disable PV extensions

2019-06-25 Thread Zhenzhong Duan
In virtualization environment, PV extensions (drivers, interrupts,
timers, etc) are enabled in the majority of use cases which is the
best option.

However, in some cases (kexec not fully working, benchmarking)
we want to disable PV extensions. As such introduce the
'nopv' parameter that will do it.

There are guest types which just won't work without PV extensions,
like Xen PV, Xen PVH and jailhouse. add a "ignore_nopv" member to
struct hypervisor_x86 set to true for those guest types and call
the detect functions only if nopv is false or ignore_nopv is true.

There is already 'xen_nopv' parameter for XEN platform but not for
others. 'xen_nopv' can then be removed with this change.

Suggested-by: Juergen Gross 
Signed-off-by: Zhenzhong Duan 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Jan Kiszka 
Cc: Boris Ostrovsky 
Cc: Stefano Stabellini 
Cc: xen-devel@lists.xenproject.org
---
 Documentation/admin-guide/kernel-parameters.txt |  5 +
 arch/x86/include/asm/hypervisor.h   |  3 +++
 arch/x86/kernel/cpu/hypervisor.c| 11 +++
 arch/x86/kernel/jailhouse.c |  1 +
 arch/x86/xen/enlighten_pv.c |  1 +
 5 files changed, 21 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 138f666..21e08af 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5268,6 +5268,11 @@
improve timer resolution at the expense of processing
more timer interrupts.
 
+   nopv=   [X86,XEN,KVM,HYPER_V,VMWARE]
+   Disables the PV optimizations forcing the guest to run
+   as generic guest with no PV drivers. Currently support
+   XEN HVM, KVM, HYPER_V and VMWARE guest.
+
xirc2ps_cs= [NET,PCMCIA]
Format:

,[,[,[,]]]
diff --git a/arch/x86/include/asm/hypervisor.h 
b/arch/x86/include/asm/hypervisor.h
index 8c5aaba..d75d2ea 100644
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -52,6 +52,9 @@ struct hypervisor_x86 {
 
/* runtime callbacks */
struct x86_hyper_runtime runtime;
+
+   /* ignore nopv parameter */
+   bool ignore_nopv;
 };
 
 extern enum x86_hypervisor_type x86_hyper_type;
diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
index 479ca47..337ff07 100644
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -54,6 +54,14 @@
 enum x86_hypervisor_type x86_hyper_type;
 EXPORT_SYMBOL(x86_hyper_type);
 
+bool __initdata nopv;
+static __init int parse_nopv(char *arg)
+{
+   nopv = true;
+   return 0;
+}
+early_param("nopv", parse_nopv);
+
 static inline const struct hypervisor_x86 * __init
 detect_hypervisor_vendor(void)
 {
@@ -61,6 +69,9 @@
uint32_t pri, max_pri = 0;
 
for (p = hypervisors; p < hypervisors + ARRAY_SIZE(hypervisors); p++) {
+   if (unlikely(nopv) && !(*p)->ignore_nopv)
+   continue;
+
pri = (*p)->detect();
if (pri > max_pri) {
max_pri = pri;
diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c
index d96d563..880329f 100644
--- a/arch/x86/kernel/jailhouse.c
+++ b/arch/x86/kernel/jailhouse.c
@@ -217,4 +217,5 @@ static bool __init jailhouse_x2apic_available(void)
.detect = jailhouse_detect,
.init.init_platform = jailhouse_init_platform,
.init.x2apic_available  = jailhouse_x2apic_available,
+   .ignore_nopv= true,
 };
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 4722ba2..5d16824 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1463,4 +1463,5 @@ static uint32_t __init xen_platform_pv(void)
.detect = xen_platform_pv,
.type   = X86_HYPER_XEN_PV,
.runtime.pin_vcpu   = xen_pin_vcpu,
+   .ignore_nopv= true,
 };
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 4/7] Revert "xen: Introduce 'xen_nopv' to disable PV extensions for HVM guests."

2019-06-25 Thread Zhenzhong Duan
This reverts commit 8d693b911bb9c57009c24cb1772d205b84c7985c.

Instead we use an unified parameter 'nopv' for all the hypervisor
platforms.

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: xen-devel@lists.xenproject.org
---
 Documentation/admin-guide/kernel-parameters.txt |  4 
 arch/x86/xen/enlighten_hvm.c| 12 +---
 2 files changed, 1 insertion(+), 15 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 21e08af..d5c3dcc 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5251,10 +5251,6 @@
Disables the ticketlock slowpath using Xen PV
optimizations.
 
-   xen_nopv[X86]
-   Disables the PV optimizations forcing the HVM guest to
-   run as generic HVM guest with no PV drivers.
-
xen_scrub_pages=[XEN]
Boolean option to control scrubbing pages before giving 
them back
to Xen, for use by other domains. Can be also changed 
at runtime
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index ac4943c..7fcb4ea 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -210,18 +210,8 @@ static void __init xen_hvm_guest_init(void)
 #endif
 }
 
-static bool xen_nopv;
-static __init int xen_parse_nopv(char *arg)
-{
-   xen_nopv = true;
-   return 0;
-}
-early_param("xen_nopv", xen_parse_nopv);
-
 bool __init xen_hvm_need_lapic(void)
 {
-   if (xen_nopv)
-   return false;
if (xen_pv_domain())
return false;
if (!xen_hvm_domain())
@@ -233,7 +223,7 @@ bool __init xen_hvm_need_lapic(void)
 
 static uint32_t __init xen_platform_hvm(void)
 {
-   if (xen_pv_domain() || xen_nopv)
+   if (xen_pv_domain())
return 0;
 
return xen_cpuid_base();
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 3/6] x86: Add nopv parameter to disable PV extensions

2019-06-24 Thread Zhenzhong Duan
In virtualization environment, PV extensions (drivers, interrupts,
timers, etc) are enabled in the majority of use cases which is the
best option.

However, in some cases (kexec not fully working, benchmarking)
we want to disable PV extensions. As such introduce the
'nopv' parameter that will do it.

There is already 'xen_nopv' parameter for XEN platform but not for
others. 'xen_nopv' can then be removed with this change.

Signed-off-by: Zhenzhong Duan 
Cc: xen-devel@lists.xenproject.org
---
 Documentation/admin-guide/kernel-parameters.txt |  4 
 arch/x86/kernel/cpu/hypervisor.c| 11 +++
 2 files changed, 15 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 138f666..b352f36 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5268,6 +5268,10 @@
improve timer resolution at the expense of processing
more timer interrupts.
 
+   nopv=   [X86]
+   Disables the PV optimizations forcing the guest to run
+   as generic guest with no PV drivers.
+
xirc2ps_cs= [NET,PCMCIA]
Format:

,[,[,[,]]]
diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
index 479ca47..4f2c875 100644
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -85,10 +85,21 @@ static void __init copy_array(const void *src, void 
*target, unsigned int size)
to[i] = from[i];
 }
 
+static bool nopv;
+static __init int xen_parse_nopv(char *arg)
+{
+   nopv = true;
+   return 0;
+}
+early_param("nopv", xen_parse_nopv);
+
 void __init init_hypervisor_platform(void)
 {
const struct hypervisor_x86 *h;
 
+   if (nopv)
+   return;
+
h = detect_hypervisor_vendor();
 
if (!h)
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 4/6] Revert "xen: Introduce 'xen_nopv' to disable PV extensions for HVM guests."

2019-06-24 Thread Zhenzhong Duan
This reverts commit 8d693b911bb9c57009c24cb1772d205b84c7985c.

Instead we use an unified parameter 'nopv' for all the hypervisor
platforms.

Signed-off-by: Zhenzhong Duan 
Cc: xen-devel@lists.xenproject.org
---
 Documentation/admin-guide/kernel-parameters.txt |  4 
 arch/x86/xen/enlighten_hvm.c| 12 +---
 2 files changed, 1 insertion(+), 15 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index b352f36..ebb75c1 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5251,10 +5251,6 @@
Disables the ticketlock slowpath using Xen PV
optimizations.
 
-   xen_nopv[X86]
-   Disables the PV optimizations forcing the HVM guest to
-   run as generic HVM guest with no PV drivers.
-
xen_scrub_pages=[XEN]
Boolean option to control scrubbing pages before giving 
them back
to Xen, for use by other domains. Can be also changed 
at runtime
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index ac4943c..7fcb4ea 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -210,18 +210,8 @@ static void __init xen_hvm_guest_init(void)
 #endif
 }
 
-static bool xen_nopv;
-static __init int xen_parse_nopv(char *arg)
-{
-   xen_nopv = true;
-   return 0;
-}
-early_param("xen_nopv", xen_parse_nopv);
-
 bool __init xen_hvm_need_lapic(void)
 {
-   if (xen_nopv)
-   return false;
if (xen_pv_domain())
return false;
if (!xen_hvm_domain())
@@ -233,7 +223,7 @@ bool __init xen_hvm_need_lapic(void)
 
 static uint32_t __init xen_platform_hvm(void)
 {
-   if (xen_pv_domain() || xen_nopv)
+   if (xen_pv_domain())
return 0;
 
return xen_cpuid_base();
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/6] x86: Add nopv parameter to disable PV extensions

2019-06-24 Thread Zhenzhong Duan


On 2019/6/24 21:18, Juergen Gross wrote:

On 23.06.19 15:01, Zhenzhong Duan wrote:

In virtualization environment, PV extensions (drivers, interrupts,
timers, etc) are enabled in the majority of use cases which is the
best option.

However, in some cases (kexec not fully working, benchmarking)
we want to disable PV extensions. As such introduce the
'nopv' parameter that will do it.

There is already 'xen_nopv' parameter for XEN platform but not for
others. 'xen_nopv' can then be removed with this change.

Signed-off-by: Zhenzhong Duan 
Cc: xen-devel@lists.xenproject.org
---
  Documentation/admin-guide/kernel-parameters.txt |  4 
  arch/x86/kernel/cpu/hypervisor.c    | 11 +++
  2 files changed, 15 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt

index 138f666..b352f36 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5268,6 +5268,10 @@
  improve timer resolution at the expense of processing
  more timer interrupts.
  +    nopv=    [X86]
+    Disables the PV optimizations forcing the guest to run
+    as generic guest with no PV drivers.
+
  xirc2ps_cs=    [NET,PCMCIA]
  Format:
,[,[,[,]]]
diff --git a/arch/x86/kernel/cpu/hypervisor.c 
b/arch/x86/kernel/cpu/hypervisor.c

index 479ca47..4f2c875 100644
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -85,10 +85,21 @@ static void __init copy_array(const void *src, 
void *target, unsigned int size)

  to[i] = from[i];
  }
  +static bool nopv;
+static __init int xen_parse_nopv(char *arg)
+{
+    nopv = true;
+    return 0;
+}
+early_param("nopv", xen_parse_nopv);
+
  void __init init_hypervisor_platform(void)
  {
  const struct hypervisor_x86 *h;
  +    if (nopv)
+    return;
+


Oh, this is no good idea.

There are guest types which just won't work without pv interfaces, like
Xen PV and Xen PVH. Letting them fail due to just a wrong command line
parameter is not nice, especially as the failure might be very hard to
track down to the issue for the user.

Yes, thanks for catching.


I guess you could add a "ignore_nopv" member to struct hypervisor_x86
set to true for the mentioned guest types and call the detect functions
only if nopv is false or ignore_nopv is true.


I think your suggestion is good, I'll rework it based on your suggestion.

Thanks

Zhenzhong


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 5/7] x86/xen: nopv parameter support for HVM guest

2019-06-26 Thread Zhenzhong Duan


On 2019/6/25 20:31, Juergen Gross wrote:

On 24.06.19 14:02, Zhenzhong Duan wrote:

PVH guest needs PV extentions to work, so nopv parameter is ignored
for PVH but not for HVM guest.

In order for nopv parameter to take effect for HVM guest, we need to
distinguish between PVH and HVM guest early in hypervisor detection
code. By moving the detection of PVH in xen_platform_hvm(),
xen_pvh_domain() could be used for that purpose.

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: xen-devel@lists.xenproject.org
---
  arch/x86/xen/enlighten_hvm.c | 18 --
  1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 7fcb4ea..26939e7 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -25,6 +25,7 @@
  #include "mmu.h"
  #include "smp.h"
  +extern bool nopv;
  static unsigned long shared_info_pfn;
    void xen_hvm_init_shared_info(void)
@@ -226,20 +227,24 @@ static uint32_t __init xen_platform_hvm(void)
  if (xen_pv_domain())
  return 0;
  +#ifdef CONFIG_XEN_PVH
+    /* Test for PVH domain (PVH boot path taken overrides ACPI 
flags). */

+    if (!x86_platform.legacy.rtc && x86_platform.legacy.no_vga)
+    xen_pvh = true;


Sorry, this won't work, as ACPI tables are scanned only some time later.

Hmm, right. Thanks for point out.


You can test for xen_pvh being true here (for the case where the guest
has been booted via the Xen-PVH boot entry) and handle that case, but
the case of a PVH guest started via the normal boot entry (like via
grub2) and nopv specified is difficult. The only idea I have right now
would be to use another struct hypervisor_x86 for that case which will
only be used for Xen HVM/PVH _and_ nopv specified. It should be a copy
of the bare metal variant, but a special guest_late_init member issuing
a big fat warning in case PVH is being detected.


After that warning, I guess PVH will run into hang finally? If it's 
true, BUG() is better?


Adding another hypervisor_x86 is a bit redundant, I think of below change.

I'll test it tomorrow. But appreciate your suggestion whether it's 
feasible. Thanks


--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -25,6 +25,7 @@
 #include "mmu.h"
 #include "smp.h"

+extern bool nopv;
 static unsigned long shared_info_pfn;

 void xen_hvm_init_shared_info(void)
@@ -221,11 +222,37 @@ bool __init xen_hvm_need_lapic(void)
    return true;
 }

+static __init void xen_hvm_nopv_guest_late_init(void)
+{
+#ifdef CONFIG_XEN_PVH
+   if (x86_platform.legacy.rtc || !x86_platform.legacy.no_vga)
+   return;
+
+   /* PVH detected. */
+   xen_pvh = true;
+
+   printk(KERN_CRIT "nopv parameter isn't supported in PVH guest\n");
+   BUG();
+#endif
+}
+
+
 static uint32_t __init xen_platform_hvm(void)
 {
    if (xen_pv_domain())
    return 0;

+   if (xen_pvh_domain() && nopv)
+   {
+   /* guest booting via the Xen-PVH boot entry goes here */
+   printk(KERN_INFO "nopv parameter is ignored in PVH 
guest\n");

+   }
+   else if (nopv)
+   {
+   /* guest booting via normal boot entry (like via grub2) goes here */
+   x86_init.hyper.guest_late_init = 
xen_hvm_nopv_guest_late_init;

+   return 0;
+   }
    return xen_cpuid_base();
 }

@@ -258,4 +285,5 @@ static __init void xen_hvm_guest_late_init(void)
    .init.init_mem_mapping  = xen_hvm_init_mem_mapping,
    .init.guest_late_init   = xen_hvm_guest_late_init,
    .runtime.pin_vcpu   = xen_pin_vcpu,
+   .ignore_nopv    = true,
 };


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2] xen/pv: Fix a boot up hang revealed by int3 self test

2019-07-12 Thread Zhenzhong Duan
Commit 7457c0da024b ("x86/alternatives: Add int3_emulate_call()
selftest") is used to ensure there is a gap setup in exception stack
which could be used for inserting call return address.

This gap is missed in XEN PV int3 exception entry path, then below panic
triggered:

[0.772876] general protection fault:  [#1] SMP NOPTI
[0.772886] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0+ #11
[0.772893] RIP: e030:int3_magic+0x0/0x7
[0.772905] RSP: 3507:82203e98 EFLAGS: 0246
[0.773334] Call Trace:
[0.773334]  alternative_instructions+0x3d/0x12e
[0.773334]  check_bugs+0x7c9/0x887
[0.773334]  ? __get_locked_pte+0x178/0x1f0
[0.773334]  start_kernel+0x4ff/0x535
[0.773334]  ? set_init_arg+0x55/0x55
[0.773334]  xen_start_kernel+0x571/0x57a

As xenint3 and int3 entry code are same except xenint3 doesn't generate
a gap, we can fix it by using int3 and drop useless xenint3.

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 v2: fix up description.
---
 arch/x86/entry/entry_64.S| 1 -
 arch/x86/include/asm/traps.h | 2 +-
 arch/x86/xen/enlighten_pv.c  | 2 +-
 arch/x86/xen/xen-asm_64.S| 1 -
 4 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 0ea4831..35a66fc 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1176,7 +1176,6 @@ idtentry stack_segmentdo_stack_segment
has_error_code=1
 #ifdef CONFIG_XEN_PV
 idtentry xennmido_nmi  has_error_code=0
 idtentry xendebug  do_debughas_error_code=0
-idtentry xenint3   do_int3 has_error_code=0
 #endif
 
 idtentry general_protectiondo_general_protection   has_error_code=1
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 7d6f3f3..f2bd284 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -40,7 +40,7 @@
 asmlinkage void xen_divide_error(void);
 asmlinkage void xen_xennmi(void);
 asmlinkage void xen_xendebug(void);
-asmlinkage void xen_xenint3(void);
+asmlinkage void xen_int3(void);
 asmlinkage void xen_overflow(void);
 asmlinkage void xen_bounds(void);
 asmlinkage void xen_invalid_op(void);
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 4722ba2..2138d69 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -596,7 +596,7 @@ struct trap_array_entry {
 
 static struct trap_array_entry trap_array[] = {
{ debug,   xen_xendebug,true },
-   { int3,xen_xenint3, true },
+   { int3,xen_int3,true },
{ double_fault,xen_double_fault,true },
 #ifdef CONFIG_X86_MCE
{ machine_check,   xen_machine_check,   true },
diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index 1e9ef0b..ebf610b 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -32,7 +32,6 @@ xen_pv_trap divide_error
 xen_pv_trap debug
 xen_pv_trap xendebug
 xen_pv_trap int3
-xen_pv_trap xenint3
 xen_pv_trap xennmi
 xen_pv_trap overflow
 xen_pv_trap bounds
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/pv: Fix a boot up hang triggered by int3 self test

2019-07-12 Thread Zhenzhong Duan

Sorry for the noise, it looks description is wrong.

This is not a double pop, but xen pv taking the path

with create_gap=0, I'll send a v2.

Zhenzhong

On 2019/7/11 12:47, Zhenzhong Duan wrote:

Commit 7457c0da024b ("x86/alternatives: Add int3_emulate_call()
selftest") reveals a bug in XEN PV int3 assemble code. There is
a double pop of register R11 and RCX currupting the exception
frame, one in xen_int3 and the other in xen_xenint3.

We see below hang at bootup:

general protection fault:  [#1] SMP NOPTI
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0+ #6
RIP: e030:int3_magic+0x0/0x7
Call Trace:
  alternative_instructions+0x3d/0x12e
  check_bugs+0x7c9/0x887
  ?__get_locked_pte+0x178/0x1f0
  start_kernel+0x4ff/0x535
  ?set_init_arg+0x55/0x55
  xen_start_kernel+0x571/0x57a

Fix it by removing xen_xenint3.

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
  arch/x86/include/asm/traps.h | 2 +-
  arch/x86/xen/enlighten_pv.c  | 2 +-
  arch/x86/xen/xen-asm_64.S| 1 -
  3 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 7d6f3f3..f2bd284 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -40,7 +40,7 @@
  asmlinkage void xen_divide_error(void);
  asmlinkage void xen_xennmi(void);
  asmlinkage void xen_xendebug(void);
-asmlinkage void xen_xenint3(void);
+asmlinkage void xen_int3(void);
  asmlinkage void xen_overflow(void);
  asmlinkage void xen_bounds(void);
  asmlinkage void xen_invalid_op(void);
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 4722ba2..2138d69 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -596,7 +596,7 @@ struct trap_array_entry {
  
  static struct trap_array_entry trap_array[] = {

{ debug,   xen_xendebug,true },
-   { int3,xen_xenint3, true },
+   { int3,xen_int3,true },
{ double_fault,xen_double_fault,true },
  #ifdef CONFIG_X86_MCE
{ machine_check,   xen_machine_check,   true },
diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index 1e9ef0b..ebf610b 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -32,7 +32,6 @@ xen_pv_trap divide_error
  xen_pv_trap debug
  xen_pv_trap xendebug
  xen_pv_trap int3
-xen_pv_trap xenint3
  xen_pv_trap xennmi
  xen_pv_trap overflow
  xen_pv_trap bounds


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v7 0/5] misc fixes to PV extensions code

2019-07-12 Thread Zhenzhong Duan
Hi,

In virtualization environment, PV extensions (drivers, interrupts,
timers, etc) are enabled in the majority of use cases which is the
best option.

However, in some cases (kexec not fully working, benchmarking, etc)
we want to disable PV extensions. We have xen_nopv for that purpose
but only for XEN. For a consistent admin experience a common command
line parameter set across all PV guest implementations is a better
choice.

To achieve this introduce a new 'nopv' parameter which is usable by
most of PV guest implementation. Due to the limitation of some PV
guests(XEN PV, XEN PVH and jailhouse), 'nopv' is ignored for XEN PV
, jailhouse and XEN PVH if booting via Xen-PVH boot entry. If booting
via normal boot entry(like grub2), PVH guest has to panic itself
currently.

While analyzing the PV guest code one bug were found and fixed.
(Patches 1). It can be applied independent of the functional
changes, but is kept in the series as the functional changes
depend on them.

For compatibility reason, "xen_nopv" is keeped and mapped to "nopv",
this way also avoids an issue with xen_nopv when booting PVH guest.

Build test passes with CONFIG_HYPERVISOR_GUEST enable and disabled.
I didn't get env to test with jailhouse and Hyperv, the others work
as expected.

v7:
PATCH4 a new added patch prerequite for PATCH5(previously PATCH4)
PATCH5 rewrite the code based on Boris's suggestion. I compare the one
to update interface function one-by-one and the one to modify all
x86_hyper_xen_hvm's ops to immediately return if nopv is set, both
have same effect and the first looks smarter, so choose the 1st one.

v6:
PATCH3 add Reviewed-by
PATCH4 remove unnecessory xen_hvm_nopv_guest_late_init() per Boris

v5:
PATCH2:
update patch description per Boris
add declaration of nopv variable in arch/x86/include/asm/hypervisor.h
which will be used in PATCH3 and PATCH4

PATCH3 update xen_parse_nopv() per Boris
PATCH4 add nopv=false per Boris
Combine PATCH5 into PATCH3


v4:
PATCH5 a new patch to add 'xen_nopv' back per Boris

v3:
Remove some unrelated patches from patchset as suggested by Tglx

PATCH1 unchanged
PATCH2 add Reviewed-by
PATCH3 add Reviewed-by
PATCH4 rewrite the patch as Jgross found an issue in old patch,
description is also updated.

v2:
PATCH3 use 'ignore_nopv' for PVH/PV guest as suggested by Jgross.
PATCH5 new added one, specifically for HVM guest

Thanks
Zhenzhong


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v7 2/5] x86: Add "nopv" parameter to disable PV extensions

2019-07-12 Thread Zhenzhong Duan
In virtualization environment, PV extensions (drivers, interrupts,
timers, etc) are enabled in the majority of use cases which is the
best option.

However, in some cases (kexec not fully working, benchmarking)
we want to disable PV extensions. We have "xen_nopv" for that purpose
but only for XEN. For a consistent admin experience a common command
line parameter "nopv" set across all PV guest implementations is a
better choice.

There are guest types which just won't work without PV extensions,
like Xen PV, Xen PVH and jailhouse. add a "ignore_nopv" member to
struct hypervisor_x86 set to true for those guest types and call
the detect functions only if nopv is false or ignore_nopv is true.

Suggested-by: Juergen Gross 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Juergen Gross 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Jan Kiszka 
Cc: Boris Ostrovsky 
Cc: Stefano Stabellini 
---
 Documentation/admin-guide/kernel-parameters.txt |  5 +
 arch/x86/include/asm/hypervisor.h   |  4 
 arch/x86/kernel/cpu/hypervisor.c| 11 +++
 arch/x86/kernel/jailhouse.c |  1 +
 arch/x86/xen/enlighten_pv.c |  1 +
 5 files changed, 22 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index f1c433d..dbfe9c2 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5271,6 +5271,11 @@
improve timer resolution at the expense of processing
more timer interrupts.
 
+   nopv=   [X86,XEN,KVM,HYPER_V,VMWARE]
+   Disables the PV optimizations forcing the guest to run
+   as generic guest with no PV drivers. Currently support
+   XEN HVM, KVM, HYPER_V and VMWARE guest.
+
xirc2ps_cs= [NET,PCMCIA]
Format:

,[,[,[,]]]
diff --git a/arch/x86/include/asm/hypervisor.h 
b/arch/x86/include/asm/hypervisor.h
index 50a30f6..f7b4c53 100644
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -53,8 +53,12 @@ struct hypervisor_x86 {
 
/* runtime callbacks */
struct x86_hyper_runtime runtime;
+
+   /* ignore nopv parameter */
+   bool ignore_nopv;
 };
 
+extern bool nopv;
 extern enum x86_hypervisor_type x86_hyper_type;
 extern void init_hypervisor_platform(void);
 static inline bool hypervisor_is_type(enum x86_hypervisor_type type)
diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
index 87e39ad..7eaad41 100644
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -58,6 +58,14 @@
 enum x86_hypervisor_type x86_hyper_type;
 EXPORT_SYMBOL(x86_hyper_type);
 
+bool __initdata nopv;
+static __init int parse_nopv(char *arg)
+{
+   nopv = true;
+   return 0;
+}
+early_param("nopv", parse_nopv);
+
 static inline const struct hypervisor_x86 * __init
 detect_hypervisor_vendor(void)
 {
@@ -65,6 +73,9 @@
uint32_t pri, max_pri = 0;
 
for (p = hypervisors; p < hypervisors + ARRAY_SIZE(hypervisors); p++) {
+   if (unlikely(nopv) && !(*p)->ignore_nopv)
+   continue;
+
pri = (*p)->detect();
if (pri > max_pri) {
max_pri = pri;
diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c
index 6857b45..3ad34f0 100644
--- a/arch/x86/kernel/jailhouse.c
+++ b/arch/x86/kernel/jailhouse.c
@@ -217,4 +217,5 @@ static bool __init jailhouse_x2apic_available(void)
.detect = jailhouse_detect,
.init.init_platform = jailhouse_init_platform,
.init.x2apic_available  = jailhouse_x2apic_available,
+   .ignore_nopv= true,
 };
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 4722ba2..5d16824 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1463,4 +1463,5 @@ static uint32_t __init xen_platform_pv(void)
.detect = xen_platform_pv,
.type   = X86_HYPER_XEN_PV,
.runtime.pin_vcpu   = xen_pin_vcpu,
+   .ignore_nopv= true,
 };
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v7 4/5] x86/paravirt: Remove const mark from x86_hyper_xen_hvm variable

2019-07-12 Thread Zhenzhong Duan
.. as "nopv" support needs it to be changeable at boot up stage.

Checkpatch report warning, so move variable declarations from
hypervisor.c to hypervisor.h

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 arch/x86/include/asm/hypervisor.h | 8 
 arch/x86/kernel/cpu/hypervisor.c  | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/hypervisor.h 
b/arch/x86/include/asm/hypervisor.h
index f7b4c53..e41cbf2 100644
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -58,6 +58,14 @@ struct hypervisor_x86 {
bool ignore_nopv;
 };
 
+extern const struct hypervisor_x86 x86_hyper_vmware;
+extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
+extern const struct hypervisor_x86 x86_hyper_xen_pv;
+extern const struct hypervisor_x86 x86_hyper_kvm;
+extern const struct hypervisor_x86 x86_hyper_jailhouse;
+extern const struct hypervisor_x86 x86_hyper_acrn;
+extern struct hypervisor_x86 x86_hyper_xen_hvm;
+
 extern bool nopv;
 extern enum x86_hypervisor_type x86_hyper_type;
 extern void init_hypervisor_platform(void);
diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
index 7eaad41..553bfbf 100644
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -26,14 +26,6 @@
 #include 
 #include 
 
-extern const struct hypervisor_x86 x86_hyper_vmware;
-extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
-extern const struct hypervisor_x86 x86_hyper_xen_pv;
-extern const struct hypervisor_x86 x86_hyper_xen_hvm;
-extern const struct hypervisor_x86 x86_hyper_kvm;
-extern const struct hypervisor_x86 x86_hyper_jailhouse;
-extern const struct hypervisor_x86 x86_hyper_acrn;
-
 static const __initconst struct hypervisor_x86 * const hypervisors[] =
 {
 #ifdef CONFIG_XEN_PV
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] xen/pv: Fix a boot up hang revealed by int3 self test

2019-07-12 Thread Zhenzhong Duan


On 2019/7/12 20:06, Peter Zijlstra wrote:

On Thu, Jul 11, 2019 at 04:15:21PM +0800, Zhenzhong Duan wrote:

diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 4722ba2..2138d69 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -596,7 +596,7 @@ struct trap_array_entry {
  
  static struct trap_array_entry trap_array[] = {

{ debug,   xen_xendebug,true },
-   { int3,xen_xenint3, true },
+   { int3,xen_int3,true },
{ double_fault,xen_double_fault,true },
  #ifdef CONFIG_X86_MCE
{ machine_check,   xen_machine_check,   true },

I'm confused on the purpose of trap_array[], could you elucidate me?


Used to replace trap handler addresses by Xen specific ones and sanity check

if there's an unexpected IST-using fault handler.



The sole user seems to be get_trap_addr() and that talks about ISTs, but
#BP isn't an IST anymore, so why does it have ist_okay=true?


Oh, yes, I missed that boolean, thanks. I'll try ist_okey=false for int3 
and test it tomorrow.


Zhenzhong


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] xen/pv: Fix a boot up hang revealed by int3 self test

2019-07-12 Thread Zhenzhong Duan

On 2019/7/12 21:09, Peter Zijlstra wrote:

On Fri, Jul 12, 2019 at 09:04:22PM +0800, Zhenzhong Duan wrote:

On 2019/7/12 20:06, Peter Zijlstra wrote:

On Thu, Jul 11, 2019 at 04:15:21PM +0800, Zhenzhong Duan wrote:

diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 4722ba2..2138d69 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -596,7 +596,7 @@ struct trap_array_entry {
   static struct trap_array_entry trap_array[] = {
{ debug,   xen_xendebug,true },
-   { int3,xen_xenint3, true },
+   { int3,xen_int3,true },
{ double_fault,xen_double_fault,true },
   #ifdef CONFIG_X86_MCE
{ machine_check,   xen_machine_check,   true },

I'm confused on the purpose of trap_array[], could you elucidate me?

Used to replace trap handler addresses by Xen specific ones and sanity check

if there's an unexpected IST-using fault handler.

git grep xen_int3, failed me. Where does that symbol come from?


Generated by "xen_pv_trap int3" in arch/x86/xen/xen-asm_64.S

Zhenzhong


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v7 5/5] x86/xen: Add "nopv" support for HVM guest

2019-07-12 Thread Zhenzhong Duan
PVH guest needs PV extentions to work, so "nopv" parameter should be
ignored for PVH but not for HVM guest.

If PVH guest boots up via the Xen-PVH boot entry, xen_pvh is set early,
we know it's PVH guest and ignore "nopv" parameter directly.

If PVH guest boots up via the normal boot entry same as HVM guest, it's
hard to distinguish PVH and HVM guest at that time. In this case, we
have to panic early if PVH is detected and nopv is enabled to avoid a
worse situation later.

Move xen_platform_hvm() after xen_hvm_guest_late_init() to avoid compile
error.

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 arch/x86/include/asm/x86_init.h |  2 ++
 arch/x86/kernel/x86_init.c  |  4 ++--
 arch/x86/xen/enlighten_hvm.c| 45 -
 3 files changed, 40 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index b85a7c5..ac09341 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -301,6 +301,8 @@ struct x86_apic_ops {
 extern void x86_early_init_platform_quirks(void);
 extern void x86_init_noop(void);
 extern void x86_init_uint_noop(unsigned int unused);
+extern bool bool_x86_init_noop(void);
+extern void x86_op_int_noop(int cpu);
 extern bool x86_pnpbios_disabled(void);
 
 #endif
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index 50a2b49..1bef687 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -29,8 +29,8 @@ void x86_init_noop(void) { }
 void __init x86_init_uint_noop(unsigned int unused) { }
 static int __init iommu_init_noop(void) { return 0; }
 static void iommu_shutdown_noop(void) { }
-static bool __init bool_x86_init_noop(void) { return false; }
-static void x86_op_int_noop(int cpu) { }
+bool __init bool_x86_init_noop(void) { return false; }
+void x86_op_int_noop(int cpu) { }
 
 /*
  * The platform setup functions are preset with the default functions
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 1756cf7..e138f7d 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -231,14 +231,6 @@ bool __init xen_hvm_need_lapic(void)
return true;
 }
 
-static uint32_t __init xen_platform_hvm(void)
-{
-   if (xen_pv_domain())
-   return 0;
-
-   return xen_cpuid_base();
-}
-
 static __init void xen_hvm_guest_late_init(void)
 {
 #ifdef CONFIG_XEN_PVH
@@ -250,6 +242,9 @@ static __init void xen_hvm_guest_late_init(void)
/* PVH detected. */
xen_pvh = true;
 
+   if (nopv)
+   panic("\"nopv\" and \"xen_nopv\" parameters are unsupported in 
PVH guest.");
+
/* Make sure we don't fall back to (default) ACPI_IRQ_MODEL_PIC. */
if (!nr_ioapics && acpi_irq_model == ACPI_IRQ_MODEL_PIC)
acpi_irq_model = ACPI_IRQ_MODEL_PLATFORM;
@@ -259,7 +254,38 @@ static __init void xen_hvm_guest_late_init(void)
 #endif
 }
 
-const __initconst struct hypervisor_x86 x86_hyper_xen_hvm = {
+static uint32_t __init xen_platform_hvm(void)
+{
+   uint32_t xen_domain = xen_cpuid_base();
+   struct x86_hyper_init *h = _hyper_xen_hvm.init;
+
+   if (xen_pv_domain())
+   return 0;
+
+   if (xen_pvh_domain() && nopv) {
+   /* Guest booting via the Xen-PVH boot entry goes here */
+   pr_info("\"nopv\" parameter is ignored in PVH guest\n");
+   nopv = false;
+   } else if (nopv && xen_domain) {
+   /*
+* Guest booting via normal boot entry (like via grub2) goes
+* here.
+*
+* Use interface functions for bare hardware if nopv,
+* xen_hvm_guest_late_init is an exception as we need to
+* detect PVH and panic there.
+*/
+   h->init_platform = x86_init_noop;
+   h->x2apic_available = bool_x86_init_noop;
+   h->init_mem_mapping = x86_init_noop;
+   h->init_after_bootmem = x86_init_noop;
+   h->guest_late_init = xen_hvm_guest_late_init;
+   x86_hyper_xen_hvm.runtime.pin_vcpu = x86_op_int_noop;
+   }
+   return xen_domain;
+}
+
+struct hypervisor_x86 x86_hyper_xen_hvm __initdata = {
.name   = "Xen HVM",
.detect = xen_platform_hvm,
.type   = X86_HYPER_XEN_HVM,
@@ -268,4 +294,5 @@ static __init void xen_hvm_guest_late_init(void)
.init.init_mem_mapping  = xen_hvm_init_mem_mapping,
.init.guest_late_init   = xen_hvm_guest_late_init,
.runtime.pin_vcpu   = xen_pin_vcpu,
+   .ignore_nopv= true,
 };
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v7 3/5] xen: Map "xen_nopv" parameter to "nopv" and mark it obsolete

2019-07-12 Thread Zhenzhong Duan
Clean up unnecessory code after that operation.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 Documentation/admin-guide/kernel-parameters.txt |  2 ++
 arch/x86/xen/enlighten_hvm.c| 12 ++--
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index dbfe9c2..c3f3e01 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5257,6 +5257,8 @@
xen_nopv[X86]
Disables the PV optimizations forcing the HVM guest to
run as generic HVM guest with no PV drivers.
+   This option is obsoleted by the "nopv" option, which
+   has equivalent effect for XEN platform.
 
xen_scrub_pages=[XEN]
Boolean option to control scrubbing pages before giving 
them back
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index ac4943c..1756cf7 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -210,18 +210,18 @@ static void __init xen_hvm_guest_init(void)
 #endif
 }
 
-static bool xen_nopv;
 static __init int xen_parse_nopv(char *arg)
 {
-   xen_nopv = true;
-   return 0;
+   pr_notice("\"xen_nopv\" is deprecated, please use \"nopv\" instead\n");
+
+   if (xen_cpuid_base())
+   nopv = true;
+   return 0;
 }
 early_param("xen_nopv", xen_parse_nopv);
 
 bool __init xen_hvm_need_lapic(void)
 {
-   if (xen_nopv)
-   return false;
if (xen_pv_domain())
return false;
if (!xen_hvm_domain())
@@ -233,7 +233,7 @@ bool __init xen_hvm_need_lapic(void)
 
 static uint32_t __init xen_platform_hvm(void)
 {
-   if (xen_pv_domain() || xen_nopv)
+   if (xen_pv_domain())
return 0;
 
return xen_cpuid_base();
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v7 1/5] x86/xen: Mark xen_hvm_need_lapic() and xen_x2apic_para_available() as __init

2019-07-12 Thread Zhenzhong Duan
.. as they are only called at early bootup stage. In fact, other
functions in x86_hyper_xen_hvm.init.* are all marked as __init.

Unexport xen_hvm_need_lapic as it's never used outside.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Juergen Gross 
Cc: Boris Ostrovsky 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 arch/x86/include/asm/xen/hypervisor.h | 6 +++---
 arch/x86/xen/enlighten_hvm.c  | 3 +--
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/xen/hypervisor.h 
b/arch/x86/include/asm/xen/hypervisor.h
index 39171b3..42e1245 100644
--- a/arch/x86/include/asm/xen/hypervisor.h
+++ b/arch/x86/include/asm/xen/hypervisor.h
@@ -44,14 +44,14 @@ static inline uint32_t xen_cpuid_base(void)
 }
 
 #ifdef CONFIG_XEN
-extern bool xen_hvm_need_lapic(void);
+extern bool __init xen_hvm_need_lapic(void);
 
-static inline bool xen_x2apic_para_available(void)
+static inline bool __init xen_x2apic_para_available(void)
 {
return xen_hvm_need_lapic();
 }
 #else
-static inline bool xen_x2apic_para_available(void)
+static inline bool __init xen_x2apic_para_available(void)
 {
return (xen_cpuid_base() != 0);
 }
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 0e75642..ac4943c 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -218,7 +218,7 @@ static __init int xen_parse_nopv(char *arg)
 }
 early_param("xen_nopv", xen_parse_nopv);
 
-bool xen_hvm_need_lapic(void)
+bool __init xen_hvm_need_lapic(void)
 {
if (xen_nopv)
return false;
@@ -230,7 +230,6 @@ bool xen_hvm_need_lapic(void)
return false;
return true;
 }
-EXPORT_SYMBOL_GPL(xen_hvm_need_lapic);
 
 static uint32_t __init xen_platform_hvm(void)
 {
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] xen/pv: Fix a boot up hang triggered by int3 self test

2019-07-11 Thread Zhenzhong Duan
Commit 7457c0da024b ("x86/alternatives: Add int3_emulate_call()
selftest") reveals a bug in XEN PV int3 assemble code. There is
a double pop of register R11 and RCX currupting the exception
frame, one in xen_int3 and the other in xen_xenint3.

We see below hang at bootup:

general protection fault:  [#1] SMP NOPTI
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0+ #6
RIP: e030:int3_magic+0x0/0x7
Call Trace:
 alternative_instructions+0x3d/0x12e
 check_bugs+0x7c9/0x887
 ?__get_locked_pte+0x178/0x1f0
 start_kernel+0x4ff/0x535
 ?set_init_arg+0x55/0x55
 xen_start_kernel+0x571/0x57a

Fix it by removing xen_xenint3.

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 arch/x86/include/asm/traps.h | 2 +-
 arch/x86/xen/enlighten_pv.c  | 2 +-
 arch/x86/xen/xen-asm_64.S| 1 -
 3 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 7d6f3f3..f2bd284 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -40,7 +40,7 @@
 asmlinkage void xen_divide_error(void);
 asmlinkage void xen_xennmi(void);
 asmlinkage void xen_xendebug(void);
-asmlinkage void xen_xenint3(void);
+asmlinkage void xen_int3(void);
 asmlinkage void xen_overflow(void);
 asmlinkage void xen_bounds(void);
 asmlinkage void xen_invalid_op(void);
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 4722ba2..2138d69 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -596,7 +596,7 @@ struct trap_array_entry {
 
 static struct trap_array_entry trap_array[] = {
{ debug,   xen_xendebug,true },
-   { int3,xen_xenint3, true },
+   { int3,xen_int3,true },
{ double_fault,xen_double_fault,true },
 #ifdef CONFIG_X86_MCE
{ machine_check,   xen_machine_check,   true },
diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index 1e9ef0b..ebf610b 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -32,7 +32,6 @@ xen_pv_trap divide_error
 xen_pv_trap debug
 xen_pv_trap xendebug
 xen_pv_trap int3
-xen_pv_trap xenint3
 xen_pv_trap xennmi
 xen_pv_trap overflow
 xen_pv_trap bounds
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 4/4] x86/xen: Add "nopv" support for HVM guest

2019-07-08 Thread Zhenzhong Duan


On 2019/7/8 21:46, Boris Ostrovsky wrote:

On 7/7/19 5:15 AM, Zhenzhong Duan wrote:
  
+static uint32_t __init xen_platform_hvm(void)

+{
+   if (xen_pv_domain())
+   return 0;
+
+   if (xen_pvh_domain() && nopv) {
+   /* Guest booting via the Xen-PVH boot entry goes here */
+   pr_info("\"nopv\" parameter is ignored in PVH guest\n");
+   nopv = false;
+   } else if (nopv) {
+   /*
+* Guest booting via normal boot entry (like via grub2) goes
+* here.
+*/
+   x86_init.hyper.guest_late_init = xen_hvm_guest_late_init;
+   return 0;

After staring at this some more I don't think we can do this.
Hypervisor-specific code should not muck with with x86_init.hyper, it's
layering violation. That's what init_hypervisor_platform() is for.

Ok, I see.


So we may have to create another hypervisor_x86 ops structure for Xen
nopv (which I don't find very appealing TBH). Or update
x86_hyper_xen_hvm and return xen_cpuid_base() instead of zero (but then
you'd need to update all these structs from __initconst to _init or some
such). Or something else.


I choose the second. I made below changes, not clear if it's also a 
layering violation


as use x86_init.hyper as source for memcpy. I choose memcpy instead of 
updating


functions pointers one-by-one because if x86_hyper_init interface 
functions extends


in the future we need no changes. Let me know if you prefer updating 
pointers directly.


This isn't a formal patch for review, just want to get answer of above 
question.


diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c

index 1756cf7..8416640d 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -231,14 +231,6 @@ bool __init xen_hvm_need_lapic(void)
    return true;
 }

-static uint32_t __init xen_platform_hvm(void)
-{
-   if (xen_pv_domain())
-   return 0;
-
-   return xen_cpuid_base();
-}
-
 static __init void xen_hvm_guest_late_init(void)
 {
 #ifdef CONFIG_XEN_PVH
@@ -250,6 +242,9 @@ static __init void xen_hvm_guest_late_init(void)
    /* PVH detected. */
    xen_pvh = true;

+   if (nopv)
+   panic("\"nopv\" and \"xen_nopv\" parameters are 
unsupported in PVH guest.");

+
    /* Make sure we don't fall back to (default) ACPI_IRQ_MODEL_PIC. */
    if (!nr_ioapics && acpi_irq_model == ACPI_IRQ_MODEL_PIC)
    acpi_irq_model = ACPI_IRQ_MODEL_PLATFORM;
@@ -259,7 +254,36 @@ static __init void xen_hvm_guest_late_init(void)
 #endif
 }

-const __initconst struct hypervisor_x86 x86_hyper_xen_hvm = {
+static uint32_t __init xen_platform_hvm(void)
+{
+   uint32_t xen_domain = xen_cpuid_base();
+   struct x86_hyper_init *h = _hyper_xen_hvm.init;
+
+   if (xen_pv_domain())
+   return 0;
+
+   if (xen_pvh_domain() && nopv) {
+   /* Guest booting via the Xen-PVH boot entry goes here */
+   pr_info("\"nopv\" parameter is ignored in PVH guest\n");
+   nopv = false;
+   } else if (nopv && xen_domain) {
+   /*
+    * Guest booting via normal boot entry (like via grub2) goes
+    * here.
+    *
+    * Use interface functions for bare hardware if nopv,
+    * xen_hvm_guest_late_init is an exception as we need to
+    * detect PVH and panic there.
+    */
+   memcpy(h, (void *)_init.hyper, sizeof(x86_init.hyper));
+   memcpy(_hyper_xen_hvm.runtime, (void 
*)_platform.hyper,

+  sizeof(x86_platform.hyper));
+   h->guest_late_init = xen_hvm_guest_late_init;
+   }
+   return xen_domain;
+}
+
+struct hypervisor_x86 x86_hyper_xen_hvm __initdata = {
    .name   = "Xen HVM",
    .detect = xen_platform_hvm,
    .type   = X86_HYPER_XEN_HVM,
@@ -268,4 +292,5 @@ static __init void xen_hvm_guest_late_init(void)
    .init.init_mem_mapping  = xen_hvm_init_mem_mapping,
    .init.guest_late_init   = xen_hvm_guest_late_init,
    .runtime.pin_vcpu   = xen_pin_vcpu,
+   .ignore_nopv    = true,
 };


Thanks

Zhenzhong


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 3/4] xen: Map "xen_nopv" parameter to "nopv" and mark it obsolete

2019-07-03 Thread Zhenzhong Duan
Clean up unnecessory code after that operation.

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 Documentation/admin-guide/kernel-parameters.txt |  2 ++
 arch/x86/xen/enlighten_hvm.c| 12 ++--
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 21e08af..8ab34a1 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5254,6 +5254,8 @@
xen_nopv[X86]
Disables the PV optimizations forcing the HVM guest to
run as generic HVM guest with no PV drivers.
+   This option is obsoleted by the "nopv" option, which
+   has equivalent effect for XEN platform.
 
xen_scrub_pages=[XEN]
Boolean option to control scrubbing pages before giving 
them back
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index ac4943c..1756cf7 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -210,18 +210,18 @@ static void __init xen_hvm_guest_init(void)
 #endif
 }
 
-static bool xen_nopv;
 static __init int xen_parse_nopv(char *arg)
 {
-   xen_nopv = true;
-   return 0;
+   pr_notice("\"xen_nopv\" is deprecated, please use \"nopv\" instead\n");
+
+   if (xen_cpuid_base())
+   nopv = true;
+   return 0;
 }
 early_param("xen_nopv", xen_parse_nopv);
 
 bool __init xen_hvm_need_lapic(void)
 {
-   if (xen_nopv)
-   return false;
if (xen_pv_domain())
return false;
if (!xen_hvm_domain())
@@ -233,7 +233,7 @@ bool __init xen_hvm_need_lapic(void)
 
 static uint32_t __init xen_platform_hvm(void)
 {
-   if (xen_pv_domain() || xen_nopv)
+   if (xen_pv_domain())
return 0;
 
return xen_cpuid_base();
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 1/4] x86/xen: Mark xen_hvm_need_lapic() and xen_x2apic_para_available() as __init

2019-07-03 Thread Zhenzhong Duan
.. as they are only called at early bootup stage. In fact, other
functions in x86_hyper_xen_hvm.init.* are all marked as __init.

Unexport xen_hvm_need_lapic as it's never used outside.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Juergen Gross 
Cc: Boris Ostrovsky 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 arch/x86/include/asm/xen/hypervisor.h | 6 +++---
 arch/x86/xen/enlighten_hvm.c  | 3 +--
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/xen/hypervisor.h 
b/arch/x86/include/asm/xen/hypervisor.h
index 39171b3..42e1245 100644
--- a/arch/x86/include/asm/xen/hypervisor.h
+++ b/arch/x86/include/asm/xen/hypervisor.h
@@ -44,14 +44,14 @@ static inline uint32_t xen_cpuid_base(void)
 }
 
 #ifdef CONFIG_XEN
-extern bool xen_hvm_need_lapic(void);
+extern bool __init xen_hvm_need_lapic(void);
 
-static inline bool xen_x2apic_para_available(void)
+static inline bool __init xen_x2apic_para_available(void)
 {
return xen_hvm_need_lapic();
 }
 #else
-static inline bool xen_x2apic_para_available(void)
+static inline bool __init xen_x2apic_para_available(void)
 {
return (xen_cpuid_base() != 0);
 }
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 0e75642..ac4943c 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -218,7 +218,7 @@ static __init int xen_parse_nopv(char *arg)
 }
 early_param("xen_nopv", xen_parse_nopv);
 
-bool xen_hvm_need_lapic(void)
+bool __init xen_hvm_need_lapic(void)
 {
if (xen_nopv)
return false;
@@ -230,7 +230,6 @@ bool xen_hvm_need_lapic(void)
return false;
return true;
 }
-EXPORT_SYMBOL_GPL(xen_hvm_need_lapic);
 
 static uint32_t __init xen_platform_hvm(void)
 {
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 4/4] x86/xen: Add "nopv" support for HVM guest

2019-07-03 Thread Zhenzhong Duan
PVH guest needs PV extentions to work, so "nopv" parameter should be
ignored for PVH but not for HVM guest.

If PVH guest boots up via the Xen-PVH boot entry, xen_pvh is set early,
we know it's PVH guest and ignore "nopv" parameter directly.

If PVH guest boots up via the normal boot entry same as HVM guest, it's
hard to distinguish PVH and HVM guest at that time.

To handle that case, add a new function xen_hvm_nopv_guest_late_init()
to detect PVH at a late time and panic itself if nopv enabled for a
PVH guest.

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 arch/x86/xen/enlighten_hvm.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 1756cf7..09a010a 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -231,11 +231,37 @@ bool __init xen_hvm_need_lapic(void)
return true;
 }
 
+static __init void xen_hvm_nopv_guest_late_init(void)
+{
+#ifdef CONFIG_XEN_PVH
+   if (x86_platform.legacy.rtc || !x86_platform.legacy.no_vga)
+   return;
+
+   /* PVH detected. */
+   xen_pvh = true;
+
+   panic("\"nopv\" and \"xen_nopv\" parameters are unsupported in PVH 
guest.");
+#endif
+}
+
+
 static uint32_t __init xen_platform_hvm(void)
 {
if (xen_pv_domain())
return 0;
 
+   if (xen_pvh_domain() && nopv) {
+   /* Guest booting via the Xen-PVH boot entry goes here */
+   pr_info("\"nopv\" parameter is ignored in PVH guest\n");
+   nopv = false;
+   } else if (nopv) {
+   /*
+* Guest booting via normal boot entry (like via grub2) goes
+* here.
+*/
+   x86_init.hyper.guest_late_init = xen_hvm_nopv_guest_late_init;
+   return 0;
+   }
return xen_cpuid_base();
 }
 
@@ -268,4 +294,5 @@ static __init void xen_hvm_guest_late_init(void)
.init.init_mem_mapping  = xen_hvm_init_mem_mapping,
.init.guest_late_init   = xen_hvm_guest_late_init,
.runtime.pin_vcpu   = xen_pin_vcpu,
+   .ignore_nopv= true,
 };
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 0/4] misc fixes to PV extentions code

2019-07-03 Thread Zhenzhong Duan
Hi,

In virtualization environment, PV extensions (drivers, interrupts,
timers, etc) are enabled in the majority of use cases which is the
best option.

However, in some cases (kexec not fully working, benchmarking, etc)
we want to disable PV extensions. We have xen_nopv for that purpose
but only for XEN. For a consistent admin experience a common command
line parameter set across all PV guest implementations is a better
choice.

To achieve this introduce a new 'nopv' parameter which is usable by
most of PV guest implementation. Due to the limitation of some PV
guests(XEN PV, XEN PVH and jailhouse), 'nopv' is ignored for XEN PV
, jailhouse and XEN PVH if booting via Xen-PVH boot entry. If booting
via normal boot entry(like grub2), PVH guest has to panic itself
currently.

While analyzing the PV guest code one bug were found and fixed.
(Patches 1). It can be applied independent of the functional
changes, but is kept in the series as the functional changes
depend on them.

For compatibility reason, "xen_nopv" is keeped and mapped to "nopv",
this way also avoids an issue with xen_nopv when booting PVH guest.

Build test passes with CONFIG_HYPERVISOR_GUEST enable and disabled.
I didn't get env to test with jailhouse and Hyperv, the others work
as expected.

v5:
PATCH2:
update patch description per Boris
add declaration of nopv variable in arch/x86/include/asm/hypervisor.h
which will be used in PATCH3 and PATCH4

PATCH3 update xen_parse_nopv() per Boris
PATCH4 add nopv=false per Boris
Combine PATCH5 into PATCH3


v4:
PATCH5 a new patch to add 'xen_nopv' back per Boris

v3:
Remove some unrelated patches from patchset as suggested by Tglx

PATCH1 unchanged
PATCH2 add Reviewed-by
PATCH3 add Reviewed-by
PATCH4 rewrite the patch as Jgross found an issue in old patch,
description is also updated.



v2:
PATCH3 use 'ignore_nopv' for PVH/PV guest as suggested by Jgross.
PATCH5 new added one, specifically for HVM guest

Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 2/4] x86: Add "nopv" parameter to disable PV extensions

2019-07-03 Thread Zhenzhong Duan
In virtualization environment, PV extensions (drivers, interrupts,
timers, etc) are enabled in the majority of use cases which is the
best option.

However, in some cases (kexec not fully working, benchmarking)
we want to disable PV extensions. We have "xen_nopv" for that purpose
but only for XEN. For a consistent admin experience a common command
line parameter "nopv" set across all PV guest implementations is a
better choice.

There are guest types which just won't work without PV extensions,
like Xen PV, Xen PVH and jailhouse. add a "ignore_nopv" member to
struct hypervisor_x86 set to true for those guest types and call
the detect functions only if nopv is false or ignore_nopv is true.

Suggested-by: Juergen Gross 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Juergen Gross 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Jan Kiszka 
Cc: Boris Ostrovsky 
Cc: Stefano Stabellini 
---
 Documentation/admin-guide/kernel-parameters.txt |  5 +
 arch/x86/include/asm/hypervisor.h   |  4 
 arch/x86/kernel/cpu/hypervisor.c| 11 +++
 arch/x86/kernel/jailhouse.c |  1 +
 arch/x86/xen/enlighten_pv.c |  1 +
 5 files changed, 22 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 138f666..21e08af 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5268,6 +5268,11 @@
improve timer resolution at the expense of processing
more timer interrupts.
 
+   nopv=   [X86,XEN,KVM,HYPER_V,VMWARE]
+   Disables the PV optimizations forcing the guest to run
+   as generic guest with no PV drivers. Currently support
+   XEN HVM, KVM, HYPER_V and VMWARE guest.
+
xirc2ps_cs= [NET,PCMCIA]
Format:

,[,[,[,]]]
diff --git a/arch/x86/include/asm/hypervisor.h 
b/arch/x86/include/asm/hypervisor.h
index 8c5aaba..00240b0 100644
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -52,8 +52,12 @@ struct hypervisor_x86 {
 
/* runtime callbacks */
struct x86_hyper_runtime runtime;
+
+   /* ignore nopv parameter */
+   bool ignore_nopv;
 };
 
+extern bool nopv;
 extern enum x86_hypervisor_type x86_hyper_type;
 extern void init_hypervisor_platform(void);
 static inline bool hypervisor_is_type(enum x86_hypervisor_type type)
diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
index 479ca47..337ff07 100644
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -54,6 +54,14 @@
 enum x86_hypervisor_type x86_hyper_type;
 EXPORT_SYMBOL(x86_hyper_type);
 
+bool __initdata nopv;
+static __init int parse_nopv(char *arg)
+{
+   nopv = true;
+   return 0;
+}
+early_param("nopv", parse_nopv);
+
 static inline const struct hypervisor_x86 * __init
 detect_hypervisor_vendor(void)
 {
@@ -61,6 +69,9 @@
uint32_t pri, max_pri = 0;
 
for (p = hypervisors; p < hypervisors + ARRAY_SIZE(hypervisors); p++) {
+   if (unlikely(nopv) && !(*p)->ignore_nopv)
+   continue;
+
pri = (*p)->detect();
if (pri > max_pri) {
max_pri = pri;
diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c
index 1b2ee55..c52c4105 100644
--- a/arch/x86/kernel/jailhouse.c
+++ b/arch/x86/kernel/jailhouse.c
@@ -217,4 +217,5 @@ static bool jailhouse_x2apic_available(void)
.detect = jailhouse_detect,
.init.init_platform = jailhouse_init_platform,
.init.x2apic_available  = jailhouse_x2apic_available,
+   .ignore_nopv= true,
 };
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 4722ba2..5d16824 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1463,4 +1463,5 @@ static uint32_t __init xen_platform_pv(void)
.detect = xen_platform_pv,
.type   = X86_HYPER_XEN_PV,
.runtime.pin_vcpu   = xen_pin_vcpu,
+   .ignore_nopv= true,
 };
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 5/5] xen: Add 'xen_nopv' parameter back for backward compatibility

2019-07-02 Thread Zhenzhong Duan


On 2019/7/3 2:13, Boris Ostrovsky wrote:

On 7/1/19 1:19 AM, Zhenzhong Duan wrote:

Map 'xen_nopv' to 'nopv' and mark 'xen_nopv' obsolete in
kernel-parameters.txt

I am not sure we want patch #3, why not do everything in a single patch?


Thanks for review. I'll fix all those based on your comments.

Zhenzhong


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 4/4] x86/xen: Add "nopv" support for HVM guest

2019-07-08 Thread Zhenzhong Duan


On 2019/7/5 21:06, Boris Ostrovsky wrote:

On 7/2/19 9:19 PM, Zhenzhong Duan wrote:

PVH guest needs PV extentions to work, so "nopv" parameter should be
ignored for PVH but not for HVM guest.

If PVH guest boots up via the Xen-PVH boot entry, xen_pvh is set early,
we know it's PVH guest and ignore "nopv" parameter directly.

If PVH guest boots up via the normal boot entry same as HVM guest, it's
hard to distinguish PVH and HVM guest at that time.

To handle that case, add a new function xen_hvm_nopv_guest_late_init()
to detect PVH at a late time and panic itself if nopv enabled for a
PVH guest.

Signed-off-by: Zhenzhong Duan
Cc: Boris Ostrovsky
Cc: Juergen Gross
Cc: Stefano Stabellini
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Borislav Petkov
---
  arch/x86/xen/enlighten_hvm.c | 27 +++
  1 file changed, 27 insertions(+)

diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 1756cf7..09a010a 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -231,11 +231,37 @@ bool __init xen_hvm_need_lapic(void)
return true;
  }
  
+static __init void xen_hvm_nopv_guest_late_init(void)

+{
+#ifdef CONFIG_XEN_PVH
+   if (x86_platform.legacy.rtc || !x86_platform.legacy.no_vga)
+   return;
+
+   /* PVH detected. */
+   xen_pvh = true;
+
+   panic("\"nopv\" and \"xen_nopv\" parameters are unsupported in PVH 
guest.");
+#endif
+}

Can't all of this be done in xen_hvm_guest_late_init()? It has the same
tests already and it seems to me you should be able to panic from there.


Indeed, I didn't realize this, thanks for pointing, I'll fix it.

Zhenzhong


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 4/4] x86/xen: Add "nopv" support for HVM guest

2019-07-08 Thread Zhenzhong Duan
PVH guest needs PV extentions to work, so "nopv" parameter should be
ignored for PVH but not for HVM guest.

If PVH guest boots up via the Xen-PVH boot entry, xen_pvh is set early,
we know it's PVH guest and ignore "nopv" parameter directly.

If PVH guest boots up via the normal boot entry same as HVM guest, it's
hard to distinguish PVH and HVM guest at that time. In this case, we
have to panic early if PVH is detected and nopv is enabled to avoid a
worse situation later.

Move xen_platform_hvm() after xen_hvm_guest_late_init() to avoid compile
error.

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 arch/x86/xen/enlighten_hvm.c | 32 
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 1756cf7..7e1c75f 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -231,14 +231,6 @@ bool __init xen_hvm_need_lapic(void)
return true;
 }
 
-static uint32_t __init xen_platform_hvm(void)
-{
-   if (xen_pv_domain())
-   return 0;
-
-   return xen_cpuid_base();
-}
-
 static __init void xen_hvm_guest_late_init(void)
 {
 #ifdef CONFIG_XEN_PVH
@@ -250,6 +242,9 @@ static __init void xen_hvm_guest_late_init(void)
/* PVH detected. */
xen_pvh = true;
 
+   if (nopv)
+   panic("\"nopv\" and \"xen_nopv\" parameters are unsupported in 
PVH guest.");
+
/* Make sure we don't fall back to (default) ACPI_IRQ_MODEL_PIC. */
if (!nr_ioapics && acpi_irq_model == ACPI_IRQ_MODEL_PIC)
acpi_irq_model = ACPI_IRQ_MODEL_PLATFORM;
@@ -259,6 +254,26 @@ static __init void xen_hvm_guest_late_init(void)
 #endif
 }
 
+static uint32_t __init xen_platform_hvm(void)
+{
+   if (xen_pv_domain())
+   return 0;
+
+   if (xen_pvh_domain() && nopv) {
+   /* Guest booting via the Xen-PVH boot entry goes here */
+   pr_info("\"nopv\" parameter is ignored in PVH guest\n");
+   nopv = false;
+   } else if (nopv) {
+   /*
+* Guest booting via normal boot entry (like via grub2) goes
+* here.
+*/
+   x86_init.hyper.guest_late_init = xen_hvm_guest_late_init;
+   return 0;
+   }
+   return xen_cpuid_base();
+}
+
 const __initconst struct hypervisor_x86 x86_hyper_xen_hvm = {
.name   = "Xen HVM",
.detect = xen_platform_hvm,
@@ -268,4 +283,5 @@ static __init void xen_hvm_guest_late_init(void)
.init.init_mem_mapping  = xen_hvm_init_mem_mapping,
.init.guest_late_init   = xen_hvm_guest_late_init,
.runtime.pin_vcpu   = xen_pin_vcpu,
+   .ignore_nopv= true,
 };
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 1/4] x86/xen: Mark xen_hvm_need_lapic() and xen_x2apic_para_available() as __init

2019-07-08 Thread Zhenzhong Duan
.. as they are only called at early bootup stage. In fact, other
functions in x86_hyper_xen_hvm.init.* are all marked as __init.

Unexport xen_hvm_need_lapic as it's never used outside.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Juergen Gross 
Cc: Boris Ostrovsky 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 arch/x86/include/asm/xen/hypervisor.h | 6 +++---
 arch/x86/xen/enlighten_hvm.c  | 3 +--
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/xen/hypervisor.h 
b/arch/x86/include/asm/xen/hypervisor.h
index 39171b3..42e1245 100644
--- a/arch/x86/include/asm/xen/hypervisor.h
+++ b/arch/x86/include/asm/xen/hypervisor.h
@@ -44,14 +44,14 @@ static inline uint32_t xen_cpuid_base(void)
 }
 
 #ifdef CONFIG_XEN
-extern bool xen_hvm_need_lapic(void);
+extern bool __init xen_hvm_need_lapic(void);
 
-static inline bool xen_x2apic_para_available(void)
+static inline bool __init xen_x2apic_para_available(void)
 {
return xen_hvm_need_lapic();
 }
 #else
-static inline bool xen_x2apic_para_available(void)
+static inline bool __init xen_x2apic_para_available(void)
 {
return (xen_cpuid_base() != 0);
 }
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 0e75642..ac4943c 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -218,7 +218,7 @@ static __init int xen_parse_nopv(char *arg)
 }
 early_param("xen_nopv", xen_parse_nopv);
 
-bool xen_hvm_need_lapic(void)
+bool __init xen_hvm_need_lapic(void)
 {
if (xen_nopv)
return false;
@@ -230,7 +230,6 @@ bool xen_hvm_need_lapic(void)
return false;
return true;
 }
-EXPORT_SYMBOL_GPL(xen_hvm_need_lapic);
 
 static uint32_t __init xen_platform_hvm(void)
 {
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 3/4] xen: Map "xen_nopv" parameter to "nopv" and mark it obsolete

2019-07-08 Thread Zhenzhong Duan
Clean up unnecessory code after that operation.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 Documentation/admin-guide/kernel-parameters.txt |  2 ++
 arch/x86/xen/enlighten_hvm.c| 12 ++--
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 21e08af..8ab34a1 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5254,6 +5254,8 @@
xen_nopv[X86]
Disables the PV optimizations forcing the HVM guest to
run as generic HVM guest with no PV drivers.
+   This option is obsoleted by the "nopv" option, which
+   has equivalent effect for XEN platform.
 
xen_scrub_pages=[XEN]
Boolean option to control scrubbing pages before giving 
them back
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index ac4943c..1756cf7 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -210,18 +210,18 @@ static void __init xen_hvm_guest_init(void)
 #endif
 }
 
-static bool xen_nopv;
 static __init int xen_parse_nopv(char *arg)
 {
-   xen_nopv = true;
-   return 0;
+   pr_notice("\"xen_nopv\" is deprecated, please use \"nopv\" instead\n");
+
+   if (xen_cpuid_base())
+   nopv = true;
+   return 0;
 }
 early_param("xen_nopv", xen_parse_nopv);
 
 bool __init xen_hvm_need_lapic(void)
 {
-   if (xen_nopv)
-   return false;
if (xen_pv_domain())
return false;
if (!xen_hvm_domain())
@@ -233,7 +233,7 @@ bool __init xen_hvm_need_lapic(void)
 
 static uint32_t __init xen_platform_hvm(void)
 {
-   if (xen_pv_domain() || xen_nopv)
+   if (xen_pv_domain())
return 0;
 
return xen_cpuid_base();
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 0/4] misc fixes to PV extensions code

2019-07-08 Thread Zhenzhong Duan
Hi,

In virtualization environment, PV extensions (drivers, interrupts,
timers, etc) are enabled in the majority of use cases which is the
best option.

However, in some cases (kexec not fully working, benchmarking, etc)
we want to disable PV extensions. We have xen_nopv for that purpose
but only for XEN. For a consistent admin experience a common command
line parameter set across all PV guest implementations is a better
choice.

To achieve this introduce a new 'nopv' parameter which is usable by
most of PV guest implementation. Due to the limitation of some PV
guests(XEN PV, XEN PVH and jailhouse), 'nopv' is ignored for XEN PV
, jailhouse and XEN PVH if booting via Xen-PVH boot entry. If booting
via normal boot entry(like grub2), PVH guest has to panic itself
currently.

While analyzing the PV guest code one bug were found and fixed.
(Patches 1). It can be applied independent of the functional
changes, but is kept in the series as the functional changes
depend on them.

For compatibility reason, "xen_nopv" is keeped and mapped to "nopv",
this way also avoids an issue with xen_nopv when booting PVH guest.

Build test passes with CONFIG_HYPERVISOR_GUEST enable and disabled.
I didn't get env to test with jailhouse and Hyperv, the others work
as expected.

v6:
PATCH3 add Reviewed-by
PATCH4 remove unnecessory xen_hvm_nopv_guest_late_init() per Boris

v5:
PATCH2:
update patch description per Boris
add declaration of nopv variable in arch/x86/include/asm/hypervisor.h
which will be used in PATCH3 and PATCH4

PATCH3 update xen_parse_nopv() per Boris
PATCH4 add nopv=false per Boris
Combine PATCH5 into PATCH3


v4:
PATCH5 a new patch to add 'xen_nopv' back per Boris

v3:
Remove some unrelated patches from patchset as suggested by Tglx

PATCH1 unchanged
PATCH2 add Reviewed-by
PATCH3 add Reviewed-by
PATCH4 rewrite the patch as Jgross found an issue in old patch,
description is also updated.



v2:
PATCH3 use 'ignore_nopv' for PVH/PV guest as suggested by Jgross.
PATCH5 new added one, specifically for HVM guest

Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 2/4] x86: Add "nopv" parameter to disable PV extensions

2019-07-08 Thread Zhenzhong Duan
In virtualization environment, PV extensions (drivers, interrupts,
timers, etc) are enabled in the majority of use cases which is the
best option.

However, in some cases (kexec not fully working, benchmarking)
we want to disable PV extensions. We have "xen_nopv" for that purpose
but only for XEN. For a consistent admin experience a common command
line parameter "nopv" set across all PV guest implementations is a
better choice.

There are guest types which just won't work without PV extensions,
like Xen PV, Xen PVH and jailhouse. add a "ignore_nopv" member to
struct hypervisor_x86 set to true for those guest types and call
the detect functions only if nopv is false or ignore_nopv is true.

Suggested-by: Juergen Gross 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Juergen Gross 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Jan Kiszka 
Cc: Boris Ostrovsky 
Cc: Stefano Stabellini 
---
 Documentation/admin-guide/kernel-parameters.txt |  5 +
 arch/x86/include/asm/hypervisor.h   |  4 
 arch/x86/kernel/cpu/hypervisor.c| 11 +++
 arch/x86/kernel/jailhouse.c |  1 +
 arch/x86/xen/enlighten_pv.c |  1 +
 5 files changed, 22 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 138f666..21e08af 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5268,6 +5268,11 @@
improve timer resolution at the expense of processing
more timer interrupts.
 
+   nopv=   [X86,XEN,KVM,HYPER_V,VMWARE]
+   Disables the PV optimizations forcing the guest to run
+   as generic guest with no PV drivers. Currently support
+   XEN HVM, KVM, HYPER_V and VMWARE guest.
+
xirc2ps_cs= [NET,PCMCIA]
Format:

,[,[,[,]]]
diff --git a/arch/x86/include/asm/hypervisor.h 
b/arch/x86/include/asm/hypervisor.h
index 8c5aaba..00240b0 100644
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -52,8 +52,12 @@ struct hypervisor_x86 {
 
/* runtime callbacks */
struct x86_hyper_runtime runtime;
+
+   /* ignore nopv parameter */
+   bool ignore_nopv;
 };
 
+extern bool nopv;
 extern enum x86_hypervisor_type x86_hyper_type;
 extern void init_hypervisor_platform(void);
 static inline bool hypervisor_is_type(enum x86_hypervisor_type type)
diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
index 479ca47..337ff07 100644
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -54,6 +54,14 @@
 enum x86_hypervisor_type x86_hyper_type;
 EXPORT_SYMBOL(x86_hyper_type);
 
+bool __initdata nopv;
+static __init int parse_nopv(char *arg)
+{
+   nopv = true;
+   return 0;
+}
+early_param("nopv", parse_nopv);
+
 static inline const struct hypervisor_x86 * __init
 detect_hypervisor_vendor(void)
 {
@@ -61,6 +69,9 @@
uint32_t pri, max_pri = 0;
 
for (p = hypervisors; p < hypervisors + ARRAY_SIZE(hypervisors); p++) {
+   if (unlikely(nopv) && !(*p)->ignore_nopv)
+   continue;
+
pri = (*p)->detect();
if (pri > max_pri) {
max_pri = pri;
diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c
index 1b2ee55..c52c4105 100644
--- a/arch/x86/kernel/jailhouse.c
+++ b/arch/x86/kernel/jailhouse.c
@@ -217,4 +217,5 @@ static bool jailhouse_x2apic_available(void)
.detect = jailhouse_detect,
.init.init_platform = jailhouse_init_platform,
.init.x2apic_available  = jailhouse_x2apic_available,
+   .ignore_nopv= true,
 };
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 4722ba2..5d16824 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1463,4 +1463,5 @@ static uint32_t __init xen_platform_pv(void)
.detect = xen_platform_pv,
.type   = X86_HYPER_XEN_PV,
.runtime.pin_vcpu   = xen_pin_vcpu,
+   .ignore_nopv= true,
 };
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3] xen/pv: Fix a boot up hang revealed by int3 self test

2019-07-15 Thread Zhenzhong Duan
Commit 7457c0da024b ("x86/alternatives: Add int3_emulate_call()
selftest") is used to ensure there is a gap setup in int3 exception stack
which could be used for inserting call return address.

This gap is missed in XEN PV int3 exception entry path, then below panic
triggered:

[0.772876] general protection fault:  [#1] SMP NOPTI
[0.772886] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0+ #11
[0.772893] RIP: e030:int3_magic+0x0/0x7
[0.772905] RSP: 3507:82203e98 EFLAGS: 0246
[0.773334] Call Trace:
[0.773334]  alternative_instructions+0x3d/0x12e
[0.773334]  check_bugs+0x7c9/0x887
[0.773334]  ? __get_locked_pte+0x178/0x1f0
[0.773334]  start_kernel+0x4ff/0x535
[0.773334]  ? set_init_arg+0x55/0x55
[0.773334]  xen_start_kernel+0x571/0x57a

For 64bit PV guests, Xen's ABI enters the kernel with using SYSRET, with
%rcx/%r11 on the stack. To convert back to "normal" looking exceptions,
the xen thunks do 'xen_*: pop %rcx; pop %r11; jmp *'.

E.g. Extracting 'xen_pv_trap xenint3' we have:
xen_xenint3:
 pop %rcx;
 pop %r11;
 jmp xenint3

As xenint3 and int3 entry code are same except xenint3 doesn't generate
a gap, we can fix it by using int3 and drop useless xenint3.

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Andrew Cooper 
---
 bootup test pass with PV guest.

 v3: set ist_okay to false for int3 per PeterZ
 add Andrew's comments to patch description

 v2: fix up description.
---
 arch/x86/entry/entry_64.S| 1 -
 arch/x86/include/asm/traps.h | 2 +-
 arch/x86/xen/enlighten_pv.c  | 2 +-
 arch/x86/xen/xen-asm_64.S| 1 -
 4 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 0ea4831..35a66fc 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1176,7 +1176,6 @@ idtentry stack_segmentdo_stack_segment
has_error_code=1
 #ifdef CONFIG_XEN_PV
 idtentry xennmido_nmi  has_error_code=0
 idtentry xendebug  do_debughas_error_code=0
-idtentry xenint3   do_int3 has_error_code=0
 #endif
 
 idtentry general_protectiondo_general_protection   has_error_code=1
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 7d6f3f3..f2bd284 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -40,7 +40,7 @@
 asmlinkage void xen_divide_error(void);
 asmlinkage void xen_xennmi(void);
 asmlinkage void xen_xendebug(void);
-asmlinkage void xen_xenint3(void);
+asmlinkage void xen_int3(void);
 asmlinkage void xen_overflow(void);
 asmlinkage void xen_bounds(void);
 asmlinkage void xen_invalid_op(void);
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 4722ba2..30c14cb 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -596,12 +596,12 @@ struct trap_array_entry {
 
 static struct trap_array_entry trap_array[] = {
{ debug,   xen_xendebug,true },
-   { int3,xen_xenint3, true },
{ double_fault,xen_double_fault,true },
 #ifdef CONFIG_X86_MCE
{ machine_check,   xen_machine_check,   true },
 #endif
{ nmi, xen_xennmi,  true },
+   { int3,xen_int3,false },
{ overflow,xen_overflow,false },
 #ifdef CONFIG_IA32_EMULATION
{ entry_INT80_compat,  xen_entry_INT80_compat,  false },
diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index 1e9ef0b..ebf610b 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -32,7 +32,6 @@ xen_pv_trap divide_error
 xen_pv_trap debug
 xen_pv_trap xendebug
 xen_pv_trap int3
-xen_pv_trap xenint3
 xen_pv_trap xennmi
 xen_pv_trap overflow
 xen_pv_trap bounds
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] xen/pv: Fix a boot up hang revealed by int3 self test

2019-07-14 Thread Zhenzhong Duan


On 2019/7/12 22:06, Andrew Cooper wrote:

On 11/07/2019 03:15, Zhenzhong Duan wrote:

Commit 7457c0da024b ("x86/alternatives: Add int3_emulate_call()
selftest") is used to ensure there is a gap setup in exception stack
which could be used for inserting call return address.

This gap is missed in XEN PV int3 exception entry path, then below panic
triggered:

[0.772876] general protection fault:  [#1] SMP NOPTI
[0.772886] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0+ #11
[0.772893] RIP: e030:int3_magic+0x0/0x7
[0.772905] RSP: 3507:82203e98 EFLAGS: 0246
[0.773334] Call Trace:
[0.773334]  alternative_instructions+0x3d/0x12e
[0.773334]  check_bugs+0x7c9/0x887
[0.773334]  ? __get_locked_pte+0x178/0x1f0
[0.773334]  start_kernel+0x4ff/0x535
[0.773334]  ? set_init_arg+0x55/0x55
[0.773334]  xen_start_kernel+0x571/0x57a

As xenint3 and int3 entry code are same except xenint3 doesn't generate
a gap, we can fix it by using int3 and drop useless xenint3.

For 64bit PV guests, Xen's ABI enters the kernel with using SYSRET, with
%rcx/%r11 on the stack.

To convert back to "normal" looking exceptions, the xen thunks do `pop
%rcx; pop %r11; jmp do_*`...

I will add this to commit message.



diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 0ea4831..35a66fc 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1176,7 +1176,6 @@ idtentry stack_segmentdo_stack_segment
has_error_code=1
  #ifdef CONFIG_XEN_PV
  idtentry xennmi   do_nmi  has_error_code=0
  idtentry xendebug do_debughas_error_code=0
-idtentry xenint3   do_int3 has_error_code=0
  #endif

What is confusing is why there are 3 extra magic versions here.  At a
guess, I'd say something to do with IST settings (given the vectors),
but I don't see why #DB/#BP would need to be different in the first
place.  (NMI sure, but that is more to do with the crazy hoops needing
to be jumped through to keep native functioning safely.)


xenint3 will be removed in this patch safely as it don't use IST now.

But debug and nmi need paranoid_entry which will read MSR_GS_BASE to 
determine


if swapgs is needed. I guess PV guesting running in ring3 will #GP with 
swapgs?



Zhenzhong


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 4/4] x86/xen: Add 'nopv' support for HVM guest

2019-07-01 Thread Zhenzhong Duan
PVH guest needs PV extentions to work, so 'nopv' parameter should be
ignored for PVH but not for HVM guest.

If PVH guest boots up via the Xen-PVH boot entry, xen_pvh is set early,
we know it's PVH guest and ignore 'nopv' parameter directly.

If PVH guest boots up via the normal boot entry same as HVM guest, it's
hard to distinguish PVH and HVM guest at that time.

To handle that case, add a new function xen_hvm_nopv_guest_late_init()
to detect PVH at a late time and panic itself if 'nopv' enabled for a
PVH guest.

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 arch/x86/xen/enlighten_hvm.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 7fcb4ea..340dff8 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -25,6 +25,7 @@
 #include "mmu.h"
 #include "smp.h"
 
+extern bool nopv;
 static unsigned long shared_info_pfn;
 
 void xen_hvm_init_shared_info(void)
@@ -221,11 +222,36 @@ bool __init xen_hvm_need_lapic(void)
return true;
 }
 
+static __init void xen_hvm_nopv_guest_late_init(void)
+{
+#ifdef CONFIG_XEN_PVH
+   if (x86_platform.legacy.rtc || !x86_platform.legacy.no_vga)
+   return;
+
+   /* PVH detected. */
+   xen_pvh = true;
+
+   panic("nopv parameter isn't supported in PVH guest.");
+#endif
+}
+
+
 static uint32_t __init xen_platform_hvm(void)
 {
if (xen_pv_domain())
return 0;
 
+   if (xen_pvh_domain() && nopv) {
+   /* Guest booting via the Xen-PVH boot entry goes here */
+   pr_info("nopv parameter is ignored in PVH guest\n");
+   } else if (nopv) {
+   /*
+* Guest booting via normal boot entry (like via grub2) goes
+* here.
+*/
+   x86_init.hyper.guest_late_init = xen_hvm_nopv_guest_late_init;
+   return 0;
+   }
return xen_cpuid_base();
 }
 
@@ -258,4 +284,5 @@ static __init void xen_hvm_guest_late_init(void)
.init.init_mem_mapping  = xen_hvm_init_mem_mapping,
.init.guest_late_init   = xen_hvm_guest_late_init,
.runtime.pin_vcpu   = xen_pin_vcpu,
+   .ignore_nopv= true,
 };
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 1/4] x86/xen: Mark xen_hvm_need_lapic() and xen_x2apic_para_available() as __init

2019-07-01 Thread Zhenzhong Duan
.. as they are only called at early bootup stage. In fact, other
functions in x86_hyper_xen_hvm.init.* are all marked as __init.

Unexport xen_hvm_need_lapic as it's never used outside.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Juergen Gross 
Cc: Boris Ostrovsky 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 arch/x86/include/asm/xen/hypervisor.h | 6 +++---
 arch/x86/xen/enlighten_hvm.c  | 3 +--
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/xen/hypervisor.h 
b/arch/x86/include/asm/xen/hypervisor.h
index 39171b3..42e1245 100644
--- a/arch/x86/include/asm/xen/hypervisor.h
+++ b/arch/x86/include/asm/xen/hypervisor.h
@@ -44,14 +44,14 @@ static inline uint32_t xen_cpuid_base(void)
 }
 
 #ifdef CONFIG_XEN
-extern bool xen_hvm_need_lapic(void);
+extern bool __init xen_hvm_need_lapic(void);
 
-static inline bool xen_x2apic_para_available(void)
+static inline bool __init xen_x2apic_para_available(void)
 {
return xen_hvm_need_lapic();
 }
 #else
-static inline bool xen_x2apic_para_available(void)
+static inline bool __init xen_x2apic_para_available(void)
 {
return (xen_cpuid_base() != 0);
 }
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 0e75642..ac4943c 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -218,7 +218,7 @@ static __init int xen_parse_nopv(char *arg)
 }
 early_param("xen_nopv", xen_parse_nopv);
 
-bool xen_hvm_need_lapic(void)
+bool __init xen_hvm_need_lapic(void)
 {
if (xen_nopv)
return false;
@@ -230,7 +230,6 @@ bool xen_hvm_need_lapic(void)
return false;
return true;
 }
-EXPORT_SYMBOL_GPL(xen_hvm_need_lapic);
 
 static uint32_t __init xen_platform_hvm(void)
 {
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 4/5] x86/xen: Add 'nopv' support for HVM guest

2019-07-01 Thread Zhenzhong Duan
PVH guest needs PV extentions to work, so 'nopv' parameter should be
ignored for PVH but not for HVM guest.

If PVH guest boots up via the Xen-PVH boot entry, xen_pvh is set early,
we know it's PVH guest and ignore 'nopv' parameter directly.

If PVH guest boots up via the normal boot entry same as HVM guest, it's
hard to distinguish PVH and HVM guest at that time.

To handle that case, add a new function xen_hvm_nopv_guest_late_init()
to detect PVH at a late time and panic itself if 'nopv' enabled for a
PVH guest.

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 arch/x86/xen/enlighten_hvm.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 7fcb4ea..340dff8 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -25,6 +25,7 @@
 #include "mmu.h"
 #include "smp.h"
 
+extern bool nopv;
 static unsigned long shared_info_pfn;
 
 void xen_hvm_init_shared_info(void)
@@ -221,11 +222,36 @@ bool __init xen_hvm_need_lapic(void)
return true;
 }
 
+static __init void xen_hvm_nopv_guest_late_init(void)
+{
+#ifdef CONFIG_XEN_PVH
+   if (x86_platform.legacy.rtc || !x86_platform.legacy.no_vga)
+   return;
+
+   /* PVH detected. */
+   xen_pvh = true;
+
+   panic("nopv parameter isn't supported in PVH guest.");
+#endif
+}
+
+
 static uint32_t __init xen_platform_hvm(void)
 {
if (xen_pv_domain())
return 0;
 
+   if (xen_pvh_domain() && nopv) {
+   /* Guest booting via the Xen-PVH boot entry goes here */
+   pr_info("nopv parameter is ignored in PVH guest\n");
+   } else if (nopv) {
+   /*
+* Guest booting via normal boot entry (like via grub2) goes
+* here.
+*/
+   x86_init.hyper.guest_late_init = xen_hvm_nopv_guest_late_init;
+   return 0;
+   }
return xen_cpuid_base();
 }
 
@@ -258,4 +284,5 @@ static __init void xen_hvm_guest_late_init(void)
.init.init_mem_mapping  = xen_hvm_init_mem_mapping,
.init.guest_late_init   = xen_hvm_guest_late_init,
.runtime.pin_vcpu   = xen_pin_vcpu,
+   .ignore_nopv= true,
 };
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 1/5] x86/xen: Mark xen_hvm_need_lapic() and xen_x2apic_para_available() as __init

2019-07-01 Thread Zhenzhong Duan
.. as they are only called at early bootup stage. In fact, other
functions in x86_hyper_xen_hvm.init.* are all marked as __init.

Unexport xen_hvm_need_lapic as it's never used outside.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Juergen Gross 
Cc: Boris Ostrovsky 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 arch/x86/include/asm/xen/hypervisor.h | 6 +++---
 arch/x86/xen/enlighten_hvm.c  | 3 +--
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/xen/hypervisor.h 
b/arch/x86/include/asm/xen/hypervisor.h
index 39171b3..42e1245 100644
--- a/arch/x86/include/asm/xen/hypervisor.h
+++ b/arch/x86/include/asm/xen/hypervisor.h
@@ -44,14 +44,14 @@ static inline uint32_t xen_cpuid_base(void)
 }
 
 #ifdef CONFIG_XEN
-extern bool xen_hvm_need_lapic(void);
+extern bool __init xen_hvm_need_lapic(void);
 
-static inline bool xen_x2apic_para_available(void)
+static inline bool __init xen_x2apic_para_available(void)
 {
return xen_hvm_need_lapic();
 }
 #else
-static inline bool xen_x2apic_para_available(void)
+static inline bool __init xen_x2apic_para_available(void)
 {
return (xen_cpuid_base() != 0);
 }
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 0e75642..ac4943c 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -218,7 +218,7 @@ static __init int xen_parse_nopv(char *arg)
 }
 early_param("xen_nopv", xen_parse_nopv);
 
-bool xen_hvm_need_lapic(void)
+bool __init xen_hvm_need_lapic(void)
 {
if (xen_nopv)
return false;
@@ -230,7 +230,6 @@ bool xen_hvm_need_lapic(void)
return false;
return true;
 }
-EXPORT_SYMBOL_GPL(xen_hvm_need_lapic);
 
 static uint32_t __init xen_platform_hvm(void)
 {
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 2/5] x86: Add nopv parameter to disable PV extensions

2019-07-01 Thread Zhenzhong Duan
In virtualization environment, PV extensions (drivers, interrupts,
timers, etc) are enabled in the majority of use cases which is the
best option.

However, in some cases (kexec not fully working, benchmarking)
we want to disable PV extensions. As such introduce the
'nopv' parameter that will do it.

There are guest types which just won't work without PV extensions,
like Xen PV, Xen PVH and jailhouse. add a "ignore_nopv" member to
struct hypervisor_x86 set to true for those guest types and call
the detect functions only if nopv is false or ignore_nopv is true.

There is already 'xen_nopv' parameter for XEN platform but not for
others. 'xen_nopv' can then be removed with this change.

Suggested-by: Juergen Gross 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Juergen Gross 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Jan Kiszka 
Cc: Boris Ostrovsky 
Cc: Stefano Stabellini 
---
 Documentation/admin-guide/kernel-parameters.txt |  5 +
 arch/x86/include/asm/hypervisor.h   |  3 +++
 arch/x86/kernel/cpu/hypervisor.c| 11 +++
 arch/x86/kernel/jailhouse.c |  1 +
 arch/x86/xen/enlighten_pv.c |  1 +
 5 files changed, 21 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 138f666..21e08af 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5268,6 +5268,11 @@
improve timer resolution at the expense of processing
more timer interrupts.
 
+   nopv=   [X86,XEN,KVM,HYPER_V,VMWARE]
+   Disables the PV optimizations forcing the guest to run
+   as generic guest with no PV drivers. Currently support
+   XEN HVM, KVM, HYPER_V and VMWARE guest.
+
xirc2ps_cs= [NET,PCMCIA]
Format:

,[,[,[,]]]
diff --git a/arch/x86/include/asm/hypervisor.h 
b/arch/x86/include/asm/hypervisor.h
index 8c5aaba..d75d2ea 100644
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -52,6 +52,9 @@ struct hypervisor_x86 {
 
/* runtime callbacks */
struct x86_hyper_runtime runtime;
+
+   /* ignore nopv parameter */
+   bool ignore_nopv;
 };
 
 extern enum x86_hypervisor_type x86_hyper_type;
diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
index 479ca47..337ff07 100644
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -54,6 +54,14 @@
 enum x86_hypervisor_type x86_hyper_type;
 EXPORT_SYMBOL(x86_hyper_type);
 
+bool __initdata nopv;
+static __init int parse_nopv(char *arg)
+{
+   nopv = true;
+   return 0;
+}
+early_param("nopv", parse_nopv);
+
 static inline const struct hypervisor_x86 * __init
 detect_hypervisor_vendor(void)
 {
@@ -61,6 +69,9 @@
uint32_t pri, max_pri = 0;
 
for (p = hypervisors; p < hypervisors + ARRAY_SIZE(hypervisors); p++) {
+   if (unlikely(nopv) && !(*p)->ignore_nopv)
+   continue;
+
pri = (*p)->detect();
if (pri > max_pri) {
max_pri = pri;
diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c
index d96d563..880329f 100644
--- a/arch/x86/kernel/jailhouse.c
+++ b/arch/x86/kernel/jailhouse.c
@@ -217,4 +217,5 @@ static bool __init jailhouse_x2apic_available(void)
.detect = jailhouse_detect,
.init.init_platform = jailhouse_init_platform,
.init.x2apic_available  = jailhouse_x2apic_available,
+   .ignore_nopv= true,
 };
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 4722ba2..5d16824 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1463,4 +1463,5 @@ static uint32_t __init xen_platform_pv(void)
.detect = xen_platform_pv,
.type   = X86_HYPER_XEN_PV,
.runtime.pin_vcpu   = xen_pin_vcpu,
+   .ignore_nopv= true,
 };
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 3/5] Revert "xen: Introduce 'xen_nopv' to disable PV extensions for HVM guests."

2019-07-01 Thread Zhenzhong Duan
This reverts commit 8d693b911bb9c57009c24cb1772d205b84c7985c.

Instead we use an unified parameter 'nopv' for all the hypervisor
platforms.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Juergen Gross 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 Documentation/admin-guide/kernel-parameters.txt |  4 
 arch/x86/xen/enlighten_hvm.c| 12 +---
 2 files changed, 1 insertion(+), 15 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 21e08af..d5c3dcc 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5251,10 +5251,6 @@
Disables the ticketlock slowpath using Xen PV
optimizations.
 
-   xen_nopv[X86]
-   Disables the PV optimizations forcing the HVM guest to
-   run as generic HVM guest with no PV drivers.
-
xen_scrub_pages=[XEN]
Boolean option to control scrubbing pages before giving 
them back
to Xen, for use by other domains. Can be also changed 
at runtime
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index ac4943c..7fcb4ea 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -210,18 +210,8 @@ static void __init xen_hvm_guest_init(void)
 #endif
 }
 
-static bool xen_nopv;
-static __init int xen_parse_nopv(char *arg)
-{
-   xen_nopv = true;
-   return 0;
-}
-early_param("xen_nopv", xen_parse_nopv);
-
 bool __init xen_hvm_need_lapic(void)
 {
-   if (xen_nopv)
-   return false;
if (xen_pv_domain())
return false;
if (!xen_hvm_domain())
@@ -233,7 +223,7 @@ bool __init xen_hvm_need_lapic(void)
 
 static uint32_t __init xen_platform_hvm(void)
 {
-   if (xen_pv_domain() || xen_nopv)
+   if (xen_pv_domain())
return 0;
 
return xen_cpuid_base();
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 0/5] misc fixes to PV extentions code

2019-07-01 Thread Zhenzhong Duan
Hi,

In virtualization environment, PV extensions (drivers, interrupts,
timers, etc) are enabled in the majority of use cases which is the
best option.

However, in some cases (kexec not fully working, benchmarking, etc)
we want to disable PV extensions. We have xen_nopv for that purpose
but only for XEN. For a consistent admin experience a common command
line parameter set across all PV guest implementations is a better
choice.

To achieve this introduce a new 'nopv' parameter which is usable by
most of PV guest implementation. Due to the limitation of some PV
guests(XEN PV, XEN PVH and jailhouse), 'nopv' is ignored for XEN PV
, jailhouse and XEN PVH if booting via Xen-PVH boot entry. If booting
via normal boot entry(like grub2), PVH guest has to panic itself
currently.

While analyzing the PV guest code one bug were found and fixed.
(Patches 1). It can be applied independent of the functional
changes, but is kept in the series as the functional changes
depend on them.

As I see there is an issue with xen_nopv when booting PVH guest, so
I revert 'xen_nopv' in one patch and map it back to 'nopv' in the
following patch. This is the easier way to fix the issue and easy
for review.

I didn't get env to test with jailhouse and Hyperv, the others work
as expected.

v4:
PATCH5 a new patch to add 'xen_nopv' back per Boris

v3:
Remove some unrelated patches from patchset as suggested by Tglx

PATCH1 unchanged
PATCH2 add Reviewed-by
PATCH3 add Reviewed-by
PATCH4 rewrite the patch as Jgross found an issue in old patch,
description is also updated.



v2:
PATCH3 use 'ignore_nopv' for PVH/PV guest as suggested by Jgross.
PATCH5 new added one, specifically for HVM guest

Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 5/5] xen: Add 'xen_nopv' parameter back for backward compatibility

2019-07-01 Thread Zhenzhong Duan
Map 'xen_nopv' to 'nopv' and mark 'xen_nopv' obsolete in
kernel-parameters.txt

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 Documentation/admin-guide/kernel-parameters.txt | 6 ++
 arch/x86/xen/enlighten_hvm.c| 7 +++
 2 files changed, 13 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index d5c3dcc..34eb323 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5264,6 +5264,12 @@
improve timer resolution at the expense of processing
more timer interrupts.
 
+   xen_nopv[X86]
+   Disables the PV optimizations forcing the HVM guest to
+   run as generic HVM guest with no PV drivers.
+   This option is obsoleted by the "nopv" option, which
+   has equivalent effect for XEN platform.
+
nopv=   [X86,XEN,KVM,HYPER_V,VMWARE]
Disables the PV optimizations forcing the guest to run
as generic guest with no PV drivers. Currently support
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 340dff8..5cdd608 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -211,6 +211,13 @@ static void __init xen_hvm_guest_init(void)
 #endif
 }
 
+static __init int xen_parse_nopv(char *arg)
+{
+   nopv = true;
+   return 0;
+}
+early_param("xen_nopv", xen_parse_nopv);
+
 bool __init xen_hvm_need_lapic(void)
 {
if (xen_pv_domain())
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 2/4] x86: Add nopv parameter to disable PV extensions

2019-07-01 Thread Zhenzhong Duan
In virtualization environment, PV extensions (drivers, interrupts,
timers, etc) are enabled in the majority of use cases which is the
best option.

However, in some cases (kexec not fully working, benchmarking)
we want to disable PV extensions. As such introduce the
'nopv' parameter that will do it.

There are guest types which just won't work without PV extensions,
like Xen PV, Xen PVH and jailhouse. add a "ignore_nopv" member to
struct hypervisor_x86 set to true for those guest types and call
the detect functions only if nopv is false or ignore_nopv is true.

There is already 'xen_nopv' parameter for XEN platform but not for
others. 'xen_nopv' can then be removed with this change.

Suggested-by: Juergen Gross 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Juergen Gross 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Jan Kiszka 
Cc: Boris Ostrovsky 
Cc: Stefano Stabellini 
---
 Documentation/admin-guide/kernel-parameters.txt |  5 +
 arch/x86/include/asm/hypervisor.h   |  3 +++
 arch/x86/kernel/cpu/hypervisor.c| 11 +++
 arch/x86/kernel/jailhouse.c |  1 +
 arch/x86/xen/enlighten_pv.c |  1 +
 5 files changed, 21 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 138f666..21e08af 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5268,6 +5268,11 @@
improve timer resolution at the expense of processing
more timer interrupts.
 
+   nopv=   [X86,XEN,KVM,HYPER_V,VMWARE]
+   Disables the PV optimizations forcing the guest to run
+   as generic guest with no PV drivers. Currently support
+   XEN HVM, KVM, HYPER_V and VMWARE guest.
+
xirc2ps_cs= [NET,PCMCIA]
Format:

,[,[,[,]]]
diff --git a/arch/x86/include/asm/hypervisor.h 
b/arch/x86/include/asm/hypervisor.h
index 8c5aaba..d75d2ea 100644
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -52,6 +52,9 @@ struct hypervisor_x86 {
 
/* runtime callbacks */
struct x86_hyper_runtime runtime;
+
+   /* ignore nopv parameter */
+   bool ignore_nopv;
 };
 
 extern enum x86_hypervisor_type x86_hyper_type;
diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
index 479ca47..337ff07 100644
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -54,6 +54,14 @@
 enum x86_hypervisor_type x86_hyper_type;
 EXPORT_SYMBOL(x86_hyper_type);
 
+bool __initdata nopv;
+static __init int parse_nopv(char *arg)
+{
+   nopv = true;
+   return 0;
+}
+early_param("nopv", parse_nopv);
+
 static inline const struct hypervisor_x86 * __init
 detect_hypervisor_vendor(void)
 {
@@ -61,6 +69,9 @@
uint32_t pri, max_pri = 0;
 
for (p = hypervisors; p < hypervisors + ARRAY_SIZE(hypervisors); p++) {
+   if (unlikely(nopv) && !(*p)->ignore_nopv)
+   continue;
+
pri = (*p)->detect();
if (pri > max_pri) {
max_pri = pri;
diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c
index d96d563..880329f 100644
--- a/arch/x86/kernel/jailhouse.c
+++ b/arch/x86/kernel/jailhouse.c
@@ -217,4 +217,5 @@ static bool __init jailhouse_x2apic_available(void)
.detect = jailhouse_detect,
.init.init_platform = jailhouse_init_platform,
.init.x2apic_available  = jailhouse_x2apic_available,
+   .ignore_nopv= true,
 };
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 4722ba2..5d16824 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1463,4 +1463,5 @@ static uint32_t __init xen_platform_pv(void)
.detect = xen_platform_pv,
.type   = X86_HYPER_XEN_PV,
.runtime.pin_vcpu   = xen_pin_vcpu,
+   .ignore_nopv= true,
 };
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 3/4] Revert "xen: Introduce 'xen_nopv' to disable PV extensions for HVM guests."

2019-07-01 Thread Zhenzhong Duan
This reverts commit 8d693b911bb9c57009c24cb1772d205b84c7985c.

Instead we use an unified parameter 'nopv' for all the hypervisor
platforms.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Juergen Gross 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 Documentation/admin-guide/kernel-parameters.txt |  4 
 arch/x86/xen/enlighten_hvm.c| 12 +---
 2 files changed, 1 insertion(+), 15 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 21e08af..d5c3dcc 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5251,10 +5251,6 @@
Disables the ticketlock slowpath using Xen PV
optimizations.
 
-   xen_nopv[X86]
-   Disables the PV optimizations forcing the HVM guest to
-   run as generic HVM guest with no PV drivers.
-
xen_scrub_pages=[XEN]
Boolean option to control scrubbing pages before giving 
them back
to Xen, for use by other domains. Can be also changed 
at runtime
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index ac4943c..7fcb4ea 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -210,18 +210,8 @@ static void __init xen_hvm_guest_init(void)
 #endif
 }
 
-static bool xen_nopv;
-static __init int xen_parse_nopv(char *arg)
-{
-   xen_nopv = true;
-   return 0;
-}
-early_param("xen_nopv", xen_parse_nopv);
-
 bool __init xen_hvm_need_lapic(void)
 {
-   if (xen_nopv)
-   return false;
if (xen_pv_domain())
return false;
if (!xen_hvm_domain())
@@ -233,7 +223,7 @@ bool __init xen_hvm_need_lapic(void)
 
 static uint32_t __init xen_platform_hvm(void)
 {
-   if (xen_pv_domain() || xen_nopv)
+   if (xen_pv_domain())
return 0;
 
return xen_cpuid_base();
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 0/4] misc fixes to PV extentions code

2019-07-01 Thread Zhenzhong Duan
Hi,

In virtualization environment, PV extensions (drivers, interrupts,
timers, etc) are enabled in the majority of use cases which is the
best option.

However, in some cases (kexec not fully working, benchmarking, etc)
we want to disable PV extensions. We have xen_nopv for that purpose
but only for XEN. For a consistent admin experience a common command
line parameter set across all PV guest implementations is a better
choice.

To achieve this introduce a new 'nopv' parameter which is usable by
most of PV guest implementation. Due to the limitation of some PV
guests(XEN PV, XEN PVH and jailhouse), 'nopv' is ignored for XEN PV
, jailhouse and XEN PVH if booting via Xen-PVH boot entry. If booting
via normal boot entry(like grub2), PVH guest has to panic itself
currently.

While analyzing the PV guest code one bug were found and fixed.
(Patches 1). It can be applied independent of the functional
changes, but is kept in the series as the functional changes
depend on them.

As I didn't got further comments from Jgross about remove 'xen_nopv',
so I presume he has no objection to that. In fact, I think no product
environment will use 'xen_nopv' for performance. Jgross, if you changed
mind to preserve it finally, just let me know.

I didn't get env to test with jailhouse and Hyperv, the others work
as expected.

v3:
Remove some unrelated patches from patchset as suggested by Tglx

PATCH1 unchanged
PATCH2 add Reviewed-by
PATCH3 add Reviewed-by
PATCH4 rewrite the patch as Jgross found an issue in old patch,
description is also updated.



v2:
PATCH3 use 'ignore_nopv' for PVH/PV guest as suggested by Jgross.
PATCH5 new added one, specifically for HVM guest

Thanks
Zhenzhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 3/4] Revert "xen: Introduce 'xen_nopv' to disable PV extensions for HVM guests."

2019-07-01 Thread Zhenzhong Duan


On 2019/7/2 11:48, Boris Ostrovsky wrote:

On Mon, Jul 01, 2019 at 10:20:27AM +0800, Zhenzhong Duan wrote:

This reverts commit 8d693b911bb9c57009c24cb1772d205b84c7985c.

Instead we use an unified parameter 'nopv' for all the hypervisor
platforms.

Signed-off-by: Zhenzhong Duan
Reviewed-by: Juergen Gross
Cc: Boris Ostrovsky
Cc: Juergen Gross
Cc: Stefano Stabellini
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Borislav Petkov
---
  Documentation/admin-guide/kernel-parameters.txt |  4 
  arch/x86/xen/enlighten_hvm.c| 12 +---
  2 files changed, 1 insertion(+), 15 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 21e08af..d5c3dcc 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5251,10 +5251,6 @@
Disables the ticketlock slowpath using Xen PV
optimizations.
  
-	xen_nopv	[X86]

-   Disables the PV optimizations forcing the HVM guest to
-   run as generic HVM guest with no PV drivers.
-

So someone upgrades the kernel and suddenly things work differently?

At least there should be a warning that the option has been replaced
with 'nopv' (but I would actually keep this option working as well).


OK, I'll add new patch to map xen_nopv to nopv. So if 'xen_nopv' is 
used, we go


to the path for 'nopv'. I will be same effect.

Thanks

Zhenzhong


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 4/5] x86/paravirt: Remove const mark from x86_hyper_xen_hvm variable

2019-07-16 Thread Zhenzhong Duan


On 2019/7/16 18:57, Juergen Gross wrote:

On 11.07.19 14:02, Zhenzhong Duan wrote:

.. as "nopv" support needs it to be changeable at boot up stage.

Checkpatch report warning, so move variable declarations from
hypervisor.c to hypervisor.h

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
  arch/x86/include/asm/hypervisor.h | 8 
  arch/x86/kernel/cpu/hypervisor.c  | 8 
  2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/hypervisor.h 
b/arch/x86/include/asm/hypervisor.h

index f7b4c53..e41cbf2 100644
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -58,6 +58,14 @@ struct hypervisor_x86 {
  bool ignore_nopv;
  };
  +extern const struct hypervisor_x86 x86_hyper_vmware;
+extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
+extern const struct hypervisor_x86 x86_hyper_xen_pv;
+extern const struct hypervisor_x86 x86_hyper_kvm;
+extern const struct hypervisor_x86 x86_hyper_jailhouse;
+extern const struct hypervisor_x86 x86_hyper_acrn;
+extern struct hypervisor_x86 x86_hyper_xen_hvm;


This should either stay const and be changed in patch 5, or you
should adapt its definition in arch/x86/xen/enlighten_hvm.c in
this patch.


Ok, thanks for your suggestion.

I'll choose 2nd opinion as I don't need to change descripton with that.

Zhenzhong


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v8 4/5] x86/paravirt: Remove const mark from x86_hyper_xen_hvm variable

2019-07-16 Thread Zhenzhong Duan
.. as "nopv" support needs it to be changeable at boot up stage.

Checkpatch reports warning, so move variable declarations from
hypervisor.c to hypervisor.h

Signed-off-by: Zhenzhong Duan 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 arch/x86/include/asm/hypervisor.h | 8 
 arch/x86/kernel/cpu/hypervisor.c  | 8 
 arch/x86/xen/enlighten_hvm.c  | 2 +-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/hypervisor.h 
b/arch/x86/include/asm/hypervisor.h
index f7b4c53..e41cbf2 100644
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -58,6 +58,14 @@ struct hypervisor_x86 {
bool ignore_nopv;
 };
 
+extern const struct hypervisor_x86 x86_hyper_vmware;
+extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
+extern const struct hypervisor_x86 x86_hyper_xen_pv;
+extern const struct hypervisor_x86 x86_hyper_kvm;
+extern const struct hypervisor_x86 x86_hyper_jailhouse;
+extern const struct hypervisor_x86 x86_hyper_acrn;
+extern struct hypervisor_x86 x86_hyper_xen_hvm;
+
 extern bool nopv;
 extern enum x86_hypervisor_type x86_hyper_type;
 extern void init_hypervisor_platform(void);
diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
index 7eaad41..553bfbf 100644
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -26,14 +26,6 @@
 #include 
 #include 
 
-extern const struct hypervisor_x86 x86_hyper_vmware;
-extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
-extern const struct hypervisor_x86 x86_hyper_xen_pv;
-extern const struct hypervisor_x86 x86_hyper_xen_hvm;
-extern const struct hypervisor_x86 x86_hyper_kvm;
-extern const struct hypervisor_x86 x86_hyper_jailhouse;
-extern const struct hypervisor_x86 x86_hyper_acrn;
-
 static const __initconst struct hypervisor_x86 * const hypervisors[] =
 {
 #ifdef CONFIG_XEN_PV
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 1756cf7..b671983 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -259,7 +259,7 @@ static __init void xen_hvm_guest_late_init(void)
 #endif
 }
 
-const __initconst struct hypervisor_x86 x86_hyper_xen_hvm = {
+struct hypervisor_x86 x86_hyper_xen_hvm __initdata = {
.name   = "Xen HVM",
.detect = xen_platform_hvm,
.type   = X86_HYPER_XEN_HVM,
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v8 5/5] x86/xen: Add "nopv" support for HVM guest

2019-07-16 Thread Zhenzhong Duan
PVH guest needs PV extentions to work, so "nopv" parameter should be
ignored for PVH but not for HVM guest.

If PVH guest boots up via the Xen-PVH boot entry, xen_pvh is set early,
we know it's PVH guest and ignore "nopv" parameter directly.

If PVH guest boots up via the normal boot entry same as HVM guest, it's
hard to distinguish PVH and HVM guest at that time. In this case, we
have to panic early if PVH is detected and nopv is enabled to avoid a
worse situation later.

Remove static from bool_x86_init_noop/x86_op_int_noop so they could be
used globally. Move xen_platform_hvm() after xen_hvm_guest_late_init()
to avoid compile error.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
---
 arch/x86/include/asm/x86_init.h |  2 ++
 arch/x86/kernel/x86_init.c  |  4 ++--
 arch/x86/xen/enlighten_hvm.c| 43 +
 3 files changed, 39 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index b85a7c5..ac09341 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -301,6 +301,8 @@ struct x86_apic_ops {
 extern void x86_early_init_platform_quirks(void);
 extern void x86_init_noop(void);
 extern void x86_init_uint_noop(unsigned int unused);
+extern bool bool_x86_init_noop(void);
+extern void x86_op_int_noop(int cpu);
 extern bool x86_pnpbios_disabled(void);
 
 #endif
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index 50a2b49..1bef687 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -29,8 +29,8 @@ void x86_init_noop(void) { }
 void __init x86_init_uint_noop(unsigned int unused) { }
 static int __init iommu_init_noop(void) { return 0; }
 static void iommu_shutdown_noop(void) { }
-static bool __init bool_x86_init_noop(void) { return false; }
-static void x86_op_int_noop(int cpu) { }
+bool __init bool_x86_init_noop(void) { return false; }
+void x86_op_int_noop(int cpu) { }
 
 /*
  * The platform setup functions are preset with the default functions
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index b671983..e138f7d 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -231,14 +231,6 @@ bool __init xen_hvm_need_lapic(void)
return true;
 }
 
-static uint32_t __init xen_platform_hvm(void)
-{
-   if (xen_pv_domain())
-   return 0;
-
-   return xen_cpuid_base();
-}
-
 static __init void xen_hvm_guest_late_init(void)
 {
 #ifdef CONFIG_XEN_PVH
@@ -250,6 +242,9 @@ static __init void xen_hvm_guest_late_init(void)
/* PVH detected. */
xen_pvh = true;
 
+   if (nopv)
+   panic("\"nopv\" and \"xen_nopv\" parameters are unsupported in 
PVH guest.");
+
/* Make sure we don't fall back to (default) ACPI_IRQ_MODEL_PIC. */
if (!nr_ioapics && acpi_irq_model == ACPI_IRQ_MODEL_PIC)
acpi_irq_model = ACPI_IRQ_MODEL_PLATFORM;
@@ -259,6 +254,37 @@ static __init void xen_hvm_guest_late_init(void)
 #endif
 }
 
+static uint32_t __init xen_platform_hvm(void)
+{
+   uint32_t xen_domain = xen_cpuid_base();
+   struct x86_hyper_init *h = _hyper_xen_hvm.init;
+
+   if (xen_pv_domain())
+   return 0;
+
+   if (xen_pvh_domain() && nopv) {
+   /* Guest booting via the Xen-PVH boot entry goes here */
+   pr_info("\"nopv\" parameter is ignored in PVH guest\n");
+   nopv = false;
+   } else if (nopv && xen_domain) {
+   /*
+* Guest booting via normal boot entry (like via grub2) goes
+* here.
+*
+* Use interface functions for bare hardware if nopv,
+* xen_hvm_guest_late_init is an exception as we need to
+* detect PVH and panic there.
+*/
+   h->init_platform = x86_init_noop;
+   h->x2apic_available = bool_x86_init_noop;
+   h->init_mem_mapping = x86_init_noop;
+   h->init_after_bootmem = x86_init_noop;
+   h->guest_late_init = xen_hvm_guest_late_init;
+   x86_hyper_xen_hvm.runtime.pin_vcpu = x86_op_int_noop;
+   }
+   return xen_domain;
+}
+
 struct hypervisor_x86 x86_hyper_xen_hvm __initdata = {
.name   = "Xen HVM",
.detect = xen_platform_hvm,
@@ -268,4 +294,5 @@ struct hypervisor_x86 x86_hyper_xen_hvm __initdata = {
.init.init_mem_mapping  = xen_hvm_init_mem_mapping,
.init.guest_late_init   = xen_hvm_guest_late_init,
.runtime.pin_vcpu   = xen_pin_vcpu,
+   .ignore_nopv= true,
 };
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 4/4] x86/xen: Add "nopv" support for HVM guest

2019-07-09 Thread Zhenzhong Duan

On 2019/7/9 22:54, Boris Ostrovsky wrote:

On 7/9/19 12:20 AM, Zhenzhong Duan wrote:

-const __initconst struct hypervisor_x86 x86_hyper_xen_hvm = {
+static uint32_t __init xen_platform_hvm(void)
+{
+   uint32_t xen_domain = xen_cpuid_base();
+   struct x86_hyper_init *h = _hyper_xen_hvm.init;
+
+   if (xen_pv_domain())
+   return 0;
+
+   if (xen_pvh_domain() && nopv) {
+   /* Guest booting via the Xen-PVH boot entry goes here */
+   pr_info("\"nopv\" parameter is ignored in PVH guest\n");
+   nopv = false;
+   } else if (nopv && xen_domain) {
+   /*
+    * Guest booting via normal boot entry (like via
grub2) goes
+    * here.
+    *
+    * Use interface functions for bare hardware if nopv,
+    * xen_hvm_guest_late_init is an exception as we need to
+    * detect PVH and panic there.
+    */
+   memcpy(h, (void *)_init.hyper,
sizeof(x86_init.hyper));


And this worked? I'd think it would fail since h points to RO section.

Yes, I have below changes in the patch.

-const __initconst struct hypervisor_x86 x86_hyper_xen_hvm = {
+struct hypervisor_x86 x86_hyper_xen_hvm __initdata = {





+   memcpy(_hyper_xen_hvm.runtime, (void
*)_platform.hyper,
+  sizeof(x86_platform.hyper));
+   h->guest_late_init = xen_hvm_guest_late_init;


To me this still doesn't look right --- you are making assumptions about
x86_platform/x86_init.hyper and I don't think you can assume they have
not been set to point to another hypervisor, for example.


You mean copy_array() calls in init_hypervisor_platform()? But that 
happens after


detect_hypervisor_vendor() shoose out the prefered hypervisor. In detect 
stage,


x86_platform/x86_init.hyper has default value for bare hardware, or I 
missed something?


Just realized I can use memset to zero instead of memcpy which looks 
more rational.




Would modifying all x86_hyper_xen_hvm's ops (except, I guess,
xen_hvm_guest_late_init()) to immediately return if nopv is set work?


I think so,  Let me try it.

Zhenzhong


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel