Re: [Xen-devel] PV DRM doesn't work without auto_translated_physmap feature in Dom0

2020-01-07 Thread Oleksandr Andrushchenko
On 1/6/20 10:38 AM, Jürgen Groß wrote:
> On 06.01.20 08:56, Santucco wrote:
>> Hello,
>>
>> I’m trying to use vdispl interface from PV OS, it doesn’t work.
>> Configuration details:
>>  Xen 4.12.1
>>  Dom0: Linux 4.20.17-gentoo #13 SMP Sat Dec 28 11:12:24 MSK 2019 
>> x86_64 Intel(R) Celeron(R) CPU N3050 @ 1.60GHz GenuineIntel GNU/Linux
>>  DomU: x86 Plan9, PV
>>  displ_be as a backend for vdispl and vkb
>>
>> when VM starts, displ_be reports about an error:
>> gnttab: error: ioctl DMABUF_EXP_FROM_REFS failed: Invalid argument 
>> (displ_be.log:221)
>>
>> related Dom0 output is:
>> [  191.579278] Cannot provide dma-buf: use_ptemode 1 
>> (dmesg.create.log:123)
>
> This seems to be a limitation of the xen dma-buf driver. It was written
> for being used on ARM initially where PV is not available.
This is true and we never tried/targeted PV domains with this 
implementation,
so if there is a need for that someone has to take a look on the proper
implementation for PV...
>
> CC-ing Oleksandr Andrushchenko who is the author of that driver. He
> should be able to tell us what would be needed to enable PV dom0.
>
> Depending on your use case it might be possible to use PVH dom0, but
> support for this mode is "experimental" only and some features are not
> yet working.
>
Well, one of the workarounds possible is to drop zero-copying use-case
(this is why display backend tries to create dmu-bufs from grants passed
by the guest domain and fails because of "Cannot provide dma-buf: 
use_ptemode 1")
So, in this case display backend will do memory copying for the incoming 
frames
and won't touch DMABUF_EXP_FROM_REFS ioctl.
To do so just disable zero-copying while building the backend [1]
>
> Juergen
>
[1] https://github.com/xen-troops/displ_be/blob/master/CMakeLists.txt#L12
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] XEN crash and double fault when doing cpu online/offline

2020-01-07 Thread Jürgen Groß

On 08.01.20 06:50, Tao Xu wrote:

Hi,

When I use xen-hptool cpu-offline/cpu-online to let CPU in a socket 
online/offline using the script as follows:


for((j=48;j<=95;j++));
do
   xen-hptool cpu-offline $j
done

for((j=48;j<=95;j++));
do
   xen-hptool cpu-online $j
done

Xen crash when cpu re-online. I use the upstream XEN(0dd92688) and try 
many days, it still crash. But if I only do cpu online/offline for CPU 
48~59, Xen will not crash. The bug can be reproduced when we do cpu 
online/offline for most CPU in a socket. And interesting thing is when 
we use the script as follow:


for((j=48;j<=95;j++));
do
   xen-hptool cpu-offline $j
   xen-hptool cpu-online $j
done

Xen will not crash too. Is there a bug in sched_credit2?

The crash message as follows:

(XEN) Adding cpu 77 to runqueue 1
(XEN) Adding cpu 78 to runqueue 1
(XEN) Adding cpu 79 to runqueue 1
(XEN) Adding cpu 80 to runqueue 1
(X(ENXE) N) *** DOUBLE FAULT ***
(XEN) Assertion 'debug->cpu == smp_processor_id()' failed at spinlock.c:88
(XEN) [ Xen-4.14-unstable  x86_64  debug=y   Not tainted ]
(XEN) Debugging connection not set up.
(XEN) CPU:    48
(XEN) [ Xen-4.14-unstable  x86_64  debug=y   Not tainted ]
(XEN) CPU:    0
(XEN) RIP:    e008:[] _spin_unlock+0x40/0x42


So the original problem causes a double fault, but spinlock debugging
causes a subsequent panic.

Can you please retry the tests with the attached patch? It should
result in diagnostic data related to the real problem.


Juergen
>From 1e84395de9ae532a479706ff74f50040f25041fa Mon Sep 17 00:00:00 2001
From: Juergen Gross 
Date: Wed, 8 Jan 2020 08:43:53 +0100
Subject: [PATCH] xen/spinlock: disable spinlock debugging in
 console_force_unlock()

console_force_unlock() might result in subsequent ASSERT() triggering
when CONFIG_DEBUG_LOCKS was active. Avoid that by calling
spin_debug_disable() in console_force_unlock() and make the spinlock
debug assertions trigger only if spin_debug was active.

Signed-off-by: Juergen Gross 
---
 xen/common/spinlock.c  | 2 +-
 xen/drivers/char/console.c | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index ed69f0a4d2..43c3a437e8 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -85,7 +85,7 @@ static void got_lock(union lock_debug *debug)
 
 static void rel_lock(union lock_debug *debug)
 {
-ASSERT(debug->cpu == smp_processor_id());
+ASSERT(atomic_read(_debug) > 0 || debug->cpu == smp_processor_id());
 debug->cpu = SPINLOCK_NO_CPU;
 }
 
diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
index b31d789a5d..4bcbbfa7d6 100644
--- a/xen/drivers/char/console.c
+++ b/xen/drivers/char/console.c
@@ -1077,6 +1077,7 @@ void console_unlock_recursive_irqrestore(unsigned long flags)
 void console_force_unlock(void)
 {
 watchdog_disable();
+spin_debug_disable();
 spin_lock_init(_lock);
 serial_force_unlock(sercon_handle);
 console_locks_busted = 1;
-- 
2.16.4

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [qemu-mainline test] 145777: regressions - FAIL

2020-01-07 Thread osstest service owner
flight 145777 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/145777/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-xsm   6 xen-buildfail REGR. vs. 144861
 build-arm64   6 xen-buildfail REGR. vs. 144861
 build-i3866 xen-buildfail REGR. vs. 144861
 build-amd64   6 xen-buildfail REGR. vs. 144861
 build-i386-xsm6 xen-buildfail REGR. vs. 144861
 build-amd64-xsm   6 xen-buildfail REGR. vs. 144861
 build-armhf   6 xen-buildfail REGR. vs. 144861

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-seattle   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 test-arm64-arm64-xl-thunderx  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-shadow 1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit1   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1)  blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit1   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-pvshim 1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ws16-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-pvhv2-intel  1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  1 build-check(1)  blocked n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-amd64-pvgrub  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvhv2-amd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ws16-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-i386-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-shadow1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvshim1 build-check(1)   blocked  n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 

[Xen-devel] [BUG] XEN crash and double fault when doing cpu online/offline

2020-01-07 Thread Tao Xu

Hi,

When I use xen-hptool cpu-offline/cpu-online to let CPU in a socket 
online/offline using the script as follows:


for((j=48;j<=95;j++));
do
  xen-hptool cpu-offline $j
done

for((j=48;j<=95;j++));
do
  xen-hptool cpu-online $j
done

Xen crash when cpu re-online. I use the upstream XEN(0dd92688) and try 
many days, it still crash. But if I only do cpu online/offline for CPU 
48~59, Xen will not crash. The bug can be reproduced when we do cpu 
online/offline for most CPU in a socket. And interesting thing is when 
we use the script as follow:


for((j=48;j<=95;j++));
do
  xen-hptool cpu-offline $j
  xen-hptool cpu-online $j
done

Xen will not crash too. Is there a bug in sched_credit2?

The crash message as follows:

(XEN) Adding cpu 77 to runqueue 1
(XEN) Adding cpu 78 to runqueue 1
(XEN) Adding cpu 79 to runqueue 1
(XEN) Adding cpu 80 to runqueue 1
(X(ENXE) N) *** DOUBLE FAULT ***
(XEN) Assertion 'debug->cpu == smp_processor_id()' failed at spinlock.c:88
(XEN) [ Xen-4.14-unstable  x86_64  debug=y   Not tainted ]
(XEN) Debugging connection not set up.
(XEN) CPU:48
(XEN) [ Xen-4.14-unstable  x86_64  debug=y   Not tainted ]
(XEN) CPU:0
(XEN) RIP:e008:[] _spin_unlock+0x40/0x42
(XEN) RFLAGS: 00010006   CONTEXT: hypervisor
(XEN) rax: 830059027fff   rbx: 0046   rcx: 
(XEN) rdx: 0030   rsi: 0046   rdi: 82d080819860
(XEN) rbp: 830059027a78   rsp: 830059027a78   r8:  
(XEN) r9:  0004   r10: 0001   r11: 0002
(XEN) r12: 82d08044d270   r13: 0010   r14: 82d08044d270
(XEN) r15: 82d0808197e0   cr0: 8005003b   cr4: 003526e0
(XEN) cr3: 59014000   cr2: 7f9d0fbc1cd9
(XEN) fsb: 7feb9960a740   gsb: 88fcdafc   gss: 
(XEN) ds: 002b   es: 002b   fs:    gs:    ss: e010   cs: e008
(XEN) Xen code around  (_spin_unlock+0x40/0x42):
(XEN)  ff 0f 66 83 07 01 5d c3 <0f> 0b 55 48 89 e5 e8 b5 ff ff ff fb 5d 
c3 55 48

(XEN) Xen stack trace from rsp=830059027a78:
(XEN)830059027a90 82d080240c17 0020 830059027ae8
(XEN)82d080252ea9 000d8081a6a0 0046 82d080819860
(XEN)0010 0006 82d08044d26a 82d08093e700
(XEN)0086 830059027b98 830059027af8 82d08024fe41
(XEN)830059027b18 82d08024fe7d  82d08092f3a0
(XEN)830059027b80 82d08024fee2 830059027b50 82d0802fa68e
(XEN)0001 830059027b60 82d080240b77 82d080819718
(XEN)82d08045b4d0 82d08092f3a0 830059027bd8 0086
(XEN)82d08093e71e 830059027bc8 82d0802503ea 82d08044d26a
(XEN)82d08093e703 0051 83203ffe20b0 8320104e00d8
(XEN)0001 8323996aad00 830059027c20 82d080250502
(XEN)82d00018 830059027c30 830059027bf0 830059027c38
(XEN)0051 0001 0001 83239969f580
(XEN)0003 830059027c80 82d0802303e8 0051
(XEN)005159027c78 82d080952b80 00e0 8323996aad00
(XEN)83203ffe20b0 83239969f580 82d080930008 82d08094c840
(XEN)0051 830059027cc0 82d0802307e1 8323996aad00
(XEN)0051 82d080930008 82d080803660 0051
(XEN)8323996aad00 830059027d58 82d08023f1fd 830059027d10
(XEN)0206 82d080819680 83239969f580 
(XEN) Xen call trace:
(XEN)[] R _spin_unlock+0x40/0x42
(XEN)[] F _spin_unlock_irqrestore+0xd/0x24
(XEN)[] F serial_puts+0x131/0x141
(XEN)[] F console_serial_puts+0x28/0x2a
(XEN)[] F drivers/char/console.c#__putstr+0x3a/0x8b
(XEN)[] F 
drivers/char/console.c#printk_start_of_line+0x14/0x17b
(XEN)[] F 
drivers/char/console.c#vprintk_common+0x8d/0x158

(XEN)[] F printk+0x4d/0x4f
(XEN)[] F common/sched_credit2.c#init_pdata+0xdd/0x441
(XEN)[] F 
common/sched_credit2.c#csched2_switch_sched+0x95/0xe2

(XEN)[] F schedule_cpu_add+0x18a/0x3fd
(XEN)[] F 
common/cpupool.c#cpupool_assign_cpu_locked+0x58/0x189

(XEN)[] F common/cpupool.c#cpu_callback+0x186/0x3c1
(XEN)[] F notifier_call_chain+0x6b/0x96
(XEN)[] F 
common/cpu.c#cpu_notifier_call_chain+0x1b/0x33

(XEN)[] F cpu_up+0xa8/0xe5
(XEN)[] F cpu_up_helper+0xf/0xa5
(XEN)[] F 
common/domain.c#continue_hypercall_tasklet_handler+0x4c/0xb9

(XEN)[] F common/tasklet.c#do_tasklet_work+0x76/0xa9
(XEN)[] F do_tasklet+0x58/0x8a
(XEN)[] F arch/x86/domain.c#idle_loop+0x40/0x9b
(XEN)
(XEN) RIP:e008:[](XEN)
(XEN) 
 82d0bffcf800(XEN) Panic on CPU 0:

(XEN) RFLAGS: 00010006   (XEN) Assertion 'debug->cpu == 
smp_processor_id()' failed at spinlock.c:88

CONTEXT: 

[Xen-devel] [qemu-mainline bisection] complete build-i386

2020-01-07 Thread osstest service owner
branch xen-unstable
xenbranch xen-unstable
job build-i386
testid xen-build

Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://git.qemu.org/qemu.git
Tree: seabios git://xenbits.xen.org/osstest/seabios.git
Tree: xen git://xenbits.xen.org/xen.git

*** Found and reproduced problem changeset ***

  Bug is in tree:  qemuu git://git.qemu.org/qemu.git
  Bug introduced:  b0b74e1f17508cb8cef8afd698558db1bd8999cc
  Bug not present: f17783e706ab9c7b3a2b69cf48e4f0ba40664f54
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/145780/


  commit b0b74e1f17508cb8cef8afd698558db1bd8999cc
  Merge: f17783e706 ddf9069963
  Author: Peter Maydell 
  Date:   Mon Jan 6 11:39:55 2020 +
  
  Merge remote-tracking branch 
'remotes/ehabkost/tags/python-next-pull-request' into staging
  
  Require Python >= 3.5 to build QEMU
  
  Python 2 EOL is 11 days away, we will stop supporting
  it in QEMU 5.0.
  
  # gpg: Signature made Fri 20 Dec 2019 16:49:02 GMT
  # gpg:using RSA key 
5A322FD5ABC4D3DBACCFD1AA2807936F984DC5A6
  # gpg:issuer "ehabk...@redhat.com"
  # gpg: Good signature from "Eduardo Habkost " [full]
  # Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D 
C5A6
  
  * remotes/ehabkost/tags/python-next-pull-request:
configure: Require Python >= 3.5
travis: Replace Python 3.4 build with 3.5
  
  Signed-off-by: Peter Maydell 
  
  commit ddf90699631db53c981b6a5a63d31c08e0eaeec7
  Author: Eduardo Habkost 
  Date:   Wed Oct 16 19:42:37 2019 -0300
  
  configure: Require Python >= 3.5
  
  Python 3.5 is the oldest Python version available on our
  supported build platforms, and Python 2 end of life will be 3
  weeks after the planned release date of QEMU 4.2.0.  Drop Python
  2 support from configure completely, and require Python 3.5 or
  newer.
  
  Signed-off-by: Eduardo Habkost 
  Message-Id: <20191016224237.26180-1-ehabk...@redhat.com>
  Reviewed-by: John Snow 
  Signed-off-by: Eduardo Habkost 
  
  commit 49233804f5c458d61d8eb903c19d62edb3434db2
  Author: Eduardo Habkost 
  Date:   Fri Dec 20 13:45:27 2019 -0300
  
  travis: Replace Python 3.4 build with 3.5
  
  We'll start requiring Python 3.5 to build QEMU.
  
  Signed-off-by: Eduardo Habkost 


For bisection revision-tuple graph see:
   
http://logs.test-lab.xenproject.org/osstest/results/bisect/qemu-mainline/build-i386.xen-build.html
Revision IDs in each graph node refer, respectively, to the Trees above.


Running cs-bisection-step 
--graph-out=/home/logs/results/bisect/qemu-mainline/build-i386.xen-build 
--summary-out=tmp/145780.bisection-summary --basis-template=144861 
--blessings=real,real-bisect qemu-mainline build-i386 xen-build
Searching for failure / basis pass:
 145770 fail [host=italia0] / 145664 [host=debina1] 145649 [host=elbling1] 
145624 [host=pinot1] 145592 ok.
Failure / basis pass flights: 145770 / 145592
(tree with no url: minios)
Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://git.qemu.org/qemu.git
Tree: seabios git://xenbits.xen.org/osstest/seabios.git
Tree: xen git://xenbits.xen.org/xen.git
Latest cc617b6e1430242f8d042c71c2d923dbc6436a36 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
035eed4c0d257c905a556fa0f4865a0c077b4e7f 
f21b5a4aeb020f2a5e2c6503f906a9349dd2f069 
0dd92688080202adcc43dcb3486d4143110a66d5
Basis pass b948a496150f4ae4f656c0f0ab672608723c80e6 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
f0dcfddecee8b860e015bb07d67cfcbdfbfd51d9 
f21b5a4aeb020f2a5e2c6503f906a9349dd2f069 
7b3c5b70a32303b46d0d051e695f18d72cce5ed0
Generating revisions with ./adhoc-revtuple-generator  
git://xenbits.xen.org/osstest/ovmf.git#b948a496150f4ae4f656c0f0ab672608723c80e6-cc617b6e1430242f8d042c71c2d923dbc6436a36
 
git://xenbits.xen.org/qemu-xen-traditional.git#d0d8ad39ecb51cd7497cd524484fe09f50876798-d0d8ad39ecb51cd7497cd524484fe09f50876798
 
git://git.qemu.org/qemu.git#f0dcfddecee8b860e015bb07d67cfcbdfbfd51d9-035eed4c0d257c905a556fa0f4865a0c077b4e7f
 
git://xenbits.xen.org/osstest/seabios.git#f21b5a4aeb020f2a5e2c6503f906a9349dd2f069-f21\
 b5a4aeb020f2a5e2c6503f906a9349dd2f069 
git://xenbits.xen.org/xen.git#7b3c5b70a32303b46d0d051e695f18d72cce5ed0-0dd92688080202adcc43dcb3486d4143110a66d5
Loaded 84734 nodes in revision graph
Searching for test results:
 145529 [host=elbling1]
 145530 [host=elbling1]
 145532 [host=albana0]
 145533 [host=huxelrebe0]
 145534 fail irrelevant
 145536 [host=albana0]
 145547 [host=elbling1]
 145537 [host=elbling1]
 145562 [host=huxelrebe1]
 145539 [host=elbling1]
 145605 [host=albana1]
 145564 [host=elbling1]
 145541 [host=elbling1]
 145581 [host=albana1]
 145543 [host=albana0]
 145566 [host=elbling1]
 145535 [host=elbling1]
 145544 

[Xen-devel] [qemu-mainline test] 145770: regressions - FAIL

2020-01-07 Thread osstest service owner
flight 145770 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/145770/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-xsm   6 xen-buildfail REGR. vs. 144861
 build-arm64   6 xen-buildfail REGR. vs. 144861
 build-i3866 xen-buildfail REGR. vs. 144861
 build-amd64   6 xen-buildfail REGR. vs. 144861
 build-i386-xsm6 xen-buildfail REGR. vs. 144861
 build-amd64-xsm   6 xen-buildfail REGR. vs. 144861
 build-armhf   6 xen-buildfail REGR. vs. 144861

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 test-amd64-amd64-xl-credit1   1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-amd64-qemuu-nested-amd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-pvhv2-intel  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-seattle   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  1 build-check(1)  blocked n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvhv2-amd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvshim1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qcow2 1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-i386-pvgrub  1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-amd64-qemuu-nested-intel  1 build-check(1)  blocked n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-arm64-arm64-xl-thunderx  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-shadow 1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-amd64-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 

[Xen-devel] [ovmf test] 145767: all pass - PUSHED

2020-01-07 Thread osstest service owner
flight 145767 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/145767/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 70911f1f4aee0366b6122f2b90d367ec0f066beb
baseline version:
 ovmf cc617b6e1430242f8d042c71c2d923dbc6436a36

Last test of basis   145699  2020-01-07 01:09:14 Z1 days
Testing same since   145767  2020-01-08 00:39:09 Z0 days1 attempts


People who touched revisions under test:
  Eric Dong 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   cc617b6e14..70911f1f4a  70911f1f4aee0366b6122f2b90d367ec0f066beb -> 
xen-tested-master

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable test] 145749: regressions - FAIL

2020-01-07 Thread osstest service owner
flight 145749 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/145749/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-libvirt-raw 18 leak-check/check fail REGR. vs. 145725

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-rtds 18 guest-localmigrate/x10   fail  like 145725
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 145725
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 145725
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 145725
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 145725
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 145725
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 145725
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 145725
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 145725
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 145725
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 145725
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 xen  f383de87a2fb077f1fdbd4594493af613b15c233
baseline version:
 xen  0dd92688080202adcc43dcb3486d4143110a66d5

Last test of basis   145725  2020-01-07 08:02:53 Z0 days
Testing same since   145749  2020-01-07 17:36:48 Z0 days1 attempts


People who touched 

[Xen-devel] [qemu-mainline test] 145765: regressions - FAIL

2020-01-07 Thread osstest service owner
flight 145765 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/145765/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-xsm   6 xen-buildfail REGR. vs. 144861
 build-arm64   6 xen-buildfail REGR. vs. 144861
 build-i3866 xen-buildfail REGR. vs. 144861
 build-amd64   6 xen-buildfail REGR. vs. 144861
 build-i386-xsm6 xen-buildfail REGR. vs. 144861
 build-amd64-xsm   6 xen-buildfail REGR. vs. 144861
 build-armhf   6 xen-buildfail REGR. vs. 144861

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-amd64-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qcow2 1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  1 build-check(1)  blocked n/a
 test-amd64-amd64-qemuu-nested-amd  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-raw1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit1   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ws16-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-ws16-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-shadow 1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  1 build-check(1) blocked n/a
 test-arm64-arm64-xl-thunderx  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-shadow1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-pvshim 1 build-check(1)   blocked  n/a
 test-amd64-amd64-pygrub   1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-qemuu-nested-intel  1 build-check(1)  blocked n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-pvhv2-intel  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 test-amd64-amd64-xl-credit1   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 

[Xen-devel] [qemu-mainline test] 145759: regressions - FAIL

2020-01-07 Thread osstest service owner
flight 145759 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/145759/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-xsm   6 xen-buildfail REGR. vs. 144861
 build-arm64   6 xen-buildfail REGR. vs. 144861
 build-i3866 xen-buildfail REGR. vs. 144861
 build-amd64   6 xen-buildfail REGR. vs. 144861
 build-i386-xsm6 xen-buildfail REGR. vs. 144861
 build-amd64-xsm   6 xen-buildfail REGR. vs. 144861
 build-armhf   6 xen-buildfail REGR. vs. 144861

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-credit1   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-raw1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-shadow 1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-pvshim 1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  1 build-check(1)  blocked n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-thunderx  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1) blocked n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-amd64-qemuu-nested-intel  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qcow2 1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ws16-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-pvhv2-amd  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-seattle   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-pvhv2-intel  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-pygrub   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  1 build-check(1) blocked n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ws16-amd64  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-shadow

[Xen-devel] [RFC PATCH V2 11/11] x86: tsc: avoid system instability in hibernation

2020-01-07 Thread Anchal Agarwal
From: Eduardo Valentin 

System instability are seen during resume from hibernation when system
is under heavy CPU load. This is due to the lack of update of sched
clock data, and the scheduler would then think that heavy CPU hog
tasks need more time in CPU, causing the system to freeze
during the unfreezing of tasks. For example, threaded irqs,
and kernel processes servicing network interface may be delayed
for several tens of seconds, causing the system to be unreachable.

Situation like this can be reported by using lockup detectors
such as workqueue lockup detectors:

[root@ip-172-31-67-114 ec2-user]# echo disk > /sys/power/state

Message from syslogd@ip-172-31-67-114 at May  7 18:23:21 ...
 kernel:BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 
57s!

Message from syslogd@ip-172-31-67-114 at May  7 18:23:21 ...
 kernel:BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 
57s!

Message from syslogd@ip-172-31-67-114 at May  7 18:23:21 ...
 kernel:BUG: workqueue lockup - pool cpus=3 node=0 flags=0x1 nice=0 stuck for 
57s!

Message from syslogd@ip-172-31-67-114 at May  7 18:29:06 ...
 kernel:BUG: workqueue lockup - pool cpus=3 node=0 flags=0x1 nice=0 stuck for 
403s!

The fix for this situation is to mark the sched clock as unstable
as early as possible in the resume path, leaving it unstable
for the duration of the resume process. This will force the
scheduler to attempt to align the sched clock across CPUs using
the delta with time of day, updating sched clock data. In a post
hibernation event, we can then mark the sched clock as stable
again, avoiding unnecessary syncs with time of day on systems
in which TSC is reliable.

Reviewed-by: Erik Quanstrom 
Reviewed-by: Frank van der Linden 
Reviewed-by: Balbir Singh 
Reviewed-by: Munehisa Kamata 
Tested-by: Anchal Agarwal 
Signed-off-by: Eduardo Valentin 
---
 arch/x86/kernel/tsc.c   | 29 +
 include/linux/sched/clock.h |  5 +
 kernel/sched/clock.c|  4 ++--
 3 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 7e322e2daaf5..ae77b8bc4e46 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1534,3 +1535,31 @@ unsigned long calibrate_delay_is_known(void)
return 0;
 }
 #endif
+
+static int tsc_pm_notifier(struct notifier_block *notifier,
+   unsigned long pm_event, void *unused)
+{
+   switch (pm_event) {
+   case PM_HIBERNATION_PREPARE:
+   clear_sched_clock_stable();
+   break;
+   case PM_POST_HIBERNATION:
+   /* Set back to the default */
+   if (!check_tsc_unstable())
+   set_sched_clock_stable();
+   break;
+   }
+
+   return 0;
+};
+
+static struct notifier_block tsc_pm_notifier_block = {
+   .notifier_call = tsc_pm_notifier,
+};
+
+static int tsc_setup_pm_notifier(void)
+{
+   return register_pm_notifier(_pm_notifier_block);
+}
+
+subsys_initcall(tsc_setup_pm_notifier);
diff --git a/include/linux/sched/clock.h b/include/linux/sched/clock.h
index 867d588314e0..902654ac5f7e 100644
--- a/include/linux/sched/clock.h
+++ b/include/linux/sched/clock.h
@@ -32,6 +32,10 @@ static inline void clear_sched_clock_stable(void)
 {
 }
 
+static inline void set_sched_clock_stable(void)
+{
+}
+
 static inline void sched_clock_idle_sleep_event(void)
 {
 }
@@ -51,6 +55,7 @@ static inline u64 local_clock(void)
 }
 #else
 extern int sched_clock_stable(void);
+extern void set_sched_clock_stable(void);
 extern void clear_sched_clock_stable(void);
 
 /*
diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
index 1152259a4ca0..374d40e5b1a2 100644
--- a/kernel/sched/clock.c
+++ b/kernel/sched/clock.c
@@ -116,7 +116,7 @@ static void __scd_stamp(struct sched_clock_data *scd)
scd->tick_raw = sched_clock();
 }
 
-static void __set_sched_clock_stable(void)
+void set_sched_clock_stable(void)
 {
struct sched_clock_data *scd;
 
@@ -236,7 +236,7 @@ static int __init sched_clock_init_late(void)
smp_mb(); /* matches {set,clear}_sched_clock_stable() */
 
if (__sched_clock_stable_early)
-   __set_sched_clock_stable();
+   set_sched_clock_stable();
 
return 0;
 }
-- 
2.15.3.AMZN


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [RFC PATCH V2 10/11] PM / hibernate: update the resume offset on SNAPSHOT_SET_SWAP_AREA

2020-01-07 Thread Anchal Agarwal
From: Aleksei Besogonov 

The SNAPSHOT_SET_SWAP_AREA is supposed to be used to set the hibernation
offset on a running kernel to enable hibernating to a swap file.
However, it doesn't actually update the swsusp_resume_block variable. As
a result, the hibernation fails at the last step (after all the data is
written out) in the validation of the swap signature in
mark_swapfiles().

Before this patch, the command line processing was the only place where
swsusp_resume_block was set.

Signed-off-by: Aleksei Besogonov 
Signed-off-by: Munehisa Kamata 
Signed-off-by: Anchal Agarwal 
---
 kernel/power/user.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/power/user.c b/kernel/power/user.c
index 77438954cc2b..d396e313cb7b 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -374,8 +374,12 @@ static long snapshot_ioctl(struct file *filp, unsigned int 
cmd,
if (swdev) {
offset = swap_area.offset;
data->swap = swap_type_of(swdev, offset, NULL);
-   if (data->swap < 0)
+   if (data->swap < 0) {
error = -ENODEV;
+   } else {
+   swsusp_resume_device = swdev;
+   swsusp_resume_block = offset;
+   }
} else {
data->swap = -1;
error = -EINVAL;
-- 
2.15.3.AMZN


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [RFC PATCH V2 09/11] xen: Clear IRQD_IRQ_STARTED flag during shutdown PIRQs

2020-01-07 Thread Anchal Agarwal
shutdown_pirq is invoked during hibernation path and hence
PIRQs should be restarted during resume.
Before this commit'020db9d3c1dc0a' xen/events: Fix interrupt lost
during irq_disable and irq_enable startup_pirq was automatically
called during irq_enable however, after this commit pirq's did not
get explicitly started once resumed from hibernation.

chip->irq_startup is called only if IRQD_IRQ_STARTED is unset during
irq_startup on resume. This flag gets cleared by free_irq->irq_shutdown
during suspend. free_irq() never gets explicitly called for ioapic-edge
and ioapic-level interrupts as respective drivers do nothing during
suspend/resume. So we shut them down explicitly in the first place in
syscore_suspend path to clear IRQ<>event channel mapping. shutdown_pirq
being called explicitly during suspend does not clear this flags, hence
.irq_enable is called in irq_startup during resume instead and pirq's
never start up.

Signed-off-by: Anchal Agarwal 
---
 drivers/xen/events/events_base.c | 1 +
 include/linux/irq.h  | 1 +
 kernel/irq/chip.c| 3 ++-
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index b893536d8af4..aae7c4997b51 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -1606,6 +1606,7 @@ void xen_shutdown_pirqs(void)
continue;
 
shutdown_pirq(irq_get_irq_data(info->irq));
+   irq_state_clr_started(irq_to_desc(info->irq));
}
 }
 
diff --git a/include/linux/irq.h b/include/linux/irq.h
index fb301cf29148..1e125cd22cf0 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -745,6 +745,7 @@ extern int irq_set_msi_desc(unsigned int irq, struct 
msi_desc *entry);
 extern int irq_set_msi_desc_off(unsigned int irq_base, unsigned int irq_offset,
struct msi_desc *entry);
 extern struct irq_data *irq_get_irq_data(unsigned int irq);
+extern void irq_state_clr_started(struct irq_desc *desc);
 
 static inline struct irq_chip *irq_get_chip(unsigned int irq)
 {
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index b76703b2c0af..3e8a36c673d6 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -173,10 +173,11 @@ static void irq_state_clr_masked(struct irq_desc *desc)
irqd_clear(>irq_data, IRQD_IRQ_MASKED);
 }
 
-static void irq_state_clr_started(struct irq_desc *desc)
+void irq_state_clr_started(struct irq_desc *desc)
 {
irqd_clear(>irq_data, IRQD_IRQ_STARTED);
 }
+EXPORT_SYMBOL_GPL(irq_state_clr_started);
 
 static void irq_state_set_started(struct irq_desc *desc)
 {
-- 
2.15.3.AMZN


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [RFC PATCH V2 08/11] x86/xen: close event channels for PIRQs in system core suspend callback

2020-01-07 Thread Anchal Agarwal
From: Munehisa Kamata 

There are no pm handlers for the legacy devices, so during tear down
stale event channel <> IRQ mapping may still remain in the image and resume
may fail. To avoid adding much code by implementing handlers for legacy
devices, add a simple helper function to "shutdown" active PIRQs, which
actually closes event channels but keeps related IRQ structures intact.
PM suspend/hibernation code will rely on this.
Close event channels allocated for devices which are backed by PIRQ and
still active when suspending the system core. Normally, the devices are
emulated legacy devices, e.g. PS/2 keyboard, floppy controller and etc.
Without this, in PM hibernation, information about the event channel
remains in hibernation image, but there is no guarantee that the same
event channel numbers are assigned to the devices when restoring the
system. This may cause conflict like the following and prevent some
devices from being restored correctly.

[  102.330821] [ cut here ]
[  102.333264] WARNING: CPU: 0 PID: 2324 at
drivers/xen/events/events_base.c:878 bind_evtchn_to_irq+0x88/0xf0
...
[  102.348057] Call Trace:
[  102.348057]  [] dump_stack+0x63/0x84
[  102.348057]  [] __warn+0xd1/0xf0
[  102.348057]  [] warn_slowpath_null+0x1d/0x20
[  102.348057]  [] bind_evtchn_to_irq+0x88/0xf0
[  102.348057]  [] ? blkif_copy_from_grant+0xb0/0xb0 
[xen_blkfront]
[  102.348057]  [] bind_evtchn_to_irqhandler+0x27/0x80
[  102.348057]  [] talk_to_blkback+0x425/0xcd0 [xen_blkfront]
[  102.348057]  [] ? __kmalloc+0x1ea/0x200
[  102.348057]  [] blkfront_restore+0x2d/0x60 [xen_blkfront]
[  102.348057]  [] xenbus_dev_restore+0x58/0x100
[  102.348057]  [] ?  xenbus_frontend_delayed_resume+0x20/0x20
[  102.348057]  [] xenbus_dev_cond_restore+0x1e/0x30
[  102.348057]  [] dpm_run_callback+0x4e/0x130
[  102.348057]  [] device_resume+0xe7/0x210
[  102.348057]  [] ? pm_dev_dbg+0x80/0x80
[  102.348057]  [] dpm_resume+0x114/0x2f0
[  102.348057]  [] hibernation_snapshot+0x15f/0x380
[  102.348057]  [] hibernate+0x183/0x290
[  102.348057]  [] state_store+0xcf/0xe0
[  102.348057]  [] kobj_attr_store+0xf/0x20
[  102.348057]  [] sysfs_kf_write+0x3a/0x50
[  102.348057]  [] kernfs_fop_write+0x10b/0x190
[  102.348057]  [] __vfs_write+0x28/0x120
[  102.348057]  [] ? rw_verify_area+0x49/0xb0
[  102.348057]  [] vfs_write+0xb2/0x1b0
[  102.348057]  [] SyS_write+0x46/0xa0
[  102.348057]  [] entry_SYSCALL_64_fastpath+0x1a/0xa9
[  102.423005] ---[ end trace b8d6718e22e2b107 ]---
[  102.425031] genirq: Flags mismatch irq 6.  (blkif) vs.  
(floppy)

Note that we don't explicitly re-allocate event channels for such
devices in the resume callback. Re-allocation will occur when PM core
re-enable IRQs for the devices at later point.

Signed-off-by: Munehisa Kamata 
Signed-off-by: Anchal Agarwal 
---
 arch/x86/xen/suspend.c   |  2 ++
 drivers/xen/events/events_base.c | 12 
 include/xen/events.h |  1 +
 3 files changed, 15 insertions(+)

diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c
index dae0f74f5390..affa63d4b6bd 100644
--- a/arch/x86/xen/suspend.c
+++ b/arch/x86/xen/suspend.c
@@ -105,6 +105,8 @@ static int xen_syscore_suspend(void)
xen_save_steal_clock(cpu);
}
 
+   xen_shutdown_pirqs();
+
xrfp.domid = DOMID_SELF;
xrfp.gpfn = __pa(HYPERVISOR_shared_info) >> PAGE_SHIFT;
 
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 569437c158ca..b893536d8af4 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -1597,6 +1597,18 @@ void xen_irq_resume(void)
restore_pirqs();
 }
 
+void xen_shutdown_pirqs(void)
+{
+   struct irq_info *info;
+
+   list_for_each_entry(info, _irq_list_head, list) {
+   if (info->type != IRQT_PIRQ || !VALID_EVTCHN(info->evtchn))
+   continue;
+
+   shutdown_pirq(irq_get_irq_data(info->irq));
+   }
+}
+
 static struct irq_chip xen_dynamic_chip __read_mostly = {
.name   = "xen-dyn",
 
diff --git a/include/xen/events.h b/include/xen/events.h
index c0e6a0598397..39b2c4e4d2ef 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -71,6 +71,7 @@ static inline void notify_remote_via_evtchn(int port)
 void notify_remote_via_irq(int irq);
 
 void xen_irq_resume(void);
+void xen_shutdown_pirqs(void);
 
 /* Clear an irq's pending state, in preparation for polling on it */
 void xen_clear_irq_pending(int irq);
-- 
2.15.3.AMZN


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [RFC PATCH V2 07/11] x86/xen: save and restore steal clock during hibernation

2020-01-07 Thread Anchal Agarwal
From: Munehisa Kamata 

Currently, steal time accounting code in scheduler expects steal clock
callback to provide monotonically increasing value. If the accounting
code receives a smaller value than previous one, it uses a negative
value to calculate steal time and results in incorrectly updated idle
and steal time accounting. This breaks userspace tools which read
/proc/stat.

top - 08:05:35 up  2:12,  3 users,  load average: 0.00, 0.07, 0.23
Tasks:  80 total,   1 running,  79 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,30100.0%id,  0.0%wa,  0.0%hi, 0.0%si,
-1253874204672.0%st

This can actually happen when a Xen PVHVM guest gets restored from
hibernation, because such a restored guest is just a fresh domain from
Xen perspective and the time information in runstate info starts over
from scratch.

Introduce xen_save_steal_clock() which saves current steal clock values
of all present CPUs in runstate info into per-cpu variables during system
core ops suspend callbacks. Its couterpart, xen_restore_steal_clock(),
restores a boot CPU's steal clock in the system core resume callback. It
sets offset if it found the current values in runstate info are smaller
than previous ones. xen_steal_clock() is also modified to use the offset
to ensure that scheduler only sees monotonically increasing number.

For non-boot CPUs, restore after they're brought up, because runstate
info for non-boot CPUs are not active until then.

[Anchal Changelog: Merged patch xen/time: introduce 
xen_{save,restore}_steal_clock
with this one for better code readability]
Signed-off-by: Anchal Agarwal 
Signed-off-by: Munehisa Kamata 
---
 arch/x86/xen/suspend.c | 13 -
 arch/x86/xen/time.c|  3 +++
 drivers/xen/time.c | 28 +++-
 include/xen/xen-ops.h  |  2 ++
 4 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c
index 784c4484100b..dae0f74f5390 100644
--- a/arch/x86/xen/suspend.c
+++ b/arch/x86/xen/suspend.c
@@ -91,12 +91,20 @@ void xen_arch_suspend(void)
 static int xen_syscore_suspend(void)
 {
struct xen_remove_from_physmap xrfp;
-   int ret;
+   int cpu, ret;
 
/* Xen suspend does similar stuffs in its own logic */
if (xen_suspend_mode_is_xen_suspend())
return 0;
 
+   for_each_present_cpu(cpu) {
+   /*
+* Nonboot CPUs are already offline, but the last copy of
+* runstate info is still accessible.
+*/
+   xen_save_steal_clock(cpu);
+   }
+
xrfp.domid = DOMID_SELF;
xrfp.gpfn = __pa(HYPERVISOR_shared_info) >> PAGE_SHIFT;
 
@@ -118,6 +126,9 @@ static void xen_syscore_resume(void)
 
pvclock_resume();
 
+   /* Nonboot CPUs will be resumed when they're brought up */
+   xen_restore_steal_clock(smp_processor_id());
+
gnttab_resume();
 }
 
diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
index befbdd8b17f0..8cf632dda605 100644
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -537,6 +537,9 @@ static void xen_hvm_setup_cpu_clockevents(void)
 {
int cpu = smp_processor_id();
xen_setup_runstate_info(cpu);
+   if (cpu)
+   xen_restore_steal_clock(cpu);
+
/*
 * xen_setup_timer(cpu) - snprintf is bad in atomic context. Hence
 * doing it xen_hvm_cpu_notify (which gets called by smp_init during
diff --git a/drivers/xen/time.c b/drivers/xen/time.c
index 0968859c29d0..3713d716070c 100644
--- a/drivers/xen/time.c
+++ b/drivers/xen/time.c
@@ -20,6 +20,8 @@
 
 /* runstate info updated by Xen */
 static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate);
+static DEFINE_PER_CPU(u64, xen_prev_steal_clock);
+static DEFINE_PER_CPU(u64, xen_steal_clock_offset);
 
 static DEFINE_PER_CPU(u64[4], old_runstate_time);
 
@@ -149,7 +151,7 @@ bool xen_vcpu_stolen(int vcpu)
return per_cpu(xen_runstate, vcpu).state == RUNSTATE_runnable;
 }
 
-u64 xen_steal_clock(int cpu)
+static u64 __xen_steal_clock(int cpu)
 {
struct vcpu_runstate_info state;
 
@@ -157,6 +159,30 @@ u64 xen_steal_clock(int cpu)
return state.time[RUNSTATE_runnable] + state.time[RUNSTATE_offline];
 }
 
+u64 xen_steal_clock(int cpu)
+{
+   return __xen_steal_clock(cpu) + per_cpu(xen_steal_clock_offset, cpu);
+}
+
+void xen_save_steal_clock(int cpu)
+{
+   per_cpu(xen_prev_steal_clock, cpu) = xen_steal_clock(cpu);
+}
+
+void xen_restore_steal_clock(int cpu)
+{
+   u64 steal_clock = __xen_steal_clock(cpu);
+
+   if (per_cpu(xen_prev_steal_clock, cpu) > steal_clock) {
+   /* Need to update the offset */
+   per_cpu(xen_steal_clock_offset, cpu) =
+   per_cpu(xen_prev_steal_clock, cpu) - steal_clock;
+   } else {
+   /* Avoid unnecessary steal clock warp */
+   per_cpu(xen_steal_clock_offset, cpu) = 0;
+   }
+}
+
 void 

[Xen-devel] [RFC PATCH V2 06/11] xen-blkfront: add callbacks for PM suspend and hibernation

2020-01-07 Thread Anchal Agarwal
From: Munehisa Kamata 

Add freeze, thaw and restore callbacks for PM suspend and hibernation
support. All frontend drivers that needs to use PM_HIBERNATION/PM_SUSPEND
events, need to implement these xenbus_driver callbacks.
The freeze handler stops a block-layer queue and disconnect the
frontend from the backend while freeing ring_info and associated resources.
The restore handler re-allocates ring_info and re-connect to the
backend, so the rest of the kernel can continue to use the block device
transparently. Also, the handlers are used for both PM suspend and
hibernation so that we can keep the existing suspend/resume callbacks for
Xen suspend without modification. Before disconnecting from backend,
we need to prevent any new IO from being queued and wait for existing
IO to complete. Freeze/unfreeze of the queues will guarantee that there
are no requests in use on the shared ring.

Note:For older backends,if a backend doesn't have commit'12ea729645ace'
xen/blkback: unmap all persistent grants when frontend gets disconnected,
the frontend may see massive amount of grant table warning when freeing
resources.
[   36.852659] deferring g.e. 0xf9 (pfn 0x)
[   36.855089] xen:grant_table: WARNING:e.g. 0x112 still in use!

In this case, persistent grants would need to be disabled.

[Anchal Changelog: Removed timeout/request during blkfront freeze.
Fixed major part of the code to work with blk-mq]
Signed-off-by: Anchal Agarwal 
Signed-off-by: Munehisa Kamata 
---
 drivers/block/xen-blkfront.c | 119 ---
 1 file changed, 112 insertions(+), 7 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index a74d03913822..b1d38ca4600f 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -47,6 +47,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -79,6 +81,8 @@ enum blkif_state {
BLKIF_STATE_DISCONNECTED,
BLKIF_STATE_CONNECTED,
BLKIF_STATE_SUSPENDED,
+   BLKIF_STATE_FREEZING,
+   BLKIF_STATE_FROZEN
 };
 
 struct grant {
@@ -220,6 +224,7 @@ struct blkfront_info
struct list_head requests;
struct bio_list bio_list;
struct list_head info_list;
+   struct completion wait_backend_disconnected;
 };
 
 static unsigned int nr_minors;
@@ -261,6 +266,7 @@ static DEFINE_SPINLOCK(minor_lock);
 static int blkfront_setup_indirect(struct blkfront_ring_info *rinfo);
 static void blkfront_gather_backend_features(struct blkfront_info *info);
 static int negotiate_mq(struct blkfront_info *info);
+static void __blkif_free(struct blkfront_info *info);
 
 static int get_id_from_freelist(struct blkfront_ring_info *rinfo)
 {
@@ -995,6 +1001,7 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u16 
sector_size,
info->sector_size = sector_size;
info->physical_sector_size = physical_sector_size;
blkif_set_queue_limits(info);
+   init_completion(>wait_backend_disconnected);
 
return 0;
 }
@@ -1218,6 +1225,8 @@ static void xlvbd_release_gendisk(struct blkfront_info 
*info)
 /* Already hold rinfo->ring_lock. */
 static inline void kick_pending_request_queues_locked(struct 
blkfront_ring_info *rinfo)
 {
+   if (unlikely(rinfo->dev_info->connected == BLKIF_STATE_FREEZING))
+   return;
if (!RING_FULL(>ring))
blk_mq_start_stopped_hw_queues(rinfo->dev_info->rq, true);
 }
@@ -1341,8 +1350,6 @@ static void blkif_free_ring(struct blkfront_ring_info 
*rinfo)
 
 static void blkif_free(struct blkfront_info *info, int suspend)
 {
-   unsigned int i;
-
/* Prevent new requests being issued until we fix things up. */
info->connected = suspend ?
BLKIF_STATE_SUSPENDED : BLKIF_STATE_DISCONNECTED;
@@ -1350,6 +1357,13 @@ static void blkif_free(struct blkfront_info *info, int 
suspend)
if (info->rq)
blk_mq_stop_hw_queues(info->rq);
 
+   __blkif_free(info);
+}
+
+static void __blkif_free(struct blkfront_info *info)
+{
+   unsigned int i;
+
for (i = 0; i < info->nr_rings; i++)
blkif_free_ring(>rinfo[i]);
 
@@ -1553,8 +1567,10 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
struct blkfront_ring_info *rinfo = (struct blkfront_ring_info *)dev_id;
struct blkfront_info *info = rinfo->dev_info;
 
-   if (unlikely(info->connected != BLKIF_STATE_CONNECTED))
-   return IRQ_HANDLED;
+   if (unlikely(info->connected != BLKIF_STATE_CONNECTED)) {
+   if (info->connected != BLKIF_STATE_FREEZING)
+   return IRQ_HANDLED;
+   }
 
spin_lock_irqsave(>ring_lock, flags);
  again:
@@ -2020,6 +2036,7 @@ static int blkif_recover(struct blkfront_info *info)
struct bio *bio;
unsigned int segs;
 
+   bool frozen = info->connected == BLKIF_STATE_FROZEN;
blkfront_gather_backend_features(info);
/* 

[Xen-devel] [RFC PATCH V2 05/11] xen-netfront: add callbacks for PM suspend and hibernation support

2020-01-07 Thread Anchal Agarwal
From: Munehisa Kamata 

Add freeze, thaw and restore callbacks for PM suspend and hibernation
support. The freeze handler simply disconnects the frotnend from the
backend and frees resources associated with queues after disabling the
net_device from the system. The restore handler just changes the
frontend state and let the xenbus handler to re-allocate the resources
and re-connect to the backend. This can be performed transparently to
the rest of the system. The handlers are used for both PM suspend and
hibernation so that we can keep the existing suspend/resume callbacks
for Xen suspend without modification. Freezing netfront devices is
normally expected to finish within a few hundred milliseconds, but it
can rarely take more than 5 seconds and hit the hard coded timeout,
it would depend on backend state which may be congested and/or have
complex configuration. While it's rare case, longer default timeout
seems a bit more reasonable here to avoid hitting the timeout.
Also, make it configurable via module parameter so that we can cover
broader setups than what we know currently.

[Anchal changelog: Variable name fix and checkpatch.pl fixes]
Signed-off-by: Anchal Agarwal 
Signed-off-by: Munehisa Kamata 
---
 drivers/net/xen-netfront.c | 98 +-
 1 file changed, 97 insertions(+), 1 deletion(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 467fd0f0ffcd..aa7ef40378ca 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -43,6 +43,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -56,6 +57,12 @@
 #include 
 #include 
 
+enum netif_freeze_state {
+   NETIF_FREEZE_STATE_UNFROZEN,
+   NETIF_FREEZE_STATE_FREEZING,
+   NETIF_FREEZE_STATE_FROZEN,
+};
+
 /* Module parameters */
 #define MAX_QUEUES_DEFAULT 8
 static unsigned int xennet_max_queues;
@@ -63,6 +70,12 @@ module_param_named(max_queues, xennet_max_queues, uint, 
0644);
 MODULE_PARM_DESC(max_queues,
 "Maximum number of queues per virtual interface");
 
+static unsigned int netfront_freeze_timeout_secs = 10;
+module_param_named(freeze_timeout_secs,
+  netfront_freeze_timeout_secs, uint, 0644);
+MODULE_PARM_DESC(freeze_timeout_secs,
+"timeout when freezing netfront device in seconds");
+
 static const struct ethtool_ops xennet_ethtool_ops;
 
 struct netfront_cb {
@@ -160,6 +173,10 @@ struct netfront_info {
struct netfront_stats __percpu *tx_stats;
 
atomic_t rx_gso_checksum_fixup;
+
+   int freeze_state;
+
+   struct completion wait_backend_disconnected;
 };
 
 struct netfront_rx_info {
@@ -721,6 +738,21 @@ static int xennet_close(struct net_device *dev)
return 0;
 }
 
+static int xennet_disable_interrupts(struct net_device *dev)
+{
+   struct netfront_info *np = netdev_priv(dev);
+   unsigned int num_queues = dev->real_num_tx_queues;
+   unsigned int queue_index;
+   struct netfront_queue *queue;
+
+   for (queue_index = 0; queue_index < num_queues; ++queue_index) {
+   queue = >queues[queue_index];
+   disable_irq(queue->tx_irq);
+   disable_irq(queue->rx_irq);
+   }
+   return 0;
+}
+
 static void xennet_move_rx_slot(struct netfront_queue *queue, struct sk_buff 
*skb,
grant_ref_t ref)
 {
@@ -1301,6 +1333,8 @@ static struct net_device *xennet_create_dev(struct 
xenbus_device *dev)
 
np->queues = NULL;
 
+   init_completion(>wait_backend_disconnected);
+
err = -ENOMEM;
np->rx_stats = netdev_alloc_pcpu_stats(struct netfront_stats);
if (np->rx_stats == NULL)
@@ -1794,6 +1828,50 @@ static int xennet_create_queues(struct netfront_info 
*info,
return 0;
 }
 
+static int netfront_freeze(struct xenbus_device *dev)
+{
+   struct netfront_info *info = dev_get_drvdata(>dev);
+   unsigned long timeout = netfront_freeze_timeout_secs * HZ;
+   int err = 0;
+
+   xennet_disable_interrupts(info->netdev);
+
+   netif_device_detach(info->netdev);
+
+   info->freeze_state = NETIF_FREEZE_STATE_FREEZING;
+
+   /* Kick the backend to disconnect */
+   xenbus_switch_state(dev, XenbusStateClosing);
+
+   /* We don't want to move forward before the frontend is diconnected
+* from the backend cleanly.
+*/
+   timeout = wait_for_completion_timeout(>wait_backend_disconnected,
+ timeout);
+   if (!timeout) {
+   err = -EBUSY;
+   xenbus_dev_error(dev, err, "Freezing timed out;"
+"the device may become inconsistent state");
+   return err;
+   }
+
+   /* Tear down queues */
+   xennet_disconnect_backend(info);
+   xennet_destroy_queues(info);
+
+   info->freeze_state = NETIF_FREEZE_STATE_FROZEN;
+
+   return err;
+}
+
+static int 

[Xen-devel] [RFC PATCH V2 04/11] x86/xen: add system core suspend and resume callbacks

2020-01-07 Thread Anchal Agarwal
From: Munehisa Kamata 

Add Xen PVHVM specific system core callbacks for PM suspend and
hibernation support. The callbacks suspend and resume Xen
primitives,like shared_info, pvclock and grant table. Note that
Xen suspend can handle them in a different manner, but system
core callbacks are called from the context. So if the callbacks
are called from Xen suspend context, return immediately.

Signed-off-by: Agarwal Anchal 
Signed-off-by: Munehisa Kamata 
---
 arch/x86/xen/enlighten_hvm.c |  1 +
 arch/x86/xen/suspend.c   | 53 
 include/xen/xen-ops.h|  3 +++
 3 files changed, 57 insertions(+)

diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 75b1ec7a0fcd..138e71786e03 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -204,6 +204,7 @@ static void __init xen_hvm_guest_init(void)
if (xen_feature(XENFEAT_hvm_callback_vector))
xen_have_vector_callback = 1;
 
+   xen_setup_syscore_ops();
xen_hvm_smp_init();
WARN_ON(xen_cpuhp_setup(xen_cpu_up_prepare_hvm, xen_cpu_dead_hvm));
xen_unplug_emulated_devices();
diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c
index 1d83152c761b..784c4484100b 100644
--- a/arch/x86/xen/suspend.c
+++ b/arch/x86/xen/suspend.c
@@ -2,17 +2,22 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
+#include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
 
 #include "xen-ops.h"
 #include "mmu.h"
@@ -82,3 +87,51 @@ void xen_arch_suspend(void)
 
on_each_cpu(xen_vcpu_notify_suspend, NULL, 1);
 }
+
+static int xen_syscore_suspend(void)
+{
+   struct xen_remove_from_physmap xrfp;
+   int ret;
+
+   /* Xen suspend does similar stuffs in its own logic */
+   if (xen_suspend_mode_is_xen_suspend())
+   return 0;
+
+   xrfp.domid = DOMID_SELF;
+   xrfp.gpfn = __pa(HYPERVISOR_shared_info) >> PAGE_SHIFT;
+
+   ret = HYPERVISOR_memory_op(XENMEM_remove_from_physmap, );
+   if (!ret)
+   HYPERVISOR_shared_info = _dummy_shared_info;
+
+   return ret;
+}
+
+static void xen_syscore_resume(void)
+{
+   /* Xen suspend does similar stuffs in its own logic */
+   if (xen_suspend_mode_is_xen_suspend())
+   return;
+
+   /* No need to setup vcpu_info as it's already moved off */
+   xen_hvm_map_shared_info();
+
+   pvclock_resume();
+
+   gnttab_resume();
+}
+
+/*
+ * These callbacks will be called with interrupts disabled and when having only
+ * one CPU online.
+ */
+static struct syscore_ops xen_hvm_syscore_ops = {
+   .suspend = xen_syscore_suspend,
+   .resume = xen_syscore_resume
+};
+
+void __init xen_setup_syscore_ops(void)
+{
+   if (xen_hvm_domain())
+   register_syscore_ops(_hvm_syscore_ops);
+}
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
index 6c36e161dfd1..3b3992b5b0c2 100644
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -43,6 +43,9 @@ int xen_setup_shutdown_event(void);
 bool xen_suspend_mode_is_xen_suspend(void);
 bool xen_suspend_mode_is_pm_suspend(void);
 bool xen_suspend_mode_is_pm_hibernation(void);
+
+void xen_setup_syscore_ops(void);
+
 extern unsigned long *xen_contiguous_bitmap;
 
 #if defined(CONFIG_XEN_PV) || defined(CONFIG_ARM) || defined(CONFIG_ARM64)
-- 
2.15.3.AMZN


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [RFC PATCH V2 03/11] x86/xen: Introduce new function to map

2020-01-07 Thread Anchal Agarwal
Introduce a small function which re-uses shared page's PA allocated
during guest initialization time in reserve_shared_info() and not
allocate new page during resume flow.
It also  does the mapping of shared_info_page by calling
xen_hvm_init_shared_info() to use the function.

Signed-off-by: Anchal Agarwal 
---
 arch/x86/xen/enlighten_hvm.c | 7 +++
 arch/x86/xen/xen-ops.h   | 1 +
 2 files changed, 8 insertions(+)

diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index e138f7de52d2..75b1ec7a0fcd 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -27,6 +27,13 @@
 
 static unsigned long shared_info_pfn;
 
+void xen_hvm_map_shared_info(void)
+{
+   xen_hvm_init_shared_info();
+   if (shared_info_pfn)
+   HYPERVISOR_shared_info = __va(PFN_PHYS(shared_info_pfn));
+}
+
 void xen_hvm_init_shared_info(void)
 {
struct xen_add_to_physmap xatp;
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index 45a441c33d6d..d84c357994bd 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -56,6 +56,7 @@ void xen_enable_syscall(void);
 void xen_vcpu_restore(void);
 
 void xen_callback_vector(void);
+void xen_hvm_map_shared_info(void);
 void xen_hvm_init_shared_info(void);
 void xen_unplug_emulated_devices(void);
 
-- 
2.15.3.AMZN


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [RFC PATCH V2 02/11] xenbus: add freeze/thaw/restore callbacks support

2020-01-07 Thread Anchal Agarwal
From: Munehisa Kamata 

Since commit b3e96c0c7562 ("xen: use freeze/restore/thaw PM events for
suspend/resume/chkpt"), xenbus uses PMSG_FREEZE, PMSG_THAW and
PMSG_RESTORE events for Xen suspend. However, they're actually assigned
to xenbus_dev_suspend(), xenbus_dev_cancel() and xenbus_dev_resume()
respectively, and only suspend and resume callbacks are supported at
driver level. To support PM suspend and PM hibernation, modify the bus
level PM callbacks to invoke not only device driver's suspend/resume but
also freeze/thaw/restore.

Note that we'll use freeze/restore callbacks even for PM suspend whereas
suspend/resume callbacks are normally used in the case, becausae the
existing xenbus device drivers already have suspend/resume callbacks
specifically designed for Xen suspend. So we can allow the device
drivers to keep the existing callbacks wihtout modification.

[Anchal Changelog: Refactored the callbacks code]
Signed-off-by: Agarwal Anchal 
Signed-off-by: Munehisa Kamata 
---
 drivers/xen/xenbus/xenbus_probe.c | 99 ---
 include/xen/xenbus.h  |  3 ++
 2 files changed, 84 insertions(+), 18 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_probe.c 
b/drivers/xen/xenbus/xenbus_probe.c
index 5b471889d723..0fa868c2 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -49,6 +49,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -597,27 +598,44 @@ int xenbus_dev_suspend(struct device *dev)
struct xenbus_driver *drv;
struct xenbus_device *xdev
= container_of(dev, struct xenbus_device, dev);
-
+   bool xen_suspend = xen_suspend_mode_is_xen_suspend();
DPRINTK("%s", xdev->nodename);
 
if (dev->driver == NULL)
return 0;
drv = to_xenbus_driver(dev->driver);
-   if (drv->suspend)
-   err = drv->suspend(xdev);
-   if (err)
-   pr_warn("suspend %s failed: %i\n", dev_name(dev), err);
+
+   if (xen_suspend) {
+   if (drv->suspend)
+   err = drv->suspend(xdev);
+   } else {
+   if (drv->freeze) {
+   err = drv->freeze(xdev);
+   if (!err) {
+   free_otherend_watch(xdev);
+   free_otherend_details(xdev);
+   return 0;
+   }
+   }
+   }
+
+   if (err) {
+   pr_warn("%s %s failed: %i\n", xen_suspend ?
+   "suspend" : "freeze", dev_name(dev), err);
+   return err;
+   }
+
return 0;
 }
 EXPORT_SYMBOL_GPL(xenbus_dev_suspend);
 
 int xenbus_dev_resume(struct device *dev)
 {
-   int err;
+   int err = 0;
struct xenbus_driver *drv;
struct xenbus_device *xdev
= container_of(dev, struct xenbus_device, dev);
-
+   bool xen_suspend = xen_suspend_mode_is_xen_suspend();
DPRINTK("%s", xdev->nodename);
 
if (dev->driver == NULL)
@@ -625,24 +643,32 @@ int xenbus_dev_resume(struct device *dev)
drv = to_xenbus_driver(dev->driver);
err = talk_to_otherend(xdev);
if (err) {
-   pr_warn("resume (talk_to_otherend) %s failed: %i\n",
+   pr_warn("%s (talk_to_otherend) %s failed: %i\n",
+   xen_suspend ? "resume" : "restore",
dev_name(dev), err);
return err;
}
 
-   xdev->state = XenbusStateInitialising;
+   if (xen_suspend) {
+   xdev->state = XenbusStateInitialising;
+   if (drv->resume)
+   err = drv->resume(xdev);
+   } else {
+   if (drv->restore)
+   err = drv->restore(xdev);
+   }
 
-   if (drv->resume) {
-   err = drv->resume(xdev);
-   if (err) {
-   pr_warn("resume %s failed: %i\n", dev_name(dev), err);
-   return err;
-   }
+   if (err) {
+   pr_warn("%s %s failed: %i\n",
+   xen_suspend ? "resume" : "restore",
+   dev_name(dev), err);
+   return err;
}
 
err = watch_otherend(xdev);
if (err) {
-   pr_warn("resume (watch_otherend) %s failed: %d.\n",
+   pr_warn("%s (watch_otherend) %s failed: %d.\n",
+   xen_suspend ? "resume" : "restore",
dev_name(dev), err);
return err;
}
@@ -653,8 +679,45 @@ EXPORT_SYMBOL_GPL(xenbus_dev_resume);
 
 int xenbus_dev_cancel(struct device *dev)
 {
-   /* Do nothing */
-   DPRINTK("cancel");
+   int err = 0;
+   struct xenbus_driver *drv;
+   struct xenbus_device *xdev
+   = container_of(dev, struct xenbus_device, dev);
+   bool xen_suspend = 

[Xen-devel] [RFC PATCH V2 01/11] xen/manage: keep track of the on-going suspend mode

2020-01-07 Thread Anchal Agarwal
From: Munehisa Kamata 

Guest hibernation is different from xen suspend/resume/live migration.
Xen save/restore does not use pm_ops as is needed by guest hibernation.
Hibernation in guest follows ACPI path and is guest inititated , the
hibernation image is saved within guest as compared to later modes
which are xen toolstack assisted and image creation/storage is in
control of hypervisor/host machine.
To differentiate between Xen suspend and PM hibernation, keep track
of the on-going suspend mode by mainly using a new PM notifier.
Introduce simple functions which help to know the on-going suspend mode
so that other Xen-related code can behave differently according to the
current suspend mode.
Since Xen suspend doesn't have corresponding PM event, its main logic
is modfied to acquire pm_mutex and set the current mode.

Though, acquirng pm_mutex is still right thing to do, we may
see deadlock if PM hibernation is interrupted by Xen suspend.
PM hibernation depends on xenwatch thread to process xenbus state
transactions, but the thread will sleep to wait pm_mutex which is
already held by PM hibernation context in the scenario. Xen shutdown
code may need some changes to avoid the issue.

[Anchal Changelog: Merged patch xen/manage: introduce helper function
to know the on-going suspend mode into this one for better readability]
Signed-off-by: Anchal Agarwal 
Signed-off-by: Munehisa Kamata 
---
 drivers/xen/manage.c  | 73 +++
 include/xen/xen-ops.h |  3 +++
 2 files changed, 76 insertions(+)

diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
index cd046684e0d1..0b30ab522b77 100644
--- a/drivers/xen/manage.c
+++ b/drivers/xen/manage.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -40,6 +41,31 @@ enum shutdown_state {
 /* Ignore multiple shutdown requests. */
 static enum shutdown_state shutting_down = SHUTDOWN_INVALID;
 
+enum suspend_modes {
+   NO_SUSPEND = 0,
+   XEN_SUSPEND,
+   PM_SUSPEND,
+   PM_HIBERNATION,
+};
+
+/* Protected by pm_mutex */
+static enum suspend_modes suspend_mode = NO_SUSPEND;
+
+bool xen_suspend_mode_is_xen_suspend(void)
+{
+   return suspend_mode == XEN_SUSPEND;
+}
+
+bool xen_suspend_mode_is_pm_suspend(void)
+{
+   return suspend_mode == PM_SUSPEND;
+}
+
+bool xen_suspend_mode_is_pm_hibernation(void)
+{
+   return suspend_mode == PM_HIBERNATION;
+}
+
 struct suspend_info {
int cancelled;
 };
@@ -99,6 +125,10 @@ static void do_suspend(void)
int err;
struct suspend_info si;
 
+   lock_system_sleep();
+
+   suspend_mode = XEN_SUSPEND;
+
shutting_down = SHUTDOWN_SUSPEND;
 
err = freeze_processes();
@@ -162,6 +192,10 @@ static void do_suspend(void)
thaw_processes();
 out:
shutting_down = SHUTDOWN_INVALID;
+
+   suspend_mode = NO_SUSPEND;
+
+   unlock_system_sleep();
 }
 #endif /* CONFIG_HIBERNATE_CALLBACKS */
 
@@ -387,3 +421,42 @@ int xen_setup_shutdown_event(void)
 EXPORT_SYMBOL_GPL(xen_setup_shutdown_event);
 
 subsys_initcall(xen_setup_shutdown_event);
+
+static int xen_pm_notifier(struct notifier_block *notifier,
+  unsigned long pm_event, void *unused)
+{
+   switch (pm_event) {
+   case PM_SUSPEND_PREPARE:
+   suspend_mode = PM_SUSPEND;
+   break;
+   case PM_HIBERNATION_PREPARE:
+   case PM_RESTORE_PREPARE:
+   suspend_mode = PM_HIBERNATION;
+   break;
+   case PM_POST_SUSPEND:
+   case PM_POST_RESTORE:
+   case PM_POST_HIBERNATION:
+   /* Set back to the default */
+   suspend_mode = NO_SUSPEND;
+   break;
+   default:
+   pr_warn("Receive unknown PM event 0x%lx\n", pm_event);
+   return -EINVAL;
+   }
+
+   return 0;
+};
+
+static struct notifier_block xen_pm_notifier_block = {
+   .notifier_call = xen_pm_notifier
+};
+
+static int xen_setup_pm_notifier(void)
+{
+   if (!xen_hvm_domain())
+   return -ENODEV;
+
+   return register_pm_notifier(_pm_notifier_block);
+}
+
+subsys_initcall(xen_setup_pm_notifier);
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
index d89969aa9942..6c36e161dfd1 100644
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -40,6 +40,9 @@ u64 xen_steal_clock(int cpu);
 
 int xen_setup_shutdown_event(void);
 
+bool xen_suspend_mode_is_xen_suspend(void);
+bool xen_suspend_mode_is_pm_suspend(void);
+bool xen_suspend_mode_is_pm_hibernation(void);
 extern unsigned long *xen_contiguous_bitmap;
 
 #if defined(CONFIG_XEN_PV) || defined(CONFIG_ARM) || defined(CONFIG_ARM64)
-- 
2.15.3.AMZN


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [RFC PATCH V2 00/11] Enable PM hibernation on guest VMs

2020-01-07 Thread Anchal Agarwal
Hello,
I am sending out a V2 version of series of patches that implements guest 
PM hibernation.
These guests are running on xen hypervisor. The patches had been tested
against mainstream kernel. EC2 instance hibernation feature is provided 
to the AWS EC2 customers. PM hibernation uses swap space carved out within 
the guest[or can be a separate partition], where hibernation image is 
stored and restored from.

Why is guest hibenration needed:
Doing guest hibernation does not involve any support from hypervisor and this
way guest has complete control over its state. Infrastructure restrictions like
saving up guest state etc can be overcome by guest initiated hibernation.

This series includes some improvements over RFC series sent last year:
https://lists.xenproject.org/archives/html/xen-devel/2018-06/msg00823.html

Any comments or suggestions are welcome.

Changelog v2:
1. Removed timeout/request present on the ring in xen-blkfront during blkfront 
freeze
2. Fixed restoring of PIRQs which was apparently working for 4.9 kernels but 
not for
newer kernel. [Legacy irqs were no longer restored after hibernation introduced 
with
this commit "020db9d3c1dc0"]
3. Merged couple of related patches to make the code more coherent and readable
4. Code refactoring
5. Sched clock fix when hibernating guest is under heavy CPU load
Note: Under very rare circumstances we see resume failures with KASLR enabled 
only
on xen instances.  We are roughly seeing 3% failures [>1000 runs] when testing 
with
various instance sizes and some workload running on each instance. I am 
currently
investigating the issue as to confirm if its a xen issue or kernel issue.
However, it should not hold back anyone from reviewing/accepting these patches.

Testing done:
All the testing is done using amazon linux images w/t stock upstream kernel
installed. All testing is done for multiple hibernation cycle.

i. multiple loops[~100] of hibernation in disk mode  w/t 5.4 guest 
kernel + 4.11 xen
ii. Hibernation tested with memory stress tester running in background on 
smaller and
larger instance sizes on EC2.[>500 runs]
iii. Testing is also done on physical host machine[Ubuntu18.04/4.15 
kernel/stock xen-4.6]
running amazon linux 2 OS as guest VM with multiple queues.
iv. Ran dd to write a large file with bs=1k and hibernated multiple times

Testing How to:
---
Example:
Set up a file-backed swap space. Swap file size>=Total memory on the system
sudo dd if=/dev/zero of=/swap bs=$(( 1024 * 1024 )) count=4096 # 4096MiB
sudo chmod 600 /swap
sudo mkswap /swap
sudo swapon /swap

Update resume device/resume offset in grub if using swap file:
resume=/dev/xvda1 resume_offset=200704

Execute:

sudo pm-hibernate
OR
echo disk > /sys/power/state && echo reboot > /sys/power/disk

Compute resume offset code:
"
#!/usr/bin/env python
import sys
import array
import fcntl

#swap file
f = open(sys.argv[1], 'r')
buf = array.array('L', [0])

#FIBMAP
ret = fcntl.ioctl(f.fileno(), 0x01, buf)
print buf[0]
"

Aleksei Besogonov (1):
  PM / hibernate: update the resume offset on SNAPSHOT_SET_SWAP_AREA

Anchal Agarwal (2):
  x86/xen: Introduce new function to map HYPERVISOR_shared_info on
Resume
  xen: Clear IRQD_IRQ_STARTED flag during shutdown PIRQs

Eduardo Valentin (1):
  x86: tsc: avoid system instability in hibernation

Munehisa Kamata (7):
  xen/manage: keep track of the on-going suspend mode
  xenbus: add freeze/thaw/restore callbacks support
  x86/xen: add system core suspend and resume callbacks
  xen-netfront: add callbacks for PM suspend and hibernation support
  xen-blkfront: add callbacks for PM suspend and hibernation
  x86/xen: save and restore steal clock during hibernation
  x86/xen: close event channels for PIRQs in system core suspend
callback

 arch/x86/kernel/tsc.c |  29 ++
 arch/x86/xen/enlighten_hvm.c  |   8 +++
 arch/x86/xen/suspend.c|  66 +
 arch/x86/xen/time.c   |   3 +
 arch/x86/xen/xen-ops.h|   1 +
 drivers/block/xen-blkfront.c  | 119 +++---
 drivers/net/xen-netfront.c|  98 ++-
 drivers/xen/events/events_base.c  |  13 +
 drivers/xen/manage.c  |  73 +++
 drivers/xen/time.c|  28 -
 drivers/xen/xenbus/xenbus_probe.c |  99 +--
 include/linux/irq.h   |   1 +
 include/linux/sched/clock.h   |   5 ++
 include/xen/events.h  |   1 +
 include/xen/xen-ops.h |   8 +++
 include/xen/xenbus.h  |   3 +
 kernel/irq/chip.c |   3 +-
 kernel/power/user.c   |   6 +-
 kernel/sched/clock.c  |   4 +-
 19 files changed, 537 insertions(+), 31 deletions(-)

-- 
2.15.3.AMZN


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org

[Xen-devel] [qemu-mainline test] 145756: regressions - FAIL

2020-01-07 Thread osstest service owner
flight 145756 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/145756/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-xsm   6 xen-buildfail REGR. vs. 144861
 build-arm64   6 xen-buildfail REGR. vs. 144861
 build-i3866 xen-buildfail REGR. vs. 144861
 build-amd64   6 xen-buildfail REGR. vs. 144861
 build-i386-xsm6 xen-buildfail REGR. vs. 144861
 build-amd64-xsm   6 xen-buildfail REGR. vs. 144861
 build-armhf   6 xen-buildfail REGR. vs. 144861

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-qemuu-nested-amd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-shadow1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1)  blocked n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qcow2 1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-i386-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-qemuu-nested-intel  1 build-check(1)  blocked n/a
 test-arm64-arm64-xl-seattle   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1) blocked n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-amd64-pygrub   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ws16-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-raw1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ws16-amd64  1 build-check(1)  blocked n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit1   1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 

[Xen-devel] [xen-unstable-smoke test] 145752: tolerable all pass - PUSHED

2020-01-07 Thread osstest service owner
flight 145752 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/145752/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  4dde27b6e0a0b0dcb8fdfc7580fbd9c976aa103f
baseline version:
 xen  f383de87a2fb077f1fdbd4594493af613b15c233

Last test of basis   145740  2020-01-07 14:00:34 Z0 days
Testing same since   145752  2020-01-07 18:00:34 Z0 days1 attempts


People who touched revisions under test:
  Hongyan Xia 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   f383de87a2..4dde27b6e0  4dde27b6e0a0b0dcb8fdfc7580fbd9c976aa103f -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [qemu-mainline test] 145750: regressions - FAIL

2020-01-07 Thread osstest service owner
flight 145750 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/145750/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-xsm   6 xen-buildfail REGR. vs. 144861
 build-arm64   6 xen-buildfail REGR. vs. 144861
 build-i3866 xen-buildfail REGR. vs. 144861
 build-amd64   6 xen-buildfail REGR. vs. 144861
 build-i386-xsm6 xen-buildfail REGR. vs. 144861
 build-amd64-xsm   6 xen-buildfail REGR. vs. 144861
 build-armhf   6 xen-buildfail REGR. vs. 144861

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-shadow 1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qcow2 1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ws16-amd64  1 build-check(1)  blocked n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit1   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  1 build-check(1)  blocked n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-thunderx  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-pvhv2-amd  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-pygrub   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 test-amd64-amd64-xl-pvhv2-intel  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-pvshim 1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-raw1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-seattle   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit1   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-pvshim1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1) blocked n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-amd64-qemuu-nested-amd  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-amd64-amd64-amd64-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ws16-amd64  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  

Re: [Xen-devel] [...], USB-passthru only works with qemu-traditional

2020-01-07 Thread Steffen Einsle

Hello,

you're probably right about the malformed commandline for USB-passthru: 
With upstream qemu I get


qemu-system-x86_64: -usbdevice tablet: '-usbdevice' is deprecated, 
please use '-device usb-...' instead
qemu-system-x86_64: -usbdevice host:0d46:3003: '-usbdevice' is 
deprecated, please use '-device usb-...' instead
qemu-system-x86_64: -usbdevice host:0d46:3003: could not add USB device 
'host:0d46:3003'


I'm not quite sure if this ever worked (without trad), but if it did, it 
was some years ago... perhaps at the times of xen 4.1 ?



Am 06.01.2020 um 11:23 schrieb Durrant, Paul:

-Original Message-
From: win-pv-devel  On Behalf
Of Steffen Einsle
Sent: 05 January 2020 00:44
To: win-pv-de...@lists.xenproject.org
Subject: [win-pv-devel] Driver 9.0.0 no keyboard in vncviewer, USB-
passthru only with qemu-traditional

Hello,

I just installed a Windows 2019 Server with the new 9.0.0 PV drivers
under xen 4.12.1. I use gentoo and since I need usb-passthru I have to
use the qemu-traditional useflag (or device_model_version =
'qemu-xen-traditional').

- USB-passthru works only with qemu-traditional

   That seems odd, but I guess nor many people use USB passthru so it could 
have got broken with upstream somewhere along the way.

Is there a general trick to get USB-passthru working with qemu-xen?
(without qemu-traditional my usbdevice = ['tablet', 'host:0d46:3003']
prevents domu creation - device-model-exited-error)

   I think that is probably something to post on xen-users or xen-devel. Have 
you ever had USB passthrough working with upstream QEMU? There's nothing at 
https://wiki.xenproject.org/wiki/Xen_USB_Passthrough to suggest it is only 
supported using trad so if it is broken it needs fixing. What does your qemu 
log (under /var/log/xen) say was the reason for failure? (I'm guessing it was 
probably malformed command line, which would mean there's a bug in libxl).
Paul






smime.p7s
Description: S/MIME Cryptographic Signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/6] x86/boot: Map the trampoline as read-only

2020-01-07 Thread Andrew Cooper
On 07/01/2020 16:19, Jan Beulich wrote:
> On 07.01.2020 16:51, Andrew Cooper wrote:
>> On 07/01/2020 15:21, Jan Beulich wrote:
>>> On 06.01.2020 16:54, Andrew Cooper wrote:
 c/s ec92fcd1d08, which caused the trampoline GDT Access bits to be set,
 removed the final writes which occurred between enabling paging and 
 switching
 to the high mappings.  There don't plausibly need to be any memory writes 
 in
 few instructions is takes to perform this transition.

 As a consequence, we can remove the RWX mapping of the trampoline.  It is 
 RX
 via its identity mapping below 1M, and RW via the directmap.

 Signed-off-by: Andrew Cooper 
>>> Reviewed-by: Jan Beulich 
>>>
 This probably wants backporting, alongside ec92fcd1d08 if it hasn't yet.
>>> This is just cleanup, largely cosmetic in nature. It could be argued
>>> that once the directmap has disappeared this can serve as additional
>>> proof that the trampoline range has no (intended) writable mappings
>>> anymore, but prior to that point I don't see much further benefit.
>>> Could you expand on the reasons why you see both as backporting
>>> candidates?
>> Defence in depth.
>>
>> An RWX mapping is very attractive for an attacker who's broken into Xen
>> and is looking to expand the damage they can do.
> Such an attacker is typically in the position though to make
> themselves RWX mappings.

This is one example of a possibility.  I wouldn't put it in the "likely"
category, and it definitely isn't a guarantee.

>  Having as little as possible is only
> complicating their job, not making it impossible, I would say.

Yes, and?

This is the entire point of defence in depth.  Make an attackers job harder.

Enforcing W^X is universally considered a good thing from a security
perspective, because it removes a load of trivial cases cases where a
stack over-write can easily be turned into arbitrary code execution.

Sure - this isn't going to stop an attacker who has arbitrary write
exploit, but it very well might stop an attacker who only has restricted
write exploit.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 6/6] x86/boot: Drop INVALID_VCPU

2020-01-07 Thread Andrew Cooper
On 07/01/2020 16:52, Jan Beulich wrote:
> On 06.01.2020 16:54, Andrew Cooper wrote:
>> Now that NULL will fault at boot, there is no need for a special constant to
>> signify "current not set up yet".
> Mind making this "... no strong need ..."? The benefit of an easily
> recognizable value goes away, but I guess we'll be fine without.
> IOW I'm not meaning to object.

Fine.

>
>> --- a/xen/arch/x86/setup.c
>> +++ b/xen/arch/x86/setup.c
>> @@ -705,7 +705,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>>  /* Critical region without IDT or TSS.  Any fault is deadly! */
>>  
>>  set_processor_id(0);
>> -set_current(INVALID_VCPU); /* debug sanity. */
>> +set_current(NULL); /* debug sanity. */
>>  idle_vcpu[0] = current;
> Is any of this actually changing any value in memory?

Yes. Observe:

    /* Set up stack. */
    lea STACK_SIZE + sym_esi(cpu0_stack), %esp

twice in head.S, meaning that the top-of-stack block is junk at this point.

Explicitly setting it to NULL here seems like a safer option than
trusting that noone has actually used the stack yet.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 5/6] x86/boot: Don't map 0 during boot

2020-01-07 Thread Andrew Cooper
On 07/01/2020 16:35, Jan Beulich wrote:
> On 06.01.2020 16:54, Andrew Cooper wrote:
>> --- a/xen/arch/x86/boot/head.S
>> +++ b/xen/arch/x86/boot/head.S
>> @@ -689,12 +689,15 @@ trampoline_setup:
>>  sub $(L2_PAGETABLE_ENTRIES*8),%eax
>>  loop1b
>>  
>> -/*
>> - * During boot, hook 4kB mappings of first 2MB of memory into L2.
>> - * This avoids mixing cachability for the legacy VGA region.
>> - */
>> -lea __PAGE_HYPERVISOR+sym_esi(l1_identmap),%edi
>> -mov %edi,sym_fs(l2_bootmap)
>> +/* Map the permentant trampoline page into l{1,2}_bootmap[]. */
> "permanent"?

Fixed.

>
>> +mov sym_esi(trampoline_phys), %edx
>> +mov %edx, %ecx
>> +or  $__PAGE_HYPERVISOR_RX, %edx /* %edx = PTE to write  */
>> +shr $PAGE_SHIFT, %ecx   /* %ecx = Slot to write */
> Following the LEA model further down, how about
>
> mov sym_esi(trampoline_phys), %ecx
> lea __PAGE_HYPERVISOR_RX(%ecx), %edx /* %edx = PTE to write  */
> shr $PAGE_SHIFT, %ecx/* %ecx = Slot to write */
>
> ?

LGTM

> Anyway, with or without this adjustment
> Reviewed-by: Jan Beulich 

Thanks.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/6] x86/boot: Clean up l?_bootmap[] construction

2020-01-07 Thread Andrew Cooper
On 07/01/2020 16:30, Jan Beulich wrote:
>>>  for ( i = 0; i < 8; ++i )
>>>  {
>>>  unsigned int slot = (xen_phys_start >> L2_PAGETABLE_SHIFT) + i;
>>>  paddr_t addr = slot << L2_PAGETABLE_SHIFT;
>>>  
>>>  l2_identmap[slot] = l2e_from_paddr(addr, 
>>> PAGE_HYPERVISOR|_PAGE_PSE);
>>> -slot &= L2_PAGETABLE_ENTRIES - 1;
>>>  l2_bootmap[slot] = l2e_from_paddr(addr, 
>>> __PAGE_HYPERVISOR|_PAGE_PSE);
>>>  }
>>> -/* Initialise L3 boot-map page directory entries. */
>>> -l3_bootmap[l3_table_offset(xen_phys_start)] =
>>> -l3e_from_paddr((UINTN)l2_bootmap, __PAGE_HYPERVISOR);
>>> -l3_bootmap[l3_table_offset(xen_phys_start + (8 << L2_PAGETABLE_SHIFT) 
>>> - 1)] =
>>> -l3e_from_paddr((UINTN)l2_bootmap, __PAGE_HYPERVISOR);
>>> +
>>> +/* Initialize L3 boot-map page directory entries. */
>>> +for ( i = 0; i < 4; ++i )
>>> +l3_bootmap[i] = l3e_from_paddr((UINTN)l2_bootmap + i * PAGE_SIZE,
>>> +   __PAGE_HYPERVISOR);
>> The idea behind the original code was to be immune to the number
>> of pages l2_bootmap[] covers, as long as it's at least one (which
>> it'll always be, I would say). The minimum requirement to any
>> change to this I have is that the build must break if the size
>> assumption here is violated. I.e. there may not be a literal 4 as
>> the upper loop bound here, or there would need to be a
>> BUILD_BUG_ON() right next to it. But I'd really prefer if the
>> code was left as is (perhaps with a comment added), unless you
>> can point out actual issues with it (which I can't see in the
>> description), or you can otherwise justify the change with better
>> than "the EFI side is further complicated by spraying non-identity
>> aliases into the mix."
> And if this change is to be made, won't it mean the code in setup.c
> commented with "Make boot page tables match non-EFI boot" can then
> go away at the same time?

When I've figured out why altering that causes the EFI boot to fail, yes
- that was the plan...

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/6] x86/boot: Clean up l?_bootmap[] construction

2020-01-07 Thread Andrew Cooper
On 07/01/2020 16:16, Jan Beulich wrote:
> On 06.01.2020 16:54, Andrew Cooper wrote:
>> The need for Xen to be identity mapped into the bootmap is not obvious, and
>> differs between the MB and EFI boot paths.  Furthermore, the EFI side is
>> further complicated by spraying non-identity aliases into the mix.
> What (intentional) aliases are you talking about? The changes done here
> don't remove any. Or do you mean the ones occurring as a side effect of
> possibly using the same L2 in two L3 slots?

This piece of logic took ages to reverse engineer, but yes - there are
aliases.

The logic previously read:

l2_identmap[slot] = l2e_from_paddr(addr, PAGE_HYPERVISOR|_PAGE_PSE);
slot &= L2_PAGETABLE_ENTRIES - 1;
l2_bootmap[slot] = l2e_from_paddr(addr, __PAGE_HYPERVISOR|_PAGE_PSE);

which is suspicious and looks wrong, seeing as l2_bootmap[] and
l2_idetmap[] are both 4 pages long and used elsewhere as identity mappings.

This ends up working because of the l3 logic which may, in some
circumstances (the 16M of Xen crossing a 1G boundary), edit two entries
of l3_identmap[], rather than one.  In this case, there ends up being a
second (split) alias of Xen mapped at either end of 2G range which
covers Xen, as the same L2 is used in two L3e's

>> Simplify the EFI bootmap construction code to make exactly one identity-map 
>> of
>> Xen, which now matches the MB path.  Comment both pieces of logic, explaining
>> what the mappings are needed for.
> Is both boot map variants fully matching actually needed for anything?

They don't actually fully match after this change.  Xen.efi doesn't map
the trampoline, and has only ever (AFAICT) booted because
zap_low_mappings() creates the trampoline mapping even if it was absent
previously.

The MB path needs the trampoline mapping because it unconditionally
bounces through there, even when no BIOS calls are needed.  This is
expected to change in the future with David's kexec plans.

As for why they should be matching, (or at least, used consistently when
used for the same purpose), my sanity trying to figure out how the EFI
side of things didn't explode on boot.

>
>> --- a/xen/arch/x86/efi/efi-boot.h
>> +++ b/xen/arch/x86/efi/efi-boot.h
>> @@ -584,21 +584,24 @@ static void __init efi_arch_memory_setup(void)
>>  if ( !efi_enabled(EFI_LOADER) )
>>  return;
>>  
>> -/* Initialise L2 identity-map and boot-map page table entries (16MB). */
>> +/*
>> + * Map Xen into the directmap (NX, needed for early-boot pagetable
>> + * handling/walking), and identity map Xen into bootmap (X, needed for 
>> the
>> + * transition from the EFI pagetables to Xen), using 2M superpages.
>> + */
> How does NX vs X matter for the code below here? PAGE_HYPERVISOR and
> __PAGE_HYPERVISOR, as used below, differ by just _PAGE_GLOBAL. Did
> you mean to make further changes?

Hmm - good question.  I really did get the EFI build dying when using
code of the form:

l2_identmap[slot] = l2_bootmap[slot] =
    l2e_from_paddr(addr, __PAGE_HYPERVISOR | _PAGE_PSE);

I put that down to trying to use an NX mapping before EFER.NXE was set
up, but in light of your point, I suspect it was something else.

>
>>  for ( i = 0; i < 8; ++i )
>>  {
>>  unsigned int slot = (xen_phys_start >> L2_PAGETABLE_SHIFT) + i;
>>  paddr_t addr = slot << L2_PAGETABLE_SHIFT;
>>  
>>  l2_identmap[slot] = l2e_from_paddr(addr, PAGE_HYPERVISOR|_PAGE_PSE);
>> -slot &= L2_PAGETABLE_ENTRIES - 1;
>>  l2_bootmap[slot] = l2e_from_paddr(addr, 
>> __PAGE_HYPERVISOR|_PAGE_PSE);
>>  }
>> -/* Initialise L3 boot-map page directory entries. */
>> -l3_bootmap[l3_table_offset(xen_phys_start)] =
>> -l3e_from_paddr((UINTN)l2_bootmap, __PAGE_HYPERVISOR);
>> -l3_bootmap[l3_table_offset(xen_phys_start + (8 << L2_PAGETABLE_SHIFT) - 
>> 1)] =
>> -l3e_from_paddr((UINTN)l2_bootmap, __PAGE_HYPERVISOR);
>> +
>> +/* Initialize L3 boot-map page directory entries. */
>> +for ( i = 0; i < 4; ++i )
>> +l3_bootmap[i] = l3e_from_paddr((UINTN)l2_bootmap + i * PAGE_SIZE,
>> +   __PAGE_HYPERVISOR);
> The idea behind the original code was to be immune to the number
> of pages l2_bootmap[] covers, as long as it's at least one (which
> it'll always be, I would say). The minimum requirement to any
> change to this I have is that the build must break if the size
> assumption here is violated. I.e. there may not be a literal 4 as
> the upper loop bound here, or there would need to be a
> BUILD_BUG_ON() right next to it. But I'd really prefer if the
> code was left as is (perhaps with a comment added), unless you
> can point out actual issues with it (which I can't see in the
> description), or you can otherwise justify the change with better
> than "the EFI side is further complicated by spraying non-identity
> aliases into the mix."

Given that what you describe here is totally undocumented, and AFAICT,
totally undescribed 

[Xen-devel] [PATCH 2/2] x86/hyperv: drop all __packed from hyperv-tlfs.h

2020-01-07 Thread Wei Liu
All structures are already naturally aligned. Linux added those
attributes out of paranoia.

In Xen we've had instance we had to drop pointless __packed to placate
gcc 9 (see ca9310b24e), it is better drop those attributes.

Requested-by: Jan Beulich 
Signed-off-by: Wei Liu 
---
 xen/include/asm-x86/guest/hyperv-tlfs.h | 54 -
 1 file changed, 27 insertions(+), 27 deletions(-)

diff --git a/xen/include/asm-x86/guest/hyperv-tlfs.h 
b/xen/include/asm-x86/guest/hyperv-tlfs.h
index e4183c802c..0811785002 100644
--- a/xen/include/asm-x86/guest/hyperv-tlfs.h
+++ b/xen/include/asm-x86/guest/hyperv-tlfs.h
@@ -288,7 +288,7 @@ union hv_x64_msr_hypercall_contents {
u64 enable:1;
u64 reserved:11;
u64 guest_physical_address:52;
-   } __packed;
+   };
 };
 
 /*
@@ -300,7 +300,7 @@ struct ms_hyperv_tsc_page {
volatile u64 tsc_scale;
volatile s64 tsc_offset;
u64 reserved2[509];
-}  __packed;
+};
 
 /*
  * The guest OS needs to register the guest ID with the hypervisor.
@@ -347,17 +347,17 @@ struct hv_reenlightenment_control {
__u64 enabled:1;
__u64 reserved2:15;
__u64 target_vp:32;
-}  __packed;
+};
 
 struct hv_tsc_emulation_control {
__u64 enabled:1;
__u64 reserved:63;
-} __packed;
+};
 
 struct hv_tsc_emulation_status {
__u64 inprogress:1;
__u64 reserved:63;
-} __packed;
+};
 
 #define HV_X64_MSR_HYPERCALL_ENABLE0x0001
 #define HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT12
@@ -445,7 +445,7 @@ typedef struct _HV_REFERENCE_TSC_PAGE {
__u32 res1;
__u64 tsc_scale;
__s64 tsc_offset;
-}  __packed HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE;
+} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE;
 
 /* Define the number of synthetic interrupt sources. */
 #define HV_SYNIC_SINT_COUNT(16)
@@ -502,7 +502,7 @@ union hv_message_flags {
struct {
__u8 msg_pending:1;
__u8 reserved:7;
-   } __packed;
+   };
 };
 
 /* Define port identifier type. */
@@ -511,7 +511,7 @@ union hv_port_id {
struct {
__u32 id:24;
__u32 reserved:8;
-   } __packed u;
+   } u;
 };
 
 /* Define synthetic interrupt controller message header. */
@@ -524,7 +524,7 @@ struct hv_message_header {
__u64 sender;
union hv_port_id port;
};
-} __packed;
+};
 
 /* Define synthetic interrupt controller message format. */
 struct hv_message {
@@ -532,12 +532,12 @@ struct hv_message {
union {
__u64 payload[HV_MESSAGE_PAYLOAD_QWORD_COUNT];
} u;
-} __packed;
+};
 
 /* Define the synthetic interrupt message page layout. */
 struct hv_message_page {
struct hv_message sint_message[HV_SYNIC_SINT_COUNT];
-} __packed;
+};
 
 /* Define timer message payload structure. */
 struct hv_timer_message_payload {
@@ -545,7 +545,7 @@ struct hv_timer_message_payload {
__u32 reserved;
__u64 expiration_time;  /* When the timer expired */
__u64 delivery_time;/* When the message was delivered */
-} __packed;
+};
 
 struct hv_nested_enlightenments_control {
struct {
@@ -555,7 +555,7 @@ struct hv_nested_enlightenments_control {
struct {
__u32 reserved;
} hypercallControls;
-} __packed;
+};
 
 /* Define virtual processor assist page structure. */
 struct hv_vp_assist_page {
@@ -566,7 +566,7 @@ struct hv_vp_assist_page {
__u8 enlighten_vmentry;
__u8 reserved2[7];
__u64 current_nested_vmcs;
-} __packed;
+};
 
 struct hv_enlightened_vmcs {
u32 revision_id;
@@ -742,7 +742,7 @@ struct hv_enlightened_vmcs {
u32 nested_flush_hypercall:1;
u32 msr_bitmap:1;
u32 reserved:30;
-   }  __packed hv_enlightenments_control;
+   }  hv_enlightenments_control;
u32 hv_vp_id;
 
u64 hv_vm_id;
@@ -752,7 +752,7 @@ struct hv_enlightened_vmcs {
u64 padding64_5[7];
u64 xss_exit_bitmap;
u64 padding64_6[7];
-} __packed;
+};
 
 #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE0
 #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_IO_BITMAP   BIT(0, UL)
@@ -793,7 +793,7 @@ union hv_stimer_config {
u64 reserved_z0:3;
u64 sintx:4;
u64 reserved_z1:44;
-   } __packed;
+   };
 };
 
 
@@ -808,7 +808,7 @@ union hv_synic_scontrol {
struct {
u64 enable:1;
u64 reserved:63;
-   } __packed;
+   };
 };
 
 /* Define synthetic interrupt source. */
@@ -821,7 +821,7 @@ union hv_synic_sint {
u64 auto_eoi:1;
u64 polling:1;
u64 reserved2:45;
-   } __packed;
+   };
 };
 
 /* Define the format of the SIMP register */
@@ -831,7 +831,7 @@ union hv_synic_simp {
u64 simp_enabled:1;

[Xen-devel] [PATCH 0/2] Misc Hyper-V TLFS fixes

2020-01-07 Thread Wei Liu
Fix two issues discovered by Jan.

Wei Liu (2):
  x86/hyperv: drop usage of GENMASK_ULL from hyperv-tlfs.h
  x86/hyperv: drop all __packed from hyperv-tlfs.h

 xen/include/asm-x86/guest/hyperv-tlfs.h | 60 -
 1 file changed, 30 insertions(+), 30 deletions(-)

-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 1/2] x86/hyperv: drop usage of GENMASK_ULL from hyperv-tlfs.h

2020-01-07 Thread Wei Liu
I'm told that GENMASK_ULL shouldn't be used outside of Arm code in its
current form.

Requested-by: Jan Beulich 
Signed-off-by: Wei Liu 
---
 xen/include/asm-x86/guest/hyperv-tlfs.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/include/asm-x86/guest/hyperv-tlfs.h 
b/xen/include/asm-x86/guest/hyperv-tlfs.h
index 5b43f99de8..e4183c802c 100644
--- a/xen/include/asm-x86/guest/hyperv-tlfs.h
+++ b/xen/include/asm-x86/guest/hyperv-tlfs.h
@@ -415,13 +415,13 @@ enum HV_GENERIC_SET_FORMAT {
HV_GENERIC_SET_ALL,
 };
 
-#define HV_HYPERCALL_RESULT_MASK   GENMASK_ULL(15, 0)
+#define HV_HYPERCALL_RESULT_MASK   0x /* GENMASK_ULL(15, 0) */
 #define HV_HYPERCALL_FAST_BIT  BIT(16, UL)
 #define HV_HYPERCALL_VARHEAD_OFFSET17
 #define HV_HYPERCALL_REP_COMP_OFFSET   32
-#define HV_HYPERCALL_REP_COMP_MASK GENMASK_ULL(43, 32)
+#define HV_HYPERCALL_REP_COMP_MASK 0xfff /* GENMASK_ULL(43, 32) */
 #define HV_HYPERCALL_REP_START_OFFSET  48
-#define HV_HYPERCALL_REP_START_MASKGENMASK_ULL(59, 48)
+#define HV_HYPERCALL_REP_START_MASK0xfff /* GENMASK_ULL(59, 
48) */
 
 /* hypercall status code */
 #define HV_STATUS_SUCCESS  0
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [qemu-mainline test] 145743: regressions - FAIL

2020-01-07 Thread osstest service owner
flight 145743 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/145743/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-xsm   6 xen-buildfail REGR. vs. 144861
 build-arm64   6 xen-buildfail REGR. vs. 144861
 build-i3866 xen-buildfail REGR. vs. 144861
 build-amd64   6 xen-buildfail REGR. vs. 144861
 build-i386-xsm6 xen-buildfail REGR. vs. 144861
 build-amd64-xsm   6 xen-buildfail REGR. vs. 144861
 build-armhf   6 xen-buildfail REGR. vs. 144861

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-raw1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 test-amd64-amd64-i386-pvgrub  1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qcow2 1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 test-amd64-amd64-pygrub   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ws16-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit1   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvhv2-amd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-shadow1 build-check(1)   blocked  n/a
 test-amd64-amd64-amd64-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-qemuu-nested-intel  1 build-check(1)  blocked n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  1 build-check(1)  blocked n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit1   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ws16-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-amd64-qemuu-nested-amd  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1)  blocked n/a
 test-arm64-arm64-xl-seattle   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 

Re: [Xen-devel] [PATCH v3 3/5] x86/hyperv: provide percpu hypercall input page

2020-01-07 Thread Wei Liu
On Tue, Jan 07, 2020 at 06:08:19PM +0100, Jan Beulich wrote:
> On 07.01.2020 17:33, Wei Liu wrote:
> > On Mon, Jan 06, 2020 at 11:27:18AM +0100, Jan Beulich wrote:
> >> On 05.01.2020 17:47, Wei Liu wrote:
> >>> Hyper-V's input / output argument must be 8 bytes aligned an not cross
> >>> page boundary. The easiest way to satisfy those requirements is to use
> >>> percpu page.
> >>
> >> I'm not sure "easiest" is really true here. Others could consider adding
> >> __aligned() attributes as easy or even easier (by being even more
> >> transparent to use sites). Could we settle on "One way ..."?
> > 
> > Do you mean something like
> > 
> >struct foo __aligned(8);
> 
> If this is in a header and ...
> 
> >hv_do_hypercall(OP, virt_to_maddr(), ...);
> 
> ... this in actual code, then yes.
> 
> > ?
> > 
> > I don't think this is transparent to user sites. Plus, foo is on stack
> > which is 1) difficult to get its maddr,
> 
> It being on the stack may indeed complicate getting its machine address
> (if not now, then down the road) - valid point.
> 
> > 2) may cross page boundary.
> 
> The __aligned() of course needs to be large enough to avoid this
> happening.

For this alignment to be large enough, it will need to be of PAGE_SIZE,
right? Wouldn't that blow up Xen's stack easily?  Given we only have two
pages for that.

In light of these restrictions, the approach I take in the original
patch should be okay.

I'm fine with changing the wording to "One way ..." -- if that's the
only objection you have after this mail.

> 
> >> Also, while looking at this I notice that - despite my earlier
> >> comment when giving the respective, sort-of-conditional ack -
> >> there are (still) many apparently pointless __packed attributes
> >> in hyperv-tlfs.h. Care to comment on this?
> > 
> > Again, that's a straight import from Linux. I tried not to deviate too
> > much. A commit in Linux (ec084491727b0) claims "compiler can add
> > alignment padding to structures or reorder struct members for
> > randomization and optimization".
> 
> Would a compiler doing so (without explicitly being told to) even
> be in line with the C spec? I'd buy such a claim only if I see an
> example proving it.
> 
> > I just checked all the packed structures. They seem to have all the
> > required manual paddings already. I can only assume they tried to erred
> > on the safe side.
> 
> And you surely recall we had to remove quite a few instances of
> __packed for gcc 9 compatibility?

Fair enough. I will write a patch to drop those __packed attributes.

Wei.

> 
> Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/6] x86/boot: Remove the preconstructed low 16M superpage mappings

2020-01-07 Thread Andrew Cooper
On 07/01/2020 15:43, Jan Beulich wrote:
> On 06.01.2020 16:54, Andrew Cooper wrote:
>> First, it is undefined to have superpages and MTRRs disagree on cacheability
>> boundaries, and nothing this early in boot has checked that it is safe to use
>> superpages here.
> Stating this here gives, at least to me, the impression that you change
> things here to obey to these restrictions. I don't see you do so, though
> - map_pages_to_xen() doesn't query MTRRs at all afaics.

No, but it does now honour the E820 WRT holes and/or reserved regions,
rather than blindly using 2M WB superpages, which is an improvement.

>
>> Furthermore, nothing actually uses the mappings on boot.  Build these entries
>> in the directmap when walking the E820 table along with everything else.
> I'm pretty sure some of these mappings were used, perhaps long ago, and
> possibly only by the 32-bit hypervisor. It would feel quite a bit better
> if it was clear when the need for this disappeared. I wonder if I could
> talk you into finding out, so you could say so here.

TBH, its hard enough figuring out how the mappings were used on staging
alone.

At a guess, these date from the pre-MB2 days, where Xen depended on
being loaded at 1M, and will have been the equivalent of:

+    /*
+ * Map Xen into the directmap (needed for early-boot pagetable
+ * handling/walking), and identity map Xen into bootmap (needed for
+ * the transition into long mode), using 2M superpages.
+ */

which is described now in patch 4.

In my experiments, discussed in the cover letter, I did get down to
having a only the single 4k trampoline page mapped, and across a number
of machines, it was the bootscrub which then hit their absence in the
directmap.

>
>> --- a/xen/arch/x86/boot/x86_64.S
>> +++ b/xen/arch/x86/boot/x86_64.S
>> @@ -66,24 +66,19 @@ l1_identmap:
>>  .size l1_identmap, . - l1_identmap
>>  
>>  /*
>> - * __page_tables_start does not cover l1_identmap because it (l1_identmap)
>> - * contains 1-1 mappings. This means that frame addresses of these mappings
>> - * are static and should not be updated at runtime.
>> + * __page_tables_{start,end} cover the range of pagetables which need
>> + * relocating as Xen moves around physical memory.  i.e. each sym_offs()
>> + * reference to a different pagetable in the Xen image.
>>   */
>>  GLOBAL(__page_tables_start)
>>  
>>  /*
>> - * Space for mapping the first 4GB of memory, with the first 16 megabytes
>> - * actualy mapped (mostly using superpages).  Uses 4x 4k pages.
>> + * Space for 4G worth of 2M mappings, first 2M actually mapped via
>> + * l1_identmap[].  Uses 4x 4k pages.
> Would you mind making this say "page tables" instead of "pages" in the
> 2nd sentence?

Why?  Currently all the "Uses x pages" are consistent, and it is
describing the size of the objects, whose units are pages, not pagetables.

>
>> --- a/xen/arch/x86/setup.c
>> +++ b/xen/arch/x86/setup.c
>> @@ -1020,8 +1020,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>>   *
>>   * We require superpage alignment because the boot allocator is
>>   * not yet initialised. Hence we can only map superpages in the
>> - * address range BOOTSTRAP_MAP_BASE to 4GB, as this is guaranteed
>> - * not to require dynamic allocation of pagetables.
>> + * address range 2MB to 4GB, as this is guaranteed not to require
>> + * dynamic allocation of pagetables.
>>   *
>>   * As well as mapping superpages in that range, in preparation for
>>   * initialising the boot allocator, we also look for a region to which
>> @@ -1036,10 +1036,10 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>>  if ( boot_e820.map[i].type != E820_RAM )
>>  continue;
>>  
>> -/* Superpage-aligned chunks from BOOTSTRAP_MAP_BASE. */
>> +/* Superpage-aligned chunks from 2MB. */
>>  s = (boot_e820.map[i].addr + mask) & ~mask;
>>  e = (boot_e820.map[i].addr + boot_e820.map[i].size) & ~mask;
>> -s = max_t(uint64_t, s, BOOTSTRAP_MAP_BASE);
>> +s = max_t(uint64_t, s, MB(2));
>>  if ( s >= e )
>>  continue;
>>  
>> @@ -1346,8 +1346,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>>  
>>  set_pdx_range(s >> PAGE_SHIFT, e >> PAGE_SHIFT);
>>  
>> -/* Need to create mappings above BOOTSTRAP_MAP_BASE. */
>> -map_s = max_t(uint64_t, s, BOOTSTRAP_MAP_BASE);
>> +/* Need to create mappings above 2MB. */
>> +map_s = max_t(uint64_t, s, MB(2));
> Instead of hard coding 2Mb everywhere, how about simply reducing
> BOOTSTRAP_MAP_BASE?

Because the use of BOOTSTRAP_MAP_BASE here is conceptually wrong.

Once I've figured out one other bug on the EFI side of things only, I've
got a follow-on change which manages to undef BOOTSTRAP_MAP_BASE beside
LIMIT because, ...

>  This would then also ease shrinking the build
> time mappings further, e.g. to the low 1Mb (instead of touching
> 

[Xen-devel] [xen-unstable test] 145725: tolerable FAIL

2020-01-07 Thread osstest service owner
flight 145725 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/145725/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-rtds 15 guest-stop   fail in 145691 pass in 145725
 test-armhf-armhf-xl-vhd  15 guest-start/debian.repeat  fail pass in 145691

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-rtds   16 guest-start/debian.repeat fail blocked in 145691
 test-amd64-amd64-xl-rtds 18 guest-localmigrate/x10   fail  like 145691
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 145691
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 145691
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 145691
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 145691
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 145691
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 145691
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 145691
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 145691
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 145691
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 xen  0dd92688080202adcc43dcb3486d4143110a66d5
baseline version:
 xen  0dd92688080202adcc43dcb3486d4143110a66d5

Last test of basis   145725  2020-01-07 08:02:53 Z0 days
Testing same since  (not found) 0 attempts

jobs:
 

Re: [Xen-devel] [PATCH v3 3/5] x86/hyperv: provide percpu hypercall input page

2020-01-07 Thread Jan Beulich
On 07.01.2020 17:33, Wei Liu wrote:
> On Mon, Jan 06, 2020 at 11:27:18AM +0100, Jan Beulich wrote:
>> On 05.01.2020 17:47, Wei Liu wrote:
>>> Hyper-V's input / output argument must be 8 bytes aligned an not cross
>>> page boundary. The easiest way to satisfy those requirements is to use
>>> percpu page.
>>
>> I'm not sure "easiest" is really true here. Others could consider adding
>> __aligned() attributes as easy or even easier (by being even more
>> transparent to use sites). Could we settle on "One way ..."?
> 
> Do you mean something like
> 
>struct foo __aligned(8);

If this is in a header and ...

>hv_do_hypercall(OP, virt_to_maddr(), ...);

... this in actual code, then yes.

> ?
> 
> I don't think this is transparent to user sites. Plus, foo is on stack
> which is 1) difficult to get its maddr,

It being on the stack may indeed complicate getting its machine address
(if not now, then down the road) - valid point.

> 2) may cross page boundary.

The __aligned() of course needs to be large enough to avoid this
happening.

>> Also, while looking at this I notice that - despite my earlier
>> comment when giving the respective, sort-of-conditional ack -
>> there are (still) many apparently pointless __packed attributes
>> in hyperv-tlfs.h. Care to comment on this?
> 
> Again, that's a straight import from Linux. I tried not to deviate too
> much. A commit in Linux (ec084491727b0) claims "compiler can add
> alignment padding to structures or reorder struct members for
> randomization and optimization".

Would a compiler doing so (without explicitly being told to) even
be in line with the C spec? I'd buy such a claim only if I see an
example proving it.

> I just checked all the packed structures. They seem to have all the
> required manual paddings already. I can only assume they tried to erred
> on the safe side.

And you surely recall we had to remove quite a few instances of
__packed for gcc 9 compatibility?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 145740: tolerable all pass - PUSHED

2020-01-07 Thread osstest service owner
flight 145740 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/145740/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  f383de87a2fb077f1fdbd4594493af613b15c233
baseline version:
 xen  0dd92688080202adcc43dcb3486d4143110a66d5

Last test of basis   145682  2020-01-06 20:00:31 Z0 days
Testing same since   145740  2020-01-07 14:00:34 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Ian Jackson 
  Jan Beulich 
  Julien Grall 
  Sergey Dyasli 
  Wei Liu 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   0dd9268808..f383de87a2  f383de87a2fb077f1fdbd4594493af613b15c233 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 6/6] x86/boot: Drop INVALID_VCPU

2020-01-07 Thread Jan Beulich
On 06.01.2020 16:54, Andrew Cooper wrote:
> Now that NULL will fault at boot, there is no need for a special constant to
> signify "current not set up yet".

Mind making this "... no strong need ..."? The benefit of an easily
recognizable value goes away, but I guess we'll be fine without.
IOW I'm not meaning to object.

> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -705,7 +705,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>  /* Critical region without IDT or TSS.  Any fault is deadly! */
>  
>  set_processor_id(0);
> -set_current(INVALID_VCPU); /* debug sanity. */
> +set_current(NULL); /* debug sanity. */
>  idle_vcpu[0] = current;

Is any of this actually changing any value in memory? I.e. wouldn't
it be better to delete all of this, or leave it in a comment for
documentation purposes? (I'm willing to ack the patch as is, but I'd
like this alternative to at least be considered.)

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 3/5] x86/hyperv: provide percpu hypercall input page

2020-01-07 Thread Michael Kelley
From: Wei Liu  Sent: Tuesday, January 7, 2020 8:34 AM
> 
> On Mon, Jan 06, 2020 at 11:27:18AM +0100, Jan Beulich wrote:
> > On 05.01.2020 17:47, Wei Liu wrote:
> > > Hyper-V's input / output argument must be 8 bytes aligned an not cross
> > > page boundary. The easiest way to satisfy those requirements is to use
> > > percpu page.
> >
> > I'm not sure "easiest" is really true here. Others could consider adding
> > __aligned() attributes as easy or even easier (by being even more
> > transparent to use sites). Could we settle on "One way ..."?
> 
> Do you mean something like
> 
>struct foo __aligned(8);
> 
>hv_do_hypercall(OP, virt_to_maddr(), ...);
> 
> ?
> 
> I don't think this is transparent to user sites. Plus, foo is on stack
> which is 1) difficult to get its maddr, 2) may cross page boundary.
> 
> If I misunderstood what you meant, please give me an example here.
> 
> >
> > > @@ -83,14 +84,33 @@ static void __init setup_hypercall_page(void)
> > >  wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> > >  }
> > >
> > > +static void setup_hypercall_pcpu_arg(void)
> > > +{
> > > +void *mapping;
> > > +
> > > +mapping = alloc_xenheap_page();
> > > +if ( !mapping )
> > > +panic("Failed to allocate hypercall input page for %u\n",
> >
> > "... for CPU%u\n" please.
> >
> > > +  smp_processor_id());
> > > +
> > > +this_cpu(hv_pcpu_input_arg) = mapping;
> >
> > When offlining and then re-onlining a CPU, the prior page will be
> > leaked.
> 
> Right. Thanks for catching this one.
> 
> >
> > > --- a/xen/include/asm-x86/guest/hyperv.h
> > > +++ b/xen/include/asm-x86/guest/hyperv.h
> > > @@ -51,6 +51,8 @@ static inline uint64_t hv_scale_tsc(uint64_t tsc, 
> > > uint64_t scale,
> > >
> > >  #ifdef CONFIG_HYPERV_GUEST
> > >
> > > +#include 
> > > +
> > >  #include 
> > >
> > >  struct ms_hyperv_info {
> > > @@ -63,6 +65,8 @@ struct ms_hyperv_info {
> > >  };
> > >  extern struct ms_hyperv_info ms_hyperv;
> > >
> > > +DECLARE_PER_CPU(void *, hv_pcpu_input_arg);
> >
> > Will this really be needed outside of the file that defines it?
> >
> 
> This can live in a private header for the time being.
> 
> > Also, while looking at this I notice that - despite my earlier
> > comment when giving the respective, sort-of-conditional ack -
> > there are (still) many apparently pointless __packed attributes
> > in hyperv-tlfs.h. Care to comment on this?
> 
> Again, that's a straight import from Linux. I tried not to deviate too
> much. A commit in Linux (ec084491727b0) claims "compiler can add
> alignment padding to structures or reorder struct members for
> randomization and optimization".
> 
> I just checked all the packed structures. They seem to have all the
> required manual paddings already. I can only assume they tried to erred
> on the safe side.

Correct.  The __packed attribute was added only about a year ago
after somebody on LKML noticed that the structures were not packed.
Some discussion ensued, but the consensus was to add __packed due
to general  paranoia about what the compiler might do even though
individual fields are aligned to their natural boundary.

Michael

> 
> Wei.
> 
> >
> > Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] MAINTAINERS: Add explicit check-in policy section

2020-01-07 Thread Jan Beulich
On 07.01.2020 17:17, George Dunlap wrote:
> On 1/7/20 1:05 PM, Jan Beulich wrote:
>> On 07.01.2020 13:03, George Dunlap wrote:
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -104,7 +104,53 @@ Descriptions of section entries:
>>>xen-maintainers-
>>>  
>>>  
>>> -The meaning of nesting:
>>> +   Check-in policy
>>> +   ===
>>> +
>>> +In order for a patch to be checked in, in general, several conditions
>>> +must be met:
>>> +
>>> +1. In order to get a change to a given file committed, it must have
>>> +   the approval of at least one maintainer of that file.
>>> +
>>> +   A patch of course needs Acks from the maintainers of each file that
>>> +   it changes; so a patch which changes xen/arch/x86/traps.c,
>>> +   xen/arch/x86/mm/p2m.c, and xen/arch/x86/mm/shadow/multi.c would
>>> +   require an Ack from each of the three sets of maintainers.
>>> +
>>> +   See below for rules on nested maintainership.
>>> +
>>> +2. It must have an Acked-by or a Reviewed-by from someone other than
>>> +   the submitter.
>>
>> I'd like to propose some further distinction here, albeit I'm not sure
>> this isn't implied anyway. It might be that making explicit the
>> distinction between A-b and R-b is sufficient - our current common
>> understanding looks to be that only maintainers can "ack", and others
>> would "review".
> 
> Well first of all, I don't think that's strictly true.  If a
> non-maintainer raises a concern, the patch can't be checked in unless
> that person is satisfied.  We sometimes assume silence is consent, but
> it's much better for the person who raised the concern to say, "I am now
> satisfied with this patch"; and the clearest and most concise way to do
> that is to say "Acked-by".

Hmm, that's a possible model, but one I would never have thought of
given the meaning we assign to "Acked-by". In a case like what you
describe I would always have expected indication of consent by other
than a formal tag, if the person wouldn't anyway be in the position
to ack a patch (or part of it).

> But that sort of "Acked-by" isn't really what is meant by this section.
>  I guess you'd like to say that such an Acked-by would not be sufficient
> to check in a patch; it would have to be the stronger Reviewed-by.
> 
> The point of this sentence is not to define what Ack and Reviewed-by
> mean, but that it must come from someone who is not the submitter.
> However, it is true that someone may read that and be confused;
> particularly as we don't seem to define it anywhere else in the tree, so
> perhaps it's worth trying to clarify.
> 
>> Since the latter is implying a more thorough look at a
>> patch, I think it wouldn't be right to allow (quoting text further
>> down) "anyone in the community" to ack a random patch (I could probably
>> talk my son into ack-ing my patches ;-) ). Perhaps, rather than
>> limiting acks to maintainers of the changed code, we could extend this
>> to maintainers of just some code for maintainer submitted patches (i.e.
>> anyone named as M: at least once in ./MAINTAINERS)? People outside of
>> whatever subset we might pick would be eligible to offer R-b only,
>> implying of course that they actually did do a review.
> 
> I do actually prefer that only people in a "direct line" of
> maintainership for that exact code (i.e., is a maintainer at whatever
> level of specificity) be able to get Acks; and that anyone else should
> be required to give a Reviewed-by.
> 
> This is of course again slightly more aggregate work for a maintianer
> than for someone else, but I think that makes sense in this case.
> 
> How about this:
> 
> 2. It must have either a an Acked-by from a maintainer, or a
>Reviewed-by.  This must come from someone other than the submitter.

Better, but leaving ambiguous whether "maintainer" means "any one"
or "of the code being touched". I think you mean the former, in
which case I'd prefer to see it amended along the lines of "...
from a maintainer (of any component), or ...". Or possibly you
mean any maintainer up the "nesting" chain, in which case the
wording would need to be yet different?

>>> +3. Sufficient time and/or warning must have been given for anyone to
>>> +   respond.  This depends in large part upon the urgency and nature of
>>> +   the patch.  For a straightforward uncontroversial patch, a day or
>>> +   two is sufficient; for a controversial patch, perhaps waiting a
>>> +   week and then saying "I intend to check this in tomorrow unless I
>>> +   hear otherwise".
>>
>> To me as non-native speaker, this last sentence looks incomplete (as
>> in missing e.g. "would be appropriate" at the end), or alternatively
>> it would feel like wanting the two "ing" dropped from the verbs.
> 
> I see what you mean.  But on reflection, I think the intent of this
> paragraph has gotten skewed.  Patches should be given sufficent time for
> *anyone* to give input before being checked in.
> 
> What about changing this as follows:
> 
> ---
> 3. Sufficient time 

Re: [Xen-devel] [PATCH] libxl: don't needlessly report "highmem" in use

2020-01-07 Thread Wei Liu
On Tue, Jan 07, 2020 at 03:58:07PM +0100, Jan Beulich wrote:
> Due to the unconditional updating of dom->highmem_end in
> libxl__domain_device_construct_rdm() I've observed on a 2Gb HVM guest
> with a passed through device (without overly large BARs, and with no RDM
> ranges at all)
> 
> (d2) RAM in high memory; setting high_mem resource base to 1
> ...
> (d2) E820 table:
> (d2)  [00]: : - :000a: RAM
> (d2)  HOLE: :000a - :000d
> (d2)  [01]: :000d - :0010: RESERVED
> (d2)  [02]: :0010 - :7f80: RAM
> (d2)  HOLE: :7f80 - :fc00
> (d2)  [03]: :fc00 - 0001:: RESERVED
> (d2)  [04]: 0001: - 0001:: RAM
> 
> both of which aren't really appropriate in this case. Arrange for this
> to not happen.

Indeed. We shouldn't need to move RAM to high address in this
configuration.

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 5/6] x86/boot: Don't map 0 during boot

2020-01-07 Thread Jan Beulich
On 06.01.2020 16:54, Andrew Cooper wrote:
> --- a/xen/arch/x86/boot/head.S
> +++ b/xen/arch/x86/boot/head.S
> @@ -689,12 +689,15 @@ trampoline_setup:
>  sub $(L2_PAGETABLE_ENTRIES*8),%eax
>  loop1b
>  
> -/*
> - * During boot, hook 4kB mappings of first 2MB of memory into L2.
> - * This avoids mixing cachability for the legacy VGA region.
> - */
> -lea __PAGE_HYPERVISOR+sym_esi(l1_identmap),%edi
> -mov %edi,sym_fs(l2_bootmap)
> +/* Map the permentant trampoline page into l{1,2}_bootmap[]. */

"permanent"?

> +mov sym_esi(trampoline_phys), %edx
> +mov %edx, %ecx
> +or  $__PAGE_HYPERVISOR_RX, %edx /* %edx = PTE to write  */
> +shr $PAGE_SHIFT, %ecx   /* %ecx = Slot to write */

Following the LEA model further down, how about

mov sym_esi(trampoline_phys), %ecx
lea __PAGE_HYPERVISOR_RX(%ecx), %edx /* %edx = PTE to write  */
shr $PAGE_SHIFT, %ecx/* %ecx = Slot to write */

? Anyway, with or without this adjustment
Reviewed-by: Jan Beulich 

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 4/5] x86/hyperv: retrieve vp_index from Hyper-V

2020-01-07 Thread Wei Liu
On Mon, Jan 06, 2020 at 11:31:50AM +0100, Jan Beulich wrote:
> On 05.01.2020 17:48, Wei Liu wrote:
> > --- a/xen/include/asm-x86/guest/hyperv.h
> > +++ b/xen/include/asm-x86/guest/hyperv.h
> > @@ -66,6 +66,7 @@ struct ms_hyperv_info {
> >  extern struct ms_hyperv_info ms_hyperv;
> >  
> >  DECLARE_PER_CPU(void *, hv_pcpu_input_arg);
> > +DECLARE_PER_CPU(unsigned int, hv_vp_index);
> 
> Same question here - will this need to be visible outside of the
> file defining the variable? In the other patch as well as here,
> if the answer is yes, the next question would be whether it needs
> to be visible outside of xen/arch/x86/guest/hyperv/ (i.e. whether
> it shouldn't live in a private header).

Private header should be fine for now.

Wei.

> 
> Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 3/5] x86/hyperv: provide percpu hypercall input page

2020-01-07 Thread Wei Liu
On Mon, Jan 06, 2020 at 11:27:18AM +0100, Jan Beulich wrote:
> On 05.01.2020 17:47, Wei Liu wrote:
> > Hyper-V's input / output argument must be 8 bytes aligned an not cross
> > page boundary. The easiest way to satisfy those requirements is to use
> > percpu page.
> 
> I'm not sure "easiest" is really true here. Others could consider adding
> __aligned() attributes as easy or even easier (by being even more
> transparent to use sites). Could we settle on "One way ..."?

Do you mean something like

   struct foo __aligned(8);

   hv_do_hypercall(OP, virt_to_maddr(), ...);

?

I don't think this is transparent to user sites. Plus, foo is on stack
which is 1) difficult to get its maddr, 2) may cross page boundary.

If I misunderstood what you meant, please give me an example here.

> 
> > @@ -83,14 +84,33 @@ static void __init setup_hypercall_page(void)
> >  wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> >  }
> >  
> > +static void setup_hypercall_pcpu_arg(void)
> > +{
> > +void *mapping;
> > +
> > +mapping = alloc_xenheap_page();
> > +if ( !mapping )
> > +panic("Failed to allocate hypercall input page for %u\n",
> 
> "... for CPU%u\n" please.
> 
> > +  smp_processor_id());
> > +
> > +this_cpu(hv_pcpu_input_arg) = mapping;
> 
> When offlining and then re-onlining a CPU, the prior page will be
> leaked.

Right. Thanks for catching this one.

> 
> > --- a/xen/include/asm-x86/guest/hyperv.h
> > +++ b/xen/include/asm-x86/guest/hyperv.h
> > @@ -51,6 +51,8 @@ static inline uint64_t hv_scale_tsc(uint64_t tsc, 
> > uint64_t scale,
> >  
> >  #ifdef CONFIG_HYPERV_GUEST
> >  
> > +#include 
> > +
> >  #include 
> >  
> >  struct ms_hyperv_info {
> > @@ -63,6 +65,8 @@ struct ms_hyperv_info {
> >  };
> >  extern struct ms_hyperv_info ms_hyperv;
> >  
> > +DECLARE_PER_CPU(void *, hv_pcpu_input_arg);
> 
> Will this really be needed outside of the file that defines it?
> 

This can live in a private header for the time being.

> Also, while looking at this I notice that - despite my earlier
> comment when giving the respective, sort-of-conditional ack -
> there are (still) many apparently pointless __packed attributes
> in hyperv-tlfs.h. Care to comment on this?

Again, that's a straight import from Linux. I tried not to deviate too
much. A commit in Linux (ec084491727b0) claims "compiler can add
alignment padding to structures or reorder struct members for
randomization and optimization".

I just checked all the packed structures. They seem to have all the
required manual paddings already. I can only assume they tried to erred
on the safe side.

Wei.

> 
> Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/6] x86/boot: Clean up l?_bootmap[] construction

2020-01-07 Thread Jan Beulich
On 07.01.2020 17:16, Jan Beulich wrote:
> On 06.01.2020 16:54, Andrew Cooper wrote:
>> --- a/xen/arch/x86/efi/efi-boot.h
>> +++ b/xen/arch/x86/efi/efi-boot.h
>> @@ -584,21 +584,24 @@ static void __init efi_arch_memory_setup(void)
>>  if ( !efi_enabled(EFI_LOADER) )
>>  return;
>>  
>> -/* Initialise L2 identity-map and boot-map page table entries (16MB). */
>> +/*
>> + * Map Xen into the directmap (NX, needed for early-boot pagetable
>> + * handling/walking), and identity map Xen into bootmap (X, needed for 
>> the
>> + * transition from the EFI pagetables to Xen), using 2M superpages.
>> + */
> 
> How does NX vs X matter for the code below here? PAGE_HYPERVISOR and
> __PAGE_HYPERVISOR, as used below, differ by just _PAGE_GLOBAL. Did
> you mean to make further changes?
> 
>>  for ( i = 0; i < 8; ++i )
>>  {
>>  unsigned int slot = (xen_phys_start >> L2_PAGETABLE_SHIFT) + i;
>>  paddr_t addr = slot << L2_PAGETABLE_SHIFT;
>>  
>>  l2_identmap[slot] = l2e_from_paddr(addr, PAGE_HYPERVISOR|_PAGE_PSE);
>> -slot &= L2_PAGETABLE_ENTRIES - 1;
>>  l2_bootmap[slot] = l2e_from_paddr(addr, 
>> __PAGE_HYPERVISOR|_PAGE_PSE);
>>  }
>> -/* Initialise L3 boot-map page directory entries. */
>> -l3_bootmap[l3_table_offset(xen_phys_start)] =
>> -l3e_from_paddr((UINTN)l2_bootmap, __PAGE_HYPERVISOR);
>> -l3_bootmap[l3_table_offset(xen_phys_start + (8 << L2_PAGETABLE_SHIFT) - 
>> 1)] =
>> -l3e_from_paddr((UINTN)l2_bootmap, __PAGE_HYPERVISOR);
>> +
>> +/* Initialize L3 boot-map page directory entries. */
>> +for ( i = 0; i < 4; ++i )
>> +l3_bootmap[i] = l3e_from_paddr((UINTN)l2_bootmap + i * PAGE_SIZE,
>> +   __PAGE_HYPERVISOR);
> 
> The idea behind the original code was to be immune to the number
> of pages l2_bootmap[] covers, as long as it's at least one (which
> it'll always be, I would say). The minimum requirement to any
> change to this I have is that the build must break if the size
> assumption here is violated. I.e. there may not be a literal 4 as
> the upper loop bound here, or there would need to be a
> BUILD_BUG_ON() right next to it. But I'd really prefer if the
> code was left as is (perhaps with a comment added), unless you
> can point out actual issues with it (which I can't see in the
> description), or you can otherwise justify the change with better
> than "the EFI side is further complicated by spraying non-identity
> aliases into the mix."

And if this change is to be made, won't it mean the code in setup.c
commented with "Make boot page tables match non-EFI boot" can then
go away at the same time?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] MAINTAINERS: Add explicit check-in policy section

2020-01-07 Thread George Dunlap
On 1/7/20 1:05 PM, Jan Beulich wrote:
> On 07.01.2020 13:03, George Dunlap wrote:
>> DISCUSSION
>>
>> This seems to be a change from people's understanding of the current
>> policy.  Most people's understanding of the current policy seems to be:
>>
>> 1.  In order to get a change to a given file committed, it must have
>> an Ack or Review from at least one *maintainer* of that file other
>> than the submitter.
>>
>> 2. In the case where a file has only one maintainer, it must have an
>> Ack or Review from a "nested" maintainer.
>>
>> I.e., if I submitted something to x86/mm, it would require an Ack from
>> Jan or Andy, or (in exceptional circumstances) The Rest; but an Ack from
>> (say) Roger or Juergen wouldn't suffice.
>>
>> Let's call this the "maintainer-ack" approach (because it must have an
>> ack or r-b from a maintainer to be checked in), and the proposal in
>> this patch the "maintainer-approval" (since SoB from a maintainer
>> indicates approval).
>>
>> The core issue I have with "maintainer-ack" is that it makes the
>> maintainer less privileged with regard to writing code than
>> non-maintainers.  If component X has maintainers A and B, then a
>> non-maintainer can have code checked in if reviewed either by A or B.
>> If A or B wants code checked in, they have to wait for exactly one
>> person to review it.
>>
>> In fact, if B is quite busy, the easiest way for A really to get their
>> code checked in might be to hand it to a non-maintainer N, and ask N
>> to submit it as their own.  Then A can Ack the patches and check them
>> in.
>>
>> The current system, therefore, either sets up a perverse incentive (if
>> you think the behavior described above is unacceptable) or unnecessary
>> bureaucracy (if you think it's acceptable).  Either way I think we
>> should set up our system to avoid it.
> 
> I much appreciate this initiative of yours.
> 
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -104,7 +104,53 @@ Descriptions of section entries:
>> xen-maintainers-
>>  
>>  
>> -The meaning of nesting:
>> +Check-in policy
>> +===
>> +
>> +In order for a patch to be checked in, in general, several conditions
>> +must be met:
>> +
>> +1. In order to get a change to a given file committed, it must have
>> +   the approval of at least one maintainer of that file.
>> +
>> +   A patch of course needs Acks from the maintainers of each file that
>> +   it changes; so a patch which changes xen/arch/x86/traps.c,
>> +   xen/arch/x86/mm/p2m.c, and xen/arch/x86/mm/shadow/multi.c would
>> +   require an Ack from each of the three sets of maintainers.
>> +
>> +   See below for rules on nested maintainership.
>> +
>> +2. It must have an Acked-by or a Reviewed-by from someone other than
>> +   the submitter.
> 
> I'd like to propose some further distinction here, albeit I'm not sure
> this isn't implied anyway. It might be that making explicit the
> distinction between A-b and R-b is sufficient - our current common
> understanding looks to be that only maintainers can "ack", and others
> would "review".

Well first of all, I don't think that's strictly true.  If a
non-maintainer raises a concern, the patch can't be checked in unless
that person is satisfied.  We sometimes assume silence is consent, but
it's much better for the person who raised the concern to say, "I am now
satisfied with this patch"; and the clearest and most concise way to do
that is to say "Acked-by".

But that sort of "Acked-by" isn't really what is meant by this section.
 I guess you'd like to say that such an Acked-by would not be sufficient
to check in a patch; it would have to be the stronger Reviewed-by.

The point of this sentence is not to define what Ack and Reviewed-by
mean, but that it must come from someone who is not the submitter.
However, it is true that someone may read that and be confused;
particularly as we don't seem to define it anywhere else in the tree, so
perhaps it's worth trying to clarify.

> Since the latter is implying a more thorough look at a
> patch, I think it wouldn't be right to allow (quoting text further
> down) "anyone in the community" to ack a random patch (I could probably
> talk my son into ack-ing my patches ;-) ). Perhaps, rather than
> limiting acks to maintainers of the changed code, we could extend this
> to maintainers of just some code for maintainer submitted patches (i.e.
> anyone named as M: at least once in ./MAINTAINERS)? People outside of
> whatever subset we might pick would be eligible to offer R-b only,
> implying of course that they actually did do a review.

I do actually prefer that only people in a "direct line" of
maintainership for that exact code (i.e., is a maintainer at whatever
level of specificity) be able to get Acks; and that anyone else should
be required to give a Reviewed-by.

This is of course again slightly more aggregate work for a maintianer
than for someone else, but I think that makes sense in this case.

How about this:

2. It must have 

Re: [Xen-devel] [PATCH v3 2/5] x86/hyperv: provide Hyper-V hypercall functions

2020-01-07 Thread Wei Liu
On Mon, Jan 06, 2020 at 10:38:23AM +0100, Jan Beulich wrote:
[...]
> > +
> > +static inline uint64_t hv_do_rep_hypercall(uint16_t code, uint16_t 
> > rep_count,
> > +   uint16_t varhead_size,
> > +   paddr_t input, paddr_t output)
> > +{
> > +uint64_t control = code;
> > +uint64_t status;
> > +uint16_t rep_comp;
> > +
> > +control |= (uint64_t)varhead_size << HV_HYPERCALL_VARHEAD_OFFSET;
> > +control |= (uint64_t)rep_count << HV_HYPERCALL_REP_COMP_OFFSET;
> > +
> > +do {
> > +status = hv_do_hypercall(control, input, output);
> > +if ( (status & HV_HYPERCALL_RESULT_MASK) != HV_STATUS_SUCCESS )
> > +break;
> > +
> > +rep_comp = (status & HV_HYPERCALL_REP_COMP_MASK) >>
> > +HV_HYPERCALL_REP_COMP_OFFSET;
> 
> MASK_EXTR()? (I then also wonder whether MASK_INSR() would better be
> used with some of the other constructs here.)

Sure, I can see if that can be used.

> 
> What's worse though - looking at the definition of
> HV_HYPERCALL_REP_COMP_MASK I notice that it and a few others use
> GENMASK_ULL(), when it was clearly said during review (perhaps of
> another but related patch) that this macro should not be used
> outside of Arm-specific code until it gets put into better shape:
> https://lists.xenproject.org/archives/html/xen-devel/2019-12/msg00705.html

That's a straight import from Linux. I only made the header build
without further inspection.

That can be fixed, of course.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/6] x86/boot: Map the trampoline as read-only

2020-01-07 Thread Jan Beulich
On 07.01.2020 16:51, Andrew Cooper wrote:
> On 07/01/2020 15:21, Jan Beulich wrote:
>> On 06.01.2020 16:54, Andrew Cooper wrote:
>>> c/s ec92fcd1d08, which caused the trampoline GDT Access bits to be set,
>>> removed the final writes which occurred between enabling paging and 
>>> switching
>>> to the high mappings.  There don't plausibly need to be any memory writes in
>>> few instructions is takes to perform this transition.
>>>
>>> As a consequence, we can remove the RWX mapping of the trampoline.  It is RX
>>> via its identity mapping below 1M, and RW via the directmap.
>>>
>>> Signed-off-by: Andrew Cooper 
>> Reviewed-by: Jan Beulich 
>>
>>> This probably wants backporting, alongside ec92fcd1d08 if it hasn't yet.
>> This is just cleanup, largely cosmetic in nature. It could be argued
>> that once the directmap has disappeared this can serve as additional
>> proof that the trampoline range has no (intended) writable mappings
>> anymore, but prior to that point I don't see much further benefit.
>> Could you expand on the reasons why you see both as backporting
>> candidates?
> 
> Defence in depth.
> 
> An RWX mapping is very attractive for an attacker who's broken into Xen
> and is looking to expand the damage they can do.

Such an attacker is typically in the position though to make
themselves RWX mappings. Having as little as possible is only
complicating their job, not making it impossible, I would say.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 2/5] x86/hyperv: provide Hyper-V hypercall functions

2020-01-07 Thread Wei Liu
On Sun, Jan 05, 2020 at 10:06:08PM +, Andrew Cooper wrote:
> On 05/01/2020 21:22, Wei Liu wrote:
> > On Sun, Jan 05, 2020 at 07:08:28PM +, Andrew Cooper wrote:
> >>> +static inline uint64_t hv_do_hypercall(uint64_t control, paddr_t input, 
> >>> paddr_t output)
> >>> +{
> >>> +uint64_t status;
> >>> +
> >>> +asm volatile ("mov %[output], %%r8\n"
> >>> +  "call hv_hypercall_page"
> >>> +  : "=a" (status), "+c" (control),
> >>> +"+d" (input) ASM_CALL_CONSTRAINT
> >>> +  : [output] "rm" (output)
> >>> +  : "cc", "memory", "r8", "r9", "r10", "r11");
> >> I think you want:
> >>
> >> register unsigned long r8 asm("r8") = output;
> >>
> >> and "+r" (r8) as an output constraint.
> > Although it is named `output`, it is really just an input parameter from
> > Hyper-V's PoV.
> 
> Yes, but it is also clobbered.
> 
> This is an awkward corner case of gnu inline assembly.
> 
> It is not permitted to have a clobber list overlap with any input/output
> operations, and because r8 doesn't have a unique letter, you can't do
> the usual trick of "=r8" (discard) : "r8" (input).
> 
> The only available option is to mark it as read and written (which is
> "+r" in the output list), and not use the C variable r8 at any point later.

But r8 is only listed in clobber list, so it certainly doesn't overlap
with any input register. I fail to see what the bug (if there is any) is
here.

I think what you're asking for here is an optimisation. Is that correct?
I don't mind changing the code. What I need is clarification here.

> 
> 
> Having looked through the spec a bit more, is this a wise API to have in
> the first place?  input and output (perhaps better named input_addr and
> output_addr) are fixed per CPU, and control is semantically linked to
> the hypercall and its particular ABI.
> 
> I suppose the answer ultimately depends on what the callers look like.

The call sites will be like

struct hv_input_arg *input_arg;
input_arg = per_cpu_input_page;
input_arg.foo = xxx;
input_arg.bar = xxx;

hv_do_hypercall(control, virt_to_maddr(input_arg), NULL);

.

(Alternatively, we can put virt_to_maddr in hv_do_hypercall now that
we're sure the input page is from xenheap)

> 
> >
> >> In particular, that doesn't force the compiler to put output into a
> >> register other than r8 (or worse, spill it to the stack) to have the
> >> opaque blob of asm move it back into r8.  What it will do in practice is
> >> cause the compiler to construct output directly in r8.
> >>
> >> As for the other clobbers, I can't find anything at all in the spec
> >> which even mentions those registers.  There will be a decent improvement
> >> to code generation if we don't force them to be spilled around a hypercall.
> >>
> > Neither can I. But Linux's commit says that's needed, so I chose to err
> > on the safe side.
> 
> That's dull.  Is there any qualifying information?

See Linux commit fc53662f13b.

I will also ask my contact in Hyper-V team for clarification.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/6] x86/boot: Clean up l?_bootmap[] construction

2020-01-07 Thread Jan Beulich
On 06.01.2020 16:54, Andrew Cooper wrote:
> The need for Xen to be identity mapped into the bootmap is not obvious, and
> differs between the MB and EFI boot paths.  Furthermore, the EFI side is
> further complicated by spraying non-identity aliases into the mix.

What (intentional) aliases are you talking about? The changes done here
don't remove any. Or do you mean the ones occurring as a side effect of
possibly using the same L2 in two L3 slots?

> Simplify the EFI bootmap construction code to make exactly one identity-map of
> Xen, which now matches the MB path.  Comment both pieces of logic, explaining
> what the mappings are needed for.

Is both boot map variants fully matching actually needed for anything?

> --- a/xen/arch/x86/efi/efi-boot.h
> +++ b/xen/arch/x86/efi/efi-boot.h
> @@ -584,21 +584,24 @@ static void __init efi_arch_memory_setup(void)
>  if ( !efi_enabled(EFI_LOADER) )
>  return;
>  
> -/* Initialise L2 identity-map and boot-map page table entries (16MB). */
> +/*
> + * Map Xen into the directmap (NX, needed for early-boot pagetable
> + * handling/walking), and identity map Xen into bootmap (X, needed for 
> the
> + * transition from the EFI pagetables to Xen), using 2M superpages.
> + */

How does NX vs X matter for the code below here? PAGE_HYPERVISOR and
__PAGE_HYPERVISOR, as used below, differ by just _PAGE_GLOBAL. Did
you mean to make further changes?

>  for ( i = 0; i < 8; ++i )
>  {
>  unsigned int slot = (xen_phys_start >> L2_PAGETABLE_SHIFT) + i;
>  paddr_t addr = slot << L2_PAGETABLE_SHIFT;
>  
>  l2_identmap[slot] = l2e_from_paddr(addr, PAGE_HYPERVISOR|_PAGE_PSE);
> -slot &= L2_PAGETABLE_ENTRIES - 1;
>  l2_bootmap[slot] = l2e_from_paddr(addr, __PAGE_HYPERVISOR|_PAGE_PSE);
>  }
> -/* Initialise L3 boot-map page directory entries. */
> -l3_bootmap[l3_table_offset(xen_phys_start)] =
> -l3e_from_paddr((UINTN)l2_bootmap, __PAGE_HYPERVISOR);
> -l3_bootmap[l3_table_offset(xen_phys_start + (8 << L2_PAGETABLE_SHIFT) - 
> 1)] =
> -l3e_from_paddr((UINTN)l2_bootmap, __PAGE_HYPERVISOR);
> +
> +/* Initialize L3 boot-map page directory entries. */
> +for ( i = 0; i < 4; ++i )
> +l3_bootmap[i] = l3e_from_paddr((UINTN)l2_bootmap + i * PAGE_SIZE,
> +   __PAGE_HYPERVISOR);

The idea behind the original code was to be immune to the number
of pages l2_bootmap[] covers, as long as it's at least one (which
it'll always be, I would say). The minimum requirement to any
change to this I have is that the build must break if the size
assumption here is violated. I.e. there may not be a literal 4 as
the upper loop bound here, or there would need to be a
BUILD_BUG_ON() right next to it. But I'd really prefer if the
code was left as is (perhaps with a comment added), unless you
can point out actual issues with it (which I can't see in the
description), or you can otherwise justify the change with better
than "the EFI side is further complicated by spraying non-identity
aliases into the mix."

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/6] x86/boot: Map the trampoline as read-only

2020-01-07 Thread Andrew Cooper
On 07/01/2020 15:21, Jan Beulich wrote:
> On 06.01.2020 16:54, Andrew Cooper wrote:
>> c/s ec92fcd1d08, which caused the trampoline GDT Access bits to be set,
>> removed the final writes which occurred between enabling paging and switching
>> to the high mappings.  There don't plausibly need to be any memory writes in
>> few instructions is takes to perform this transition.
>>
>> As a consequence, we can remove the RWX mapping of the trampoline.  It is RX
>> via its identity mapping below 1M, and RW via the directmap.
>>
>> Signed-off-by: Andrew Cooper 
> Reviewed-by: Jan Beulich 
>
>> This probably wants backporting, alongside ec92fcd1d08 if it hasn't yet.
> This is just cleanup, largely cosmetic in nature. It could be argued
> that once the directmap has disappeared this can serve as additional
> proof that the trampoline range has no (intended) writable mappings
> anymore, but prior to that point I don't see much further benefit.
> Could you expand on the reasons why you see both as backporting
> candidates?

Defence in depth.

An RWX mapping is very attractive for an attacker who's broken into Xen
and is looking to expand the damage they can do.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/6] x86/boot: Remove the preconstructed low 16M superpage mappings

2020-01-07 Thread Jan Beulich
On 06.01.2020 16:54, Andrew Cooper wrote:
> First, it is undefined to have superpages and MTRRs disagree on cacheability
> boundaries, and nothing this early in boot has checked that it is safe to use
> superpages here.

Stating this here gives, at least to me, the impression that you change
things here to obey to these restrictions. I don't see you do so, though
- map_pages_to_xen() doesn't query MTRRs at all afaics.

> Furthermore, nothing actually uses the mappings on boot.  Build these entries
> in the directmap when walking the E820 table along with everything else.

I'm pretty sure some of these mappings were used, perhaps long ago, and
possibly only by the 32-bit hypervisor. It would feel quite a bit better
if it was clear when the need for this disappeared. I wonder if I could
talk you into finding out, so you could say so here.

> --- a/xen/arch/x86/boot/x86_64.S
> +++ b/xen/arch/x86/boot/x86_64.S
> @@ -66,24 +66,19 @@ l1_identmap:
>  .size l1_identmap, . - l1_identmap
>  
>  /*
> - * __page_tables_start does not cover l1_identmap because it (l1_identmap)
> - * contains 1-1 mappings. This means that frame addresses of these mappings
> - * are static and should not be updated at runtime.
> + * __page_tables_{start,end} cover the range of pagetables which need
> + * relocating as Xen moves around physical memory.  i.e. each sym_offs()
> + * reference to a different pagetable in the Xen image.
>   */
>  GLOBAL(__page_tables_start)
>  
>  /*
> - * Space for mapping the first 4GB of memory, with the first 16 megabytes
> - * actualy mapped (mostly using superpages).  Uses 4x 4k pages.
> + * Space for 4G worth of 2M mappings, first 2M actually mapped via
> + * l1_identmap[].  Uses 4x 4k pages.

Would you mind making this say "page tables" instead of "pages" in the
2nd sentence?

> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1020,8 +1020,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>   *
>   * We require superpage alignment because the boot allocator is
>   * not yet initialised. Hence we can only map superpages in the
> - * address range BOOTSTRAP_MAP_BASE to 4GB, as this is guaranteed
> - * not to require dynamic allocation of pagetables.
> + * address range 2MB to 4GB, as this is guaranteed not to require
> + * dynamic allocation of pagetables.
>   *
>   * As well as mapping superpages in that range, in preparation for
>   * initialising the boot allocator, we also look for a region to which
> @@ -1036,10 +1036,10 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>  if ( boot_e820.map[i].type != E820_RAM )
>  continue;
>  
> -/* Superpage-aligned chunks from BOOTSTRAP_MAP_BASE. */
> +/* Superpage-aligned chunks from 2MB. */
>  s = (boot_e820.map[i].addr + mask) & ~mask;
>  e = (boot_e820.map[i].addr + boot_e820.map[i].size) & ~mask;
> -s = max_t(uint64_t, s, BOOTSTRAP_MAP_BASE);
> +s = max_t(uint64_t, s, MB(2));
>  if ( s >= e )
>  continue;
>  
> @@ -1346,8 +1346,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>  
>  set_pdx_range(s >> PAGE_SHIFT, e >> PAGE_SHIFT);
>  
> -/* Need to create mappings above BOOTSTRAP_MAP_BASE. */
> -map_s = max_t(uint64_t, s, BOOTSTRAP_MAP_BASE);
> +/* Need to create mappings above 2MB. */
> +map_s = max_t(uint64_t, s, MB(2));

Instead of hard coding 2Mb everywhere, how about simply reducing
BOOTSTRAP_MAP_BASE? This would then also ease shrinking the build
time mappings further, e.g. to the low 1Mb (instead of touching
several of the places you touch now, it would again mainly be an
adjustment to BOOTSTRAP_MAP_BASE, alongside the assembly file
changes needed).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/5] x86/hyperv: setup hypercall page

2020-01-07 Thread Wei Liu
On Sun, Jan 05, 2020 at 09:57:56PM +, Andrew Cooper wrote:
[...]
> >
> >> The locked bit is probably a good idea, but one aspect missing here is
> >> the check to see whether the hypercall page is already enabled, which I
> >> expect is for a kexec crash scenario.
> >>
> >> However, the most important point is the one which describes the #GP
> >> properties of the guest trying to modify the page.  This can only be
> >> achieved with an EPT/NPT mapping lacking the W permission, which will
> >> shatter host superpages.   Therefore, putting it in .text is going to be
> >> rather poor, perf wise.
> >>
> >> I also note that Xen's implementation of the Viridian hypercall page
> >> doesn't conform to these properties, and wants fixing.  It is going to
> >> need a new kind identification of the page (probably a new p2m type)
> >> which injects #GP if we ever see an EPT_VIOLATION/NPT_FAULT against it.
> >>
> >> As for suggestions here, I'm struggling to find any memory map details
> >> exposed in the Viridian interface, and therefore which gfn is best to
> >> choose.  I have a sinking feeling that the answer is ACPI...
> > TLFS only says "go find one suitable page yourself" without further
> > hints.
> >
> > Since we're still quite far away from a functioning system, finding a
> > most suitable page isn't my top priority at this point. If there is a
> > simple way to extrapolate suitable information from ACPI, that would be
> > great. If it requires writing a set of functionalities, than that will
> > need to wait till later.
> 
> To cope with the "one is already established and it is already locked"
> case, the only option is to have a fixmap entry which can be set
> dynamically.  The problem is that the fixmap region is marked NX and 64G
> away from .text.
> 
> Possibly the least bad option is to have some build-time space (so 0 or
> 4k depending on CONFIG_HYPERV) between the per-cpu stubs and
> XEN_VIRT_END, which operates like the fixmap, but ends up as X/RO mappings.
> 

OK. This is probably not too difficult. 

> That way, the virtual address ends up in a useful position (wrt using
> direct call instructions) irrespective of where the gfn is/ends up.  As
> for guessing, a good start is probably MAXPHYSADDR.

To make sure I understand your correctly: you're talking about using the
page just below MAXPHYSADDR (derived from paddr_bits from xen source),
right?

Wei.

> 
> ~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/6] x86/boot: Map the trampoline as read-only

2020-01-07 Thread Jan Beulich
On 06.01.2020 16:54, Andrew Cooper wrote:
> c/s ec92fcd1d08, which caused the trampoline GDT Access bits to be set,
> removed the final writes which occurred between enabling paging and switching
> to the high mappings.  There don't plausibly need to be any memory writes in
> few instructions is takes to perform this transition.
> 
> As a consequence, we can remove the RWX mapping of the trampoline.  It is RX
> via its identity mapping below 1M, and RW via the directmap.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 

> This probably wants backporting, alongside ec92fcd1d08 if it hasn't yet.

This is just cleanup, largely cosmetic in nature. It could be argued
that once the directmap has disappeared this can serve as additional
proof that the trampoline range has no (intended) writable mappings
anymore, but prior to that point I don't see much further benefit.
Could you expand on the reasons why you see both as backporting
candidates?


Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/6] x86/boot: Check for E820_RAM earlier when searching the E820

2020-01-07 Thread Jan Beulich
On 06.01.2020 16:54, Andrew Cooper wrote:
> There is no point performing the masking calculations if we are going to
> throw the result away.

A reasonably optimizing compiler ought to do so. It's slightly less
source code the original way. Nevertheless I don't really mind the
change, so ...

> No functional change.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/arm: vgic-v3: Fix the typo of GICD IRQ active status range

2020-01-07 Thread Julien Grall



On 07/01/2020 12:55, Wei Xu wrote:

Hi Julien,
As only one entity should manage the UART (i.e Xen or Dom0), we today 
assume this will be managed by Xen. Xen should expose a partial 
virtual UART (only a few registers are emulating) to dom0 in replacement.


This is usually done by the UART driver. Looking at the log you pasted 
in a separate e-mail:


(XEN) Platform: Generic System
(XEN) Unable to initialize acpi uart: -9
(XEN) Bad console= option 'dtuart'

So Xen didn't manage to initialize the uart. The -9 suggests, Xen 
didn't find a driver for your UART. At the moment, Xen is only able to 
detect pl011, sbsa, sbsa32 UART for ACPI. What is the type of the UART 
used on your platform?




Thanks!
Got it.
Our UART is 8250.


You would need to teach the 8250 driver how to initialize the UART with 
ACPI. It is not very difficult to do it, have a look at the pl011 version.




Thanks!
It is not working even I changed the condition to " if ( acpi_disabled ) ".


Doh, thank you for spotting the extra !.


My grub 2.04 configuration is as below:

     xen_hypervisor /xen dom0_mem=4G acpi=force loglvl=all guest_loglvl=all
     xen_module /Image rdinit=/init  acpi=force noinitrd root=/dev/sdb1 rw

The log with the condition " if ( acpi_disabled ) " is as following:

     (XEN) Adding cpu 126 to runqueue 0
     (XEN) Adding cpu 127 to runqueue 0
     (XEN) alternatives: Patching with alt table 002d4f48 -> 
002d5764

     (XEN) *** LOADING DOMAIN 0 ***
     (XEN) Loading d0 kernel from boot module @ 16257000
     (XEN) Allocating 1:1 mappings totalling 4096MB for dom0:
     (XEN) BANK[0] 0x000800-0x001000 (128MB)
     (XEN) BANK[1] 0x002000-0x003800 (384MB)
     (XEN) BANK[2] 0x005000-0x008000 (768MB)
     (XEN) BANK[3] 0x002020-0x0020208000 (2048MB)
     (XEN) BANK[4] 0x002020b000-0x002020c000 (256MB)
     (XEN) BANK[5] 0x002026-0x0020262000 (512MB)
     (XEN) Grant table range: 0x00181c7000-0x0018207000
     (XEN) Allocating PPI 16 for event channel interrupt
     (XEN) Loading zImage from 16257000 to 
0808-09981200

     (XEN) Loading d0 DTB to 0x0fe0-0x0fe0025b
     (XEN) Initial low memory virq threshold set at 0x4000 pages.
     (XEN) Scrubbing Free RAM in background
     (XEN) Std. Loglevel: All
     (XEN) Guest Loglevel: All
     (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch 
input)

     (XEN) Data Abort Trap. Syndrome=0x6
     (XEN) Walking Hypervisor VA 0x10 on CPU0 via TTBR 0x182ff000
     (XEN) 0TH[0x0] = 0x18302f7f
     (XEN) 1ST[0x0] = 0x18300f7f
     (XEN) 2ND[0x0] = 0x
     (XEN) CPU0: Unexpected Trap: Data Abort
     (XEN) [ Xen-4.13.0-rc  arm64  debug=y   Not tainted ]
     (XEN) CPU:    0
     (XEN) PC: 002b65c8 002b65c8


Can you look with addr2line what this PC refers to?

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V6 1/4] x86/mm: Add array_index_nospec to guest provided index values

2020-01-07 Thread Jan Beulich
On 07.01.2020 15:31, Alexandru Stefan ISAILA wrote:
> 
> 
> On 07.01.2020 15:55, Jan Beulich wrote:
>> On 07.01.2020 14:25, Alexandru Stefan ISAILA wrote:
>>> On 27.12.2019 10:01, Jan Beulich wrote:
 On 23.12.2019 15:04, Alexandru Stefan ISAILA wrote:
> --- a/xen/arch/x86/mm/mem_access.c
> +++ b/xen/arch/x86/mm/mem_access.c
> @@ -366,11 +366,12 @@ long p2m_set_mem_access(struct domain *d, gfn_t 
> gfn, uint32_t nr,
>#ifdef CONFIG_HVM
>if ( altp2m_idx )
>{
> -if ( altp2m_idx >= MAX_ALTP2M ||
> - d->arch.altp2m_eptp[altp2m_idx] == mfn_x(INVALID_MFN) )
> +if ( altp2m_idx >=  min(ARRAY_SIZE(d->arch.altp2m_p2m), 
> MAX_EPTP) ||

 Stray blank after >= .

> + d->arch.altp2m_eptp[array_index_nospec(altp2m_idx, 
> MAX_EPTP)] ==

 I accept you can't (currently) use array_access_nospec() here,
 but ...

> + mfn_x(INVALID_MFN) )
>return -EINVAL;
>
> -ap2m = d->arch.altp2m_p2m[altp2m_idx];
> +ap2m = d->arch.altp2m_p2m[array_index_nospec(altp2m_idx, 
> MAX_ALTP2M)];

 ... I don't see why you still effectively open-code it here.
>>>
>>> I am not sure I follow you here, that is what we agreed in v5
>>> (https://lists.xenproject.org/archives/html/xen-devel/2019-12/msg01704.html).
>>> Did I miss something?
>>
>> In context there (from an earlier reply of mine) you will find me
>> having mentioned array_access_nospec(). This wasn't invalidated or
>> overridden by my "Yes, that's how I think it ought to be." I didn't
>> say so explicitly (again) because to me it goes without saying that
>> open-coding _anything_ is, in the common case, bad practice.
>>
> 
> So the way to go is to have:
> 
> altp2m_idx = array_index_nospec(altp2m_idx, MAX_ALTP2M);
> ap2m = d->arch.altp2m_p2m[altp2m_idx];

No. The way to go is to use array_access_nospec() wherever possible.
Besides (as said) avoiding its open-coding, this is the construct
correctly matching your uses of ARRAY_SIZE(), avoiding the explicit
specification of the upper array bound (MAX_ALTP2M). (I really don't
see how my previous reply was not crystal clear in this regard.)

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Community Call: Call for Agenda Items and call details for Jan 9, 16:00 - 17:00 UTC

2020-01-07 Thread Tamas K Lengyel
> > > Please send me agenda items for this Thursday's community call (we
> > agreed to move it by 1 week) preferably by Wednesday!
> > >
> > > A draft agenda is
> > at https://cryptpad.fr/pad/#/2/pad/edit/ERZtMYD5j6k0sv-NG6Htl-AJ/
> > > Please add agenda items to the document or reply to this e-mail
> >
> > I think it would be very helpful for the community in general to know
> > any specific plans each of us have for the 4.14 timeframe.
> >
> > I personally am aware of a fair quantity of work from various people,
> > but it is clear that the community as a whole doesn't really have an
> > idea of who is working on what.
> >
> > My contribution to the discussion starts with
> > https://lore.kernel.org/xen-devel/941cf23c-13ed-14a1-fd25-
> > 45b001d95...@citrix.com/T/#u
> > but I think it would be helpful if others gave at least a brief overview
> > of any plans and whether they are intending the work to hit the next
> > release, or whether it is more likely to be a future release.
>
> Agreed. I need a baseline list of items to track for 4.14.

My vm forking series has been posted and it's at v3. Most of the
patches are just mem_sharing cleanups with no functional change but
still need an ack from an x86 maintainer as I'm the only maintainer
listed for mem_sharing itself.

Tamas

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] libxl: don't needlessly report "highmem" in use

2020-01-07 Thread Jan Beulich
Due to the unconditional updating of dom->highmem_end in
libxl__domain_device_construct_rdm() I've observed on a 2Gb HVM guest
with a passed through device (without overly large BARs, and with no RDM
ranges at all)

(d2) RAM in high memory; setting high_mem resource base to 1
...
(d2) E820 table:
(d2)  [00]: : - :000a: RAM
(d2)  HOLE: :000a - :000d
(d2)  [01]: :000d - :0010: RESERVED
(d2)  [02]: :0010 - :7f80: RAM
(d2)  HOLE: :7f80 - :fc00
(d2)  [03]: :fc00 - 0001:: RESERVED
(d2)  [04]: 0001: - 0001:: RAM

both of which aren't really appropriate in this case. Arrange for this
to not happen.

Signed-off-by: Jan Beulich 

--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -432,7 +432,7 @@ int libxl__domain_device_construct_rdm(l
 uint16_t seg;
 uint8_t bus, devfn;
 uint64_t rdm_start, rdm_size;
-uint64_t highmem_end = dom->highmem_end ? dom->highmem_end : (1ull<<32);
+uint64_t highmem_end = dom->highmem_end;
 
 /*
  * We just want to construct RDM once since RDM is specific to the
@@ -557,6 +557,8 @@ int libxl__domain_device_construct_rdm(l
  * We will move downwards lowmem_end so we have to expand
  * highmem_end.
  */
+if (!highmem_end)
+highmem_end = 1ull << 32;
 highmem_end += (dom->lowmem_end - rdm_start);
 /* Now move downwards lowmem_end. */
 dom->lowmem_end = rdm_start;
@@ -577,9 +579,10 @@ int libxl__domain_device_construct_rdm(l
 conflict = overlaps_rdm(0, dom->lowmem_end,
 rdm_start, rdm_size);
 /* Does this entry conflict with highmem? */
-conflict |= overlaps_rdm((1ULL<<32),
- dom->highmem_end - (1ULL<<32),
- rdm_start, rdm_size);
+if (highmem_end)
+conflict |= overlaps_rdm((1ULL << 32),
+ highmem_end - (1ULL << 32),
+ rdm_start, rdm_size);
 
 if (!conflict)
 continue;

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/mem_sharing: Fix RANDCONFIG build

2020-01-07 Thread Tamas K Lengyel
On Tue, Jan 7, 2020 at 6:49 AM Andrew Cooper  wrote:
>
> Travis reports: https://travis-ci.org/andyhhp/xen/jobs/633751811
>
>   mem_sharing.c:361:13: error: 'rmap_has_entries' defined but not used 
> [-Werror=unused-function]
>static bool rmap_has_entries(const struct page_info *page)
>^
>   cc1: all warnings being treated as errors
>
> This happens in a release build (disables MEM_SHARING_AUDIT) when
> CONFIG_MEM_SHARING is enabled.

My bad, seemed to have missed this somehow.

>
> Mark the helper as maybe_unused.
>
> Signed-off-by: Andrew Cooper 
> ---
> CC: Tamas K Lengyel 
>
> The alternative is to delete the helper and opencode it for its one caller.

IMHO that would be better, no reason to keep this trivial check as a
separate function for one caller. Same stands for the
rmap_has_one_entry function as well (feel free to bunch that in too
but I could also do that separately).

Thanks,
Tamas

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [qemu-mainline test] 145736: regressions - FAIL

2020-01-07 Thread osstest service owner
flight 145736 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/145736/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-xsm   6 xen-buildfail REGR. vs. 144861
 build-arm64   6 xen-buildfail REGR. vs. 144861
 build-i3866 xen-buildfail REGR. vs. 144861
 build-amd64   6 xen-buildfail REGR. vs. 144861
 build-i386-xsm6 xen-buildfail REGR. vs. 144861
 build-amd64-xsm   6 xen-buildfail REGR. vs. 144861
 build-armhf   6 xen-buildfail REGR. vs. 144861

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ws16-amd64  1 build-check(1)  blocked n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-thunderx  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit1   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvhv2-intel  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-i386-pvgrub  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-credit1   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-qemuu-nested-amd  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-pvshim 1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvshim1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-amd64-pvgrub  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-seattle   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  1 build-check(1)  blocked n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-pvhv2-amd  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 test-amd64-i386-xl-shadow 1 build-check(1)   blocked  n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 

Re: [Xen-devel] Recent cores-scheduling failures

2020-01-07 Thread Sergey Dyasli
On 20/12/2019 06:26, Jürgen Groß wrote:
> On 19.12.19 13:45, Sergey Dyasli wrote:
>> Hi Juergen,
>>
>> We recently did another quick test of core scheduling mode, and the following
>> failures were found:
>>
>> 1. live-patch apply failures:
>>
>>  (XEN) [ 1058.751974] livepatch: lp_1_1: Timed out on semaphore in CPU 
>> quiesce phase 30/31
>>  (XEN) [ 1058.751982] livepatch: lp_1_1 finished REPLACE with rc=-16
>>
>> 2. ACPI S5 crash:
>>
>>  https://paste.debian.net/1121748/
>
> Are there any XenServer patches in your hypervisor?
>
> I'm asking because I don't see why a vcpu would be freed when shutting
> down the host (other than by any shutdown scripts, but those should be
> long finished when trying to enter S5).

While we have the patch-queue applied in our testing, there is nothing
there that would affect the scheduler directly.

The S5 crash reproduces reliably in automated testing, but I still don't
know how to trigger the issue manually.

--
Thanks,
Sergey

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V6 1/4] x86/mm: Add array_index_nospec to guest provided index values

2020-01-07 Thread Alexandru Stefan ISAILA


On 07.01.2020 15:55, Jan Beulich wrote:
> On 07.01.2020 14:25, Alexandru Stefan ISAILA wrote:
>> On 27.12.2019 10:01, Jan Beulich wrote:
>>> On 23.12.2019 15:04, Alexandru Stefan ISAILA wrote:
 --- a/xen/arch/x86/mm/mem_access.c
 +++ b/xen/arch/x86/mm/mem_access.c
 @@ -366,11 +366,12 @@ long p2m_set_mem_access(struct domain *d, gfn_t gfn, 
 uint32_t nr,
#ifdef CONFIG_HVM
if ( altp2m_idx )
{
 -if ( altp2m_idx >= MAX_ALTP2M ||
 - d->arch.altp2m_eptp[altp2m_idx] == mfn_x(INVALID_MFN) )
 +if ( altp2m_idx >=  min(ARRAY_SIZE(d->arch.altp2m_p2m), MAX_EPTP) 
 ||
>>>
>>> Stray blank after >= .
>>>
 + d->arch.altp2m_eptp[array_index_nospec(altp2m_idx, 
 MAX_EPTP)] ==
>>>
>>> I accept you can't (currently) use array_access_nospec() here,
>>> but ...
>>>
 + mfn_x(INVALID_MFN) )
return -EINVAL;

 -ap2m = d->arch.altp2m_p2m[altp2m_idx];
 +ap2m = d->arch.altp2m_p2m[array_index_nospec(altp2m_idx, 
 MAX_ALTP2M)];
>>>
>>> ... I don't see why you still effectively open-code it here.
>>
>> I am not sure I follow you here, that is what we agreed in v5
>> (https://lists.xenproject.org/archives/html/xen-devel/2019-12/msg01704.html).
>> Did I miss something?
> 
> In context there (from an earlier reply of mine) you will find me
> having mentioned array_access_nospec(). This wasn't invalidated or
> overridden by my "Yes, that's how I think it ought to be." I didn't
> say so explicitly (again) because to me it goes without saying that
> open-coding _anything_ is, in the common case, bad practice.
> 

So the way to go is to have:

altp2m_idx = array_index_nospec(altp2m_idx, MAX_ALTP2M);
ap2m = d->arch.altp2m_p2m[altp2m_idx];


Alex
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] CPU Lockup bug with the credit2 scheduler

2020-01-07 Thread Alastair Browne
SYMPTOMS

A Xen host is found to lock up with messages on console along the
following lines:-

NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s!

Later on in the system log, reference is often made to a specific
program that happens to be running at the time, however the program
referred to is not constant and will vary according to what happens to
be running at the time.

Once the host has locked up, the only solution is a reboot. It hasn't
been possible to further analyse the state of a locked up machine due
to unavailability of the command line.

This problem has been seen to occur on a Debian platform with the
following configuration, however it could equally occur on other
platforms.

The configuration of the host machine is as follows:-

# cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 8 (jessie)"
NAME="Debian GNU/Linux"
VERSION_ID="8"
VERSION="8 (jessie)"
ID=debian
HOME_URL="http://www.debian.org/;
SUPPORT_URL="http://www.debian.org/support;
BUG_REPORT_URL="https://bugs.debian.org/;

# uname -srvpio
Linux 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u2a~test (2019-12-18)
unknown unknown GNU/Linux

# xl info
host: my-host.example.com
release : 4.9.0-11-amd64
version : #1 SMP Debian 4.9.189-3+deb9u2a~test (2019-
12-18)
machine : x86_64
nr_cpus : 24
max_cpu_id  : 191
nr_nodes: 2
cores_per_socket: 12
threads_per_core: 1
cpu_mhz : 1797.920
hw_caps :
bfebfbff:77fef3ff:2c100800:0021:0001:37ab::0100
virt_caps   : pv hvm hvm_directio pv_directio hap shadow
iommu_hap_pt_share
total_memory: 392994
free_memory : 265294
sharing_freed_memory: 0
sharing_used_memory : 0
outstanding_claims  : 0
free_cpus   : 0
xen_major   : 4
xen_minor   : 13
xen_extra   : .0-mem1-ox
xen_version : 4.13.0-mem1-ox
xen_caps: xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 
hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler   : credit2
xen_pagesize: 4096
platform_params : virt_start=0x8000
xen_changeset   : Tue Dec 17 14:19:49 2019 + git:a2e84d8e42
xen_commandline : placeholder dom0_mem=4096M,max:16384M
com1=115200,8n1 console=com1 ucode=scan smt=0 sched=credit2 
crashkernel=512M@32M
cc_compiler : gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
cc_compile_by   : support
cc_compile_domain   : example.com
cc_compile_date : Wed Dec 18 11:13:45 GMT 2019
build_id: 672783467e7a60c4f8a1aa715d549cb59f00c7cf
xend_config_format  : 4Re-Creation


To recreate the symptoms, build a Xen host according to the above
parameters, then create at least ten Linux virtual machines
within it.The Xen host should use LVM to provision the VMs with their
storage. Each VM should have one single disk device, partitioned in
the conventional manner.

The Virtual machines and the Xen host must then be loaded up as
follows:-

VIRTUAL MACHINES

Construct a program to allocate, fill and free
memory. An example of such a program is given below:-

mem-grab.C
/*
  This program will allocate and fill memory. It's purpose is to
  simulate memory use on a machine. Once it has grabbed the memory, it
  sleeps for 10 seconds, then frees it.
  If run with no arguments, the program will find out the maximum
  memory available on the machine and then will attempt to grab 75% of
  it. If run with an integer argument, this program will attempt to
  allocate that amount of memory.
  If an error occurs with the allocation, then an exception will be
  thrown and caught. An error message will then be printed on stderr.
*/
  
#include 
#include 
#include 
#include 
#include 
#include 

#define MEM_PERCENT 0.75
  using namespace std;
int main(int argc, char** argv)
{
  int *ptr;
  unsigned long long i,n;
  unsigned long long MemAvailable = 0;
  unsigned long long MemAlloc = 0;
  if (argc == 1)
{
  // Find out the maximum memory available
  MemAvailable = get_system_memory ();
  cout << "Memory available = " << MemAvailable << endl;
  MemAlloc = MemAvailable * MEM_PERCENT;
  cout << "Memory to be allocated: " << MemAlloc << endl;
  // Divide the value by the size of an int because that's what we
  // will be filling the memory with.
  n = MemAlloc / sizeof (int);
}
  else
{
  n = strtoul (argv[1], NULL, 0);
  n = n / sizeof (int);
}
  cout << "Allocating " << n * sizeof (int) << " bytes..." << endl;
  try
{
  ptr = new int [n];
}
  catch (exception& e)
{
  cerr << "Failed to allocate memory: " << e.what() << endl;
  return 1;
}
  printf("Filling int into memory.\n");
  for (i = 0; i < n; i++)
{
  ptr[i] = 1;
}
  printf("Sleep 10 seconds..\n");
  this_thread::sleep_for 

Re: [Xen-devel] [PATCH v5 1/7] x86: move some xen mm function declarations

2020-01-07 Thread Wei Liu
On Tue, Jan 07, 2020 at 01:48:41PM +, Xia, Hongyan wrote:
> On Tue, 2020-01-07 at 14:09 +0100, Jan Beulich wrote:
> > ...
> > 
> > Looks like I simply forgot every time I went through my list of
> > pending (for the various stages of processing) patches. I guess
> > patches 3 and 4 are also independent of patch 2 and hence could
> > go in as well.
> 
> If so, looks like patch 7/7 is also in a committable state?

Looks like so. I will commit that one as well.

Thanks for putting in the effort to upstream these patches, Hongyan.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V6 1/4] x86/mm: Add array_index_nospec to guest provided index values

2020-01-07 Thread Jan Beulich
On 07.01.2020 14:25, Alexandru Stefan ISAILA wrote:
> On 27.12.2019 10:01, Jan Beulich wrote:
>> On 23.12.2019 15:04, Alexandru Stefan ISAILA wrote:
>>> --- a/xen/arch/x86/mm/mem_access.c
>>> +++ b/xen/arch/x86/mm/mem_access.c
>>> @@ -366,11 +366,12 @@ long p2m_set_mem_access(struct domain *d, gfn_t gfn, 
>>> uint32_t nr,
>>>   #ifdef CONFIG_HVM
>>>   if ( altp2m_idx )
>>>   {
>>> -if ( altp2m_idx >= MAX_ALTP2M ||
>>> - d->arch.altp2m_eptp[altp2m_idx] == mfn_x(INVALID_MFN) )
>>> +if ( altp2m_idx >=  min(ARRAY_SIZE(d->arch.altp2m_p2m), MAX_EPTP) 
>>> ||
>>
>> Stray blank after >= .
>>
>>> + d->arch.altp2m_eptp[array_index_nospec(altp2m_idx, MAX_EPTP)] 
>>> ==
>>
>> I accept you can't (currently) use array_access_nospec() here,
>> but ...
>>
>>> + mfn_x(INVALID_MFN) )
>>>   return -EINVAL;
>>>   
>>> -ap2m = d->arch.altp2m_p2m[altp2m_idx];
>>> +ap2m = d->arch.altp2m_p2m[array_index_nospec(altp2m_idx, 
>>> MAX_ALTP2M)];
>>
>> ... I don't see why you still effectively open-code it here.
> 
> I am not sure I follow you here, that is what we agreed in v5 
> (https://lists.xenproject.org/archives/html/xen-devel/2019-12/msg01704.html). 
> Did I miss something?

In context there (from an earlier reply of mine) you will find me
having mentioned array_access_nospec(). This wasn't invalidated or
overridden by my "Yes, that's how I think it ought to be." I didn't
say so explicitly (again) because to me it goes without saying that
open-coding _anything_ is, in the common case, bad practice.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86/mem_sharing: Fix RANDCONFIG build

2020-01-07 Thread Andrew Cooper
Travis reports: https://travis-ci.org/andyhhp/xen/jobs/633751811

  mem_sharing.c:361:13: error: 'rmap_has_entries' defined but not used 
[-Werror=unused-function]
   static bool rmap_has_entries(const struct page_info *page)
   ^
  cc1: all warnings being treated as errors

This happens in a release build (disables MEM_SHARING_AUDIT) when
CONFIG_MEM_SHARING is enabled.

Mark the helper as maybe_unused.

Signed-off-by: Andrew Cooper 
---
CC: Tamas K Lengyel 

The alternative is to delete the helper and opencode it for its one caller.
---
 xen/arch/x86/mm/mem_sharing.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index ddf1f0f9f9..0a1550ffd2 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -358,7 +358,7 @@ static bool rmap_has_one_entry(const struct page_info *page)
 }
 
 /* Returns true if the rmap has any entries. O(1) complexity. */
-static bool rmap_has_entries(const struct page_info *page)
+static bool __maybe_unused rmap_has_entries(const struct page_info *page)
 {
 return rmap_count(page) != 0;
 }
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 1/7] x86: move some xen mm function declarations

2020-01-07 Thread Xia, Hongyan
On Tue, 2020-01-07 at 14:09 +0100, Jan Beulich wrote:
> ...
> 
> Looks like I simply forgot every time I went through my list of
> pending (for the various stages of processing) patches. I guess
> patches 3 and 4 are also independent of patch 2 and hence could
> go in as well.

If so, looks like patch 7/7 is also in a committable state?

Hongyan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] CODING_STYLE: Document how to handle unexpected conditions

2020-01-07 Thread Julien Grall

Hi George,

On 07/01/2020 12:02, George Dunlap wrote:

It's not always clear what the best way is to handle unexpected
conditions: whether with ASSERT(), domain_crash(), BUG_ON(), or some
other method.  All methods have a risk of introducing security
vulnerabilities and unnecessary instabilities to production systems.

Provide guidelines for different options and when to use them.

Signed-off-by: George Dunlap 
---
v4:
- s/guest should/guests shouldn't/;
- Add a note about the effect of domain_crash() further up the stack.
v3:
- A number of minor edits
- Expand on domain_crash a bit.
v2:
- Clarify meaning of "or" clause
- Add domain_crash as an option
- Make it clear that ASSERT() is not an error handling mechanism.

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Konrad Wilk 
CC: Stefano Stabellini 
CC: Julien Grall 


Acked-by: Julien Grall 

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 1/7] x86: move some xen mm function declarations

2020-01-07 Thread Wei Liu
On Tue, Jan 07, 2020 at 02:09:05PM +0100, Jan Beulich wrote:
> On 07.01.2020 13:13, Wei Liu wrote:
> > On Tue, Jan 07, 2020 at 12:06:43PM +, Hongyan Xia wrote:
> >> From: Wei Liu 
> >>
> >> They were put into page.h but mm.h is more appropriate.
> >>
> >> The real reason is that I will be adding some new functions which
> >> takes mfn_t. It turns out it is a bit difficult to do in page.h.
> >>
> >> No functional change.
> >>
> >> Signed-off-by: Wei Liu 
> >> Acked-by: Jan Beulich 
> > 
> > I will commit this trivial patch soon-ish to reduce Honyan's patch queue
> > length.
> 
> Looks like I simply forgot every time I went through my list of
> pending (for the various stages of processing) patches. I guess
> patches 3 and 4 are also independent of patch 2 and hence could
> go in as well.

Sure. I pushed all three patches (1, 3 and 4).

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V6 1/4] x86/mm: Add array_index_nospec to guest provided index values

2020-01-07 Thread Alexandru Stefan ISAILA


On 27.12.2019 10:01, Jan Beulich wrote:
> (re-sending, as I still don't see the mail having appeared on the list)
> 
> On 23.12.2019 15:04, Alexandru Stefan ISAILA wrote:
>> Changes since V5:
>>  - Add black lines
> 
> Luckily no color comes through in plain text mails ;-)
> 
>> --- a/xen/arch/x86/mm/mem_access.c
>> +++ b/xen/arch/x86/mm/mem_access.c
>> @@ -366,11 +366,12 @@ long p2m_set_mem_access(struct domain *d, gfn_t gfn, 
>> uint32_t nr,
>>   #ifdef CONFIG_HVM
>>   if ( altp2m_idx )
>>   {
>> -if ( altp2m_idx >= MAX_ALTP2M ||
>> - d->arch.altp2m_eptp[altp2m_idx] == mfn_x(INVALID_MFN) )
>> +if ( altp2m_idx >=  min(ARRAY_SIZE(d->arch.altp2m_p2m), MAX_EPTP) ||
> 
> Stray blank after >= .
> 
>> + d->arch.altp2m_eptp[array_index_nospec(altp2m_idx, MAX_EPTP)] 
>> ==
> 
> I accept you can't (currently) use array_access_nospec() here,
> but ...
> 
>> + mfn_x(INVALID_MFN) )
>>   return -EINVAL;
>>   
>> -ap2m = d->arch.altp2m_p2m[altp2m_idx];
>> +ap2m = d->arch.altp2m_p2m[array_index_nospec(altp2m_idx, 
>> MAX_ALTP2M)];
> 
> ... I don't see why you still effectively open-code it here.

I am not sure I follow you here, that is what we agreed in v5 
(https://lists.xenproject.org/archives/html/xen-devel/2019-12/msg01704.html). 
Did I miss something?


> 
>> @@ -425,11 +426,12 @@ long p2m_set_mem_access_multi(struct domain *d,
>>   #ifdef CONFIG_HVM
>>   if ( altp2m_idx )
>>   {
>> -if ( altp2m_idx >= MAX_ALTP2M ||
>> - d->arch.altp2m_eptp[altp2m_idx] == mfn_x(INVALID_MFN) )
>> +if ( altp2m_idx >=  min(ARRAY_SIZE(d->arch.altp2m_p2m), MAX_EPTP) ||
>> + d->arch.altp2m_eptp[array_index_nospec(altp2m_idx, MAX_EPTP)] 
>> ==
>> + mfn_x(INVALID_MFN) )
>>   return -EINVAL;
>>   
>> -ap2m = d->arch.altp2m_p2m[altp2m_idx];
>> +ap2m = d->arch.altp2m_p2m[array_index_nospec(altp2m_idx, 
>> MAX_ALTP2M)];
> 
> Same two remarks here then, and again further down.
> 
>> --- a/xen/arch/x86/mm/p2m.c
>> +++ b/xen/arch/x86/mm/p2m.c
>> @@ -2577,6 +2577,8 @@ int p2m_init_altp2m_by_id(struct domain *d, unsigned 
>> int idx)
>>   if ( idx >= MAX_ALTP2M )
>>   return rc;
>>   
>> +idx = array_index_nospec(idx, MAX_ALTP2M);
>> +
>>   altp2m_list_lock(d);
>>   
>>   if ( d->arch.altp2m_eptp[idx] == mfn_x(INVALID_MFN) )
> 
> What about this array access?
> 
>> @@ -2618,6 +2620,8 @@ int p2m_destroy_altp2m_by_id(struct domain *d, 
>> unsigned int idx)
>>   if ( !idx || idx >= MAX_ALTP2M )
>>   return rc;
>>   
>> +idx = array_index_nospec(idx, MAX_ALTP2M);
> 
> There's a d->arch.altp2m_eptp[] access down from here too. I'm not
> going to look further. Please get things into consistent shape while
> you do this transformation.
> 

I will change the idx part in p2m_init_altp2m_by_id() and 
p2m_destroy_altp2m_by_id() so they match the rest of the checks:
"if ( idx >=  min(ARRAY_SIZE(d->arch.altp2m_p2m), MAX_EPTP))...", drop 
the idx = array_index_nospec(idx, MAX_ALTP2M); and have 
array_index_nospec() into place.


Alex
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/vmx: Shrink TASK_SWITCH's hvm_task_switch_reason reasons[]

2020-01-07 Thread Jan Beulich
On 07.01.2020 13:25, Andrew Cooper wrote:
> No need to use 4-byte integers to store two bits of information.
> 
> Signed-off-by: Andrew Cooper 

In principle
Reviewed-by: Jan Beulich 
But ...

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -3978,7 +3978,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>  vmx_update_cpu_exec_control(v);
>  break;
>  case EXIT_REASON_TASK_SWITCH: {
> -static const enum hvm_task_switch_reason reasons[] = {
> +static const int8_t reasons[] = {
>  TSW_call_or_int, TSW_iret, TSW_jmp, TSW_call_or_int
>  };

... given our general preference of unsigned types when values
can't become negative, why not uint8_t?

As an aside, elsewhere I saw people starting to convert code
because apparently gcc 10 will warn about enum type mismatches.
I didn't investigate yet whether that's just for enum -> enum
conversions, or also for enum <- / -> integer ones. Of course
it wouldn't be the end of the world if we had to revert the
change above; did you consider the alternative of making the
enum a __packed one (which would avoid potential issues like
the one named)?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/trampoline: boot_vid_mode doesn't need to be global

2020-01-07 Thread Jan Beulich
On 07.01.2020 13:15, Andrew Cooper wrote:
> AFAICT, it has never had an external user since its introduction

I guess it was only ever anticipated to gain one.

> Signed-off-by: Andrew Cooper 

Acked-by: Jan Beulich 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xsm: hide detailed Xen version from unprivileged guests

2020-01-07 Thread Jan Beulich
On 07.01.2020 12:02, Sergey Dyasli wrote:
> On 06/01/2020 14:40, Jan Beulich wrote:
>> On 06.01.2020 15:35, Sergey Dyasli wrote:
>>> On 06/01/2020 11:28, George Dunlap wrote:
 On 12/19/19 11:15 PM, Andrew Cooper wrote:
> On 19/12/2019 11:35, Jan Beulich wrote:
> XENVER_changeset
> XENVER_commandline
> XENVER_build_id
>
> Return a more customer friendly empty string instead of ""
> which would be shown in tools like dmidecode.>
 I think "" is quite fine for many of the original purposes.
 Maybe it would be better to filter for this when populating guest
 DMI tables?
>>> I don't know how DMI tables are populated, but nothing stops a guest
>>> from using these hypercalls directly.
>> And this is precisely the case where I think "" is better
>> than an empty string.
>
> "" was a terrible choice back when it was introduced, and its
> still a terrible choice today.
>
> These are ASCII string fields, and the empty string is a perfectly good
> string.  Nothing is going to break, because it would have broken the
> first time around.
>
> The end result without denied sprayed all over this interface is much
> cleaner overall.

 Unfortunately this mail doesn't contain any facts or arguments, just
 unsubstantiated value judgements.  What's so terrible about ""
 -- what bad effect does it have?  Why is "" better / cleaner?
>>>
>>> It can be explained with a picture (attached) ;)
>>
>> But that's something better addressed at or close to the presentation
>> layer, not deep down in Xen.
> 
> I agree with that. And looks like the following diff does the trick:
> 
> diff --git a/tools/firmware/hvmloader/smbios.c 
> b/tools/firmware/hvmloader/smbios.c
> index 97a054e9e3..b4d72c375f 100644
> --- a/tools/firmware/hvmloader/smbios.c
> +++ b/tools/firmware/hvmloader/smbios.c
> @@ -275,6 +275,8 @@ hvm_write_smbios_tables(
>  xen_minor_version = (uint16_t) xen_version;
> 
>  hypercall_xen_version(XENVER_extraversion, xen_extra_version);
> +if ( strcmp(xen_extra_version, "") == 0 )
> +memset(xen_extra_version, 0, sizeof(xen_extra_version));
> 
>  /* build up human-readable Xen version string */
>  p = xen_version_str;

When you submit this as a proper patch, feel free to add my ack
right away (as long you give it a non-empty and half way useful
description).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 1/7] x86: move some xen mm function declarations

2020-01-07 Thread Jan Beulich
On 07.01.2020 13:13, Wei Liu wrote:
> On Tue, Jan 07, 2020 at 12:06:43PM +, Hongyan Xia wrote:
>> From: Wei Liu 
>>
>> They were put into page.h but mm.h is more appropriate.
>>
>> The real reason is that I will be adding some new functions which
>> takes mfn_t. It turns out it is a bit difficult to do in page.h.
>>
>> No functional change.
>>
>> Signed-off-by: Wei Liu 
>> Acked-by: Jan Beulich 
> 
> I will commit this trivial patch soon-ish to reduce Honyan's patch queue
> length.

Looks like I simply forgot every time I went through my list of
pending (for the various stages of processing) patches. I guess
patches 3 and 4 are also independent of patch 2 and hence could
go in as well.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] MAINTAINERS: Add explicit check-in policy section

2020-01-07 Thread Jan Beulich
On 07.01.2020 13:03, George Dunlap wrote:
> DISCUSSION
> 
> This seems to be a change from people's understanding of the current
> policy.  Most people's understanding of the current policy seems to be:
> 
> 1.  In order to get a change to a given file committed, it must have
> an Ack or Review from at least one *maintainer* of that file other
> than the submitter.
> 
> 2. In the case where a file has only one maintainer, it must have an
> Ack or Review from a "nested" maintainer.
> 
> I.e., if I submitted something to x86/mm, it would require an Ack from
> Jan or Andy, or (in exceptional circumstances) The Rest; but an Ack from
> (say) Roger or Juergen wouldn't suffice.
> 
> Let's call this the "maintainer-ack" approach (because it must have an
> ack or r-b from a maintainer to be checked in), and the proposal in
> this patch the "maintainer-approval" (since SoB from a maintainer
> indicates approval).
> 
> The core issue I have with "maintainer-ack" is that it makes the
> maintainer less privileged with regard to writing code than
> non-maintainers.  If component X has maintainers A and B, then a
> non-maintainer can have code checked in if reviewed either by A or B.
> If A or B wants code checked in, they have to wait for exactly one
> person to review it.
> 
> In fact, if B is quite busy, the easiest way for A really to get their
> code checked in might be to hand it to a non-maintainer N, and ask N
> to submit it as their own.  Then A can Ack the patches and check them
> in.
> 
> The current system, therefore, either sets up a perverse incentive (if
> you think the behavior described above is unacceptable) or unnecessary
> bureaucracy (if you think it's acceptable).  Either way I think we
> should set up our system to avoid it.

I much appreciate this initiative of yours.

> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -104,7 +104,53 @@ Descriptions of section entries:
>  xen-maintainers-
>  
>  
> -The meaning of nesting:
> + Check-in policy
> + ===
> +
> +In order for a patch to be checked in, in general, several conditions
> +must be met:
> +
> +1. In order to get a change to a given file committed, it must have
> +   the approval of at least one maintainer of that file.
> +
> +   A patch of course needs Acks from the maintainers of each file that
> +   it changes; so a patch which changes xen/arch/x86/traps.c,
> +   xen/arch/x86/mm/p2m.c, and xen/arch/x86/mm/shadow/multi.c would
> +   require an Ack from each of the three sets of maintainers.
> +
> +   See below for rules on nested maintainership.
> +
> +2. It must have an Acked-by or a Reviewed-by from someone other than
> +   the submitter.

I'd like to propose some further distinction here, albeit I'm not sure
this isn't implied anyway. It might be that making explicit the
distinction between A-b and R-b is sufficient - our current common
understanding looks to be that only maintainers can "ack", and others
would "review". Since the latter is implying a more thorough look at a
patch, I think it wouldn't be right to allow (quoting text further
down) "anyone in the community" to ack a random patch (I could probably
talk my son into ack-ing my patches ;-) ). Perhaps, rather than
limiting acks to maintainers of the changed code, we could extend this
to maintainers of just some code for maintainer submitted patches (i.e.
anyone named as M: at least once in ./MAINTAINERS)? People outside of
whatever subset we might pick would be eligible to offer R-b only,
implying of course that they actually did do a review.

> +3. Sufficient time and/or warning must have been given for anyone to
> +   respond.  This depends in large part upon the urgency and nature of
> +   the patch.  For a straightforward uncontroversial patch, a day or
> +   two is sufficient; for a controversial patch, perhaps waiting a
> +   week and then saying "I intend to check this in tomorrow unless I
> +   hear otherwise".

To me as non-native speaker, this last sentence looks incomplete (as
in missing e.g. "would be appropriate" at the end), or alternatively
it would feel like wanting the two "ing" dropped from the verbs.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/arm: vgic-v3: Fix the typo of GICD IRQ active status range

2020-01-07 Thread Wei Xu

Hi Julien,

On 2020/1/7 19:42, Julien Grall wrote:

Hi,

On 07/01/2020 09:48, Wei Xu wrote:

On 2020/1/7 17:10, Julien Grall wrote:



On 07/01/2020 08:39, Wei Xu wrote:

Hi Stefano,

On 2020/1/7 6:01, Stefano Stabellini wrote:

On Sat, 28 Dec 2019, Wei Xu wrote:

Hi Julien,

On 2019/12/28 16:09, Julien Grall wrote:

Hi,

On 28/12/2019 03:08, Wei Xu wrote:

This patch fixes the typo about the active status range of an IRQ
via GICD. Otherwise it will be failed to handle the mmio access 
and

inject a data abort.
I have seen a patch similar from NXP a month ago and I disagreed 
on the

approach.

If you look at the context you modifed, it says that reading 
ACTIVER is not
supported. While I agree the behavior is not consistent accross 
ACTIVER,
injecting a data abort is a perfectly fine behavior to me 
(though not spec

compliant) as we don't implement the registers correctly.

I guess you are sending this patch, because you tried Linux 5.4 
(or later)
on Xen, right? Linux has recently began to read ACTIVER to check 
whether an
IRQ is active at the HW level during the synchronizing of the 
IRQS. From my
understanding, this is used because there is a window where the 
interrupt is
active at the HW level but the Linux IRQ subsystem is not aware 
of it.


While the patch below will allow Linux 5.4 to not crash, it is 
not going to
make it fly very far because of the above. So I am rather not 
happy with

persuing with returning 0.


Yes, I am using Linux 5.5-rc2 :)
Got it and thanks for the explanation.
I am not insistent on this and OK to wait for the update.
Thanks and have a very happy new year!

Hi Wei,

what do you do to reproduce the issue? Are you just booting Linux
5.5-rc2 as dom0 and seeing the issue during boot, or are you doing
something specific?

.



I directly tested the mainline kernel with defconfig.
And the 5.5-rc5 kernel booting log is as below:

 root@ubuntu:~# dmesg | more
 [0.00] Booting Linux on physical CPU 0x00 
[0x481fd010]
 [0.00] Linux version 5.5.0-rc5 (joyx@Turing-Arch-b) 
(gcc version 4.9.1 2
 0140505 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.05 - 
Linaro GCC 4.9-20

 14.05)) #132 SMP PREEMPT Tue Jan 7 15:43:06 CST 2020
 [0.00] Xen XEN_VERSION.XEN_SUBVERSION support found
 [0.00] efi: Getting EFI parameters from FDT:
 [0.00] efi: EFI v2.50 by Xen
 [0.00] efi:  ACPI 2.0=0x181d0e70
 [0.00] cma: Reserved 32 MiB at 0x7e00
 [0.00] ACPI: Early table checksum verification disabled
 [0.00] ACPI: RSDP 0x181D0E70 24 (v02 HISI  )
 [0.00] ACPI: XSDT 0x181D0DB0 BC (v01 HISI 
HIP08000

 0  0113)


Is that the full log from Linux? If not, can you post it in full?



But to boot with ACPI on our hardware, except above change I have 
also done some hacking based on

XEN 4.13 as below:


I haven't booted Xen on any ACPI systems recently so there might be 
bugs in the code. Your changes below is definitely a call to look 
more into details what's wrong.




Yes, my target is to make Xen booting with ACPI firstly.



 diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
 index d028ec9..215a291 100644
 --- a/xen/arch/arm/traps.c
 +++ b/xen/arch/arm/traps.c
 @@ -1856,8 +1856,8 @@ static bool try_map_mmio(gfn_t gfn)
  return false;

  /* The hardware domain can only map permitted MMIO 
regions */
 -if ( !iomem_access_permitted(d, mfn_x(mfn), mfn_x(mfn) + 
1) )

 -return false;
 +/* if ( !iomem_access_permitted(d, mfn_x(mfn), mfn_x(mfn) 
+ 1) ) */

 +/* return false; */


Dom0 should be able to map nearly all the address space through this 
function. The only thing not allowed is the GIC and UART (see 
acpi_iomem_deny_access).


So why do you want this change? What sort of address Dom0 is trying 
to map and fail?


Yes, it is the UART address 0x3f2f8.
Without this, during DOM0 UART initialization, the mem_serial_in in 
the kernel side will be failed and reported a unhandled fault at 
0x80001006d2f9(gva)

because of mem abort.
The Xen printed "HSR=0x93015 pc=0x800010645d94 
gva=0x80001006d2f9 gpa=0x03f2f9" in traps.c.


I assume this is your primary address as specified in the SPCR, right?


Yes.

As only one entity should manage the UART (i.e Xen or Dom0), we today 
assume this will be managed by Xen. Xen should expose a partial 
virtual UART (only a few registers are emulating) to dom0 in replacement.


This is usually done by the UART driver. Looking at the log you pasted 
in a separate e-mail:


(XEN) Platform: Generic System
(XEN) Unable to initialize acpi uart: -9
(XEN) Bad console= option 'dtuart'

So Xen didn't manage to initialize the uart. The -9 suggests, Xen 
didn't find a driver for your UART. At the moment, Xen is only able to 
detect pl011, sbsa, sbsa32 UART for ACPI. What is 

Re: [Xen-devel] [PATCH] CODING_STYLE: Document how to handle unexpected conditions

2020-01-07 Thread Jan Beulich
On 07.01.2020 13:02, George Dunlap wrote:
> It's not always clear what the best way is to handle unexpected
> conditions: whether with ASSERT(), domain_crash(), BUG_ON(), or some
> other method.  All methods have a risk of introducing security
> vulnerabilities and unnecessary instabilities to production systems.
> 
> Provide guidelines for different options and when to use them.
> 
> Signed-off-by: George Dunlap 

Acked-by: Jan Beulich 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Community Call: Call for Agenda Items and call details for Jan 9, 16:00 - 17:00 UTC

2020-01-07 Thread Lars Kurth


On 07/01/2020, 08:23, "Durrant, Paul"  wrote:

> -Original Message-
> From: Andrew Cooper 
> Sent: 07 January 2020 00:26
> To: Lars Kurth ; xen-devel  de...@lists.xenproject.org>
> Cc: Rian Quinn ; Daniel P. Smith
> ; Doug Goldstein ; Brian
> Woods ; Rich Persaud ;
> anastassios.na...@onapp.com; mirela.simono...@aggios.com;
> edgar.igles...@xilinx.com; Ji, John ;
> robin.randh...@arm.com; daniel.ki...@oracle.com; Amit Shah
> ; Matt Spencer ; Robert Townley
> ; Artem Mygaiev ; Varad
> Gautam ; Tamas K Lengyel
> ; Christopher Clark
> ; George Dunlap ;
> Stefano Stabellini ; lambert.oliv...@gmail.com;
> Ian Jackson ; vfac...@de.adit-jv.com; Kevin
> Pearson ; intel-...@intel.com; Jarvis
> Roach ; Juergen Gross ;
> Sergey Dyasli ; Durrant, Paul
> ; Julien Grall ; Jeff
> Kubascik ; Natarajan, Janakarajan
> ; Stewart Hildebrand
> ; Volodymyr Babchuk
> ; Woodhouse, David ; Roger
> Pau Monne 
> Subject: Re: [Xen-devel] Community Call: Call for Agenda Items and call
> details for Jan 9, 16:00 - 17:00 UTC
> 
> On 06/01/2020 19:56, Lars Kurth wrote:
> > Dear community members,
> >
> > I hope you all had a restful holiday period and a Happy New Year!
> >
> > Please send me agenda items for this Thursday's community call (we
> agreed to move it by 1 week) preferably by Wednesday!
> >
> > A draft agenda is
> at https://cryptpad.fr/pad/#/2/pad/edit/ERZtMYD5j6k0sv-NG6Htl-AJ/
> > Please add agenda items to the document or reply to this e-mail
> 
> I think it would be very helpful for the community in general to know
> any specific plans each of us have for the 4.14 timeframe.
> 
> I personally am aware of a fair quantity of work from various people,
> but it is clear that the community as a whole doesn't really have an
> idea of who is working on what.
> 
> My contribution to the discussion starts with
> https://lore.kernel.org/xen-devel/941cf23c-13ed-14a1-fd25-
> 45b001d95...@citrix.com/T/#u
> but I think it would be helpful if others gave at least a brief overview
> of any plans and whether they are intending the work to hit the next
> release, or whether it is more likely to be a future release.

Agreed. I need a baseline list of items to track for 4.14. 

I added 

   C.2) 4.13 Release retrospective and 4.14 planning baseline (Lars, Paul)
   4.13: Seems to be that this time some stuff had gone wrong, in particular 
around the release comms. This is a placeholder to discuss.

   4.14: Need a baseline for 4.14 planning
   It would be helpful if EVERYONE gave a brief overview of any plans for 4.14 
and whether they are intending the work to hit the next 
   release, or whether it is more likely to be a future release.

   Andrew's contribution and larger 4.14 backlog at: 
https://lore.kernel.org/xen-devel/941cf23c-13ed-14a1-fd25-45b001d95...@citrix.com/T/#u

To the agenda
Lars


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86/vmx: Shrink TASK_SWITCH's hvm_task_switch_reason reasons[]

2020-01-07 Thread Andrew Cooper
No need to use 4-byte integers to store two bits of information.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 
CC: Jun Nakajima 
CC: Kevin Tian 
---
 xen/arch/x86/hvm/vmx/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index f83f102638..b79bca71ad 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3978,7 +3978,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
 vmx_update_cpu_exec_control(v);
 break;
 case EXIT_REASON_TASK_SWITCH: {
-static const enum hvm_task_switch_reason reasons[] = {
+static const int8_t reasons[] = {
 TSW_call_or_int, TSW_iret, TSW_jmp, TSW_call_or_int
 };
 unsigned int inst_len, source;
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] tools/save: Drop unused parameters from xc_domain_save()

2020-01-07 Thread Wei Liu
On Mon, Jan 06, 2020 at 05:03:52PM +, Andrew Cooper wrote:
> XCFLAGS_CHECKPOINT_COMPRESS has been unused since c/s b15bc4345 (2015),
> XCFLAGS_HVM since c/s 9e8672f1c (2013), and XCFLAGS_STDVGA since c/s
> 087d43326 (2007).  Drop the constants, and code which sets them.
> 
> The separate hvm parameter (appeared in c/s d11bec8a1, 2007 and ultimately
> redundant with XCFLAGS_HVM), is used for sanity checking and debug printing,
> then discarded and replaced with Xen's idea of whether the domain is PV or
> HVM.
> 
> Rearrange the logic in xc_domain_save() to ask Xen sightly earlier, and use a
> consistent idea of 'hvm' throughout.  Removing this parameter removes the
> final user of libxl's dss->hvm, so drop that field as well.
> 
> Update the doxygen comment to be accurate.
> 
> Signed-off-by: Andrew Cooper 

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86/trampoline: boot_vid_mode doesn't need to be global

2020-01-07 Thread Andrew Cooper
AFAICT, it has never had an external user since its introduction

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 
---
 xen/arch/x86/boot/trampoline.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/boot/trampoline.S b/xen/arch/x86/boot/trampoline.S
index 824f45ec0f..6b403a6d1a 100644
--- a/xen/arch/x86/boot/trampoline.S
+++ b/xen/arch/x86/boot/trampoline.S
@@ -261,7 +261,7 @@ opt_edid:
 .byte   0
 
 #ifdef CONFIG_VIDEO
-GLOBAL(boot_vid_mode)
+boot_vid_mode:
 .word   VIDEO_80x25 /* If we don't run at all, 
assume basic video mode 3 at 80x25. */
 vesa_size:
 .word   0,0,0   /* width x depth x height */
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 1/7] x86: move some xen mm function declarations

2020-01-07 Thread Wei Liu
On Tue, Jan 07, 2020 at 12:06:43PM +, Hongyan Xia wrote:
> From: Wei Liu 
> 
> They were put into page.h but mm.h is more appropriate.
> 
> The real reason is that I will be adding some new functions which
> takes mfn_t. It turns out it is a bit difficult to do in page.h.
> 
> No functional change.
> 
> Signed-off-by: Wei Liu 
> Acked-by: Jan Beulich 

I will commit this trivial patch soon-ish to reduce Honyan's patch queue
length.

Shout if your disagree.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] tools/restore: Drop unused parameters from xc_domain_restore()

2020-01-07 Thread Wei Liu
On Fri, Jan 03, 2020 at 05:22:48PM +, Andrew Cooper wrote:
> The hvm and pae parameters are a remnant of legacy migration.  They have 0
> passed in from libxl_stream_read.c's process_record(), and are discarded in
> xc_domain_restore().
> 
> While dropping these, update the doxygen comment to be accurate, and simplify
> the other hvm vs pv handling in xc_domain_restore().
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper 

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 6/7] x86/mm: make sure there is one exit path for modify_xen_mappings

2020-01-07 Thread Hongyan Xia
From: Wei Liu 

We will soon need to handle dynamically mapping / unmapping page
tables in the said function.

No functional change.

Signed-off-by: Wei Liu 
Signed-off-by: Hongyan Xia 

---
Changed since v4:
- drop the end_of_loop goto label since this function may be refactored
  in the future and there are options to do things without the goto.

Changed since v3:
- remove asserts on rc since it never gets changed to anything else.
---
 xen/arch/x86/mm.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 71e9c4b19e..6b589762b1 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5557,6 +5557,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, 
unsigned int nf)
 l1_pgentry_t *pl1e;
 unsigned int  i;
 unsigned long v = s;
+int rc = -ENOMEM;
 
 /* Set of valid PTE bits which may be altered. */
 #define FLAGS_MASK (_PAGE_NX|_PAGE_RW|_PAGE_PRESENT)
@@ -5600,7 +5601,8 @@ int modify_xen_mappings(unsigned long s, unsigned long e, 
unsigned int nf)
 /* PAGE1GB: shatter the superpage and fall through. */
 l2t = alloc_xen_pagetable();
 if ( !l2t )
-return -ENOMEM;
+goto out;
+
 for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
 l2e_write(l2t + i,
   l2e_from_pfn(l3e_get_pfn(*pl3e) +
@@ -5657,7 +5659,8 @@ int modify_xen_mappings(unsigned long s, unsigned long e, 
unsigned int nf)
 /* PSE: shatter the superpage and try again. */
 l1t = alloc_xen_pagetable();
 if ( !l1t )
-return -ENOMEM;
+goto out;
+
 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
 l1e_write([i],
   l1e_from_pfn(l2e_get_pfn(*pl2e) + i,
@@ -5790,7 +5793,10 @@ int modify_xen_mappings(unsigned long s, unsigned long 
e, unsigned int nf)
 flush_area(NULL, FLUSH_TLB_GLOBAL);
 
 #undef FLAGS_MASK
-return 0;
+rc = 0;
+
+ out:
+return rc;
 }
 
 #undef flush_area
-- 
2.15.3.AMZN


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 3/7] x86/mm: introduce l{1, 2}t local variables to map_pages_to_xen

2020-01-07 Thread Hongyan Xia
From: Wei Liu 

The pl2e and pl1e variables are heavily (ab)used in that function. It
is fine at the moment because all page tables are always mapped so
there is no need to track the life time of each variable.

We will soon have the requirement to map and unmap page tables. We
need to track the life time of each variable to avoid leakage.

Introduce some l{1,2}t variables with limited scope so that we can
track life time of pointers to xen page tables more easily.

No functional change.

Signed-off-by: Wei Liu 
Reviewed-by: Jan Beulich 

---
Changed since v4:
- style fixes.
- const qualify introduced variables.
---
 xen/arch/x86/mm.c | 72 ++-
 1 file changed, 39 insertions(+), 33 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 22b55390f1..699aa6bbdf 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5211,10 +5211,11 @@ int map_pages_to_xen(
 }
 else
 {
-pl2e = l3e_to_l2e(ol3e);
+l2_pgentry_t *l2t = l3e_to_l2e(ol3e);
+
 for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
 {
-ol2e = pl2e[i];
+ol2e = l2t[i];
 if ( !(l2e_get_flags(ol2e) & _PAGE_PRESENT) )
 continue;
 if ( l2e_get_flags(ol2e) & _PAGE_PSE )
@@ -5222,21 +5223,21 @@ int map_pages_to_xen(
 else
 {
 unsigned int j;
+const l1_pgentry_t *l1t = l2e_to_l1e(ol2e);
 
-pl1e = l2e_to_l1e(ol2e);
 for ( j = 0; j < L1_PAGETABLE_ENTRIES; j++ )
-flush_flags(l1e_get_flags(pl1e[j]));
+flush_flags(l1e_get_flags(l1t[j]));
 }
 }
 flush_area(virt, flush_flags);
 for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
 {
-ol2e = pl2e[i];
+ol2e = l2t[i];
 if ( (l2e_get_flags(ol2e) & _PAGE_PRESENT) &&
  !(l2e_get_flags(ol2e) & _PAGE_PSE) )
 free_xen_pagetable(l2e_to_l1e(ol2e));
 }
-free_xen_pagetable(pl2e);
+free_xen_pagetable(l2t);
 }
 }
 
@@ -5252,6 +5253,7 @@ int map_pages_to_xen(
 {
 unsigned int flush_flags =
 FLUSH_TLB | FLUSH_ORDER(2 * PAGETABLE_ORDER);
+l2_pgentry_t *l2t;
 
 /* Skip this PTE if there is no change. */
 if ( ((l3e_get_pfn(ol3e) & ~(L2_PAGETABLE_ENTRIES *
@@ -5273,12 +5275,12 @@ int map_pages_to_xen(
 continue;
 }
 
-pl2e = alloc_xen_pagetable();
-if ( pl2e == NULL )
+l2t = alloc_xen_pagetable();
+if ( l2t == NULL )
 return -ENOMEM;
 
 for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
-l2e_write(pl2e + i,
+l2e_write(l2t + i,
   l2e_from_pfn(l3e_get_pfn(ol3e) +
(i << PAGETABLE_ORDER),
l3e_get_flags(ol3e)));
@@ -5291,15 +5293,15 @@ int map_pages_to_xen(
 if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) &&
  (l3e_get_flags(*pl3e) & _PAGE_PSE) )
 {
-l3e_write_atomic(pl3e, l3e_from_mfn(virt_to_mfn(pl2e),
+l3e_write_atomic(pl3e, l3e_from_mfn(virt_to_mfn(l2t),
 __PAGE_HYPERVISOR));
-pl2e = NULL;
+l2t = NULL;
 }
 if ( locking )
 spin_unlock(_pgdir_lock);
 flush_area(virt, flush_flags);
-if ( pl2e )
-free_xen_pagetable(pl2e);
+if ( l2t )
+free_xen_pagetable(l2t);
 }
 
 pl2e = virt_to_xen_l2e(virt);
@@ -5327,11 +5329,12 @@ int map_pages_to_xen(
 }
 else
 {
-pl1e = l2e_to_l1e(ol2e);
+l1_pgentry_t *l1t = l2e_to_l1e(ol2e);
+
 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
-flush_flags(l1e_get_flags(pl1e[i]));
+flush_flags(l1e_get_flags(l1t[i]));
 flush_area(virt, flush_flags);
-free_xen_pagetable(pl1e);
+free_xen_pagetable(l1t);
 }
 }
 
@@ -5353,6 +5356,7 @@ int map_pages_to_xen(
 {
 unsigned int flush_flags =
 FLUSH_TLB | FLUSH_ORDER(PAGETABLE_ORDER);
+

[Xen-devel] [PATCH v5 0/7] Add alternative API for XEN PTEs

2020-01-07 Thread Hongyan Xia
This batch adds an alternative alloc-map-unmap-free Xen PTE API to the
normal alloc-free on the xenheap, in preparation of switching to domheap
for Xen page tables. Since map and unmap are basically no-ops now, and
other changes are cosmetic to ease future patches, this batch does not
introduce any functional changes.

tree:
https://xenbits.xen.org/git-http/people/hx242/xen.git directnonmap-v3

---
Changed since v4:
- handle INVALID_MFN in new APIs
- drop some goto labels since there could be better options
- const qualify introduced variables
- defer some changes to future patches due to ongoing discussions on
  map_pages_to_xen

Changed since v3:
- change my email address in all patches
- address many style issues in v3
- rebase

Changed since v2:
- split into a smaller series
- drop the clear_page optimisation as Wei suggests
- rebase

Changed since v1:
- squash some commits
- merge bug fixes into this first batch
- rebase against latest master

Wei Liu (7):
  x86: move some xen mm function declarations
  x86: introduce a new set of APIs to manage Xen page tables
  x86/mm: introduce l{1,2}t local variables to map_pages_to_xen
  x86/mm: introduce l{1,2}t local variables to modify_xen_mappings
  x86/mm: map_pages_to_xen would better have one exit path
  x86/mm: make sure there is one exit path for modify_xen_mappings
  x86/mm: change pl*e to l*t in virt_to_xen_l*e

 xen/arch/x86/mm.c  | 258 -
 xen/include/asm-x86/mm.h   |  16 +++
 xen/include/asm-x86/page.h |   5 -
 3 files changed, 175 insertions(+), 104 deletions(-)

-- 
2.15.3.AMZN


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 5/7] x86/mm: map_pages_to_xen would better have one exit path

2020-01-07 Thread Hongyan Xia
From: Wei Liu 

We will soon rewrite the function to handle dynamically mapping and
unmapping of page tables.

No functional change.

Signed-off-by: Wei Liu 
Signed-off-by: Hongyan Xia 

---
Changed since v4:
- drop the end_of_loop goto label since this function may be refactored
  in the future and there are options to do things without the goto.

Changed since v3:
- remove asserts on rc since rc never gets changed to anything else.
- reword commit message.
---
 xen/arch/x86/mm.c | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 7160ddcb67..71e9c4b19e 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5164,9 +5164,11 @@ int map_pages_to_xen(
 unsigned int flags)
 {
 bool locking = system_state > SYS_STATE_boot;
+l3_pgentry_t *pl3e, ol3e;
 l2_pgentry_t *pl2e, ol2e;
 l1_pgentry_t *pl1e, ol1e;
 unsigned int  i;
+int rc = -ENOMEM;
 
 #define flush_flags(oldf) do { \
 unsigned int o_ = (oldf);  \
@@ -5184,10 +5186,11 @@ int map_pages_to_xen(
 
 while ( nr_mfns != 0 )
 {
-l3_pgentry_t ol3e, *pl3e = virt_to_xen_l3e(virt);
+pl3e = virt_to_xen_l3e(virt);
 
 if ( !pl3e )
-return -ENOMEM;
+goto out;
+
 ol3e = *pl3e;
 
 if ( cpu_has_page1gb &&
@@ -5277,7 +5280,7 @@ int map_pages_to_xen(
 
 l2t = alloc_xen_pagetable();
 if ( l2t == NULL )
-return -ENOMEM;
+goto out;
 
 for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
 l2e_write(l2t + i,
@@ -5306,7 +5309,7 @@ int map_pages_to_xen(
 
 pl2e = virt_to_xen_l2e(virt);
 if ( !pl2e )
-return -ENOMEM;
+goto out;
 
 if ( virt >> PAGE_SHIFT) | mfn_x(mfn)) &
((1u << PAGETABLE_ORDER) - 1)) == 0) &&
@@ -5350,7 +5353,7 @@ int map_pages_to_xen(
 {
 pl1e = virt_to_xen_l1e(virt);
 if ( pl1e == NULL )
-return -ENOMEM;
+goto out;
 }
 else if ( l2e_get_flags(*pl2e) & _PAGE_PSE )
 {
@@ -5378,7 +5381,7 @@ int map_pages_to_xen(
 
 l1t = alloc_xen_pagetable();
 if ( l1t == NULL )
-return -ENOMEM;
+goto out;
 
 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
 l1e_write([i],
@@ -5524,7 +5527,10 @@ int map_pages_to_xen(
 
 #undef flush_flags
 
-return 0;
+rc = 0;
+
+ out:
+return rc;
 }
 
 int populate_pt_range(unsigned long virt, unsigned long nr_mfns)
-- 
2.15.3.AMZN


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 7/7] x86/mm: change pl*e to l*t in virt_to_xen_l*e

2020-01-07 Thread Hongyan Xia
From: Wei Liu 

We will need to have a variable named pl*e when we rewrite
virt_to_xen_l*e. Change pl*e to l*t to reflect better its purpose.
This will make reviewing later patch easier.

No functional change.

Signed-off-by: Wei Liu 
Signed-off-by: Hongyan Xia 
Reviewed-by: Jan Beulich 
---
 xen/arch/x86/mm.c | 42 +-
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 6b589762b1..d594d6abfb 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5054,25 +5054,25 @@ static l3_pgentry_t *virt_to_xen_l3e(unsigned long v)
 if ( !(l4e_get_flags(*pl4e) & _PAGE_PRESENT) )
 {
 bool locking = system_state > SYS_STATE_boot;
-l3_pgentry_t *pl3e = alloc_xen_pagetable();
+l3_pgentry_t *l3t = alloc_xen_pagetable();
 
-if ( !pl3e )
+if ( !l3t )
 return NULL;
-clear_page(pl3e);
+clear_page(l3t);
 if ( locking )
 spin_lock(_pgdir_lock);
 if ( !(l4e_get_flags(*pl4e) & _PAGE_PRESENT) )
 {
-l4_pgentry_t l4e = l4e_from_paddr(__pa(pl3e), __PAGE_HYPERVISOR);
+l4_pgentry_t l4e = l4e_from_paddr(__pa(l3t), __PAGE_HYPERVISOR);
 
 l4e_write(pl4e, l4e);
 efi_update_l4_pgtable(l4_table_offset(v), l4e);
-pl3e = NULL;
+l3t = NULL;
 }
 if ( locking )
 spin_unlock(_pgdir_lock);
-if ( pl3e )
-free_xen_pagetable(pl3e);
+if ( l3t )
+free_xen_pagetable(l3t);
 }
 
 return l4e_to_l3e(*pl4e) + l3_table_offset(v);
@@ -5089,22 +5089,22 @@ static l2_pgentry_t *virt_to_xen_l2e(unsigned long v)
 if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
 {
 bool locking = system_state > SYS_STATE_boot;
-l2_pgentry_t *pl2e = alloc_xen_pagetable();
+l2_pgentry_t *l2t = alloc_xen_pagetable();
 
-if ( !pl2e )
+if ( !l2t )
 return NULL;
-clear_page(pl2e);
+clear_page(l2t);
 if ( locking )
 spin_lock(_pgdir_lock);
 if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
 {
-l3e_write(pl3e, l3e_from_paddr(__pa(pl2e), __PAGE_HYPERVISOR));
-pl2e = NULL;
+l3e_write(pl3e, l3e_from_paddr(__pa(l2t), __PAGE_HYPERVISOR));
+l2t = NULL;
 }
 if ( locking )
 spin_unlock(_pgdir_lock);
-if ( pl2e )
-free_xen_pagetable(pl2e);
+if ( l2t )
+free_xen_pagetable(l2t);
 }
 
 BUG_ON(l3e_get_flags(*pl3e) & _PAGE_PSE);
@@ -5122,22 +5122,22 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long v)
 if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
 {
 bool locking = system_state > SYS_STATE_boot;
-l1_pgentry_t *pl1e = alloc_xen_pagetable();
+l1_pgentry_t *l1t = alloc_xen_pagetable();
 
-if ( !pl1e )
+if ( !l1t )
 return NULL;
-clear_page(pl1e);
+clear_page(l1t);
 if ( locking )
 spin_lock(_pgdir_lock);
 if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
 {
-l2e_write(pl2e, l2e_from_paddr(__pa(pl1e), __PAGE_HYPERVISOR));
-pl1e = NULL;
+l2e_write(pl2e, l2e_from_paddr(__pa(l1t), __PAGE_HYPERVISOR));
+l1t = NULL;
 }
 if ( locking )
 spin_unlock(_pgdir_lock);
-if ( pl1e )
-free_xen_pagetable(pl1e);
+if ( l1t )
+free_xen_pagetable(l1t);
 }
 
 BUG_ON(l2e_get_flags(*pl2e) & _PAGE_PSE);
-- 
2.15.3.AMZN


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 4/7] x86/mm: introduce l{1, 2}t local variables to modify_xen_mappings

2020-01-07 Thread Hongyan Xia
From: Wei Liu 

The pl2e and pl1e variables are heavily (ab)used in that function.  It
is fine at the moment because all page tables are always mapped so
there is no need to track the life time of each variable.

We will soon have the requirement to map and unmap page tables. We
need to track the life time of each variable to avoid leakage.

Introduce some l{1,2}t variables with limited scope so that we can
track life time of pointers to xen page tables more easily.

No functional change.

Signed-off-by: Wei Liu 
Reviewed-by: Jan Beulich 
---
 xen/arch/x86/mm.c | 68 +++
 1 file changed, 38 insertions(+), 30 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 699aa6bbdf..7160ddcb67 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5575,6 +5575,8 @@ int modify_xen_mappings(unsigned long s, unsigned long e, 
unsigned int nf)
 
 if ( l3e_get_flags(*pl3e) & _PAGE_PSE )
 {
+l2_pgentry_t *l2t;
+
 if ( l2_table_offset(v) == 0 &&
  l1_table_offset(v) == 0 &&
  ((e - v) >= (1UL << L3_PAGETABLE_SHIFT)) )
@@ -5590,11 +5592,11 @@ int modify_xen_mappings(unsigned long s, unsigned long 
e, unsigned int nf)
 }
 
 /* PAGE1GB: shatter the superpage and fall through. */
-pl2e = alloc_xen_pagetable();
-if ( !pl2e )
+l2t = alloc_xen_pagetable();
+if ( !l2t )
 return -ENOMEM;
 for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
-l2e_write(pl2e + i,
+l2e_write(l2t + i,
   l2e_from_pfn(l3e_get_pfn(*pl3e) +
(i << PAGETABLE_ORDER),
l3e_get_flags(*pl3e)));
@@ -5603,14 +5605,14 @@ int modify_xen_mappings(unsigned long s, unsigned long 
e, unsigned int nf)
 if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) &&
  (l3e_get_flags(*pl3e) & _PAGE_PSE) )
 {
-l3e_write_atomic(pl3e, l3e_from_mfn(virt_to_mfn(pl2e),
+l3e_write_atomic(pl3e, l3e_from_mfn(virt_to_mfn(l2t),
 __PAGE_HYPERVISOR));
-pl2e = NULL;
+l2t = NULL;
 }
 if ( locking )
 spin_unlock(_pgdir_lock);
-if ( pl2e )
-free_xen_pagetable(pl2e);
+if ( l2t )
+free_xen_pagetable(l2t);
 }
 
 /*
@@ -5644,12 +5646,14 @@ int modify_xen_mappings(unsigned long s, unsigned long 
e, unsigned int nf)
 }
 else
 {
+l1_pgentry_t *l1t;
+
 /* PSE: shatter the superpage and try again. */
-pl1e = alloc_xen_pagetable();
-if ( !pl1e )
+l1t = alloc_xen_pagetable();
+if ( !l1t )
 return -ENOMEM;
 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
-l1e_write([i],
+l1e_write([i],
   l1e_from_pfn(l2e_get_pfn(*pl2e) + i,
l2e_get_flags(*pl2e) & ~_PAGE_PSE));
 if ( locking )
@@ -5657,19 +5661,19 @@ int modify_xen_mappings(unsigned long s, unsigned long 
e, unsigned int nf)
 if ( (l2e_get_flags(*pl2e) & _PAGE_PRESENT) &&
  (l2e_get_flags(*pl2e) & _PAGE_PSE) )
 {
-l2e_write_atomic(pl2e, l2e_from_mfn(virt_to_mfn(pl1e),
+l2e_write_atomic(pl2e, l2e_from_mfn(virt_to_mfn(l1t),
 __PAGE_HYPERVISOR));
-pl1e = NULL;
+l1t = NULL;
 }
 if ( locking )
 spin_unlock(_pgdir_lock);
-if ( pl1e )
-free_xen_pagetable(pl1e);
+if ( l1t )
+free_xen_pagetable(l1t);
 }
 }
 else
 {
-l1_pgentry_t nl1e;
+l1_pgentry_t nl1e, *l1t;
 
 /*
  * Ordinary 4kB mapping: The L2 entry has been verified to be
@@ -5716,9 +5720,9 @@ int modify_xen_mappings(unsigned long s, unsigned long e, 
unsigned int nf)
 continue;
 }
 
-pl1e = l2e_to_l1e(*pl2e);
+l1t = l2e_to_l1e(*pl2e);
 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
-if ( l1e_get_intpte(pl1e[i]) != 0 )
+if ( l1e_get_intpte(l1t[i]) != 0 )
 break;
 if ( i == L1_PAGETABLE_ENTRIES )
 {
@@ -5727,7 +5731,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, 
unsigned int nf)
 if ( locking )
 spin_unlock(_pgdir_lock);
  

[Xen-devel] [PATCH v5 2/7] x86: introduce a new set of APIs to manage Xen page tables

2020-01-07 Thread Hongyan Xia
From: Wei Liu 

We are going to switch to using domheap page for page tables.
A new set of APIs is introduced to allocate, map, unmap and free pages
for page tables.

The allocation and deallocation work on mfn_t but not page_info,
because they are required to work even before frame table is set up.

Implement the old functions with the new ones. We will rewrite, site
by site, other mm functions that manipulate page tables to use the new
APIs.

Note these new APIs still use xenheap page underneath and no actual
map and unmap is done so that we don't break xen half way. They will
be switched to use domheap and dynamic mappings when usage of old APIs
is eliminated.

No functional change intended in this patch.

Signed-off-by: Wei Liu 
Signed-off-by: Hongyan Xia 
Reviewed-by: Julien Grall 

---
Changed since v4:
- properly handle INVALID_MFN.
- remove the _new suffix for map/unmap_xen_pagetable because they do not
  have old alternatives.

Changed since v3:
- const qualify unmap_xen_pagetable_new().
- remove redundant parentheses.
---
 xen/arch/x86/mm.c| 44 +++-
 xen/include/asm-x86/mm.h | 11 +++
 2 files changed, 50 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index cc0d71996c..22b55390f1 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -119,6 +119,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -4992,22 +4993,55 @@ int mmcfg_intercept_write(
 }
 
 void *alloc_xen_pagetable(void)
+{
+mfn_t mfn = alloc_xen_pagetable_new();
+
+return mfn_eq(mfn, INVALID_MFN) ? NULL : mfn_to_virt(mfn_x(mfn));
+}
+
+void free_xen_pagetable(void *v)
+{
+mfn_t mfn = v ? virt_to_mfn(v) : INVALID_MFN;
+
+if ( system_state != SYS_STATE_early_boot )
+free_xen_pagetable_new(mfn);
+}
+
+/*
+ * For these PTE APIs, the caller must follow the alloc-map-unmap-free
+ * lifecycle, which means explicitly mapping the PTE pages before accessing
+ * them. The caller must check whether the allocation has succeeded, and only
+ * pass valid MFNs to map_xen_pagetable().
+ */
+mfn_t alloc_xen_pagetable_new(void)
 {
 if ( system_state != SYS_STATE_early_boot )
 {
 void *ptr = alloc_xenheap_page();
 
 BUG_ON(!hardware_domain && !ptr);
-return ptr;
+return ptr ? virt_to_mfn(ptr) : INVALID_MFN;
 }
 
-return mfn_to_virt(mfn_x(alloc_boot_pages(1, 1)));
+return alloc_boot_pages(1, 1);
 }
 
-void free_xen_pagetable(void *v)
+void *map_xen_pagetable(mfn_t mfn)
 {
-if ( system_state != SYS_STATE_early_boot )
-free_xenheap_page(v);
+return mfn_to_virt(mfn_x(mfn));
+}
+
+/* v can point to an entry within a table or be NULL */
+void unmap_xen_pagetable(const void *v)
+{
+/* XXX still using xenheap page, no need to do anything.  */
+}
+
+/* mfn can be INVALID_MFN */
+void free_xen_pagetable_new(mfn_t mfn)
+{
+if ( system_state != SYS_STATE_early_boot && !mfn_eq(mfn, INVALID_MFN) )
+free_xenheap_page(mfn_to_virt(mfn_x(mfn)));
 }
 
 static DEFINE_SPINLOCK(map_pgdir_lock);
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 2ca8882ad0..861edba34e 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -582,6 +582,17 @@ void *do_page_walk(struct vcpu *v, unsigned long addr);
 /* Allocator functions for Xen pagetables. */
 void *alloc_xen_pagetable(void);
 void free_xen_pagetable(void *v);
+mfn_t alloc_xen_pagetable_new(void);
+void *map_xen_pagetable(mfn_t mfn);
+void unmap_xen_pagetable(const void *v);
+void free_xen_pagetable_new(mfn_t mfn);
+
+#define UNMAP_XEN_PAGETABLE(ptr)\
+do {\
+unmap_xen_pagetable(ptr);   \
+(ptr) = NULL;   \
+} while (0)
+
 l1_pgentry_t *virt_to_xen_l1e(unsigned long v);
 
 int __sync_local_execstate(void);
-- 
2.15.3.AMZN


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 1/7] x86: move some xen mm function declarations

2020-01-07 Thread Hongyan Xia
From: Wei Liu 

They were put into page.h but mm.h is more appropriate.

The real reason is that I will be adding some new functions which
takes mfn_t. It turns out it is a bit difficult to do in page.h.

No functional change.

Signed-off-by: Wei Liu 
Acked-by: Jan Beulich 

---
Changed since v3:
- move Xen PTE API declarations next to do_page_walk().
---
 xen/include/asm-x86/mm.h   | 5 +
 xen/include/asm-x86/page.h | 5 -
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 1479ba6703..2ca8882ad0 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -579,6 +579,11 @@ void update_cr3(struct vcpu *v);
 int vcpu_destroy_pagetables(struct vcpu *);
 void *do_page_walk(struct vcpu *v, unsigned long addr);
 
+/* Allocator functions for Xen pagetables. */
+void *alloc_xen_pagetable(void);
+void free_xen_pagetable(void *v);
+l1_pgentry_t *virt_to_xen_l1e(unsigned long v);
+
 int __sync_local_execstate(void);
 
 /* Arch-specific portion of memory_op hypercall. */
diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
index c1e92937c0..05a8b1efa6 100644
--- a/xen/include/asm-x86/page.h
+++ b/xen/include/asm-x86/page.h
@@ -345,11 +345,6 @@ void efi_update_l4_pgtable(unsigned int l4idx, 
l4_pgentry_t);
 
 #ifndef __ASSEMBLY__
 
-/* Allocator functions for Xen pagetables. */
-void *alloc_xen_pagetable(void);
-void free_xen_pagetable(void *v);
-l1_pgentry_t *virt_to_xen_l1e(unsigned long v);
-
 /* Convert between PAT/PCD/PWT embedded in PTE flags and 3-bit cacheattr. */
 static inline unsigned int pte_flags_to_cacheattr(unsigned int flags)
 {
-- 
2.15.3.AMZN


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] MAINTAINERS: Add explicit check-in policy section

2020-01-07 Thread George Dunlap
On 1/7/20 12:03 PM, George Dunlap wrote:
> v2:
> - Modify "sufficient time" to "sufficient time and/or warning".
> - Add a comment explicitly stating that there are exceptions.
> - Move some of the alternate proposals into the changelog itself

Sorry, this should obviously have 'v2' in the subject.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] CODING_STYLE: Document how to handle unexpected conditions

2020-01-07 Thread George Dunlap
On 1/7/20 12:02 PM, George Dunlap wrote:
> It's not always clear what the best way is to handle unexpected
> conditions: whether with ASSERT(), domain_crash(), BUG_ON(), or some
> other method.  All methods have a risk of introducing security
> vulnerabilities and unnecessary instabilities to production systems.
> 
> Provide guidelines for different options and when to use them.
> 
> Signed-off-by: George Dunlap 
> ---
> v4:
> - s/guest should/guests shouldn't/;
> - Add a note about the effect of domain_crash() further up the stack.

Sorry, obviously this patch should have 'v4' in the subject.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] CODING_STYLE: Document how to handle unexpected conditions

2020-01-07 Thread George Dunlap
It's not always clear what the best way is to handle unexpected
conditions: whether with ASSERT(), domain_crash(), BUG_ON(), or some
other method.  All methods have a risk of introducing security
vulnerabilities and unnecessary instabilities to production systems.

Provide guidelines for different options and when to use them.

Signed-off-by: George Dunlap 
---
v4:
- s/guest should/guests shouldn't/;
- Add a note about the effect of domain_crash() further up the stack.
v3:
- A number of minor edits
- Expand on domain_crash a bit.
v2:
- Clarify meaning of "or" clause
- Add domain_crash as an option
- Make it clear that ASSERT() is not an error handling mechanism.

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Konrad Wilk 
CC: Stefano Stabellini 
CC: Julien Grall 
---
 CODING_STYLE | 102 +++
 1 file changed, 102 insertions(+)

diff --git a/CODING_STYLE b/CODING_STYLE
index 810b71c16d..9f50d9cec4 100644
--- a/CODING_STYLE
+++ b/CODING_STYLE
@@ -133,3 +133,105 @@ the end of files.  It should be:
  * indent-tabs-mode: nil
  * End:
  */
+
+Handling unexpected conditions
+--
+
+GUIDELINES:
+
+Passing errors up the stack should be used when the caller is already
+expecting to handle errors, and the state when the error was
+discovered isn???t broken, or isn't too hard to fix.
+
+domain_crash() should be used when passing errors up the stack is too
+difficult, and/or when fixing up state of a guest is impractical, but
+where fixing up the state of Xen will allow Xen to continue running.
+This is particularly appropriate when the guest is exhibiting behavior
+well-behaved guests shouldn't.
+
+BUG_ON() should be used when you can???t pass errors up the stack, and
+either continuing or crashing the guest would likely cause an
+information leak or privilege escalation vulnerability.
+
+ASSERT() IS NOT AN ERROR HANDLING MECHANISM.  ASSERT is a way to move
+detection of a bug earlier in the programming cycle; it is a
+more-noticeable printk.  It should only be added after one of the
+other three error-handling mechanisms has been evaluated for
+reliability and security.
+
+RATIONALE:
+
+It's frequently the case that code is written with the assumption that
+certain conditions can never happen.  There are several possible
+actions programmers can take in these situations:
+
+* Programmers can simply not handle those cases in any way, other than
+perhaps to write a comment documenting what the assumption is.
+
+* Programmers can try to handle the case gracefully -- fixing up
+in-progress state and returning an error to the user.
+
+* Programmers can crash the guest.
+
+* Programmers can use ASSERT(), which will cause the check to be
+executed in DEBUG builds, and cause the hypervisor to crash if it's
+violated
+
+* Programmers can use BUG_ON(), which will cause the check to be
+executed in both DEBUG and non-DEBUG builds, and cause the hypervisor
+to crash if it's violated.
+
+In selecting which response to use, we want to achieve several goals:
+
+- To minimize risk of introducing security vulnerabilities,
+  particularly as the code evolves over time
+
+- To efficiently spend programmer time
+
+- To detect violations of assumptions as early as possible
+
+- To minimize the impact of bugs on production use cases
+
+The guidelines above attempt to balance these:
+
+- When the caller is expecting to handle errors, and there is no
+broken state at the time the unexpected condition is discovered, or
+when fixing the state is straightforward, then fixing up the state and
+returning an error is the most robust thing to do.  However, if the
+caller isn't expecting to handle errors, or if the state is difficult
+to fix, then returning an error may require extensive refactoring,
+which is not a good use of programmer time when they're certain that
+this condition cannot occur.
+
+- BUG_ON() will stop all hypervisor action immediately.  In situations
+where continuing might allow an attacker to escalate privilege, a
+BUG_ON() can change a privilege escalation or information leak into a
+denial-of-service (an improvement).  But in situations where
+continuing (say, returning an error) might be safe, then BUG_ON() can
+change a benign failure into denial-of-service (a degradation).
+
+- domain_crash() is similar to BUG_ON(), but with a more limited
+effect: it stops that domain immediately.  In situations where
+continuing might cause guest or hypervisor corruption, but destroying
+the guest allows the hypervisor to continue, this can change a more
+serious bug into a guest denial-of-service.  But in situations where
+returning an error might be safe, then domain_crash() can change a
+benign failure into a guest denial-of-service.
+
+- ASSERT() will stop the hypervisor during development, but allow
+hypervisor action to continue during production.  In situations where
+continuing will at worst result in a denial-of-service, and at 

[Xen-devel] [PATCH] MAINTAINERS: Add explicit check-in policy section

2020-01-07 Thread George Dunlap
The "nesting" section in the MAINTAINERS file was not initially
intended to describe the check-in policy for patches, but only how
nesting worked; but since there was no check-in policy, it has been
acting as a de-facto policy.

One problem with this is that the policy is not complete: It doesn't
cover open objections, time to check-in, or so on.  The other problem
with the policy is that, as written, it doesn't account for
maintainers submitting patches to files which they themselves
maintain.  This is fine for situations where there are are multiple
maintainers, but not for situations where there is only one
maintainer.

Add an explicit "Check-in policy" section to the MAINTAINERS document
to serve as the canonical reference for the check-in policy.  Move
paragraphs not explicitly related to nesting into it.

While here, "promote" the "The meaning of nesting" section title.

DISCUSSION

This seems to be a change from people's understanding of the current
policy.  Most people's understanding of the current policy seems to be:

1.  In order to get a change to a given file committed, it must have
an Ack or Review from at least one *maintainer* of that file other
than the submitter.

2. In the case where a file has only one maintainer, it must have an
Ack or Review from a "nested" maintainer.

I.e., if I submitted something to x86/mm, it would require an Ack from
Jan or Andy, or (in exceptional circumstances) The Rest; but an Ack from
(say) Roger or Juergen wouldn't suffice.

Let's call this the "maintainer-ack" approach (because it must have an
ack or r-b from a maintainer to be checked in), and the proposal in
this patch the "maintainer-approval" (since SoB from a maintainer
indicates approval).

The core issue I have with "maintainer-ack" is that it makes the
maintainer less privileged with regard to writing code than
non-maintainers.  If component X has maintainers A and B, then a
non-maintainer can have code checked in if reviewed either by A or B.
If A or B wants code checked in, they have to wait for exactly one
person to review it.

In fact, if B is quite busy, the easiest way for A really to get their
code checked in might be to hand it to a non-maintainer N, and ask N
to submit it as their own.  Then A can Ack the patches and check them
in.

The current system, therefore, either sets up a perverse incentive (if
you think the behavior described above is unacceptable) or unnecessary
bureaucracy (if you think it's acceptable).  Either way I think we
should set up our system to avoid it.

Other variations on "maintainer-ack" have been proposed:

- Allow maintainer's patches to go in with an R-b from "designated
  reviewers"

- Allow maintainer's patches to go in with an Ack from more general
  maintainer

Both fundamentally make it harder for maintainers to get their code in
and/or reviewed effectively than non-maintainers, setting up the
perverse incentive / unnecessary bureaucracy.

Signed-off-by: George Dunlap 
---
v2:
- Modify "sufficient time" to "sufficient time and/or warning".
- Add a comment explicitly stating that there are exceptions.
- Move some of the alternate proposals into the changelog itself

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
CC: Konrad Wilk 
CC: Stefano Stabellini 
CC: Julien Grall 
CC: Lars Kurth 

This is a follow-up to the discussion in `[PATCH for-4.12]
passthrough/vtd: Drop the "workaround_bios_bug" logic entirely`, specifically
Message-ID: <5c9cf25a027800222...@prv1-mh.provo.novell.com>

Another approach would be to say that in the case of multiple
maintainers, the maintainers themselves can decide to mandate each
other's Ack.  For instance, Dario and I could agree that we don't need
each others' ack for changes to the scheduler, but Andy and Jan could
agree that they do need each other's Ack for changes to the x86 code.
Checks that maintainers themselves have agreed on will produce neither
perverse incentives, nor be considered "unnecessary".
---
 MAINTAINERS | 53 +++--
 1 file changed, 47 insertions(+), 6 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index eaea4620e2..9d15afa595 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -104,7 +104,53 @@ Descriptions of section entries:
   xen-maintainers-
 
 
-The meaning of nesting:
+   Check-in policy
+   ===
+
+In order for a patch to be checked in, in general, several conditions
+must be met:
+
+1. In order to get a change to a given file committed, it must have
+   the approval of at least one maintainer of that file.
+
+   A patch of course needs Acks from the maintainers of each file that
+   it changes; so a patch which changes xen/arch/x86/traps.c,
+   xen/arch/x86/mm/p2m.c, and xen/arch/x86/mm/shadow/multi.c would
+   require an Ack from each of the three sets of maintainers.
+
+   See below for rules on nested maintainership.
+
+2. It must have an Acked-by or a Reviewed-by from someone other 

[Xen-devel] [PATCH] VT-d: dma_pte_clear_one() can't fail anymore

2020-01-07 Thread Jan Beulich
Hence it's pointless for it to return an error indicator, and it's even
less useful for it to be __must_check. This is a result of commit
e8afe1124cc1 ("iommu: elide flushing for higher order map/unmap
operations") moving the TLB flushing out of the function.

Signed-off-by: Jan Beulich 

--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -608,13 +608,12 @@ static int __must_check iommu_flush_iotl
 }
 
 /* clear one page's page table */
-static int __must_check dma_pte_clear_one(struct domain *domain, u64 addr,
-  unsigned int *flush_flags)
+static void dma_pte_clear_one(struct domain *domain, uint64_t addr,
+  unsigned int *flush_flags)
 {
 struct domain_iommu *hd = dom_iommu(domain);
 struct dma_pte *page = NULL, *pte = NULL;
 u64 pg_maddr;
-int rc = 0;
 
 spin_lock(>arch.mapping_lock);
 /* get last level pte */
@@ -622,7 +621,7 @@ static int __must_check dma_pte_clear_on
 if ( pg_maddr == 0 )
 {
 spin_unlock(>arch.mapping_lock);
-return 0;
+return;
 }
 
 page = (struct dma_pte *)map_vtd_domain_page(pg_maddr);
@@ -632,7 +631,7 @@ static int __must_check dma_pte_clear_on
 {
 spin_unlock(>arch.mapping_lock);
 unmap_vtd_domain_page(page);
-return 0;
+return;
 }
 
 dma_clear_pte(*pte);
@@ -642,8 +641,6 @@ static int __must_check dma_pte_clear_on
 iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
 
 unmap_vtd_domain_page(page);
-
-return rc;
 }
 
 static void iommu_free_pagetable(u64 pt_maddr, int level)
@@ -1802,7 +1799,9 @@ static int __must_check intel_iommu_unma
 if ( iommu_hwdom_passthrough && is_hardware_domain(d) )
 return 0;
 
-return dma_pte_clear_one(d, dfn_to_daddr(dfn), flush_flags);
+dma_pte_clear_one(d, dfn_to_daddr(dfn), flush_flags);
+
+return 0;
 }
 
 static int intel_iommu_lookup_page(struct domain *d, dfn_t dfn, mfn_t *mfn,

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  1   2   >