date:20191211

Re: [Xen-devel] [BUG] Xen-ballooned memory never returned to domain after partial-free

2019-12-11 Thread Jürgen Groß


On 11.12.19 23:08, Nicholas Tsirakis wrote:

Hello,

The issue I'm seeing is that pages of previously-xenballooned memory are getting
trapped in the balloon on free, specifically when they are free'd in batches
(i.e. not all at once). The first batch is restored to the domain properly, but
subsequent frees are not.

Truthfully I'm not sure if this is a bug or not, but the behavior I'm seeing
doesn't seem to make sense. Note that this "bug" is in the balloon driver, but
the behavior is seen when using the gnttab API, which utilizes the balloon in
the background.

--

This issue is better illustrated as an example, seen below. Note that the file
in question is drivers/xen/balloon.c:

Kernel version: 4.19.*, code seems consistent on master as well
Relevant configs:
 - CONFIG_MEMORY_HOTPLUG not set
 - CONFIG_XEN_BALLOON_MEMORY_HOTPLUG not set

* current_pages = # of pages assigned to domain
* target_pages = # of pages we want assigned to domain
* credit = target - current

Start with current_pages/target_pages = 20 pages

1. alloc 5 pages with gnttab_alloc_pages(). current_pages = 15, credit = 5.
2. alloc 3 pages with gnttab_alloc_pages(). current_pages = 12, credit = 8.
3. some time later, free the last 3 pages with gnttab_free_pages().
4. 3 pages go back to balloon and balloon worker is scheduled since credit > 0.
 * Relevant part of balloon worker shown below:

 do {
 ...

 credit = current_credit();

 if (credit > 0) {
 if (balloon_is_inflated())
 state = increase_reservation(credit);
 else
 state = reserve_additional_memory();
 }

 ...

 } while (credit && state == BP_DONE);

5. credit > 0 and the balloon contains 3 pages, so run increase_reservation. 3
pages are restored to domain, correctly. current_pages = 15, credit = 5.
6. at this point credit is still > 0, so we loop again.
7. this time, the balloon has 0 pages, so we call reserve_additional_memory,
seen below. note that CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is disabled, so this
funciton is very sparse.

 static enum bp_state reserve_additional_memory(void)
 {
 balloon_stats.target_pages = balloon_stats.current_pages;
 return BP_ECANCELED;
 }

8. now target = current = 15, which drops our credit down to 0.


And I think this is the problem. We want here:

balloon_stats.target_pages = balloon_stats.current_pages +
 balloon_stats.target_unpopulated;

This should fix it. Thanks for the detailed analysis!

Does the attached patch work for you?

And are you fine with the "Reported-by:" added?


Juergen
>From 7cf6cf2b94ee11002dab439fb4ed5c7dcc1a971b Mon Sep 17 00:00:00 2001
From: Juergen Gross 
Date: Thu, 12 Dec 2019 08:12:26 +0100
Subject: [PATCH] xen/balloon: fix ballooned page accounting without hotplug
 enabled

When CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is not defined
reserve_additional_memory() will set balloon_stats.target_pages to a
wrong value in case there are still some ballooned pages allocated via
alloc_xenballooned_pages().

This will result in balloon_process() no longer be triggered when
ballooned pages are freed in batches.

Reported-by: Nicholas Tsirakis 
Signed-off-by: Juergen Gross 
---
 drivers/xen/balloon.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 4f2e78a5e4db..0c142bcab79d 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -394,7 +394,8 @@ static struct notifier_block xen_memory_nb = {
 #else
 static enum bp_state reserve_additional_memory(void)
 {
-	balloon_stats.target_pages = balloon_stats.current_pages;
+	balloon_stats.target_pages = balloon_stats.current_pages +
+ balloon_stats.target_unpopulated;
 	return BP_ECANCELED;
 }
 #endif /* CONFIG_XEN_BALLOON_MEMORY_HOTPLUG */
-- 
2.16.4

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [ovmf bisection] complete build-i386

2019-12-11 Thread osstest service owner

branch xen-unstable
xenbranch xen-unstable
job build-i386
testid xen-build

Tree: ovmf https://github.com/tianocore/edk2.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: seabios git://xenbits.xen.org/osstest/seabios.git
Tree: xen git://xenbits.xen.org/xen.git

*** Found and reproduced problem changeset ***

  Bug is in tree:  ovmf https://github.com/tianocore/edk2.git
  Bug introduced:  13c5e34a1b8bfedbd10ea038cfcbae5caeab6652
  Bug not present: 804666c86e7b6f04fe5c5cfdb13199c19e0e99b0
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/144757/


  commit 13c5e34a1b8bfedbd10ea038cfcbae5caeab6652
  Author: Bob Feng 
  Date:   Mon Dec 2 16:24:19 2019 +0800
  
  BaseTools: Add build option for dependency file generation
  
  BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=2311
  
  Add /showIncludes for msvc and -MMD -MF $@.deps
  for GCC and CLANG
  
  Remove /MP for msvc since /MP does not work with
  /showIncludes
  
  Signed-off-by: Bob Feng 
  
  Cc: Liming Gao 
  Cc: Steven Shi 
  Cc: Michael D Kinney 
  Reviewed-by: Liming Gao 


For bisection revision-tuple graph see:
   
http://logs.test-lab.xenproject.org/osstest/results/bisect/ovmf/build-i386.xen-build.html
Revision IDs in each graph node refer, respectively, to the Trees above.


Running cs-bisection-step 
--graph-out=/home/logs/results/bisect/ovmf/build-i386.xen-build 
--summary-out=tmp/144757.bisection-summary --basis-template=144637 
--blessings=real,real-bisect ovmf build-i386 xen-build
Searching for failure / basis pass:
 144718 fail [host=huxelrebe0] / 144637 [host=elbling1] 144590 
[host=chardonnay1] 144583 [host=albana0] 144578 [host=albana0] 144564 
[host=albana0] 144527 ok.
Failure / basis pass flights: 144718 / 144527
(tree with no url: minios)
Tree: ovmf https://github.com/tianocore/edk2.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: seabios git://xenbits.xen.org/osstest/seabios.git
Tree: xen git://xenbits.xen.org/xen.git
Latest 2fe25a74d6fee3c2ac0b930f7f3596cb432e766e 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
933ebad2470a169504799a1d95b8e410bd9847ef 
f21b5a4aeb020f2a5e2c6503f906a9349dd2f069 
4935a5433db28077fe6313f920bbedcd54516cec
Basis pass 94d4efb54ec4ca894287276ce22d29b6261dbc0b 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
933ebad2470a169504799a1d95b8e410bd9847ef 
c9ba5276e3217ac6a1ec772dbebf568ba3a8a55d 
05de315b00bf2951617b8ef28811b1f1f2dd5742
Generating revisions with ./adhoc-revtuple-generator  
https://github.com/tianocore/edk2.git#94d4efb54ec4ca894287276ce22d29b6261dbc0b-2fe25a74d6fee3c2ac0b930f7f3596cb432e766e
 
git://xenbits.xen.org/qemu-xen-traditional.git#d0d8ad39ecb51cd7497cd524484fe09f50876798-d0d8ad39ecb51cd7497cd524484fe09f50876798
 
git://xenbits.xen.org/qemu-xen.git#933ebad2470a169504799a1d95b8e410bd9847ef-933ebad2470a169504799a1d95b8e410bd9847ef
 
git://xenbits.xen.org/osstest/seabios.git#c9ba5276e3217ac6a1ec772dbebf568ba3a8a5\
 5d-f21b5a4aeb020f2a5e2c6503f906a9349dd2f069 
git://xenbits.xen.org/xen.git#05de315b00bf2951617b8ef28811b1f1f2dd5742-4935a5433db28077fe6313f920bbedcd54516cec
Use of uninitialized value $parents in array dereference at 
./adhoc-revtuple-generator line 465.
Use of uninitialized value in concatenation (.) or string at 
./adhoc-revtuple-generator line 465.
Loaded 12532 nodes in revision graph
Searching for test results:
 144527 pass 94d4efb54ec4ca894287276ce22d29b6261dbc0b 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
933ebad2470a169504799a1d95b8e410bd9847ef 
c9ba5276e3217ac6a1ec772dbebf568ba3a8a55d 
05de315b00bf2951617b8ef28811b1f1f2dd5742
 144578 [host=albana0]
 144583 [host=albana0]
 144564 [host=albana0]
 144590 [host=chardonnay1]
 144637 [host=elbling1]
 144646 [host=huxelrebe1]
 144676 [host=huxelrebe1]
 144651 [host=huxelrebe1]
 144703 [host=huxelrebe1]
 144661 fail irrelevant
 144683 fail irrelevant
 144706 fail d3add11e87dace180387562d6f1951f2bffbd3d9 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
933ebad2470a169504799a1d95b8e410bd9847ef 
f21b5a4aeb020f2a5e2c6503f906a9349dd2f069 
4935a5433db28077fe6313f920bbedcd54516cec
 144748 fail cb277815d5ea92718eed2d334641451ce65b0ff5 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
933ebad2470a169504799a1d95b8e410bd9847ef 
f21b5a4aeb020f2a5e2c6503f906a9349dd2f069 
b73aad4c8b6a767ce15cc8cb65f9eeab7bfccdae
 144752 fail 13c5e34a1b8bfedbd10ea038cfcbae5caeab6652 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
933ebad2470a169504799a1d95b8e410bd9847ef 
f21b5a4aeb020f2a5e2c6503f906a9349dd2f069 
b73aad4c8b6a767ce15cc8cb65f9eeab7bfccdae
 144693 fail 97eedf5dfbaffde33210fd88066247cf0b7d3325 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
933ebad2470a169504799a1d95b8e410bd9847ef 
f21b5a4aeb020f2a5e2c6503f906a9349dd2f069 
4935a5433db28077fe6313f920bbedcd54516cec
 144718 fail 2fe25a74d6fee3c2ac0b930f7f3596cb432e766e

[Xen-devel] [xen-4.12-testing test] 144717: tolerable FAIL - PUSHED

2019-12-11 Thread osstest service owner

flight 144717 xen-4.12-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144717/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qcow217 guest-localmigrate/x10   fail  like 144587
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 xen  8f333d13917cbd66b655df18a4476ed58df9d16f
baseline version:
 xen  212b8500cb394b3a664655f79ca0bdcb31246ff7

Last test of basis   144587  2019-12-06 12:05:57 Z5 days
Testing same since   144717  2019-12-11 14:36:36 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  George Dunlap 
  Jan Beulich 
  Jeremi Piotrowski 
  Julien Grall 
  Kevin Tian 
  Krzysztof Kolasa 
  Mark Pryor 

jobs:
 build-amd64-xsm  pass

Re: [Xen-devel] [PATCH v3 4/4] xen-blkback: support dynamic unbind/bind

2019-12-11 Thread Jürgen Groß


On 11.12.19 16:29, Paul Durrant wrote:

By simply re-attaching to shared rings during connect_ring() rather than
assuming they are freshly allocated (i.e assuming the counters are zero)
it is possible for vbd instances to be unbound and re-bound from and to
(respectively) a running guest.

This has been tested by running:

while true;
   do fio --name=randwrite --ioengine=libaio --iodepth=16 \
   --rw=randwrite --bs=4k --direct=1 --size=1G --verify=crc32;
   done

in a PV guest whilst running:

while true;
   do echo vbd-$DOMID-$VBD >unbind;
   echo unbound;
   sleep 5;
   echo vbd-$DOMID-$VBD >bind;
   echo bound;
   sleep 3;
   done

in dom0 from /sys/bus/xen-backend/drivers/vbd to continuously unbind and
re-bind its system disk image.

This is a highly useful feature for a backend module as it allows it to be
unloaded and re-loaded (i.e. updated) without requiring domUs to be halted.
This was also tested by running:

while true;
   do echo vbd-$DOMID-$VBD >unbind;
   echo unbound;
   sleep 5;
   rmmod xen-blkback;
   echo unloaded;
   sleep 1;
   modprobe xen-blkback;
   echo bound;
   cd $(pwd);
   sleep 3;
   done

in dom0 whilst running the same loop as above in the (single) PV guest.

Some (less stressful) testing has also been done using a Windows HVM guest
with the latest 9.0 PV drivers installed.

Signed-off-by: Paul Durrant 


Reviewed-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 3/4] xen/interface: re-define FRONT/BACK_RING_ATTACH()

2019-12-11 Thread Jürgen Groß


On 11.12.19 16:29, Paul Durrant wrote:

Currently these macros are defined to re-initialize a front/back ring
(respectively) to values read from the shared ring in such a way that any
requests/responses that are added to the shared ring whilst the front/back
is detached will be skipped over. This, in general, is not a desirable
semantic since most frontend implementations will eventually block waiting
for a response which would either never appear or never be processed.

Since the macros are currently unused, take this opportunity to re-define
them to re-initialize a front/back ring using specified values. This also
allows FRONT/BACK_RING_INIT() to be re-defined in terms of
FRONT/BACK_RING_ATTACH() using a specified value of 0.

NOTE: BACK_RING_ATTACH() will be used directly in a subsequent patch.

Signed-off-by: Paul Durrant 


Reviewed-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 2/4] xenbus: limit when state is forced to closed

2019-12-11 Thread Jürgen Groß


On 11.12.19 16:29, Paul Durrant wrote:

If a driver probe() fails then leave the xenstore state alone. There is no
reason to modify it as the failure may be due to transient resource
allocation issues and hence a subsequent probe() may succeed.

If the driver supports re-binding then only force state to closed during
remove() only in the case when the toolstack may need to clean up. This can
be detected by checking whether the state in xenstore has been set to
closing prior to device removal.

NOTE: Re-bind support is indicated by new boolean in struct xenbus_driver,
   which defaults to false. Subsequent patches will add support to
   some backend drivers.

Signed-off-by: Paul Durrant 


Reviewed-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-4.11-testing test] 144716: tolerable FAIL - PUSHED

2019-12-11 Thread osstest service owner

flight 144716 xen-4.11-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144716/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass

version targeted for testing:
 xen  f562c6bb93a284033bf6f5af06287a71bc40a110
baseline version:
 xen  239d37e514c93e29d50d71f734b1dc453b2236a6

Last test of basis   144505  2019-12-03 10:09:29 Z8 days
Testing same since   144716  2019-12-11 14:36:36 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Anthony PERARD 
  George Dunlap 
  Jan Beulich 
  Joe Jin 
  Julien Grall 
  Julien

[Xen-devel] [xen-unstable test] 144712: tolerable FAIL - PUSHED

2019-12-11 Thread osstest service owner

flight 144712 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144712/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds16 guest-start/debian.repeat fail REGR. vs. 144668

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-rtds 18 guest-localmigrate/x10   fail  like 144668
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 144668
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 144668
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 144668
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 144668
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 144668
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 144668
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 144668
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 144668
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 144668
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 xen  272c18435e93cbf749c096a9552ab5ef0d79a4ca
baseline version:
 xen  4935a5433db28077fe6313f920bbedcd54516cec

Last test of basis   144668  2019-12-10 15:37:57 Z1 days
Testing same since   144686  2019-12-11 01:51:21 Z0 days2 attempts


People who touched revisions under test:
  Andrew Cooper 
  Daniel De Graaf 
  George Dunlap 
  Julien Grall

[Xen-devel] [ovmf bisection] complete build-amd64-xsm

2019-12-11 Thread osstest service owner

branch xen-unstable
xenbranch xen-unstable
job build-amd64-xsm
testid xen-build

Tree: ovmf https://github.com/tianocore/edk2.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: seabios git://xenbits.xen.org/osstest/seabios.git
Tree: xen git://xenbits.xen.org/xen.git

*** Found and reproduced problem changeset ***

  Bug is in tree:  ovmf https://github.com/tianocore/edk2.git
  Bug introduced:  13c5e34a1b8bfedbd10ea038cfcbae5caeab6652
  Bug not present: 804666c86e7b6f04fe5c5cfdb13199c19e0e99b0
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/144742/


  commit 13c5e34a1b8bfedbd10ea038cfcbae5caeab6652
  Author: Bob Feng 
  Date:   Mon Dec 2 16:24:19 2019 +0800
  
  BaseTools: Add build option for dependency file generation
  
  BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=2311
  
  Add /showIncludes for msvc and -MMD -MF $@.deps
  for GCC and CLANG
  
  Remove /MP for msvc since /MP does not work with
  /showIncludes
  
  Signed-off-by: Bob Feng 
  
  Cc: Liming Gao 
  Cc: Steven Shi 
  Cc: Michael D Kinney 
  Reviewed-by: Liming Gao 


For bisection revision-tuple graph see:
   
http://logs.test-lab.xenproject.org/osstest/results/bisect/ovmf/build-amd64-xsm.xen-build.html
Revision IDs in each graph node refer, respectively, to the Trees above.


Running cs-bisection-step 
--graph-out=/home/logs/results/bisect/ovmf/build-amd64-xsm.xen-build 
--summary-out=tmp/144742.bisection-summary --basis-template=144637 
--blessings=real,real-bisect ovmf build-amd64-xsm xen-build
Searching for failure / basis pass:
 144718 fail [host=godello1] / 144637 [host=italia0] 144590 ok.
Failure / basis pass flights: 144718 / 144590
(tree with no url: minios)
Tree: ovmf https://github.com/tianocore/edk2.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: seabios git://xenbits.xen.org/osstest/seabios.git
Tree: xen git://xenbits.xen.org/xen.git
Latest 2fe25a74d6fee3c2ac0b930f7f3596cb432e766e 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
933ebad2470a169504799a1d95b8e410bd9847ef 
f21b5a4aeb020f2a5e2c6503f906a9349dd2f069 
4935a5433db28077fe6313f920bbedcd54516cec
Basis pass 49054b6bb66d35484e92c65f27584c4283a60986 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
933ebad2470a169504799a1d95b8e410bd9847ef 
c9ba5276e3217ac6a1ec772dbebf568ba3a8a55d 
131c89ce1e1dfd0b57a249615a92de4f120d9100
Generating revisions with ./adhoc-revtuple-generator  
https://github.com/tianocore/edk2.git#49054b6bb66d35484e92c65f27584c4283a60986-2fe25a74d6fee3c2ac0b930f7f3596cb432e766e
 
git://xenbits.xen.org/qemu-xen-traditional.git#d0d8ad39ecb51cd7497cd524484fe09f50876798-d0d8ad39ecb51cd7497cd524484fe09f50876798
 
git://xenbits.xen.org/qemu-xen.git#933ebad2470a169504799a1d95b8e410bd9847ef-933ebad2470a169504799a1d95b8e410bd9847ef
 
git://xenbits.xen.org/osstest/seabios.git#c9ba5276e3217ac6a1ec772dbebf568ba3a8a5\
 5d-f21b5a4aeb020f2a5e2c6503f906a9349dd2f069 
git://xenbits.xen.org/xen.git#131c89ce1e1dfd0b57a249615a92de4f120d9100-4935a5433db28077fe6313f920bbedcd54516cec
Use of uninitialized value $parents in array dereference at 
./adhoc-revtuple-generator line 465.
Use of uninitialized value in concatenation (.) or string at 
./adhoc-revtuple-generator line 465.
Loaded 12532 nodes in revision graph
Searching for test results:
 144590 pass 49054b6bb66d35484e92c65f27584c4283a60986 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
933ebad2470a169504799a1d95b8e410bd9847ef 
c9ba5276e3217ac6a1ec772dbebf568ba3a8a55d 
131c89ce1e1dfd0b57a249615a92de4f120d9100
 144637 [host=italia0]
 144646 fail irrelevant
 144701 [host=godello0]
 144676 [host=godello0]
 144702 [host=godello0]
 144651 fail irrelevant
 144741 pass 804666c86e7b6f04fe5c5cfdb13199c19e0e99b0 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
933ebad2470a169504799a1d95b8e410bd9847ef 
f21b5a4aeb020f2a5e2c6503f906a9349dd2f069 
b73aad4c8b6a767ce15cc8cb65f9eeab7bfccdae
 144703 [host=godello0]
 144704 [host=godello0]
 144688 [host=godello0]
 144705 [host=godello0]
 144690 [host=godello0]
 144661 [host=godello0]
 144691 [host=godello0]
 144683 [host=godello0]
 144706 [host=godello0]
 144707 [host=godello0]
 144692 [host=godello0]
 144694 [host=godello0]
 144732 pass 804666c86e7b6f04fe5c5cfdb13199c19e0e99b0 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
933ebad2470a169504799a1d95b8e410bd9847ef 
c9ba5276e3217ac6a1ec772dbebf568ba3a8a55d 
b813ce366a4313a28b2ac26e92da0cb8d6a01a75
 144695 [host=godello0]
 144696 [host=godello0]
 144697 [host=godello0]
 144698 [host=godello0]
 144700 [host=godello0]
 144693 [host=godello0]
 144715 [host=godello0]
 144709 [host=godello0]
 144718 fail 2fe25a74d6fee3c2ac0b930f7f3596cb432e766e 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
933ebad2470a169504799a1d95b8e410bd9847ef 
f21b5a4aeb020f2a5e2c6503f906a9349dd2f069 
4935a5433db28077fe6313f920bbedcd54516cec
 144714

[Xen-devel] [BUG] Xen-ballooned memory never returned to domain after partial-free

2019-12-11 Thread Nicholas Tsirakis

Hello,

The issue I'm seeing is that pages of previously-xenballooned memory are getting
trapped in the balloon on free, specifically when they are free'd in batches
(i.e. not all at once). The first batch is restored to the domain properly, but
subsequent frees are not.

Truthfully I'm not sure if this is a bug or not, but the behavior I'm seeing
doesn't seem to make sense. Note that this "bug" is in the balloon driver, but
the behavior is seen when using the gnttab API, which utilizes the balloon in
the background.

--

This issue is better illustrated as an example, seen below. Note that the file
in question is drivers/xen/balloon.c:

Kernel version: 4.19.*, code seems consistent on master as well
Relevant configs:
- CONFIG_MEMORY_HOTPLUG not set
- CONFIG_XEN_BALLOON_MEMORY_HOTPLUG not set

* current_pages = # of pages assigned to domain
* target_pages = # of pages we want assigned to domain
* credit = target - current

Start with current_pages/target_pages = 20 pages

1. alloc 5 pages with gnttab_alloc_pages(). current_pages = 15, credit = 5.
2. alloc 3 pages with gnttab_alloc_pages(). current_pages = 12, credit = 8.
3. some time later, free the last 3 pages with gnttab_free_pages().
4. 3 pages go back to balloon and balloon worker is scheduled since credit > 0.
* Relevant part of balloon worker shown below:

do {
...

credit = current_credit();

if (credit > 0) {
if (balloon_is_inflated())
state = increase_reservation(credit);
else
state = reserve_additional_memory();
}

...

} while (credit && state == BP_DONE);

5. credit > 0 and the balloon contains 3 pages, so run increase_reservation. 3
   pages are restored to domain, correctly. current_pages = 15, credit = 5.
6. at this point credit is still > 0, so we loop again.
7. this time, the balloon has 0 pages, so we call reserve_additional_memory,
   seen below. note that CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is disabled, so this
   funciton is very sparse.

static enum bp_state reserve_additional_memory(void)
{
balloon_stats.target_pages = balloon_stats.current_pages;
return BP_ECANCELED;
}

8. now target = current = 15, which drops our credit down to 0.
9. at some point later we attempt to free the remaining 5 pages with
   gnttab_free_pages().
10. 5 pages go back into the balloon, but this time credit = 0, so we never
trigger our balloon worker (it wouldn't do anything anyway).
11. since we've essentially irreversibly decreased target_pages, we'll never
attempt to re-add those pages to our domain, and those pages are reserved
in the balloon forever.
12. this can be verified by running "free", "cat /proc/meminfo", etc. to show
that the total memory has indeed decreased permanently until host reboot.

Is this desired behavior? Why would we decrease our target pages if there's no
way to restore them? I understand there is a helper function to manually reset
the target, but the caller would need to manually keep track of the starting
pages; that seems like unnecessary maintenance that the balloon should handle.

Additionally, why should any of the above code be possible if we have memory
hotplugging disabled? I'm surprised we are able to balloon any memory out from
the domain in the first place. I would have expected gnttab_alloc_pages to fail.

Please CC niko.tsira...@gmail.com on any replies. Thank you,

--Niko

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 2/2] xen/arm: sign extend writes to TimerValue

2019-12-11 Thread Jeff Kubascik

Per the ARMv8 Reference Manual (ARM DDI 0487E.a), section D11.2.4
specifies that the values in the TimerValue view of the timers are
signed in standard two's complement form. When writing to the TimerValue
register, it should be signed extended as described by the equation

   CompareValue = (Counter[63:0] + SignExtend(TimerValue))[63:0]

Signed-off-by: Jeff Kubascik 
---
 xen/arch/arm/vtimer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
index 21b98ec20a..872181d9b6 100644
--- a/xen/arch/arm/vtimer.c
+++ b/xen/arch/arm/vtimer.c
@@ -211,7 +211,7 @@ static bool vtimer_cntp_tval(struct cpu_user_regs *regs, 
uint32_t *r,
 }
 else
 {
-v->arch.phys_timer.cval = cntpct + *r;
+v->arch.phys_timer.cval = cntpct + (uint64_t)(int32_t)*r;
 if ( v->arch.phys_timer.ctl & CNTx_CTL_ENABLE )
 {
 v->arch.phys_timer.ctl &= ~CNTx_CTL_PENDING;
-- 
2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 1/2] xen/arm: remove physical timer offset

2019-12-11 Thread Jeff Kubascik

The physical timer traps apply an offset so that time starts at 0 for
the guest. However, this offset is not currently applied to the physical
counter. Per the ARMv8 Reference Manual (ARM DDI 0487E.a), section
D11.2.4 Timers, the "Offset" between the counter and timer should be
zero for a physical timer. This removes the offset to make the timer and
counter consistent.

This also cleans up the physical timer implementation to better match
the virtual timer - both cval's now hold the hardware value.

Signed-off-by: Jeff Kubascik 
---
 xen/arch/arm/vtimer.c| 34 ++
 xen/include/asm-arm/domain.h |  3 ---
 2 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
index e6aebdac9e..21b98ec20a 100644
--- a/xen/arch/arm/vtimer.c
+++ b/xen/arch/arm/vtimer.c
@@ -62,7 +62,6 @@ static void virt_timer_expired(void *data)
 
 int domain_vtimer_init(struct domain *d, struct xen_arch_domainconfig *config)
 {
-d->arch.phys_timer_base.offset = NOW();
 d->arch.virt_timer_base.offset = READ_SYSREG64(CNTPCT_EL0);
 d->time_offset_seconds = ticks_to_ns(d->arch.virt_timer_base.offset - 
boot_count);
 do_div(d->time_offset_seconds, 10);
@@ -108,7 +107,6 @@ int vcpu_vtimer_init(struct vcpu *v)
 
 init_timer(>timer, phys_timer_expired, t, v->processor);
 t->ctl = 0;
-t->cval = NOW();
 t->irq = d0
 ? timer_get_irq(TIMER_PHYS_NONSECURE_PPI)
 : GUEST_TIMER_PHYS_NS_PPI;
@@ -167,6 +165,7 @@ void virt_timer_restore(struct vcpu *v)
 static bool vtimer_cntp_ctl(struct cpu_user_regs *regs, uint32_t *r, bool read)
 {
 struct vcpu *v = current;
+s_time_t expires;
 
 if ( !ACCESS_ALLOWED(regs, EL0PTEN) )
 return false;
@@ -184,8 +183,9 @@ static bool vtimer_cntp_ctl(struct cpu_user_regs *regs, 
uint32_t *r, bool read)
 
 if ( v->arch.phys_timer.ctl & CNTx_CTL_ENABLE )
 {
-set_timer(>arch.phys_timer.timer,
-  v->arch.phys_timer.cval + 
v->domain->arch.phys_timer_base.offset);
+expires = v->arch.phys_timer.cval > boot_count
+  ? ticks_to_ns(v->arch.phys_timer.cval - boot_count) : 0;
+set_timer(>arch.phys_timer.timer, expires);
 }
 else
 stop_timer(>arch.phys_timer.timer);
@@ -197,26 +197,27 @@ static bool vtimer_cntp_tval(struct cpu_user_regs *regs, 
uint32_t *r,
  bool read)
 {
 struct vcpu *v = current;
-s_time_t now;
+uint64_t cntpct;
+s_time_t expires;
 
 if ( !ACCESS_ALLOWED(regs, EL0PTEN) )
 return false;
 
-now = NOW() - v->domain->arch.phys_timer_base.offset;
+cntpct = get_cycles();
 
 if ( read )
 {
-*r = (uint32_t)(ns_to_ticks(v->arch.phys_timer.cval - now) & 
0xull);
+*r = (uint32_t)((v->arch.phys_timer.cval - cntpct) & 0xull);
 }
 else
 {
-v->arch.phys_timer.cval = now + ticks_to_ns(*r);
+v->arch.phys_timer.cval = cntpct + *r;
 if ( v->arch.phys_timer.ctl & CNTx_CTL_ENABLE )
 {
 v->arch.phys_timer.ctl &= ~CNTx_CTL_PENDING;
-set_timer(>arch.phys_timer.timer,
-  v->arch.phys_timer.cval +
-  v->domain->arch.phys_timer_base.offset);
+expires = v->arch.phys_timer.cval > boot_count
+  ? ticks_to_ns(v->arch.phys_timer.cval - boot_count) : 0;
+set_timer(>arch.phys_timer.timer, expires);
 }
 }
 return true;
@@ -226,23 +227,24 @@ static bool vtimer_cntp_cval(struct cpu_user_regs *regs, 
uint64_t *r,
  bool read)
 {
 struct vcpu *v = current;
+s_time_t expires;
 
 if ( !ACCESS_ALLOWED(regs, EL0PTEN) )
 return false;
 
 if ( read )
 {
-*r = ns_to_ticks(v->arch.phys_timer.cval);
+*r = v->arch.phys_timer.cval;
 }
 else
 {
-v->arch.phys_timer.cval = ticks_to_ns(*r);
+v->arch.phys_timer.cval = *r;
 if ( v->arch.phys_timer.ctl & CNTx_CTL_ENABLE )
 {
 v->arch.phys_timer.ctl &= ~CNTx_CTL_PENDING;
-set_timer(>arch.phys_timer.timer,
-  v->arch.phys_timer.cval +
-  v->domain->arch.phys_timer_base.offset);
+expires = v->arch.phys_timer.cval > boot_count
+  ? ticks_to_ns(v->arch.phys_timer.cval - boot_count) : 0;
+set_timer(>arch.phys_timer.timer, expires);
 }
 }
 return true;
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index f3f3fb7d7f..adc7fe7210 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -65,9 +65,6 @@ struct arch_domain
 RELMEM_done,
 } relmem;
 
-struct {
-uint64_t offset;
-} phys_timer_base;
 struct {
 uint64_t offset;
 } virt_timer_base;
-- 
2.17.1

[Xen-devel] [PATCH v3 0/2] xen/arm: physical timer improvements

2019-12-11 Thread Jeff Kubascik

This patch set improves the emulation of the physical timer by removing the
physical timer offset and sign extend the TimerValue to better match the
behavior described in the ARMv8 Reference Manual (ARM DDI 0487E.a), section
D11.2.4.

Changes in v2:
- Update commit message to specify reference manual version and section
- Change physical timer cval to hold hardware value
- Make sure to sign extend TimerValue on writes. This was done by first
  casting the r pointer to (int32_t *), dereferencing it, then casting
  to uint64_t. Please let me know if there is a more correct way to do
  this

Changes in v3:
- Split TimerValue sign extension fix into separate patch
- Update commit message to mention physical timer cleanup
- Removed physical timer cval initialization line
- Changed TimerValue sign extension to (uint64_t)(int32_t)*r
- Account for condition where cval < boot_count

Jeff Kubascik (2):
  xen/arm: remove physical timer offset
  xen/arm: sign extend writes to TimerValue

 xen/arch/arm/vtimer.c| 34 ++
 xen/include/asm-arm/domain.h |  3 ---
 2 files changed, 18 insertions(+), 19 deletions(-)

-- 
2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86emul: correct LFS et al handling for 64-bit mode

2019-12-11 Thread Andrew Cooper

On 11/12/2019 09:28, Jan Beulich wrote:
> AMD and friends explicitly specify that 64-bit operands aren't possible
> for these insns. Nevertheless REX.W isn't fully ignored: It still
> cancels a possible operand size override (0x66). Intel otoh explicitly
> provides for 64-bit operands on the respective insn page of the SDM.
>
> Signed-off-by: Jan Beulich 

It is definitely more than just these.  Near jumps have per-vendor
behaviour on how long the instruction is, whereas far jump/calls are in
the same category as these by the looks of things.

~Andrew

>
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -2640,6 +2640,15 @@ x86_decode_twobyte(
>  }
>  break;
>  
> +case 0xb2: /* lss */
> +case 0xb4: /* lfs */
> +case 0xb5: /* lgs */
> +/* REX.W ignored on a vendor-dependent basis. */
> +if ( op_bytes == 8 &&
> + (ctxt->cpuid->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) 
> )
> +op_bytes = 4;
> +break;
> +
>  case 0xb8: /* jmpe / popcnt */
>  if ( rep_prefix() )
>  ctxt->opcode |= MASK_INSR(vex.pfx, X86EMUL_OPC_PFX_MASK);


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86emul: correct segment override decode for 64-bit mode

2019-12-11 Thread Andrew Cooper

On 11/12/2019 09:27, Jan Beulich wrote:
> The legacy / compatibility mode ES, CS, SS, and DS overrides are null
> prefixes in 64-bit mode, i.e. they in particular don't cancel an
> earlier FS or GS one.
>
> Signed-off-by: Jan Beulich 

null is a very overloaded term.  What you mean here is simply "ignored".

In attempting to confirm/test this, I've found yet another curiosity
with instruction length calculations when reordering a rex prefix and
legacy prefix.  Objdump gets it wrong, but the instruction boundaries
according to singlestep are weird.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 11/24] compat_ioctl: block: handle cdrom compat ioctl in non-cdrom drivers

2019-12-11 Thread Arnd Bergmann

Various block drivers implement the CDROMMULTISESSION,
CDROM_GET_CAPABILITY, and CDROMEJECT ioctl commands, relying on the
block layer to handle compat_ioctl mode for them.

Move this into the drivers directly as a preparation for simplifying
the block layer later.

Since some of these commands need a compat_ptr() conversion,
introduce a blkdev_compat_ptr_ioctl() helper function that
can be used as the .compat_ioctl callback for those drivers
that only support compatible commands.

The actual CD-ROM drivers that call cdrom_ioctl() are
converted in a separate patch.

Signed-off-by: Arnd Bergmann 
---
 block/ioctl.c| 21 +
 drivers/block/floppy.c   |  3 +++
 drivers/block/paride/pd.c|  1 +
 drivers/block/paride/pf.c|  1 +
 drivers/block/sunvdc.c   |  1 +
 drivers/block/xen-blkfront.c |  1 +
 include/linux/blkdev.h   |  7 +++
 7 files changed, 35 insertions(+)

diff --git a/block/ioctl.c b/block/ioctl.c
index 5de98b97af2a..e728331d1a5b 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -285,6 +286,26 @@ int __blkdev_driver_ioctl(struct block_device *bdev, 
fmode_t mode,
  */
 EXPORT_SYMBOL_GPL(__blkdev_driver_ioctl);
 
+#ifdef CONFIG_COMPAT
+/*
+ * This is the equivalent of compat_ptr_ioctl(), to be used by block
+ * drivers that implement only commands that are completely compatible
+ * between 32-bit and 64-bit user space
+ */
+int blkdev_compat_ptr_ioctl(struct block_device *bdev, fmode_t mode,
+   unsigned cmd, unsigned long arg)
+{
+   struct gendisk *disk = bdev->bd_disk;
+
+   if (disk->fops->ioctl)
+   return disk->fops->ioctl(bdev, mode, cmd,
+(unsigned long)compat_ptr(arg));
+
+   return -ENOIOCTLCMD;
+}
+EXPORT_SYMBOL(blkdev_compat_ptr_ioctl);
+#endif
+
 static int blkdev_pr_register(struct block_device *bdev,
struct pr_registration __user *arg)
 {
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 485865fd0412..cd3612e4e2e1 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -3879,6 +3879,9 @@ static int fd_compat_ioctl(struct block_device *bdev, 
fmode_t mode, unsigned int
 {
int drive = (long)bdev->bd_disk->private_data;
switch (cmd) {
+   case CDROMEJECT: /* CD-ROM eject */
+   case 0x6470: /* SunOS floppy eject */
+
case FDMSGON:
case FDMSGOFF:
case FDSETEMSGTRESH:
diff --git a/drivers/block/paride/pd.c b/drivers/block/paride/pd.c
index 6f9ad3fc716f..c0967507d085 100644
--- a/drivers/block/paride/pd.c
+++ b/drivers/block/paride/pd.c
@@ -874,6 +874,7 @@ static const struct block_device_operations pd_fops = {
.open   = pd_open,
.release= pd_release,
.ioctl  = pd_ioctl,
+   .compat_ioctl   = pd_ioctl,
.getgeo = pd_getgeo,
.check_events   = pd_check_events,
.revalidate_disk= pd_revalidate
diff --git a/drivers/block/paride/pf.c b/drivers/block/paride/pf.c
index 6b7d4cab3687..bb09f21ce21a 100644
--- a/drivers/block/paride/pf.c
+++ b/drivers/block/paride/pf.c
@@ -276,6 +276,7 @@ static const struct block_device_operations pf_fops = {
.open   = pf_open,
.release= pf_release,
.ioctl  = pf_ioctl,
+   .compat_ioctl   = pf_ioctl,
.getgeo = pf_getgeo,
.check_events   = pf_check_events,
 };
diff --git a/drivers/block/sunvdc.c b/drivers/block/sunvdc.c
index 571612e233fe..39aeebc6837d 100644
--- a/drivers/block/sunvdc.c
+++ b/drivers/block/sunvdc.c
@@ -171,6 +171,7 @@ static const struct block_device_operations vdc_fops = {
.owner  = THIS_MODULE,
.getgeo = vdc_getgeo,
.ioctl  = vdc_ioctl,
+   .compat_ioctl   = blkdev_compat_ptr_ioctl,
 };
 
 static void vdc_blk_queue_start(struct vdc_port *port)
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index a74d03913822..23c86350a5ab 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -2632,6 +2632,7 @@ static const struct block_device_operations 
xlvbd_block_fops =
.release = blkif_release,
.getgeo = blkif_getgeo,
.ioctl = blkif_ioctl,
+   .compat_ioctl = blkdev_compat_ptr_ioctl,
 };
 
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 47eb22a3b7f9..3e0408618da7 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1711,6 +1711,13 @@ struct block_device_operations {
const struct pr_ops *pr_ops;
 };
 
+#ifdef CONFIG_COMPAT
+extern int blkdev_compat_ptr_ioctl(struct block_device *, fmode_t,
+ unsigned int, unsigned long);
+#else
+#define blkdev_compat_ptr_ioctl NULL
+#endif
+
 extern int __blkdev_driver_ioctl(struct block_device *, fmode_t, unsigned int,

[Xen-devel] [PATCH 00/24] block, scsi: final compat_ioctl cleanup

2019-12-11 Thread Arnd Bergmann

Hi Jens, James and Martin,

This series concludes the work I did for linux-5.5 on the compat_ioctl()
cleanup, killing off fs/compat_ioctl.c and block/compat_ioctl.c by moving
everything into drivers.

Overall this would be a reduction both in complexity and line count, but
as I'm also adding documentation the overall number of lines increases
in the end.

My plan was originally to keep the SCSI and block parts separate.
This did not work easily because of interdependencies: I cannot
do the final SCSI cleanup in a good way without first addressing the
CDROM ioctls, so this is one series that I hope could be merged through
either the block or the scsi git trees, or possibly both if you can
pull in the same branch.

The series comes in these steps:

1. clean up the sg v3 interface as suggested by Linus. I have
   talked about this with Doug Gilbert as well, and he would
   rebase his sg v4 patches on top of "compat: scsi: sg: fix v3
   compat read/write interface"

2. Four patches for missing block compat_ioctl handlers, to be
   backported into stable kernels. Separate patches because they
   are needed in different stable versions.

3. Actually moving handlers out of block/compat_ioctl.c and
   block/scsi_ioctl.c into drivers, mixed in with cleanup
   patches

4. Document how to do this right. I keep getting asked about this,
   and it helps to point to some documentation file.

The series is avaialable for testing at [1].

   Arnd

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git/log/?h=compat-ioctl-endgame

Arnd Bergmann (24):
  compat: ARM64: always include asm-generic/compat.h
  compat: scsi: sg: fix v3 compat read/write interface
  compat_ioctl: block: handle BLKREPORTZONE/BLKRESETZONE
  compat_ioctl: block: handle BLKGETZONESZ/BLKGETNRZONES
  compat_ioctl: block: handle add zone open, close and finish ioctl
  compat_ioctl: block: handle Persistent Reservations
  compaT_ioctl: ubd, aoe: use blkdev_compat_ptr_ioctl
  compat_ioctl: move CDROM_SEND_PACKET handling into scsi
  compat_ioctl: move CDROMREADADIO to cdrom.c
  compat_ioctl: cdrom: handle CDROM_LAST_WRITTEN
  compat_ioctl: block: handle cdrom compat ioctl in non-cdrom drivers
  compat_ioctl: add scsi_compat_ioctl
  compat_ioctl: bsg: add handler
  compat_ioctl: ide: floppy: add handler
  compat_ioctl: scsi: move ioctl handling into drivers
  compat_ioctl: move sys_compat_ioctl() to ioctl.c
  compat_ioctl: simplify the implementation
  compat_ioctl: move cdrom commands into cdrom.c
  compat_ioctl: scsi: handle HDIO commands from drivers
  compat_ioctl: move HDIO ioctl handling into drivers/ide
  compat_ioctl: block: move blkdev_compat_ioctl() into ioctl.c
  compat_ioctl: block: simplify compat_blkpg_ioctl()
  compat_ioctl: simplify up block/ioctl.c
  Documentation: document ioctl interfaces better

 Documentation/core-api/index.rst   |   1 +
 Documentation/core-api/ioctl.rst   | 250 +++
 arch/arm64/include/asm/compat.h|   5 +-
 arch/um/drivers/ubd_kern.c |   1 +
 block/Makefile |   1 -
 block/bsg.c|   1 +
 block/compat_ioctl.c   | 411 -
 block/ioctl.c  | 319 +++
 block/scsi_ioctl.c | 214 -
 drivers/ata/libata-scsi.c  |   9 +
 drivers/block/aoe/aoeblk.c |   1 +
 drivers/block/floppy.c |   3 +
 drivers/block/paride/pcd.c |   3 +
 drivers/block/paride/pd.c  |   1 +
 drivers/block/paride/pf.c  |   1 +
 drivers/block/pktcdvd.c|  26 +-
 drivers/block/sunvdc.c |   1 +
 drivers/block/virtio_blk.c |   3 +
 drivers/block/xen-blkfront.c   |   1 +
 drivers/cdrom/cdrom.c  |  35 ++-
 drivers/cdrom/gdrom.c  |   3 +
 drivers/ide/ide-cd.c   |  40 +++
 drivers/ide/ide-disk.c |   3 +
 drivers/ide/ide-floppy.c   |   4 +
 drivers/ide/ide-floppy.h   |   2 +
 drivers/ide/ide-floppy_ioctl.c |  35 +++
 drivers/ide/ide-gd.c   |  14 +
 drivers/ide/ide-ioctls.c   |  47 ++-
 drivers/ide/ide-tape.c |  14 +
 drivers/scsi/aic94xx/aic94xx_init.c|   3 +
 drivers/scsi/ch.c  |   9 +-
 drivers/scsi/hisi_sas/hisi_sas_v1_hw.c |   3 +
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c |   3 +
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c |   3 +
 drivers/scsi/ipr.c |   3 +
 drivers/scsi/isci/init.c   |   3 +
 drivers/scsi/mvsas/mv_init.c   |   3 +
 drivers/scsi/pm8001/pm8001_init.c  |   3 +
 drivers/scsi/scsi_ioctl.c  |  54 +++-
 drivers/scsi/sd.c  |  50 ++-
 drivers/scsi/sg.c  | 169 +-
 drivers/scsi/sr.c  |  53 +++-
 drivers/scsi/st.c  |  51 +--

Re: [Xen-devel] [PATCH] x86: store cr4 during suspend/resume

2019-12-11 Thread Andrew Cooper

On 11/12/2019 19:36, Andrew Cooper wrote:
> Absolutely nothing remaining in suspend.c should be spilled.  It can all
> be (re)caluclated from the same information source as the AP boot path,
> and the result will be strictly smaller in RAM, and more robust.

And at a cursory glance, the logic in suspend.c doesn't correctly handle
the ler=1 case.  This can trivially be fixed by:

diff --git a/xen/arch/x86/acpi/suspend.c b/xen/arch/x86/acpi/suspend.c
index c9dea67bf3..50dbb14528 100644
--- a/xen/arch/x86/acpi/suspend.c
+++ b/xen/arch/x86/acpi/suspend.c
@@ -15,8 +15,6 @@
 #include 
 #include 
 
-static unsigned long saved_lstar, saved_cstar;
-static unsigned long saved_sysenter_esp, saved_sysenter_eip;
 static unsigned long saved_fs_base, saved_gs_base, saved_kernel_gs_base;
 static uint64_t saved_xcr0;
 
@@ -24,15 +22,6 @@ void save_rest_processor_state(void)
 {
 saved_fs_base = rdfsbase();
 saved_gs_base = rdgsbase();
-    rdmsrl(MSR_SHADOW_GS_BASE, saved_kernel_gs_base);
-    rdmsrl(MSR_CSTAR, saved_cstar);
-    rdmsrl(MSR_LSTAR, saved_lstar);
-
-    if ( cpu_has_sep )
-    {
-    rdmsrl(MSR_IA32_SYSENTER_ESP, saved_sysenter_esp);
-    rdmsrl(MSR_IA32_SYSENTER_EIP, saved_sysenter_eip);
-    }
 
 if ( cpu_has_xsave )
 saved_xcr0 = get_xcr0();
@@ -42,25 +31,12 @@ void save_rest_processor_state(void)
 void restore_rest_processor_state(void)
 {
 load_system_tables();
-
-    /* Recover syscall MSRs */
-    wrmsrl(MSR_LSTAR, saved_lstar);
-    wrmsrl(MSR_CSTAR, saved_cstar);
-    wrmsrl(MSR_STAR, XEN_MSR_STAR);
-    wrmsrl(MSR_SYSCALL_MASK, XEN_SYSCALL_MASK);
+    percpu_traps_init();
 
 wrfsbase(saved_fs_base);
 wrgsbase(saved_gs_base);
 wrmsrl(MSR_SHADOW_GS_BASE, saved_kernel_gs_base);
 
-    if ( cpu_has_sep )
-    {
-    /* Recover sysenter MSRs */
-    wrmsrl(MSR_IA32_SYSENTER_ESP, saved_sysenter_esp);
-    wrmsrl(MSR_IA32_SYSENTER_EIP, saved_sysenter_eip);
-    wrmsr(MSR_IA32_SYSENTER_CS, __HYPERVISOR_CS, 0);
-    }
-
 if ( cpu_has_xsave && !set_xcr0(saved_xcr0) )
 BUG();
 

for starters.

The fs/gs/shadow values are guest-only state so can just be discarded.

xcr0 isn't needed but this code is latently(soon-to-be?) buggy by not
handling XSS at the same time.  Both should follow the BSP logic for
evaluating the system default.

That lets us delete save_rest_processor_state() entirely.

In the assembly side of things, ss and cr3 obviously don't need to be
spilled.  ss is always 0, and we're already on the correct pagetable. 
We're also probably on the correct cr0.

The caller-clobbered GPRs can just be discarded, and that gets our
stashed state down to almost nothing.

I trust I have made my point blindingly obvious.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] xen/arm: remove physical timer offset

2019-12-11 Thread Jeff Kubascik

On 12/5/2019 3:28 PM, Julien Grall wrote:
> Hi,
> 
> On 05/12/2019 19:17, Jeff Kubascik wrote:
>> On 12/3/2019 1:04 PM, Julien Grall wrote:
>>> Hi,
>>>
>>> On 26/11/2019 21:13, Jeff Kubascik wrote:
 The physical timer traps apply an offset so that time starts at 0 for
 the guest. However, this offset is not currently applied to the physical
 counter. Per the ARMv8 Reference Manual (ARM DDI 0487E.a), section
 D11.2.4 Timers, the "Offset" between the counter and timer should be
 zero for a physical timer. This removes the offset to make the timer and
 counter consistent.

 Furthermore, section D11.2.4 specifies that the values in the TimerValue
 view of the timers are signed in standard two's complement form. When
 writing to the TimerValue register, it should be signed extended as
 described by the equation

 CompareValue = (Counter[63:0] + SignExtend(TimerValue))[63:0]
>>>
>>> I am a bit confused, is it a new bug introduced by the change or
>>> previously existing? If the latter, then I think this should be modified
>>> in a separate patch.
>>
>> This would be a previously existing bug - a quirk in the timer design that
>> wasn't emulated correctly before. I can break this out into a separate patch.
> 
> It would be great if you can split it. Thank you!
> 
> [...]
> 
> 
 @@ -185,7 +184,7 @@ static bool vtimer_cntp_ctl(struct cpu_user_regs 
 *regs, uint32_t *r, bool read)
if ( v->arch.phys_timer.ctl & CNTx_CTL_ENABLE )
{
set_timer(>arch.phys_timer.timer,
 -  v->arch.phys_timer.cval + 
 v->domain->arch.phys_timer_base.offset);
 +  ticks_to_ns(v->arch.phys_timer.cval - boot_count));
>>>
>>> cval may be smaller than boot_count. In that case, we will set the timer
>>> to expire a very long time. This is not the expected behavior from the
>>> guest.
>>>
>>> Instead, we should either use 0 to create the timer or call
>>> phys_timer_expired directly.
>>
>> I disagree - if you set cval to a value smaller than boot_count, you are 
>> setting
>> cval to a value less than the physical counter value. This would result in 
>> the
>> timer having a long expiration time.
> 
> boot_count refers to when Xen began to boot, not the start of the
> physical counter. If you look at the condition to fire the timer (see
> below), then it means the timer will fire right now because the physical
> counter is past CompareValue (cval).
> 
> TimerConditionMet = (((Counter[63:0] – Offset[63:0])[63:0] -
> CompareValue[63:0]) >= 0)

This makes sense now - I wasn't considering the greater than equals condition.

> We only subtract boot_count here as the timer subsystem expects a
> relative number of nanoseconds from when Xen booted.

Makes sense.

>>
>> However, set_timer expects a signed 64 bit value in ns. The conversion of 
>> cval
>> (unsigned 64 bit) from ticks to ns is going to overflow this. I'm not sure 
>> what
>> would be the best way to work around this limitation. At the very least, I 
>> think
>> we should print a warning message.
> 
> A warning message in emulation is definitely not the right solution. If
> a user asks something that is valid from the spec PoV then we should
> implement it correctly. The more that I don't think boot_count store
> what you expect (see above).
> 
> But we definitely can't allow the caller of ticks_to_ns() to pass a
> negative value as argument because (cval - boot_count) may be over 2^63
> for instance if the user requests a timer to be set in a million of year
> (I didn't do the math!).

Assuming 100MHz timer frequency, the math works out to be about 5,850 years,
give or take. I'm assuming we don't need to worry about rollover conditions?

> Cheers,
> 
> --
> Julien Grall
> 

Sincerely,
Jeff Kubascik

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86: store cr4 during suspend/resume

2019-12-11 Thread Andrew Cooper

On 10/12/2019 10:07, Jan Beulich wrote:
> On 10.12.2019 10:59, Andrew Cooper wrote:
>> On 09/12/2019 18:06, Roger Pau Monne wrote:
>>> Currently cr4 is not cached before suspension, and mmu_cr4_features is
>>> used in order to restore the expected cr4 value. This is correct so
>>> far because the tasklet that executes the suspend/resume code is
>>> running in the idle vCPU context.
>>>
>>> In order to make the code less fragile, explicitly save the current
>>> cr4 value before suspension, so that it can be restored afterwards.
>>> This ensures that the cr4 value cached in the cpu_info doesn't get out
>>> of sync after resume from suspension.
>>>
>>> Suggested-by: Jan Beulich 
>>> Signed-off-by: Roger Pau Monné 
>> Why?  There is nothing fragile here.  Suspend/resume is always in idle
>> context and loads of other logic already depends on this.
>>
>> I've been slowly stripping out redundant saved state like this.
> Where it's clearly redundant, this is fine. But I don't think it's
> sufficiently clear here

There is a reason I made it explicitly crystal clear with c/s 87e7b4d5b

> , and going back to what was there before
> is imo generally less error prone than going to some fixed state.

It is demonstrably more error prone, which is why I'm slowly killing it.

Stashing state wastes unnecessary space, and adds an error case where
state is either stashed incorrectly, or gets modified before restore,
and we'll blindly use.

Two examples of real bugs caused by this are c/s 0c30171cb and 4ee0ad72d

Absolutely nothing remaining in suspend.c should be spilled.  It can all
be (re)caluclated from the same information source as the AP boot path,
and the result will be strictly smaller in RAM, and more robust.

> Furthermore I was hoping we could eventually do away with
> mmu_cr4_features.

How do you plan on doing this?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [ovmf test] 144718: regressions - FAIL

2019-12-11 Thread osstest service owner

flight 144718 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144718/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   6 xen-buildfail REGR. vs. 144637
 build-i386-xsm6 xen-buildfail REGR. vs. 144637
 build-amd64-xsm   6 xen-buildfail REGR. vs. 144637
 build-i3866 xen-buildfail REGR. vs. 144637

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a

version targeted for testing:
 ovmf 2fe25a74d6fee3c2ac0b930f7f3596cb432e766e
baseline version:
 ovmf 804666c86e7b6f04fe5c5cfdb13199c19e0e99b0

Last test of basis   144637  2019-12-09 09:09:49 Z2 days
Failing since144646  2019-12-10 01:39:53 Z1 days   10 attempts
Testing same since   144713  2019-12-11 12:09:19 Z0 days2 attempts


People who touched revisions under test:
  Antoine Coeur 
  Ard Biesheuvel 
  Bob Feng 
  Jiewen Yao 
  Michael Kubacki 
  Philippe Mathieu-Daude 
  Steven Shi 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 2fe25a74d6fee3c2ac0b930f7f3596cb432e766e
Author: Ard Biesheuvel 
Date:   Tue Mar 5 14:32:48 2019 +0100

ArmPkg/MmCommunicationDxe: relay architected PI events to MM context

PI defines a few architected events that have significance in the MM
context as well as in the non-secure DXE context. So register notify
handlers for these events, and relay them into the standalone MM world.

Signed-off-by: Ard Biesheuvel 
Reviewed-by: Jiewen Yao 
Reviewed-by: Achin Gupta 

commit d3add11e87dace180387562d6f1951f2bffbd3d9
Author: Michael Kubacki 
Date:   Wed Nov 20 17:31:24 2019 -0800

MdeModulePkg PeiCore: Improve comment semantics

This patch clarifies wording in several PeiCore comments to improve
reading comprehension.

Cc: Dandan Bi 
Cc: Liming Gao 
Cc: Jian J Wang 
Cc: Hao A Wu 
Signed-off-by: Michael Kubacki 
Reviewed-by: Liming Gao 
Reviewed-by: Jian J Wang 

commit d39d1260c615b716675f67f5c4e1f4f52df01dad
Author: Michael Kubacki 
Date:   Wed Nov 20 17:10:48 2019 -0800

MdeModulePkg PeiCore: Fix typos

Cc: Dandan Bi 
Cc: Liming Gao 
Cc: Jian J Wang 
Cc: Hao A Wu 
Signed-off-by: Michael Kubacki 
Reviewed-by: Liming Gao 
Reviewed-by: Philippe Mathieu-Daude 
Reviewed-by: Jian J Wang 

commit 97eedf5dfbaffde33210fd88066247cf0b7d3325
Author: Antoine Coeur 
Date:   Wed Dec 4 12:14:53 2019 +0800

IntelFsp2WrapperPkg: Fix various typos

Fix various typos in comments and documentation.

Cc: Chasel Chiu 
Cc: Nate DeSimone 
Cc: Star Zeng 
Reviewed-by: Philippe Mathieu-Daude 
Signed-off-by: Philippe Mathieu-Daude 
Reviewed-by: Nate DeSimone 
Reviewed-by: Chasel Chiu 
Reviewed-by: Star Zeng 

commit 7e55cf6b48dcd43de46d008b2f12caaad2554503
Author: Jiewen Yao 
Date:   Sat Dec 7 21:41:10 2019 +0800

SecurityPkg/Tcg2Smm: Measure the table before patch.

REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1940

According to TCG PFP specification: the ACPI table must be
measured prior to any modification, and the measurement
must

[Xen-devel] [xen-4.13-testing test] 144708: regressions - FAIL

2019-12-11 Thread osstest service owner

flight 144708 xen-4.13-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144708/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-shadow   18 guest-localmigrate/x10   fail REGR. vs. 144673

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-rtds 18 guest-localmigrate/x10   fail  like 144673
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 144673
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass

version targeted for testing:
 xen  fd31193058be072131316c2e0c19ab8643e676c6
baseline version:
 xen  b0f0bbca95bd532212fb1956f3e23d1ab13a53cf

Last test of basis   144673  2019-12-10 19:07:50 Z0 days
Testing same since   144708  2019-12-11 11:38:22 Z0 days1 attempts


People who touched

[Xen-devel] [PATCH v7 1/3] xenbus/backend: Add memory pressure handler callback

2019-12-11 Thread SeongJae Park

Granting pages consumes backend system memory.  In systems configured
with insufficient spare memory for those pages, it can cause a memory
pressure situation.  However, finding the optimal amount of the spare
memory is challenging for large systems having dynamic resource
utilization patterns.  Also, such a static configuration might lack
flexibility.

To mitigate such problems, this commit adds a memory reclaim callback to
'xenbus_driver'.  If a memory pressure is detected, 'xenbus' requests
every backend driver to volunarily release its memory.

Note that it would be able to improve the callback facility for more
sophisticated handlings of general pressures.  For example, it would be
possible to monitor the memory consumption of each device and issue the
release requests to only devices which causing the pressure.  Also, the
callback could be extended to handle not only memory, but general
resources.  Nevertheless, this version of the implementation defers such
sophisticated goals as a future work.

Reviewed-by: Juergen Gross 
Signed-off-by: SeongJae Park 
---
 drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++
 include/xen/xenbus.h  |  1 +
 2 files changed, 33 insertions(+)

diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c 
b/drivers/xen/xenbus/xenbus_probe_backend.c
index b0bed4faf44c..7e78ebef7c54 100644
--- a/drivers/xen/xenbus/xenbus_probe_backend.c
+++ b/drivers/xen/xenbus/xenbus_probe_backend.c
@@ -248,6 +248,35 @@ static int backend_probe_and_watch(struct notifier_block 
*notifier,
return NOTIFY_DONE;
 }
 
+static int backend_reclaim_memory(struct device *dev, void *data)
+{
+   const struct xenbus_driver *drv;
+
+   if (!dev->driver)
+   return 0;
+   drv = to_xenbus_driver(dev->driver);
+   if (drv && drv->reclaim_memory)
+   drv->reclaim_memory(to_xenbus_device(dev));
+   return 0;
+}
+
+/*
+ * Returns 0 always because we are using shrinker to only detect memory
+ * pressure.
+ */
+static unsigned long backend_shrink_memory_count(struct shrinker *shrinker,
+   struct shrink_control *sc)
+{
+   bus_for_each_dev(_backend.bus, NULL, NULL,
+   backend_reclaim_memory);
+   return 0;
+}
+
+static struct shrinker backend_memory_shrinker = {
+   .count_objects = backend_shrink_memory_count,
+   .seeks = DEFAULT_SEEKS,
+};
+
 static int __init xenbus_probe_backend_init(void)
 {
static struct notifier_block xenstore_notifier = {
@@ -264,6 +293,9 @@ static int __init xenbus_probe_backend_init(void)
 
register_xenstore_notifier(_notifier);
 
+   if (register_shrinker(_memory_shrinker))
+   pr_warn("shrinker registration failed\n");
+
return 0;
 }
 subsys_initcall(xenbus_probe_backend_init);
diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
index 869c816d5f8c..c861cfb6f720 100644
--- a/include/xen/xenbus.h
+++ b/include/xen/xenbus.h
@@ -104,6 +104,7 @@ struct xenbus_driver {
struct device_driver driver;
int (*read_otherend_details)(struct xenbus_device *dev);
int (*is_ready)(struct xenbus_device *dev);
+   void (*reclaim_memory)(struct xenbus_device *dev);
 };
 
 static inline struct xenbus_driver *to_xenbus_driver(struct device_driver *drv)
-- 
2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v7 0/2] xenbus/backend: Add a memory pressure handler callback

2019-12-11 Thread SeongJae Park

Granting pages consumes backend system memory.  In systems configured
with insufficient spare memory for those pages, it can cause a memory
pressure situation.  However, finding the optimal amount of the spare
memory is challenging for large systems having dynamic resource
utilization patterns.  Also, such a static configuration might lack
flexibility.

To mitigate such problems, this patchset adds a memory reclaim callback
to 'xenbus_driver' (patch 1) and use it to mitigate the problem in
'xen-blkback' (patch 2).  The third patch is a trivial cleanup of
variable names.

Base Version


This patch is based on v5.4.  A complete tree is also available at my
public git repo:
https://github.com/sjp38/linux/tree/blkback_squeezing_v7


Patch History
-

Changes from v6
(https://lore.kernel.org/linux-block/20191211042428.5961-1-sjp...@amazon.de/)
 - Remove more unnecessary prefixes (suggested by Roger Pau Monné)
 - Constify a variable (suggested by Roger Pau Monné)
 - Rename 'reclaim' into 'reclaim_memory' (suggested by Roger Pau Monné)
 - More wordsmith of the commit message (suggested by Roger Pau Monné)

Changes from v5
(https://lore.kernel.org/linux-block/20191210080628.5264-1-sjp...@amazon.de/)
 - Wordsmith the commit messages (suggested by Roger Pau Monné)
 - Change the reclaim callback return type (suggested by Roger Pau Monné)
 - Change the type of the blkback squeeze duration variable
   (suggested by Roger Pau Monné)
 - Add a patch for removal of unnecessary static variable name prefixes
   (suggested by Roger Pau Monné)
 - Fix checkpatch.pl warnings

Changes from v4
(https://lore.kernel.org/xen-devel/20191209194305.20828-1-sjp...@amazon.com/)
 - Remove domain id parameter from the callback (suggested by Juergen Gross)
 - Rename xen-blkback module parameter (suggested by Stefan Nuernburger)

Changes from v3
(https://lore.kernel.org/xen-devel/20191209085839.21215-1-sjp...@amazon.com/)
 - Add general callback in xen_driver and use it (suggested by Juergen Gross)

Changes from v2
(https://lore.kernel.org/linux-block/af195033-23d5-38ed-b73b-f6e2e3b34...@amazon.com)
 - Rename the module parameter and variables for brevity
   (aggressive shrinking -> squeezing)

Changes from v1
(https://lore.kernel.org/xen-devel/20191204113419.2298-1-sjp...@amazon.com/)
 - Adjust the description to not use the term, `arbitrarily`
   (suggested by Paul Durrant)
 - Specify time unit of the duration in the parameter description,
   (suggested by Maximilian Heyne)
 - Change default aggressive shrinking duration from 1ms to 10ms
 - Merge two patches into one single patch

SeongJae Park (2):
  xenbus/backend: Add memory pressure handler callback
  xen/blkback: Squeeze page pools if a memory pressure is detected

 drivers/block/xen-blkback/blkback.c   | 23 +++--
 drivers/block/xen-blkback/common.h|  1 +
 drivers/block/xen-blkback/xenbus.c|  3 ++-
 drivers/xen/xenbus/xenbus_probe_backend.c | 31 +++
 include/xen/xenbus.h  |  1 +
 5 files changed, 56 insertions(+), 3 deletions(-)

-- 
2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v7 3/3] xen/blkback: Remove unnecessary static variable name prefixes

2019-12-11 Thread SeongJae Park

A few of static variables in blkback have 'xen_blkif_' prefix, though it
is unnecessary for static variables.  This commit removes such prefixes.

Reviewed-by: Roger Pau Monné 
Signed-off-by: SeongJae Park 
---
 drivers/block/xen-blkback/blkback.c | 37 +
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c 
b/drivers/block/xen-blkback/blkback.c
index 98823d150905..f41c698dd854 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -62,8 +62,8 @@
  * IO workloads.
  */
 
-static int xen_blkif_max_buffer_pages = 1024;
-module_param_named(max_buffer_pages, xen_blkif_max_buffer_pages, int, 0644);
+static int max_buffer_pages = 1024;
+module_param_named(max_buffer_pages, max_buffer_pages, int, 0644);
 MODULE_PARM_DESC(max_buffer_pages,
 "Maximum number of free pages to keep in each block backend buffer");
 
@@ -78,8 +78,8 @@ MODULE_PARM_DESC(max_buffer_pages,
  * algorithm.
  */
 
-static int xen_blkif_max_pgrants = 1056;
-module_param_named(max_persistent_grants, xen_blkif_max_pgrants, int, 0644);
+static int max_pgrants = 1056;
+module_param_named(max_persistent_grants, max_pgrants, int, 0644);
 MODULE_PARM_DESC(max_persistent_grants,
  "Maximum number of grants to map persistently");
 
@@ -88,8 +88,8 @@ MODULE_PARM_DESC(max_persistent_grants,
  * use. The time is in seconds, 0 means indefinitely long.
  */
 
-static unsigned int xen_blkif_pgrant_timeout = 60;
-module_param_named(persistent_grant_unused_seconds, xen_blkif_pgrant_timeout,
+static unsigned int pgrant_timeout = 60;
+module_param_named(persistent_grant_unused_seconds, pgrant_timeout,
   uint, 0644);
 MODULE_PARM_DESC(persistent_grant_unused_seconds,
 "Time in seconds an unused persistent grant is allowed to "
@@ -137,9 +137,8 @@ module_param(log_stats, int, 0644);
 
 static inline bool persistent_gnt_timeout(struct persistent_gnt 
*persistent_gnt)
 {
-   return xen_blkif_pgrant_timeout &&
-  (jiffies - persistent_gnt->last_used >=
-   HZ * xen_blkif_pgrant_timeout);
+   return pgrant_timeout && (jiffies - persistent_gnt->last_used >=
+   HZ * pgrant_timeout);
 }
 
 /* Once a memory pressure is detected, squeeze free page pools for a while. */
@@ -249,7 +248,7 @@ static int add_persistent_gnt(struct xen_blkif_ring *ring,
struct persistent_gnt *this;
struct xen_blkif *blkif = ring->blkif;
 
-   if (ring->persistent_gnt_c >= xen_blkif_max_pgrants) {
+   if (ring->persistent_gnt_c >= max_pgrants) {
if (!blkif->vbd.overflow_max_grants)
blkif->vbd.overflow_max_grants = 1;
return -EBUSY;
@@ -412,14 +411,13 @@ static void purge_persistent_gnt(struct xen_blkif_ring 
*ring)
goto out;
}
 
-   if (ring->persistent_gnt_c < xen_blkif_max_pgrants ||
-   (ring->persistent_gnt_c == xen_blkif_max_pgrants &&
+   if (ring->persistent_gnt_c < max_pgrants ||
+   (ring->persistent_gnt_c == max_pgrants &&
!ring->blkif->vbd.overflow_max_grants)) {
num_clean = 0;
} else {
-   num_clean = (xen_blkif_max_pgrants / 100) * LRU_PERCENT_CLEAN;
-   num_clean = ring->persistent_gnt_c - xen_blkif_max_pgrants +
-   num_clean;
+   num_clean = (max_pgrants / 100) * LRU_PERCENT_CLEAN;
+   num_clean = ring->persistent_gnt_c - max_pgrants + num_clean;
num_clean = min(ring->persistent_gnt_c, num_clean);
pr_debug("Going to purge at least %u persistent grants\n",
 num_clean);
@@ -614,8 +612,7 @@ static void print_stats(struct xen_blkif_ring *ring)
 current->comm, ring->st_oo_req,
 ring->st_rd_req, ring->st_wr_req,
 ring->st_f_req, ring->st_ds_req,
-ring->persistent_gnt_c,
-xen_blkif_max_pgrants);
+ring->persistent_gnt_c, max_pgrants);
ring->st_print = jiffies + msecs_to_jiffies(10 * 1000);
ring->st_rd_req = 0;
ring->st_wr_req = 0;
@@ -675,7 +672,7 @@ int xen_blkif_schedule(void *arg)
if (time_before(jiffies, buffer_squeeze_end))
shrink_free_pagepool(ring, 0);
else
-   shrink_free_pagepool(ring, xen_blkif_max_buffer_pages);
+   shrink_free_pagepool(ring, max_buffer_pages);
 
if (log_stats && time_after(jiffies, ring->st_print))
print_stats(ring);
@@ -902,7 +899,7 @@ static int xen_blkbk_map(struct xen_blkif_ring *ring,
continue;
}
if (use_persistent_gnts &&
-   ring->persistent_gnt_c < xen_blkif_max_pgrants) {
+   ring->persistent_gnt_c < max_pgrants) {

[Xen-devel] [PATCH v7 2/3] xen/blkback: Squeeze page pools if a memory pressure is detected

2019-12-11 Thread SeongJae Park

Each `blkif` has a free pages pool for the grant mapping.  The size of
the pool starts from zero and be increased on demand while processing
the I/O requests.  If current I/O requests handling is finished or 100
milliseconds has passed since last I/O requests handling, it checks and
shrinks the pool to not exceed the size limit, `max_buffer_pages`.

Therefore, host administrators can cause memory pressure in blkback by
attaching a large number of block devices and inducing I/O.  Such
problematic situations can be avoided by limiting the maximum number of
devices that can be attached, but finding the optimal limit is not so
easy.  Improper set of the limit can results in the memory pressure or a
resource underutilization.  This commit avoids such problematic
situations by squeezing the pools (returns every free page in the pool
to the system) for a while (users can set this duration via a module
parameter) if a memory pressure is detected.

Discussions
===

The `blkback`'s original shrinking mechanism returns only pages in the
pool, which are not currently be used by `blkback`, to the system.  In
other words, the pages that are not mapped with granted pages.  Because
this commit is changing only the shrink limit but still uses the same
freeing mechanism it does not touch pages which are currently mapping
grants.

Once a memory pressure is detected, this commit keeps the squeezing
limit for a user-specified time duration.  The duration should be
neither too long nor too short.  If it is too long, the squeezing
incurring overhead can reduce the I/O performance.  If it is too short,
`blkback` will not free enough pages to reduce the memory pressure.
This commit sets the value as `10 milliseconds` by default because it is
a short time in terms of I/O while it is a long time in terms of memory
operations.  Also, as the original shrinking mechanism works for at
least every 100 milliseconds, this could be a somewhat reasonable
choice.  I also tested other durations (refer to the below section for
more details) and confirmed that 10 milliseconds is the one that works
best with the test.  That said, the proper duration depends on actual
configurations and workloads.  That's why this commit allows users to
set the duration as a module parameter.

Memory Pressure Test


To show how this commit fixes the memory pressure situation well, I
configured a test environment on a xen-running virtualization system.
On the `blkfront` running guest instances, I attach a large number of
network-backed volume devices and induce I/O to those.  Meanwhile, I
measure the number of pages that swapped in (pswpin) and out (pswpout)
on the `blkback` running guest.  The test ran twice, once for the
`blkback` before this commit and once for that after this commit.  As
shown below, this commit has dramatically reduced the memory pressure:

pswpin  pswpout
before  76,672  185,799
after  2123,325

Optimal Aggressive Shrinking Duration
-

To find a best squeezing duration, I repeated the test with three
different durations (1ms, 10ms, and 100ms).  The results are as below:

durationpswpin  pswpout
1   852 6,424
10  212 3,325
100 203 3,340

As expected, the memory pressure has decreased as the duration is
increased, but the reduction stopped from the `10ms`.  Based on this
results, I chose the default duration as 10ms.

Performance Overhead Test
=

This commit could incur I/O performance degradation under severe memory
pressure because the squeezing will require more page allocations per
I/O.  To show the overhead, I artificially made a worst-case squeezing
situation and measured the I/O performance of a `blkfront` running
guest.

For the artificial squeezing, I set the `blkback.max_buffer_pages` using
the `/sys/module/xen_blkback/parameters/max_buffer_pages` file.  In this
test, I set the value to `1024` and `0`.  The `1024` is the default
value.  Setting the value as `0` is same to a situation doing the
squeezing always (worst-case).

For the I/O performance measurement, I run a simple `dd` command 5 times
as below and collect the 'MB/s' results.

$ for i in {1..5}; do dd if=/dev/zero of=file \
 bs=4k count=$((256*512)); sync; done

If the underlying block device is slow enough, the squeezing overhead
could be hidden.  For the reason, I do this test for both a slow block
device and a fast block device.  I use a popular cloud block storage
service, ebs[1] as a slow device and the ramdisk block device[2] for the
fast device.

The results are as below.  'max_pgs' represents the value of the
`blkback.max_buffer_pages` parameter.

On the slow block device


max_pgs   Min   Max   Median AvgStddev
0 38.7  45.8  38.7   40.12  3.1752165
1024  38.7  45.8

[Xen-devel] [xen-unstable-smoke test] 144719: tolerable all pass - PUSHED

2019-12-11 Thread osstest service owner

flight 144719 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144719/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  b4f042236ae0bb6725b3e8dd40af5a2466a6f971
baseline version:
 xen  80268f63640f5c3a9a4e1c688a62e35448fde9e2

Last test of basis   144711  2019-12-11 12:02:23 Z0 days
Testing same since   144719  2019-12-11 15:00:35 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  George Dunlap 
  Jan Beulich 
  Julien Grall 
  Kevin Tian 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   80268f6364..b4f042236a  b4f042236ae0bb6725b3e8dd40af5a2466a6f971 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/2] x86/mm: factor out the code for shattering an l3 PTE

2019-12-11 Thread Jan Beulich

On 11.12.2019 17:34, Xia, Hongyan wrote:
> On Wed, 2019-12-11 at 16:29 +0100, Jan Beulich wrote:
>>> +}
>>> +
>>> +if ( locking )
>>> +spin_lock(_pgdir_lock);
>>> +if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) &&
>>> + (l3e_get_flags(*pl3e) & _PAGE_PSE) )
>>> +{
>>> +l3e_write_atomic(pl3e,
>>> +l3e_from_paddr((paddr_t)virt_to_maddr(l2t),
>>> __PAGE_HYPERVISOR));
>>
>> Why the cast? (I'm sorry if this was there on v3 already and I
>> didn't spot it. And if this remains the only thing to adjust,
>> then I guess this could be taken care of while committing.)
> 
> Sadly there is no l3e_from_maddr or virt_to_paddr to call directly. Of
> course, paddr_t and maddr have the same underlying type (unsigned
> long), so it works without a cast. I just added the cast to make it
> explicit that these two are not exactly the same.

Yes, there continues to be a naming disconnect. But no, this is
not a reason to add a cast. Casts should be used as sparingly
as possible, since they tend to hide problems.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/2] x86/mm: factor out the code for shattering an l3 PTE

2019-12-11 Thread Jan Beulich

On 11.12.2019 17:27, Xia, Hongyan wrote:
> On Wed, 2019-12-11 at 16:32 +0100, Jan Beulich wrote:
>> On 11.12.2019 11:58, Hongyan Xia wrote:
>>> @@ -5578,27 +5597,8 @@ int modify_xen_mappings(unsigned long s,
>>> unsigned long e, unsigned int nf)
>>>  }
>>>  
>>>  /* PAGE1GB: shatter the superpage and fall through. */
>>> -pl2e = alloc_xen_pagetable();
>>> -if ( !pl2e )
>>> +if ( shatter_l3e(pl3e, 0, locking) )
>>>  return -ENOMEM;
>>
>> Hmm, I didn't expect I'd need to comment on this again: As per
>> my v2 reply, you should hand on the return value from the
>> function, not make up your own. This is so that in case the
>> function gains another error path with a different error code,
>> it wouldn't become indistinguishable to callers further up.
> 
> I was basically thinking about the conversation we had that ENOMEM is
> probably the only error value map_pages_to_xen would return ever, and
> it is unlikely to gain another return value in the future, so initially
> I just let shatter return -1 and the caller return -ENOMEM. There is no
> problem for me if we want to change it to handle different error
> values.

The alternative to your prior 0 / -1 returning would have been to
have the function return bool. In this case "inventing" an error
code here would be fine. The 0 / -1 approach would introduce
another instance of what we're trying to get rid of elsewhere.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH-for-4.13] build: fix tools/configure in case only python3 exists

2019-12-11 Thread Juergen Gross

Calling ./configure with python3 being there but no python,
tools/configure will fail. Fix that by defaulting to python and
falling back to python3 or python2.

While at it fix the use of non portable "type -p" by replacing it by
AC_PATH_PROG().

Signed-off-by: Juergen Gross 
---
 tools/configure.ac | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/configure.ac b/tools/configure.ac
index a8d8ce5ffe..8d86c42de8 100644
--- a/tools/configure.ac
+++ b/tools/configure.ac
@@ -337,8 +337,9 @@ case "$host_os" in
   freebsd*) ;;
   *) AX_PATH_PROG_OR_FAIL([BASH], [bash]);;
 esac
-AS_IF([test -z "$PYTHON"], [PYTHON="python"])
-AS_IF([echo "$PYTHON" | grep -q "^/"], [], [PYTHON=`type -p "$PYTHON"`])
+AS_IF([test -z "$PYTHON"], [AC_CHECK_PROGS([PYTHON], [python python3 python2], 
err)])
+AS_IF([test "$PYTHON" = "err"], [AC_MSG_ERROR([No python interpreter found])])
+AS_IF([echo "$PYTHON" | grep -q "^/"], [], [AC_PATH_PROG([PYTHON], [$PYTHON])])
 PYTHONPATH=$PYTHON
 PYTHON=`basename $PYTHONPATH`
 
-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] XSA-255 and Arm

2019-12-11 Thread Julien Grall


Hi,

On 10/12/2019 10:42, George Dunlap wrote:

On 12/9/19 1:13 PM, Julien Grall wrote:

Hi all,

I was looking at the Grant Table code over the week-end and noticed
thart XSA-255 [1] introduced some unintended consequences on Arm.

Since the XSA, gnttab_map_frame() will remove the previous mapping (if
any) because mapping to the new GFN.

As on Arm we don't have an M2P, the GFN is stored per frame in the
grant-table code. This will never get cleared during unmapping (e.g.
XENMEM_remove_from_physmap) and therefore we may end up to remove a
mapping from someone different (Arm does not check the MFN is the
correct one before removing mapping).


Sorry Julien,  I don't know enough about the ARM grant mapping code to
know what this is about.  Can you point me to the relevant files /
functions / structures, and maybe sketch out a problematic behavior?
If you look at the implementation of gnttab_map_frame() 
(common/grant_table.c), it will use gnttab_get_frame_gfn(...) to know 
whether the grant-table frame was already mapped in the Guest P2M.


If it is already mapped, then we will try to remove the current mapping 
before doing the new one.


On Arm, gnttab_get_frame_gfn() are using an internal array because to we 
don't have an internal array because we don't have an M2P. There is a 
sister function to set the frame (see gnttab_set_frame_gfn()). However, 
this is only called when mapping the frame and not when the unmap is 
done XENMEM_remove_from_physmap.


From my understanding, the current code (i.e remove the old mapping) is 
trying to workaround the fact we don't take reference when doing a 
mapping in the page-tables.


At the default of not handling reference for all the mappings, we could 
handle grant frame in a similar way to foreign mapping. This would 
remove the restriction on the number of mappings and we could update the 
internal array at the same time.


The only concern with this appraoch is we would have to walk through the 
array to find the GFN.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/2] x86/mm: factor out the code for shattering an l3 PTE

2019-12-11 Thread Xia, Hongyan

On Wed, 2019-12-11 at 16:29 +0100, Jan Beulich wrote:
> > +}
> > +
> > +if ( locking )
> > +spin_lock(_pgdir_lock);
> > +if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) &&
> > + (l3e_get_flags(*pl3e) & _PAGE_PSE) )
> > +{
> > +l3e_write_atomic(pl3e,
> > +l3e_from_paddr((paddr_t)virt_to_maddr(l2t),
> > __PAGE_HYPERVISOR));
> 
> Why the cast? (I'm sorry if this was there on v3 already and I
> didn't spot it. And if this remains the only thing to adjust,
> then I guess this could be taken care of while committing.)
> 
> Jan

Sadly there is no l3e_from_maddr or virt_to_paddr to call directly. Of
course, paddr_t and maddr have the same underlying type (unsigned
long), so it works without a cast. I just added the cast to make it
explicit that these two are not exactly the same.

Hongyan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/9] x86: introduce a new set of APIs to manage Xen page tables

2019-12-11 Thread Julien Grall


Hi,

On 04/12/2019 17:10, Hongyan Xia wrote:

From: Wei Liu 

We are going to switch to using domheap page for page tables.
A new set of APIs is introduced to allocate, map, unmap and free pages
for page tables.

The allocation and deallocation work on mfn_t but not page_info,
because they are required to work even before frame table is set up.

Implement the old functions with the new ones. We will rewrite, site
by site, other mm functions that manipulate page tables to use the new
APIs.

Note these new APIs still use xenheap page underneath and no actual
map and unmap is done so that we don't break xen half way. They will
be switched to use domheap and dynamic mappings when usage of old APIs
is eliminated.

No functional change intended in this patch.

Signed-off-by: Wei Liu 
Signed-off-by: Hongyan Xia 

---
Changed since v3:
- const qualify unmap_xen_pagetable_new().
- remove redundant parentheses.
---
  xen/arch/x86/mm.c| 39 ++-
  xen/include/asm-x86/mm.h | 11 +++
  2 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 7d4dd80a85..ca362ad638 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -119,6 +119,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -5020,22 +5021,50 @@ int mmcfg_intercept_write(
  }
  
  void *alloc_xen_pagetable(void)

+{
+mfn_t mfn;
+
+mfn = alloc_xen_pagetable_new();
+ASSERT(!mfn_eq(mfn, INVALID_MFN));


The current version of alloc_xen_pagetable() may return NULL. So I can't 
see why this would not happen with the new implementation (see more below).


Furthermore, AFAIK, mfn_to_virt() is not able to deal with INVALID_MFN. 
While the ASSERT will catch such error in debug build, non-debug build 
will end up to undefined behavior (allocation failure is a real 
possibility).


Regardless the discussion on whether the ASSERT() is warrant, I think 
you want to have:


if ( mfn_eq(mfn, INVALID_MFN) )
   return NULL;

I am half tempted to suggest to put that in map_xen_pagetable_new() for 
hardening.



+
+return map_xen_pagetable_new(mfn);
+}
+
+void free_xen_pagetable(void *v)
+{
+if ( system_state != SYS_STATE_early_boot )
+free_xen_pagetable_new(virt_to_mfn(v));
+}
+
+mfn_t alloc_xen_pagetable_new(void)
  {
  if ( system_state != SYS_STATE_early_boot )
  {
  void *ptr = alloc_xenheap_page();
  
  BUG_ON(!hardware_domain && !ptr);


This BUG_ON() only catch memory exhaustion before the Hardware Domain 
(aka Dom0) is created. Afterwards, the pointer may be NULL. As ...



-return ptr;
+return virt_to_mfn(ptr);


virt_to_mfn() requires a valid MFN, you will end up to undefined behavior.


  }
  
-return mfn_to_virt(mfn_x(alloc_boot_pages(1, 1)));

+return alloc_boot_pages(1, 1);
  }
  
-void free_xen_pagetable(void *v)

+void *map_xen_pagetable_new(mfn_t mfn)
  {
-if ( system_state != SYS_STATE_early_boot )
-free_xenheap_page(v);
+return mfn_to_virt(mfn_x(mfn));
+}
+
+/* v can point to an entry within a table or be NULL */
+void unmap_xen_pagetable_new(const void *v)
+{
+/* XXX still using xenheap page, no need to do anything.  */
+}
+
+/* mfn can be INVALID_MFN */
+void free_xen_pagetable_new(mfn_t mfn)
+{
+if ( system_state != SYS_STATE_early_boot && !mfn_eq(mfn, INVALID_MFN) )
+free_xenheap_page(mfn_to_virt(mfn_x(mfn)));
  }
  
  static DEFINE_SPINLOCK(map_pgdir_lock);

diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 9d2b833579..76593fe9e7 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -582,6 +582,17 @@ void *do_page_walk(struct vcpu *v, unsigned long addr);
  /* Allocator functions for Xen pagetables. */
  void *alloc_xen_pagetable(void);
  void free_xen_pagetable(void *v);
+mfn_t alloc_xen_pagetable_new(void);
+void *map_xen_pagetable_new(mfn_t mfn);
+void unmap_xen_pagetable_new(const void *v);
+void free_xen_pagetable_new(mfn_t mfn);
+
+#define UNMAP_XEN_PAGETABLE_NEW(ptr)\
+do {\
+unmap_xen_pagetable_new(ptr);   \
+(ptr) = NULL;   \
+} while (0)
+
  l1_pgentry_t *virt_to_xen_l1e(unsigned long v);
  
  int __sync_local_execstate(void);




--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/2] x86/mm: factor out the code for shattering an l3 PTE

2019-12-11 Thread Xia, Hongyan

On Wed, 2019-12-11 at 16:32 +0100, Jan Beulich wrote:
> On 11.12.2019 11:58, Hongyan Xia wrote:
> > @@ -5578,27 +5597,8 @@ int modify_xen_mappings(unsigned long s,
> > unsigned long e, unsigned int nf)
> >  }
> >  
> >  /* PAGE1GB: shatter the superpage and fall through. */
> > -pl2e = alloc_xen_pagetable();
> > -if ( !pl2e )
> > +if ( shatter_l3e(pl3e, 0, locking) )
> >  return -ENOMEM;
> 
> Hmm, I didn't expect I'd need to comment on this again: As per
> my v2 reply, you should hand on the return value from the
> function, not make up your own. This is so that in case the
> function gains another error path with a different error code,
> it wouldn't become indistinguishable to callers further up.
> 

I was basically thinking about the conversation we had that ENOMEM is
probably the only error value map_pages_to_xen would return ever, and
it is unlikely to gain another return value in the future, so initially
I just let shatter return -1 and the caller return -ENOMEM. There is no
problem for me if we want to change it to handle different error
values.

Hongyan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] IOMMU: make DMA containment of quarantined devices optional

2019-12-11 Thread Roger Pau Monné

On Wed, Dec 11, 2019 at 01:53:00PM +0100, Jan Beulich wrote:
> Containing still in flight DMA was introduced to work around certain
> devices / systems hanging hard upon hitting an IOMMU fault. Passing
> through (such) devices (on such systems) is inherently insecure (as
> guests could easily arrange for IOMMU faults to occur). Defaulting to
> a mode where admins may not even become aware of issues with devices can
> be considered undesirable. Therefore convert this mode of operation to
> an optional one, not one enabled by default.
> 
> This involves resurrecting code commit ea38867831da ("x86 / iommu: set
> up a scratch page in the quarantine domain") did remove, in a slightly
> extended fashion. Here, instead of reintroducing a pretty pointless use
> of "goto" in domain_context_unmap(), and instead of making the function
> (at least temporarily) inconsistent, take the opportunity and replace
> the other similarly pointless "goto" as well.
> 
> In order to key the re-instated bypasses off of there (not) being a root
> page table this further requires moving the allocate_domain_resources()
> invocation from reassign_device() to amd_iommu_setup_domain_device() (or
> else reassign_device() would allocate a root page table anyway); this is
> benign to the second caller of the latter function.
> 
> Signed-off-by: Jan Beulich 
> ---
> I'm happy to take better suggestions to replace "full".
> 
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1232,7 +1232,7 @@ detection of systems known to misbehave
>  > Default: `new` unless directed-EOI is supported
>  
>  ### iommu
> -= List of [ , verbose, debug, force, required, quarantine,
> += List of [ , verbose, debug, force, required, quarantine[=full],
>  sharept, intremap, intpost, crash-disable,
>  snoop, qinval, igfx, amd-iommu-perdev-intremap,
>  dom0-{passthrough,strict} ]
> @@ -1270,11 +1270,13 @@ boolean (e.g. `iommu=no`) can override t
>  will prevent Xen from booting if IOMMUs aren't discovered and enabled
>  successfully.
>  
> -*   The `quarantine` boolean can be used to control Xen's behavior when
> -de-assigning devices from guests.  If enabled (the default), Xen always
> -quarantines such devices; they must be explicitly assigned back to Dom0
> -before they can be used there again.  If disabled, Xen will only
> -quarantine devices the toolstack hass arranged for getting quarantined.
> +*   The `quarantine` option can be used to control Xen's behavior when
> +de-assigning devices from guests.  If set to true (the default), Xen
> +always quarantines such devices; they must be explicitly assigned back
> +to Dom0 before they can be used there again.  If set to "full", still
> +active DMA will additionally be directed to a "sink" page.  If set to
> +false, Xen will only quarantine devices the toolstack has arranged for
> +getting quarantined.
>  
>  *   The `sharept` boolean controls whether the IOMMU pagetables are shared
>  with the CPU-side HAP pagetables, or allocated separately.  Sharing
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -85,18 +85,36 @@ int get_dma_requestor_id(uint16_t seg, u
>  return req_id;
>  }
>  
> -static void amd_iommu_setup_domain_device(
> +static int __must_check allocate_domain_resources(struct domain_iommu *hd)
> +{
> +int rc;
> +
> +spin_lock(>arch.mapping_lock);
> +rc = amd_iommu_alloc_root(hd);
> +spin_unlock(>arch.mapping_lock);
> +
> +return rc;
> +}
> +
> +static int __must_check amd_iommu_setup_domain_device(
>  struct domain *domain, struct amd_iommu *iommu,
>  uint8_t devfn, struct pci_dev *pdev)
>  {
>  struct amd_iommu_dte *table, *dte;
>  unsigned long flags;
> -int req_id, valid = 1;
> +int req_id, valid = 1, rc;
>  u8 bus = pdev->bus;
> -const struct domain_iommu *hd = dom_iommu(domain);
> +struct domain_iommu *hd = dom_iommu(domain);
> +
> +/* dom_io is used as a sentinel for quarantined devices */
> +if ( domain == dom_io && !hd->arch.root_table )

This condition (and it's Intel counterpart) would be better in a macro
IMO, so that vendor code regardless of the implementation can use the
same macro (and to avoid having to add the same comment in all
instances), ie: IS_DEVICE_QUARANTINED or some such would be fine IMO.

> +return 0;
> +
> +BUG_ON(!hd->arch.paging_mode || !iommu->dev_table.buffer);
>  
> -BUG_ON( !hd->arch.root_table || !hd->arch.paging_mode ||
> -!iommu->dev_table.buffer );
> +rc = allocate_domain_resources(hd);
> +if ( rc )
> +return rc;
>  
>  if ( iommu_hwdom_passthrough && is_hardware_domain(domain) )
>  valid = 0;
> @@ -151,6 +169,8 @@ static void amd_iommu_setup_domain_devic
>  
>  amd_iommu_flush_iotlb(devfn, pdev, INV_IOMMU_ALL_PAGES_ADDRESS, 0);
>

Re: [Xen-devel] [PATCH v1] xen-pciback: optionally allow interrupt enable flag writes

2019-12-11 Thread Jan Beulich

On 03.12.2019 06:41, Marek Marczykowski-Górecki  wrote:
> QEMU running in a stubdom needs to be able to set INTX_DISABLE, and the
> MSI(-X) enable flags in the PCI config space. This adds an attribute
> 'allow_interrupt_control' which when set for a PCI device allows writes
> to this flag(s). The toolstack will need to set this for stubdoms.
> When enabled, guest (stubdomain) will be allowed to set relevant enable
> flags, but only one at a time - i.e. it refuses to enable more than one
> of INTx, MSI, MSI-X at a time.
> 
> This functionality is needed only for config space access done by device
> model (stubdomain) serving a HVM with the actual PCI device. It is not
> necessary and unsafe to enable direct access to those bits for PV domain
> with the device attached. For PV domains, there are separate protocol
> messages (XEN_PCI_OP_{enable,disable}_{msi,msix}) for this purpose.
> Those ops in addition to setting enable bits, also configure MSI(-X) in
> dom0 kernel - which is undesirable for PCI passthrough to HVM guests.
> 
> This should not introduce any new security issues since a malicious
> guest (or stubdom) can already generate MSIs through other ways, see
> [1] page 8.

True, albeit this doesn't cover INTX_DISABLE.

> Additionally, when qemu runs in dom0, it already have direct
> access to those bits.

True again, but any bug here (as in: too wide exposure) is a qemu
bug (possibly security issue). The statement also isn't really
correct for de-privileged qemu, I think (but I also think PCI
pass-through doesn't work in that mode at all yet).

On the whole this looks to be an acceptable approach to me. But
I'm not the maintainer, so my opinion may not count much. Some
issues with the implementation itself were already pointed out.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 2/2] x86/mm: factor out the code for shattering an l2 PTE

2019-12-11 Thread Jan Beulich

On 11.12.2019 11:58, Hongyan Xia wrote:
> map_pages_to_xen and modify_xen_mappings are performing almost exactly
> the same operations when shattering an l2 PTE, the only difference
> being whether we want to flush.
> 
> Signed-off-by: Hongyan Xia 

As before comments on patch 1 apply here as well.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/2] x86/mm: factor out the code for shattering an l3 PTE

2019-12-11 Thread Jan Beulich

On 11.12.2019 11:58, Hongyan Xia wrote:
> @@ -5578,27 +5597,8 @@ int modify_xen_mappings(unsigned long s, unsigned long 
> e, unsigned int nf)
>  }
>  
>  /* PAGE1GB: shatter the superpage and fall through. */
> -pl2e = alloc_xen_pagetable();
> -if ( !pl2e )
> +if ( shatter_l3e(pl3e, 0, locking) )
>  return -ENOMEM;

Hmm, I didn't expect I'd need to comment on this again: As per
my v2 reply, you should hand on the return value from the
function, not make up your own. This is so that in case the
function gains another error path with a different error code,
it wouldn't become indistinguishable to callers further up.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 4/4] xen-blkback: support dynamic unbind/bind

2019-12-11 Thread Paul Durrant

By simply re-attaching to shared rings during connect_ring() rather than
assuming they are freshly allocated (i.e assuming the counters are zero)
it is possible for vbd instances to be unbound and re-bound from and to
(respectively) a running guest.

This has been tested by running:

while true;
  do fio --name=randwrite --ioengine=libaio --iodepth=16 \
  --rw=randwrite --bs=4k --direct=1 --size=1G --verify=crc32;
  done

in a PV guest whilst running:

while true;
  do echo vbd-$DOMID-$VBD >unbind;
  echo unbound;
  sleep 5;
  echo vbd-$DOMID-$VBD >bind;
  echo bound;
  sleep 3;
  done

in dom0 from /sys/bus/xen-backend/drivers/vbd to continuously unbind and
re-bind its system disk image.

This is a highly useful feature for a backend module as it allows it to be
unloaded and re-loaded (i.e. updated) without requiring domUs to be halted.
This was also tested by running:

while true;
  do echo vbd-$DOMID-$VBD >unbind;
  echo unbound;
  sleep 5;
  rmmod xen-blkback;
  echo unloaded;
  sleep 1;
  modprobe xen-blkback;
  echo bound;
  cd $(pwd);
  sleep 3;
  done

in dom0 whilst running the same loop as above in the (single) PV guest.

Some (less stressful) testing has also been done using a Windows HVM guest
with the latest 9.0 PV drivers installed.

Signed-off-by: Paul Durrant 
---
Cc: Konrad Rzeszutek Wilk 
Cc: "Roger Pau Monné" 
Cc: Jens Axboe 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 

v3:
 - Constify sring_common and re-work error handling in xen_blkif_map()

v2:
 - Apply a sanity check to the value of rsp_prod and fail the re-attach
   if it is implausible
 - Set allow_rebind to prevent ring from being closed on unbind
 - Update test workload from dd to fio (with verification)
---
 drivers/block/xen-blkback/xenbus.c | 56 --
 1 file changed, 38 insertions(+), 18 deletions(-)

diff --git a/drivers/block/xen-blkback/xenbus.c 
b/drivers/block/xen-blkback/xenbus.c
index 59d576d27ca7..0d4097bdff3f 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -190,6 +190,9 @@ static int xen_blkif_map(struct xen_blkif_ring *ring, 
grant_ref_t *gref,
 {
int err;
struct xen_blkif *blkif = ring->blkif;
+   const struct blkif_common_sring *sring_common;
+   RING_IDX rsp_prod, req_prod;
+   unsigned int size;
 
/* Already connected through? */
if (ring->irq)
@@ -200,46 +203,62 @@ static int xen_blkif_map(struct xen_blkif_ring *ring, 
grant_ref_t *gref,
if (err < 0)
return err;
 
+   sring_common = (struct blkif_common_sring *)ring->blk_ring;
+   rsp_prod = READ_ONCE(sring_common->rsp_prod);
+   req_prod = READ_ONCE(sring_common->req_prod);
+
switch (blkif->blk_protocol) {
case BLKIF_PROTOCOL_NATIVE:
{
-   struct blkif_sring *sring;
-   sring = (struct blkif_sring *)ring->blk_ring;
-   BACK_RING_INIT(>blk_rings.native, sring,
-  XEN_PAGE_SIZE * nr_grefs);
+   struct blkif_sring *sring_native =
+   (struct blkif_sring *)ring->blk_ring;
+
+   BACK_RING_ATTACH(>blk_rings.native, sring_native,
+rsp_prod, XEN_PAGE_SIZE * nr_grefs);
+   size = __RING_SIZE(sring_native, XEN_PAGE_SIZE * nr_grefs);
break;
}
case BLKIF_PROTOCOL_X86_32:
{
-   struct blkif_x86_32_sring *sring_x86_32;
-   sring_x86_32 = (struct blkif_x86_32_sring *)ring->blk_ring;
-   BACK_RING_INIT(>blk_rings.x86_32, sring_x86_32,
-  XEN_PAGE_SIZE * nr_grefs);
+   struct blkif_x86_32_sring *sring_x86_32 =
+   (struct blkif_x86_32_sring *)ring->blk_ring;
+
+   BACK_RING_ATTACH(>blk_rings.x86_32, sring_x86_32,
+rsp_prod, XEN_PAGE_SIZE * nr_grefs);
+   size = __RING_SIZE(sring_x86_32, XEN_PAGE_SIZE * nr_grefs);
break;
}
case BLKIF_PROTOCOL_X86_64:
{
-   struct blkif_x86_64_sring *sring_x86_64;
-   sring_x86_64 = (struct blkif_x86_64_sring *)ring->blk_ring;
-   BACK_RING_INIT(>blk_rings.x86_64, sring_x86_64,
-  XEN_PAGE_SIZE * nr_grefs);
+   struct blkif_x86_64_sring *sring_x86_64 =
+   (struct blkif_x86_64_sring *)ring->blk_ring;
+
+   BACK_RING_ATTACH(>blk_rings.x86_64, sring_x86_64,
+rsp_prod, XEN_PAGE_SIZE * nr_grefs);
+   size = __RING_SIZE(sring_x86_64, XEN_PAGE_SIZE * nr_grefs);
break;
}
default:
BUG();
}
 
+   err = -EIO;
+   if (req_prod - rsp_prod > size)
+   goto fail;
+
err = bind_interdomain_evtchn_to_irqhandler(blkif->domid, evtchn,

[Xen-devel] [PATCH v3 3/4] xen/interface: re-define FRONT/BACK_RING_ATTACH()

2019-12-11 Thread Paul Durrant

Currently these macros are defined to re-initialize a front/back ring
(respectively) to values read from the shared ring in such a way that any
requests/responses that are added to the shared ring whilst the front/back
is detached will be skipped over. This, in general, is not a desirable
semantic since most frontend implementations will eventually block waiting
for a response which would either never appear or never be processed.

Since the macros are currently unused, take this opportunity to re-define
them to re-initialize a front/back ring using specified values. This also
allows FRONT/BACK_RING_INIT() to be re-defined in terms of
FRONT/BACK_RING_ATTACH() using a specified value of 0.

NOTE: BACK_RING_ATTACH() will be used directly in a subsequent patch.

Signed-off-by: Paul Durrant 
---
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 

A patch to add the FRONT/BACK_RING_ATTACH() macros to the canonical
ring.h in xen.git will be sent once the definitions have been agreed.

v2:
 - change definitions to take explicit initial index values
---
 include/xen/interface/io/ring.h | 29 +
 1 file changed, 9 insertions(+), 20 deletions(-)

diff --git a/include/xen/interface/io/ring.h b/include/xen/interface/io/ring.h
index 3f40501fc60b..2af7a1cd6658 100644
--- a/include/xen/interface/io/ring.h
+++ b/include/xen/interface/io/ring.h
@@ -125,35 +125,24 @@ struct __name##_back_ring {   
\
 memset((_s)->pad, 0, sizeof((_s)->pad));   \
 } while(0)
 
-#define FRONT_RING_INIT(_r, _s, __size) do {   \
-(_r)->req_prod_pvt = 0;\
-(_r)->rsp_cons = 0;
\
+#define FRONT_RING_ATTACH(_r, _s, _i, __size) do { \
+(_r)->req_prod_pvt = (_i); \
+(_r)->rsp_cons = (_i); \
 (_r)->nr_ents = __RING_SIZE(_s, __size);   \
 (_r)->sring = (_s);
\
 } while (0)
 
-#define BACK_RING_INIT(_r, _s, __size) do {\
-(_r)->rsp_prod_pvt = 0;\
-(_r)->req_cons = 0;
\
-(_r)->nr_ents = __RING_SIZE(_s, __size);   \
-(_r)->sring = (_s);
\
-} while (0)
+#define FRONT_RING_INIT(_r, _s, __size) FRONT_RING_ATTACH(_r, _s, 0, __size)
 
-/* Initialize to existing shared indexes -- for recovery */
-#define FRONT_RING_ATTACH(_r, _s, __size) do { \
-(_r)->sring = (_s);
\
-(_r)->req_prod_pvt = (_s)->req_prod;   \
-(_r)->rsp_cons = (_s)->rsp_prod;   \
+#define BACK_RING_ATTACH(_r, _s, _i, __size) do {  \
+(_r)->rsp_prod_pvt = (_i); \
+(_r)->req_cons = (_i); \
 (_r)->nr_ents = __RING_SIZE(_s, __size);   \
-} while (0)
-
-#define BACK_RING_ATTACH(_r, _s, __size) do {  \
 (_r)->sring = (_s);
\
-(_r)->rsp_prod_pvt = (_s)->rsp_prod;   \
-(_r)->req_cons = (_s)->req_prod;   \
-(_r)->nr_ents = __RING_SIZE(_s, __size);   \
 } while (0)
 
+#define BACK_RING_INIT(_r, _s, __size) BACK_RING_ATTACH(_r, _s, 0, __size)
+
 /* How big is this ring? */
 #define RING_SIZE(_r)  \
 ((_r)->nr_ents)
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 1/4] xenbus: move xenbus_dev_shutdown() into frontend code...

2019-12-11 Thread Paul Durrant

...and make it static

xenbus_dev_shutdown() is seemingly intended to cause clean shutdown of PV
frontends when a guest is rebooted. Indeed the function waits for a
conpletion which is only set by a call to xenbus_frontend_closed().

This patch removes the shutdown() method from backends and moves
xenbus_dev_shutdown() from xenbus_probe.c into xenbus_probe_frontend.c,
renaming it appropriately and making it static.

NOTE: In the case where the backend is running in a driver domain, the
  toolstack should have already terminated any frontends that may be
  using it (since Xen does not support re-startable PV driver domains)
  so xenbus_dev_shutdown() should never be called.

Signed-off-by: Paul Durrant 
Reviewed-by: Juergen Gross 
---
Cc: Boris Ostrovsky 
Cc: Stefano Stabellini 
---
 drivers/xen/xenbus/xenbus.h|  2 --
 drivers/xen/xenbus/xenbus_probe.c  | 23 -
 drivers/xen/xenbus/xenbus_probe_backend.c  |  1 -
 drivers/xen/xenbus/xenbus_probe_frontend.c | 24 +-
 4 files changed, 23 insertions(+), 27 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus.h b/drivers/xen/xenbus/xenbus.h
index d75a2385b37c..5f5b8a7d5b80 100644
--- a/drivers/xen/xenbus/xenbus.h
+++ b/drivers/xen/xenbus/xenbus.h
@@ -116,8 +116,6 @@ int xenbus_probe_devices(struct xen_bus_type *bus);
 
 void xenbus_dev_changed(const char *node, struct xen_bus_type *bus);
 
-void xenbus_dev_shutdown(struct device *_dev);
-
 int xenbus_dev_suspend(struct device *dev);
 int xenbus_dev_resume(struct device *dev);
 int xenbus_dev_cancel(struct device *dev);
diff --git a/drivers/xen/xenbus/xenbus_probe.c 
b/drivers/xen/xenbus/xenbus_probe.c
index c21be6e9d38a..5aa29396c9e3 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -281,29 +281,6 @@ int xenbus_dev_remove(struct device *_dev)
 }
 EXPORT_SYMBOL_GPL(xenbus_dev_remove);
 
-void xenbus_dev_shutdown(struct device *_dev)
-{
-   struct xenbus_device *dev = to_xenbus_device(_dev);
-   unsigned long timeout = 5*HZ;
-
-   DPRINTK("%s", dev->nodename);
-
-   get_device(>dev);
-   if (dev->state != XenbusStateConnected) {
-   pr_info("%s: %s: %s != Connected, skipping\n",
-   __func__, dev->nodename, xenbus_strstate(dev->state));
-   goto out;
-   }
-   xenbus_switch_state(dev, XenbusStateClosing);
-   timeout = wait_for_completion_timeout(>down, timeout);
-   if (!timeout)
-   pr_info("%s: %s timeout closing device\n",
-   __func__, dev->nodename);
- out:
-   put_device(>dev);
-}
-EXPORT_SYMBOL_GPL(xenbus_dev_shutdown);
-
 int xenbus_register_driver_common(struct xenbus_driver *drv,
  struct xen_bus_type *bus,
  struct module *owner, const char *mod_name)
diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c 
b/drivers/xen/xenbus/xenbus_probe_backend.c
index b0bed4faf44c..14876faff3b0 100644
--- a/drivers/xen/xenbus/xenbus_probe_backend.c
+++ b/drivers/xen/xenbus/xenbus_probe_backend.c
@@ -198,7 +198,6 @@ static struct xen_bus_type xenbus_backend = {
.uevent = xenbus_uevent_backend,
.probe  = xenbus_dev_probe,
.remove = xenbus_dev_remove,
-   .shutdown   = xenbus_dev_shutdown,
.dev_groups = xenbus_dev_groups,
},
 };
diff --git a/drivers/xen/xenbus/xenbus_probe_frontend.c 
b/drivers/xen/xenbus/xenbus_probe_frontend.c
index a7d90a719cea..8a1650bbe18f 100644
--- a/drivers/xen/xenbus/xenbus_probe_frontend.c
+++ b/drivers/xen/xenbus/xenbus_probe_frontend.c
@@ -126,6 +126,28 @@ static int xenbus_frontend_dev_probe(struct device *dev)
return xenbus_dev_probe(dev);
 }
 
+static void xenbus_frontend_dev_shutdown(struct device *_dev)
+{
+   struct xenbus_device *dev = to_xenbus_device(_dev);
+   unsigned long timeout = 5*HZ;
+
+   DPRINTK("%s", dev->nodename);
+
+   get_device(>dev);
+   if (dev->state != XenbusStateConnected) {
+   pr_info("%s: %s: %s != Connected, skipping\n",
+   __func__, dev->nodename, xenbus_strstate(dev->state));
+   goto out;
+   }
+   xenbus_switch_state(dev, XenbusStateClosing);
+   timeout = wait_for_completion_timeout(>down, timeout);
+   if (!timeout)
+   pr_info("%s: %s timeout closing device\n",
+   __func__, dev->nodename);
+ out:
+   put_device(>dev);
+}
+
 static const struct dev_pm_ops xenbus_pm_ops = {
.suspend= xenbus_dev_suspend,
.resume = xenbus_frontend_dev_resume,
@@ -146,7 +168,7 @@ static struct xen_bus_type xenbus_frontend = {
.uevent = xenbus_uevent_frontend,
.probe  = xenbus_frontend_dev_probe,
.remove = xenbus_dev_remove,
-

[Xen-devel] [PATCH v3 2/4] xenbus: limit when state is forced to closed

2019-12-11 Thread Paul Durrant

If a driver probe() fails then leave the xenstore state alone. There is no
reason to modify it as the failure may be due to transient resource
allocation issues and hence a subsequent probe() may succeed.

If the driver supports re-binding then only force state to closed during
remove() only in the case when the toolstack may need to clean up. This can
be detected by checking whether the state in xenstore has been set to
closing prior to device removal.

NOTE: Re-bind support is indicated by new boolean in struct xenbus_driver,
  which defaults to false. Subsequent patches will add support to
  some backend drivers.

Signed-off-by: Paul Durrant 
---
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 

v2:
 - Introduce the 'allow_rebind' flag
 - Expand the commit comment
---
 drivers/xen/xenbus/xenbus_probe.c | 12 ++--
 include/xen/xenbus.h  |  1 +
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_probe.c 
b/drivers/xen/xenbus/xenbus_probe.c
index 5aa29396c9e3..378486b79f96 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -255,7 +255,6 @@ int xenbus_dev_probe(struct device *_dev)
module_put(drv->driver.owner);
 fail:
xenbus_dev_error(dev, err, "xenbus_dev_probe on %s", dev->nodename);
-   xenbus_switch_state(dev, XenbusStateClosed);
return err;
 }
 EXPORT_SYMBOL_GPL(xenbus_dev_probe);
@@ -276,7 +275,16 @@ int xenbus_dev_remove(struct device *_dev)
 
free_otherend_details(dev);
 
-   xenbus_switch_state(dev, XenbusStateClosed);
+   /*
+* If the toolstack has forced the device state to closing then set
+* the state to closed now to allow it to be cleaned up.
+* Similarly, if the driver does not support re-bind, set the
+* closed.
+*/
+   if (!drv->allow_rebind ||
+   xenbus_read_driver_state(dev->nodename) == XenbusStateClosing)
+   xenbus_switch_state(dev, XenbusStateClosed);
+
return 0;
 }
 EXPORT_SYMBOL_GPL(xenbus_dev_remove);
diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
index 869c816d5f8c..24228a102141 100644
--- a/include/xen/xenbus.h
+++ b/include/xen/xenbus.h
@@ -93,6 +93,7 @@ struct xenbus_device_id
 struct xenbus_driver {
const char *name;   /* defaults to ids[0].devicetype */
const struct xenbus_device_id *ids;
+   bool allow_rebind; /* avoid setting xenstore closed during remove */
int (*probe)(struct xenbus_device *dev,
 const struct xenbus_device_id *id);
void (*otherend_changed)(struct xenbus_device *dev,
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 0/4] xen-blkback: support live update

2019-12-11 Thread Paul Durrant

Patch #1 is clean-up for an apparent mis-feature.

Paul Durrant (4):
  xenbus: move xenbus_dev_shutdown() into frontend code...
  xenbus: limit when state is forced to closed
  xen/interface: re-define FRONT/BACK_RING_ATTACH()
  xen-blkback: support dynamic unbind/bind

 drivers/block/xen-blkback/xenbus.c | 56 +++---
 drivers/xen/xenbus/xenbus.h|  2 -
 drivers/xen/xenbus/xenbus_probe.c  | 35 --
 drivers/xen/xenbus/xenbus_probe_backend.c  |  1 -
 drivers/xen/xenbus/xenbus_probe_frontend.c | 24 +-
 include/xen/interface/io/ring.h| 29 ---
 include/xen/xenbus.h   |  1 +
 7 files changed, 81 insertions(+), 67 deletions(-)
---
v3:
 - Only patch #4 modified

Cc: Boris Ostrovsky 
Cc: Jens Axboe 
Cc: Juergen Gross 
Cc: Konrad Rzeszutek Wilk 
Cc: "Roger Pau Monné" 
Cc: Stefano Stabellini 
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/2] x86/mm: factor out the code for shattering an l3 PTE

2019-12-11 Thread Jan Beulich

On 11.12.2019 11:58, Hongyan Xia wrote:
> map_pages_to_xen and modify_xen_mappings are performing almost exactly
> the same operations when shattering an l3 PTE, the only difference
> being whether we want to flush.
> 
> Signed-off-by: Hongyan Xia 
> 
> ---
> Changes in v3:
> - style and indentation changes.
> - return -ENOMEM instead of -1.
> 
> Changes in v2:
> - improve asm.
> - re-read pl3e from memory when taking the lock.
> - move the allocation of l2t inside the shatter function.
> ---
>  xen/arch/x86/mm.c | 98 +++
>  1 file changed, 49 insertions(+), 49 deletions(-)
> 
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index 7d4dd80a85..97f11b6016 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5151,6 +5151,52 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long v)
>   flush_area_local((const void *)v, f) : \
>   flush_area_all((const void *)v, f))
>  
> +/* Shatter an l3 entry and populate l2. If virt is passed in, also do flush. 
> */
> +static int shatter_l3e(l3_pgentry_t *pl3e, unsigned long virt, bool locking)
> +{
> +unsigned int i;
> +l3_pgentry_t ol3e = *pl3e;
> +l2_pgentry_t l2e = l2e_from_intpte(l3e_get_intpte(ol3e));
> +l2_pgentry_t *l2t = alloc_xen_pagetable();
> +
> +if ( !l2t )
> +return -ENOMEM;
> +
> +for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
> +{
> +l2e_write(l2t + i, l2e);
> +l2e = l2e_from_intpte(
> +  l2e_get_intpte(l2e) + (PAGE_SIZE << PAGETABLE_ORDER));

Andrew - iirc you had suggested this (in some different form, but
to the same effect) to improve code generation. If you're convinced
that the downside of the loss of type safety is worth the win in
generated code, I'm not going to stand in the way here, but it'll
then need to be you to ack these two patches in their eventually
final shape.

> +}
> +
> +if ( locking )
> +spin_lock(_pgdir_lock);
> +if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) &&
> + (l3e_get_flags(*pl3e) & _PAGE_PSE) )
> +{
> +l3e_write_atomic(pl3e,
> +l3e_from_paddr((paddr_t)virt_to_maddr(l2t), __PAGE_HYPERVISOR));

Why the cast? (I'm sorry if this was there on v3 already and I
didn't spot it. And if this remains the only thing to adjust,
then I guess this could be taken care of while committing.)

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 4/4] xen-blkback: support dynamic unbind/bind

2019-12-11 Thread Durrant, Paul

> -Original Message-
> > > diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-
> > blkback/xenbus.c
> > > index e8c5c54e1d26..13d09630b237 100644
> > > --- a/drivers/block/xen-blkback/xenbus.c
> > > +++ b/drivers/block/xen-blkback/xenbus.c
> > > @@ -181,6 +181,8 @@ static int xen_blkif_map(struct xen_blkif_ring
> > *ring, grant_ref_t *gref,
> > >  {
> > >   int err;
> > >   struct xen_blkif *blkif = ring->blkif;
> > > + struct blkif_common_sring *sring_common;
> > > + RING_IDX rsp_prod, req_prod;
> > >
> > >   /* Already connected through? */
> > >   if (ring->irq)
> > > @@ -191,46 +193,66 @@ static int xen_blkif_map(struct xen_blkif_ring
> > *ring, grant_ref_t *gref,
> > >   if (err < 0)
> > >   return err;
> > >
> > > + sring_common = (struct blkif_common_sring *)ring->blk_ring;
> > > + rsp_prod = READ_ONCE(sring_common->rsp_prod);
> > > + req_prod = READ_ONCE(sring_common->req_prod);
> > > +
> > >   switch (blkif->blk_protocol) {
> > >   case BLKIF_PROTOCOL_NATIVE:
> > >   {
> > > - struct blkif_sring *sring;
> > > - sring = (struct blkif_sring *)ring->blk_ring;
> > > - BACK_RING_INIT(>blk_rings.native, sring,
> > > -XEN_PAGE_SIZE * nr_grefs);
> > > + struct blkif_sring *sring_native =
> > > + (struct blkif_sring *)ring->blk_ring;
> >
> > I think you can constify both sring_native and sring_common (and the
> > other instances below).
> 
> Yes, I can do that. I don't think the macros would mind.
> 

Spoke to soon. They do mind, of course, because the sring pointer in the 
front/back ring is not (and should not) be const. I can const sring_common but 
no others.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [ovmf test] 144713: regressions - FAIL

2019-12-11 Thread osstest service owner

flight 144713 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144713/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   6 xen-buildfail REGR. vs. 144637
 build-i386-xsm6 xen-buildfail REGR. vs. 144637
 build-amd64-xsm   6 xen-buildfail REGR. vs. 144637
 build-i3866 xen-buildfail REGR. vs. 144637

Tests which did not succeed, but are not blocking:
 build-i386-libvirt1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a

version targeted for testing:
 ovmf 2fe25a74d6fee3c2ac0b930f7f3596cb432e766e
baseline version:
 ovmf 804666c86e7b6f04fe5c5cfdb13199c19e0e99b0

Last test of basis   144637  2019-12-09 09:09:49 Z2 days
Failing since144646  2019-12-10 01:39:53 Z1 days9 attempts
Testing same since   144713  2019-12-11 12:09:19 Z0 days1 attempts


People who touched revisions under test:
  Antoine Coeur 
  Ard Biesheuvel 
  Bob Feng 
  Jiewen Yao 
  Michael Kubacki 
  Philippe Mathieu-Daude 
  Steven Shi 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 2fe25a74d6fee3c2ac0b930f7f3596cb432e766e
Author: Ard Biesheuvel 
Date:   Tue Mar 5 14:32:48 2019 +0100

ArmPkg/MmCommunicationDxe: relay architected PI events to MM context

PI defines a few architected events that have significance in the MM
context as well as in the non-secure DXE context. So register notify
handlers for these events, and relay them into the standalone MM world.

Signed-off-by: Ard Biesheuvel 
Reviewed-by: Jiewen Yao 
Reviewed-by: Achin Gupta 

commit d3add11e87dace180387562d6f1951f2bffbd3d9
Author: Michael Kubacki 
Date:   Wed Nov 20 17:31:24 2019 -0800

MdeModulePkg PeiCore: Improve comment semantics

This patch clarifies wording in several PeiCore comments to improve
reading comprehension.

Cc: Dandan Bi 
Cc: Liming Gao 
Cc: Jian J Wang 
Cc: Hao A Wu 
Signed-off-by: Michael Kubacki 
Reviewed-by: Liming Gao 
Reviewed-by: Jian J Wang 

commit d39d1260c615b716675f67f5c4e1f4f52df01dad
Author: Michael Kubacki 
Date:   Wed Nov 20 17:10:48 2019 -0800

MdeModulePkg PeiCore: Fix typos

Cc: Dandan Bi 
Cc: Liming Gao 
Cc: Jian J Wang 
Cc: Hao A Wu 
Signed-off-by: Michael Kubacki 
Reviewed-by: Liming Gao 
Reviewed-by: Philippe Mathieu-Daude 
Reviewed-by: Jian J Wang 

commit 97eedf5dfbaffde33210fd88066247cf0b7d3325
Author: Antoine Coeur 
Date:   Wed Dec 4 12:14:53 2019 +0800

IntelFsp2WrapperPkg: Fix various typos

Fix various typos in comments and documentation.

Cc: Chasel Chiu 
Cc: Nate DeSimone 
Cc: Star Zeng 
Reviewed-by: Philippe Mathieu-Daude 
Signed-off-by: Philippe Mathieu-Daude 
Reviewed-by: Nate DeSimone 
Reviewed-by: Chasel Chiu 
Reviewed-by: Star Zeng 

commit 7e55cf6b48dcd43de46d008b2f12caaad2554503
Author: Jiewen Yao 
Date:   Sat Dec 7 21:41:10 2019 +0800

SecurityPkg/Tcg2Smm: Measure the table before patch.

REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1940

According to TCG PFP specification: the ACPI table must be
measured prior to any modification, and the measurement
must

[Xen-devel] [xen-unstable-smoke test] 144711: tolerable all pass - PUSHED

2019-12-11 Thread osstest service owner

flight 144711 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144711/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  80268f63640f5c3a9a4e1c688a62e35448fde9e2
baseline version:
 xen  272c18435e93cbf749c096a9552ab5ef0d79a4ca

Last test of basis   144672  2019-12-10 18:00:40 Z0 days
Testing same since   144711  2019-12-11 12:02:23 Z0 days1 attempts


People who touched revisions under test:
  Jan Beulich 
  Juergen Gross 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   272c18435e..80268f6364  80268f63640f5c3a9a4e1c688a62e35448fde9e2 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen-blkback: prevent premature module unload

2019-12-11 Thread Durrant, Paul

> -Original Message-
> From: Roger Pau Monné 
> Sent: 11 December 2019 13:55
> To: Durrant, Paul ; Juergen Gross 
> Cc: xen-devel@lists.xenproject.org; linux-bl...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Konrad Rzeszutek Wilk ;
> Jens Axboe 
> Subject: Re: [PATCH] xen-blkback: prevent premature module unload
> 
> On Wed, Dec 11, 2019 at 01:27:42PM +, Durrant, Paul wrote:
> > > -Original Message-
> > > From: Roger Pau Monné 
> > > Sent: 11 December 2019 11:29
> > > To: Durrant, Paul 
> > > Cc: xen-devel@lists.xenproject.org; linux-bl...@vger.kernel.org;
> linux-
> > > ker...@vger.kernel.org; Konrad Rzeszutek Wilk
> ;
> > > Jens Axboe 
> > > Subject: Re: [PATCH] xen-blkback: prevent premature module unload
> > >
> > > On Tue, Dec 10, 2019 at 02:53:05PM +, Paul Durrant wrote:
> > > > Objects allocated by xen_blkif_alloc come from the 'blkif_cache'
> kmem
> > > > cache. This cache is destoyed when xen-blkif is unloaded so it is
> > > > necessary to wait for the deferred free routine used for such
> objects to
> > > > complete. This necessity was missed in commit 14855954f636 "xen-
> blkback:
> > > > allow module to be cleanly unloaded". This patch fixes the problem
> by
> > > > taking/releasing extra module references in xen_blkif_alloc/free()
> > > > respectively.
> > > >
> > > > Signed-off-by: Paul Durrant 
> > >
> > > Reviewed-by: Roger Pau Monné 
> > >
> > > One nit below.
> > >
> > > > ---
> > > > Cc: Konrad Rzeszutek Wilk 
> > > > Cc: "Roger Pau Monné" 
> > > > Cc: Jens Axboe 
> > > > ---
> > > >  drivers/block/xen-blkback/xenbus.c | 10 ++
> > > >  1 file changed, 10 insertions(+)
> > > >
> > > > diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-
> > > blkback/xenbus.c
> > > > index e8c5c54e1d26..59d576d27ca7 100644
> > > > --- a/drivers/block/xen-blkback/xenbus.c
> > > > +++ b/drivers/block/xen-blkback/xenbus.c
> > > > @@ -171,6 +171,15 @@ static struct xen_blkif
> *xen_blkif_alloc(domid_t
> > > domid)
> > > > blkif->domid = domid;
> > > > atomic_set(>refcnt, 1);
> > > > init_completion(>drain_complete);
> > > > +
> > > > +   /*
> > > > +* Because freeing back to the cache may be deferred, it is
> not
> > > > +* safe to unload the module (and hence destroy the cache)
> until
> > > > +* this has completed. To prevent premature unloading, take an
> > > > +* extra module reference here and release only when the
> object
> > > > +* has been free back to the cache.
> > > ^ freed
> >
> > Oh yes. Can this be done on commit, or would you like me to send a v2?
> 
> Adjusting on commit would be fine for me, but it's up to Juergen since
> he is the one that will pick this up. IIRC the module unload patches
> didn't go through the block subsystem.

True. I forgot manually add Juergen cc list.

  Paul

> 
> Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen-blkback: prevent premature module unload

2019-12-11 Thread Jürgen Groß

On 11.12.19 14:55, Roger Pau Monné wrote:

On Wed, Dec 11, 2019 at 01:27:42PM +, Durrant, Paul wrote:

-Original Message-
From: Roger Pau Monné 
Sent: 11 December 2019 11:29
To: Durrant, Paul 
Cc: xen-devel@lists.xenproject.org; linux-bl...@vger.kernel.org; linux-
ker...@vger.kernel.org; Konrad Rzeszutek Wilk ;
Jens Axboe 
Subject: Re: [PATCH] xen-blkback: prevent premature module unload

On Tue, Dec 10, 2019 at 02:53:05PM +, Paul Durrant wrote:

Objects allocated by xen_blkif_alloc come from the 'blkif_cache' kmem
cache. This cache is destoyed when xen-blkif is unloaded so it is
necessary to wait for the deferred free routine used for such objects to
complete. This necessity was missed in commit 14855954f636 "xen-blkback:
allow module to be cleanly unloaded". This patch fixes the problem by
taking/releasing extra module references in xen_blkif_alloc/free()
respectively.

Signed-off-by: Paul Durrant 

Reviewed-by: Roger Pau Monné 

One nit below.

---
Cc: Konrad Rzeszutek Wilk 
Cc: "Roger Pau Monné" 
Cc: Jens Axboe 
---
  drivers/block/xen-blkback/xenbus.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-

blkback/xenbus.c

index e8c5c54e1d26..59d576d27ca7 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -171,6 +171,15 @@ static struct xen_blkif *xen_blkif_alloc(domid_t

domid)

blkif->domid = domid;
atomic_set(>refcnt, 1);
init_completion(>drain_complete);
+
+   /*
+* Because freeing back to the cache may be deferred, it is not
+* safe to unload the module (and hence destroy the cache) until
+* this has completed. To prevent premature unloading, take an
+* extra module reference here and release only when the object
+* has been free back to the cache.

 ^ freed

Oh yes. Can this be done on commit, or would you like me to send a v2?

Adjusting on commit would be fine for me, but it's up to Juergen since
he is the one that will pick this up. IIRC the module unload patches
didn't go through the block subsystem.

Oh, right. Yes, will fix this when committing.

Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen-blkback: prevent premature module unload

2019-12-11 Thread Roger Pau Monné

On Wed, Dec 11, 2019 at 01:27:42PM +, Durrant, Paul wrote:
> > -Original Message-
> > From: Roger Pau Monné 
> > Sent: 11 December 2019 11:29
> > To: Durrant, Paul 
> > Cc: xen-devel@lists.xenproject.org; linux-bl...@vger.kernel.org; linux-
> > ker...@vger.kernel.org; Konrad Rzeszutek Wilk ;
> > Jens Axboe 
> > Subject: Re: [PATCH] xen-blkback: prevent premature module unload
> > 
> > On Tue, Dec 10, 2019 at 02:53:05PM +, Paul Durrant wrote:
> > > Objects allocated by xen_blkif_alloc come from the 'blkif_cache' kmem
> > > cache. This cache is destoyed when xen-blkif is unloaded so it is
> > > necessary to wait for the deferred free routine used for such objects to
> > > complete. This necessity was missed in commit 14855954f636 "xen-blkback:
> > > allow module to be cleanly unloaded". This patch fixes the problem by
> > > taking/releasing extra module references in xen_blkif_alloc/free()
> > > respectively.
> > >
> > > Signed-off-by: Paul Durrant 
> > 
> > Reviewed-by: Roger Pau Monné 
> > 
> > One nit below.
> > 
> > > ---
> > > Cc: Konrad Rzeszutek Wilk 
> > > Cc: "Roger Pau Monné" 
> > > Cc: Jens Axboe 
> > > ---
> > >  drivers/block/xen-blkback/xenbus.c | 10 ++
> > >  1 file changed, 10 insertions(+)
> > >
> > > diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-
> > blkback/xenbus.c
> > > index e8c5c54e1d26..59d576d27ca7 100644
> > > --- a/drivers/block/xen-blkback/xenbus.c
> > > +++ b/drivers/block/xen-blkback/xenbus.c
> > > @@ -171,6 +171,15 @@ static struct xen_blkif *xen_blkif_alloc(domid_t
> > domid)
> > >   blkif->domid = domid;
> > >   atomic_set(>refcnt, 1);
> > >   init_completion(>drain_complete);
> > > +
> > > + /*
> > > +  * Because freeing back to the cache may be deferred, it is not
> > > +  * safe to unload the module (and hence destroy the cache) until
> > > +  * this has completed. To prevent premature unloading, take an
> > > +  * extra module reference here and release only when the object
> > > +  * has been free back to the cache.
> > ^ freed
> 
> Oh yes. Can this be done on commit, or would you like me to send a v2?

Adjusting on commit would be fine for me, but it's up to Juergen since
he is the one that will pick this up. IIRC the module unload patches
didn't go through the block subsystem.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/4] xenbus: limit when state is forced to closed

2019-12-11 Thread Durrant, Paul

> -Original Message-
> From: Jürgen Groß 
> Sent: 11 December 2019 10:21
> To: Durrant, Paul ; Roger Pau Monné
> 
> Cc: xen-devel@lists.xenproject.org; linux-ker...@vger.kernel.org; Stefano
> Stabellini ; Boris Ostrovsky
> 
> Subject: Re: [Xen-devel] [PATCH v2 2/4] xenbus: limit when state is forced
> to closed
> 
> On 11.12.19 11:14, Durrant, Paul wrote:
> >> -Original Message-
> >> From: Roger Pau Monné 
> >> Sent: 11 December 2019 10:06
> >> To: Durrant, Paul 
> >> Cc: xen-devel@lists.xenproject.org; linux-ker...@vger.kernel.org;
> Juergen
> >> Gross ; Stefano Stabellini ;
> >> Boris Ostrovsky 
> >> Subject: Re: [Xen-devel] [PATCH v2 2/4] xenbus: limit when state is
> forced
> >> to closed
> >>
> >> On Tue, Dec 10, 2019 at 11:33:45AM +, Paul Durrant wrote:
> >>> If a driver probe() fails then leave the xenstore state alone. There
> is
> >> no
> >>> reason to modify it as the failure may be due to transient resource
> >>> allocation issues and hence a subsequent probe() may succeed.
> >>>
> >>> If the driver supports re-binding then only force state to closed
> during
> >>> remove() only in the case when the toolstack may need to clean up.
> This
> >> can
> >>> be detected by checking whether the state in xenstore has been set to
> >>> closing prior to device removal.
> >>>
> >>> NOTE: Re-bind support is indicated by new boolean in struct
> >> xenbus_driver,
> >>>which defaults to false. Subsequent patches will add support to
> >>>some backend drivers.
> >>
> >> My intention was to specify whether you want to close the
> >> backends on unbind in sysfs, so that an user can decide at runtime,
> >> rather than having a hardcoded value in the driver.
> >>
> >> Anyway, I'm less sure whether such runtime tunable is useful at all,
> >> so let's leave it out and can always be added afterwards. At the end
> >> of day a user wrongly doing a rmmod blkback can always recover
> >> gracefully by loading blkback again with your proposed approach to
> >> leave connections open on module removal.
> >>
> >> Sorry for the extra work.
> >>
> >
> > Does this mean you don't think the extra driver flag is necessary any
> more? NB: now that xenbus actually takes module references you can't
> accidentally rmmod any more :-)
> 
> I'd like it to be kept, please.
> 

Ok. I'll leave this patch alone then.

  Paul

> Juergen
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen-blkback: prevent premature module unload

2019-12-11 Thread Durrant, Paul

> -Original Message-
> From: Roger Pau Monné 
> Sent: 11 December 2019 11:29
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; linux-bl...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Konrad Rzeszutek Wilk ;
> Jens Axboe 
> Subject: Re: [PATCH] xen-blkback: prevent premature module unload
> 
> On Tue, Dec 10, 2019 at 02:53:05PM +, Paul Durrant wrote:
> > Objects allocated by xen_blkif_alloc come from the 'blkif_cache' kmem
> > cache. This cache is destoyed when xen-blkif is unloaded so it is
> > necessary to wait for the deferred free routine used for such objects to
> > complete. This necessity was missed in commit 14855954f636 "xen-blkback:
> > allow module to be cleanly unloaded". This patch fixes the problem by
> > taking/releasing extra module references in xen_blkif_alloc/free()
> > respectively.
> >
> > Signed-off-by: Paul Durrant 
> 
> Reviewed-by: Roger Pau Monné 
> 
> One nit below.
> 
> > ---
> > Cc: Konrad Rzeszutek Wilk 
> > Cc: "Roger Pau Monné" 
> > Cc: Jens Axboe 
> > ---
> >  drivers/block/xen-blkback/xenbus.c | 10 ++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-
> blkback/xenbus.c
> > index e8c5c54e1d26..59d576d27ca7 100644
> > --- a/drivers/block/xen-blkback/xenbus.c
> > +++ b/drivers/block/xen-blkback/xenbus.c
> > @@ -171,6 +171,15 @@ static struct xen_blkif *xen_blkif_alloc(domid_t
> domid)
> > blkif->domid = domid;
> > atomic_set(>refcnt, 1);
> > init_completion(>drain_complete);
> > +
> > +   /*
> > +* Because freeing back to the cache may be deferred, it is not
> > +* safe to unload the module (and hence destroy the cache) until
> > +* this has completed. To prevent premature unloading, take an
> > +* extra module reference here and release only when the object
> > +* has been free back to the cache.
> ^ freed

Oh yes. Can this be done on commit, or would you like me to send a v2?

  Paul

> > +*/
> > +   __module_get(THIS_MODULE);
> > INIT_WORK(>free_work, xen_blkif_deferred_free);
> >
> > return blkif;
> > @@ -320,6 +329,7 @@ static void xen_blkif_free(struct xen_blkif *blkif)
> >
> > /* Make sure everything is drained before shutting down */
> > kmem_cache_free(xen_blkif_cachep, blkif);
> > +   module_put(THIS_MODULE);
> >  }
> >
> >  int __init xen_blkif_interface_init(void)
> > --
> > 2.20.1
> >

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Xen Security Advisory 311 v4 (CVE-2019-19577) - Bugs in dynamic height handling for AMD IOMMU pagetables

2019-12-11 Thread Sander Eikelenboom

On 11/12/2019 13:09, Xen.org security team wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
> 
> Xen Security Advisory CVE-2019-19577 / XSA-311
>version 4
> 
>  Bugs in dynamic height handling for AMD IOMMU pagetables
> 

> 
> CREDITS
> ===
> 
> This issue was discovered by Sander Eikelenboom, along with Andrew Cooper of
> Citrix.

Ahh this was why Jan's two patches were skipped, I was about to inquire
if it would be picked up in the future in some form.

--
Sander


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] IOMMU: make DMA containment of quarantined devices optional

2019-12-11 Thread Jan Beulich

Containing still in flight DMA was introduced to work around certain
devices / systems hanging hard upon hitting an IOMMU fault. Passing
through (such) devices (on such systems) is inherently insecure (as
guests could easily arrange for IOMMU faults to occur). Defaulting to
a mode where admins may not even become aware of issues with devices can
be considered undesirable. Therefore convert this mode of operation to
an optional one, not one enabled by default.

This involves resurrecting code commit ea38867831da ("x86 / iommu: set
up a scratch page in the quarantine domain") did remove, in a slightly
extended fashion. Here, instead of reintroducing a pretty pointless use
of "goto" in domain_context_unmap(), and instead of making the function
(at least temporarily) inconsistent, take the opportunity and replace
the other similarly pointless "goto" as well.

In order to key the re-instated bypasses off of there (not) being a root
page table this further requires moving the allocate_domain_resources()
invocation from reassign_device() to amd_iommu_setup_domain_device() (or
else reassign_device() would allocate a root page table anyway); this is
benign to the second caller of the latter function.

Signed-off-by: Jan Beulich 
---
I'm happy to take better suggestions to replace "full".

--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -1232,7 +1232,7 @@ detection of systems known to misbehave
 > Default: `new` unless directed-EOI is supported
 
 ### iommu
-= List of [ , verbose, debug, force, required, quarantine,
+= List of [ , verbose, debug, force, required, quarantine[=full],
 sharept, intremap, intpost, crash-disable,
 snoop, qinval, igfx, amd-iommu-perdev-intremap,
 dom0-{passthrough,strict} ]
@@ -1270,11 +1270,13 @@ boolean (e.g. `iommu=no`) can override t
 will prevent Xen from booting if IOMMUs aren't discovered and enabled
 successfully.
 
-*   The `quarantine` boolean can be used to control Xen's behavior when
-de-assigning devices from guests.  If enabled (the default), Xen always
-quarantines such devices; they must be explicitly assigned back to Dom0
-before they can be used there again.  If disabled, Xen will only
-quarantine devices the toolstack hass arranged for getting quarantined.
+*   The `quarantine` option can be used to control Xen's behavior when
+de-assigning devices from guests.  If set to true (the default), Xen
+always quarantines such devices; they must be explicitly assigned back
+to Dom0 before they can be used there again.  If set to "full", still
+active DMA will additionally be directed to a "sink" page.  If set to
+false, Xen will only quarantine devices the toolstack has arranged for
+getting quarantined.
 
 *   The `sharept` boolean controls whether the IOMMU pagetables are shared
 with the CPU-side HAP pagetables, or allocated separately.  Sharing
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -85,18 +85,36 @@ int get_dma_requestor_id(uint16_t seg, u
 return req_id;
 }
 
-static void amd_iommu_setup_domain_device(
+static int __must_check allocate_domain_resources(struct domain_iommu *hd)
+{
+int rc;
+
+spin_lock(>arch.mapping_lock);
+rc = amd_iommu_alloc_root(hd);
+spin_unlock(>arch.mapping_lock);
+
+return rc;
+}
+
+static int __must_check amd_iommu_setup_domain_device(
 struct domain *domain, struct amd_iommu *iommu,
 uint8_t devfn, struct pci_dev *pdev)
 {
 struct amd_iommu_dte *table, *dte;
 unsigned long flags;
-int req_id, valid = 1;
+int req_id, valid = 1, rc;
 u8 bus = pdev->bus;
-const struct domain_iommu *hd = dom_iommu(domain);
+struct domain_iommu *hd = dom_iommu(domain);
+
+/* dom_io is used as a sentinel for quarantined devices */
+if ( domain == dom_io && !hd->arch.root_table )
+return 0;
+
+BUG_ON(!hd->arch.paging_mode || !iommu->dev_table.buffer);
 
-BUG_ON( !hd->arch.root_table || !hd->arch.paging_mode ||
-!iommu->dev_table.buffer );
+rc = allocate_domain_resources(hd);
+if ( rc )
+return rc;
 
 if ( iommu_hwdom_passthrough && is_hardware_domain(domain) )
 valid = 0;
@@ -151,6 +169,8 @@ static void amd_iommu_setup_domain_devic
 
 amd_iommu_flush_iotlb(devfn, pdev, INV_IOMMU_ALL_PAGES_ADDRESS, 0);
 }
+
+return 0;
 }
 
 int __init acpi_ivrs_init(void)
@@ -220,17 +240,6 @@ int amd_iommu_alloc_root(struct domain_i
 return 0;
 }
 
-static int __must_check allocate_domain_resources(struct domain_iommu *hd)
-{
-int rc;
-
-spin_lock(>arch.mapping_lock);
-rc = amd_iommu_alloc_root(hd);
-spin_unlock(>arch.mapping_lock);
-
-return rc;
-}
-
 int amd_iommu_get_paging_mode(unsigned long entries)
 {
 int level = 1;
@@ -287,6 +296,10 @@ static void amd_iommu_disable_domain_dev
 int req_id;
 u8 bus =

Re: [Xen-devel] [PATCH v6 1/3] xenbus/backend: Add memory pressure handler callback

2019-12-11 Thread Jürgen Groß


On 11.12.19 12:46, Roger Pau Monné wrote:

On Wed, Dec 11, 2019 at 04:26:57AM +, SeongJae Park wrote:

+
return 0;
  }
  subsys_initcall(xenbus_probe_backend_init);
diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
index 869c816d5f8c..196260017666 100644
--- a/include/xen/xenbus.h
+++ b/include/xen/xenbus.h
@@ -104,6 +104,7 @@ struct xenbus_driver {
struct device_driver driver;
int (*read_otherend_details)(struct xenbus_device *dev);
int (*is_ready)(struct xenbus_device *dev);
+   void (*reclaim)(struct xenbus_device *dev);


reclaim_memory (if Juergen agrees).


I do agree.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 1/3] xenbus/backend: Add memory pressure handler callback

2019-12-11 Thread SeongJae Park

On Wed, 11 Dec 2019 12:46:51 +0100 "Roger Pau Monné"  
wrote:

> > Granting pages consumes backend system memory.  In systems configured
> > with insufficient spare memory for those pages, it can cause a memory
> > pressure situation.  However, finding the optimal amount of the spare
>   ^ s/the//
> > memory is challenging for large systems having dynamic resource
> > utilization patterns.  Also, such a static configuration might lack
> > flexibility.
> > 
> > To mitigate such problems, this commit adds a memory reclaim callback to
> > 'xenbus_driver'.  If a memory pressure is detected, 'xenbus' requests
>^ s/a//
> > every backend driver to volunarily release its memory.
> > 
> > Note that it would be able to improve the callback facility for more
> ^ possible
> > sophisticated handlings of general pressures.  For example, it would be
> ^ handling of resource starvation.
> > possible to monitor the memory consumption of each device and issue the
> > release requests to only devices which causing the pressure.  Also, the
> > callback could be extended to handle not only memory, but general
> > resources.  Nevertheless, this version of the implementation defers such
> > sophisticated goals as a future work.
> > 
> > Reviewed-by: Juergen Gross 
> > Signed-off-by: SeongJae Park 
> > ---
> >  drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++
> >  include/xen/xenbus.h  |  1 +
> >  2 files changed, 33 insertions(+)
> > 
> > diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c 
> > b/drivers/xen/xenbus/xenbus_probe_backend.c
> > index b0bed4faf44c..aedbe2198de5 100644
> > --- a/drivers/xen/xenbus/xenbus_probe_backend.c
> > +++ b/drivers/xen/xenbus/xenbus_probe_backend.c
> > @@ -248,6 +248,35 @@ static int backend_probe_and_watch(struct 
> > notifier_block *notifier,
> > return NOTIFY_DONE;
> >  }
> >  
> > +static int xenbus_backend_reclaim(struct device *dev, void *data)
> 
> No need for the xenbus_ prefix since it's a static function, ie:
> backend_reclaim_memory should be fine IMO.

Agreed, will change the name in the next version.

> 
> > +{
> > +   struct xenbus_driver *drv;
> 
> I've asked for this variable to be constified in v5, is it not
> possible to make it const?

Sorry, my mistake...  I was difinitely too hurry.

> 
> > +
> > +   if (!dev->driver)
> > +   return 0;
> > +   drv = to_xenbus_driver(dev->driver);
> > +   if (drv && drv->reclaim)
> > +   drv->reclaim(to_xenbus_device(dev));
> > +   return 0;
> > +}
> > +
> > +/*
> > + * Returns 0 always because we are using shrinker to only detect memory
> > + * pressure.
> > + */
> > +static unsigned long xenbus_backend_shrink_count(struct shrinker *shrinker,
> > +   struct shrink_control *sc)
> > +{
> > +   bus_for_each_dev(_backend.bus, NULL, NULL,
> > +   xenbus_backend_reclaim);
> > +   return 0;
> > +}
> > +
> > +static struct shrinker xenbus_backend_shrinker = {
> 
> I would drop the xenbus prefix, and I think it's not possible to
> constify this due to register_shrinker expecting a non-const
> parameter?

Yes, constifying it results in another compile warning.  Will drop the prefix.

> 
> > +   .count_objects = xenbus_backend_shrink_count,
> > +   .seeks = DEFAULT_SEEKS,
> > +};
> > +
> >  static int __init xenbus_probe_backend_init(void)
> >  {
> > static struct notifier_block xenstore_notifier = {
> > @@ -264,6 +293,9 @@ static int __init xenbus_probe_backend_init(void)
> >  
> > register_xenstore_notifier(_notifier);
> >  
> > +   if (register_shrinker(_backend_shrinker))
> > +   pr_warn("shrinker registration failed\n");
> 
> Can you add a xenbus prefix to the error message? Or else it's hard to
> know which subsystem is complaining when you see such message on the
> log. ie: "xenbus: shrinker ..."

Because we have #define `pr_fmt(fmt) KBUILD_MODNAME ": " fmt` in the beginning
of the file, the message will have a proper prefix.

> 
> > +
> > return 0;
> >  }
> >  subsys_initcall(xenbus_probe_backend_init);
> > diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
> > index 869c816d5f8c..196260017666 100644
> > --- a/include/xen/xenbus.h
> > +++ b/include/xen/xenbus.h
> > @@ -104,6 +104,7 @@ struct xenbus_driver {
> > struct device_driver driver;
> > int (*read_otherend_details)(struct xenbus_device *dev);
> > int (*is_ready)(struct xenbus_device *dev);
> > +   void (*reclaim)(struct xenbus_device *dev);
> 
> reclaim_memory (if Juergen agrees).

Okay.


Thanks,
SeongJae Park

> 
> Thanks, Roger.
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Xen Security Advisory 308 v3 (CVE-2019-19583) - VMX: VMentry failure with debug exceptions and blocked states

2019-12-11 Thread Xen . org security team

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Xen Security Advisory CVE-2019-19583 / XSA-308
   version 3

 VMX: VMentry failure with debug exceptions and blocked states

UPDATES IN VERSION 3


Public release.

Updated metadata to add 4.13, update StableRef's

ISSUE DESCRIPTION
=

Please see XSA-260 for background on the MovSS shadow:
  http://xenbits.xen.org/xsa/advisory-260.html

Please see XSA-156 for background on the need for #DB interception:
  http://xenbits.xen.org/xsa/advisory-156.html

The VMX VMEntry checks does not like the exact combination of state
which occurs when #DB in intercepted, Single Stepping is active, and
blocked by STI/MovSS is active, despite this being a legitimate state to
be in.  The resulting VMEntry failure is fatal to the guest.

IMPACT
==

HVM/PVH guest userspace code may be able to crash the guest, resulting
in a guest Denial of Service.

VULNERABLE SYSTEMS
==

All versions of Xen are affected.

Only systems supporting VMX hardware virtual extensions (Intel, Cyrix or
Zhaoxin CPUs) are affected. Arm and AMD systems are unaffected.

Only HVM/PVH guests are affected.  PV guests cannot leverage the
vulnerability.

MITIGATION
==

Running only PV guests will avoid this vulnerability.

Running HVM guests on only AMD hardware will also avoid this
vulnerability.

CREDITS
===

This issue was discovered by Håkon Alstadheim and diagnosed as a
security issue by Andrew Cooper of Citrix.

RESOLUTION
==

Applying the attached patch resolves this issue.

xsa308.patch   xen-unstable, Xen 4.13.x .. Xen 4.8.x

$ sha256sum xsa308*
4aa06d21478d9debb12388ff14d8abc31982e18895db40d0cec78fcc9fe68ef2  xsa308.meta
7e782b09b16f7534c8db52042f7bb3bd730d108571c8b10af184ae0b02fdae9d  xsa308.patch
$

DEPLOYMENT DURING EMBARGO
=

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.
-BEGIN PGP SIGNATURE-

iQFABAEBCAAqFiEEI+MiLBRfRHX6gGCng/4UyVfoK9kFAl3w3FsMHHBncEB4ZW4u
b3JnAAoJEIP+FMlX6CvZWHwIAIfuiZE/IyxMwTAkZL3EugBnlxxHodoBuj6imn+n
c9DvMk3TCi3vSgvZQtVpP0eNuuLN5285hVyI95lRE0LTmtRLc7jATktStRTgGkua
znW8U1sqkVRWJcVuN4uAM2zIY60pMZnFjZxdJW12+wpcA13LInE1cDWnlRv+cdD9
7DtVkGUWXjfbcm3KXGZw8YpKvTgVp983VpywR/1lzXZ+MexWzKuEco8fZFayw0ne
3nT/23Y1ofjCflNFjc7HoeJZl+zy493J/rqHS8yYI3d4vTdIfjue3rZ/X6305el9
zjCG5zXygrWVAoKGWVnPZweX1jw8rd6BlsPTqQb53UH94zc=
=yTxW
-END PGP SIGNATURE-


xsa308.meta
Description: Binary data


xsa308.patch
Description: Binary data
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Xen Security Advisory 310 v3 (CVE-2019-19580) - Further issues with restartable PV type change operations

2019-12-11 Thread Xen . org security team

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Xen Security Advisory CVE-2019-19580 / XSA-310
   version 3

  Further issues with restartable PV type change operations

UPDATES IN VERSION 3


Public release.

Updated metadata to add 4.13, update StableRef's

ISSUE DESCRIPTION
=

XSA-299 addressed several critical issues in restartable PV type
change operations.  Despite extensive testing and auditing, some
corner cases were missed.

IMPACT
==

A malicious PV guest administrator may be able to escalate their
privilege to that of the host.

VULNERABLE SYSTEMS
==

All security-supported versions of Xen are vulnerable.

Only x86 systems are affected.  Arm systems are not affected.

Only x86 PV guests can leverage the vulnerability.  x86 HVM and PVH
guests cannot leverage the vulnerability.

Note that these attacks require very precise timing, which may
be difficult to exploit in practice.

MITIGATION
==

Running only HVM or PVH guests will avoid this vulnerability.

Running PV guests in "shim" mode will also avoid this vulnerability.

CREDITS
===

This issue was discovered by Sarah Newman at prgmr.com.

RESOLUTION
==

Applying the appropriate attached patch resolves this issue.

xsa310/*.patch   xen-unstable, Xen 4.13 - 4.10
xsa310-4.9/*.patch   Xen 4.9 - 4.8

$ sha256sum xsa310* xsa310*/*
2208e40c71aa521ae487782bd751963ce696be451d10a179fcecdff7a0065369  xsa310.meta
8e75f0fb5fe890a661c8d46ec622131bc650f1a95b170b99569b50dd2224616c  
xsa310-4.9/0001-x86-mm-Set-old_guest_table-when-destroying-vcpu-page.patch
3da404a0c088936ed92377ccef1fa6fdeb23900358ca9284e3488e8e1dcb5dd2  
xsa310-4.9/0002-x86-mm-alloc-free_lN_table-Retain-partial_flags-on-E.patch
cd1a77c2f767474dcfbd1e6282ad3219ce2abcac2021b040120d40b52fc76bc8  
xsa310-4.9/0003-x86-mm-relinquish_memory-Grab-an-extra-type-ref-when.patch
44c670a1b1b8164202766d52fb741e62c104118525eb7a3e56f4b232bcb8be3f  
xsa310/0001-x86-mm-Set-old_guest_table-when-destroying-vcpu-page.patch
173dc0ffb4c572c8493bd9d5f3309b113e51888bdc9e462c78933f5c85f69b7a  
xsa310/0002-x86-mm-alloc-free_lN_table-Retain-partial_flags-on-E.patch
1833fbfc2cdea9b37f161b09df947dffdd8db5e60a2f3512913de0e0c0d4b3ef  
xsa310/0003-x86-mm-relinquish_memory-Grab-an-extra-type-ref-when.patch
$

DEPLOYMENT DURING EMBARGO
=

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.


(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-BEGIN PGP SIGNATURE-

iQFABAEBCAAqFiEEI+MiLBRfRHX6gGCng/4UyVfoK9kFAl3w3F0MHHBncEB4ZW4u
b3JnAAoJEIP+FMlX6CvZ1noH/i6Sb3F6ZiaSl460OvdCRKd9lZm3ONunOH4IHuc6
+Q/G0G4b48UYfK/8FSAAjldv8tPOA5+j3GAFr2JgVtTWjP7tZyzSs0tDvn37sZrZ
D3l0AeOHxLCuSRxnoRDtpKiuJv71DrnYEfCDdc6R4DTZuciOWYpYq6PQTac5bLZX
8G5nR+33SvzdIpncvONa0Xqm1+Cgy8yOOQQJHeQvN7GJfVvs6AHepU5zuP2Ez42W
ReNA6o13xwiI8LGKvf8cV7s74JklIxR9gzkv4bBtMKInUY2loSIbKpI8E9GsVa3n
VOJ2kwKgGgszewBoVyJdGYY1ZlXeIdPjOj7+575bsRnDlGo=
=f2/B
-END PGP SIGNATURE-


xsa310.meta
Description: Binary data


xsa310-4.9/0001-x86-mm-Set-old_guest_table-when-destroying-vcpu-page.patch
Description: Binary data


xsa310-4.9/0002-x86-mm-alloc-free_lN_table-Retain-partial_flags-on-E.patch
Description: Binary data


xsa310-4.9/0003-x86-mm-relinquish_memory-Grab-an-extra-type-ref-when.patch
Description: Binary data


xsa310/0001-x86-mm-Set-old_guest_table-when-destroying-vcpu-page.patch
Description: Binary data


xsa310/0002-x86-mm-alloc-free_lN_table-Retain-partial_flags-on-E.patch
Description: Binary data


xsa310/0003-x86-mm-relinquish_memory-Grab-an-extra-type-ref-when.patch
Description: Binary data
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Xen Security Advisory 309 v3 (CVE-2019-19578) - Linear pagetable use / entry miscounts

2019-12-11 Thread Xen . org security team

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Xen Security Advisory CVE-2019-19578 / XSA-309
   version 3

Linear pagetable use / entry miscounts

UPDATES IN VERSION 3


Public release.

Updated metadata to add 4.13, update StableRef's

ISSUE DESCRIPTION
=

"Linear pagetables" is a technique which involves either pointing a
pagetable at itself, or to another pagetable of the same or higher
level.  Xen has limited support for linear pagetables: A page may
either point to itself, or point to another pagetable of the same
level (i.e., L2 to L2, L3 to L3, and so on).

XSA-240 introduced an additional restriction that limited the "depth"
of such chains by allowing pages to either *point to* other pages of
the same level, or *be pointed to* by other pages of the same level,
but not both.  To implement this, we keep track of the number of
outstanding times a page points to or is pointed to another page
table, to prevent both from happening at the same time.

Unfortunately, the original commit introducing this reset this count
when resuming validation of a partially-validated pagetable,
incorrectly dropping some "linear_pt_entry" counts.

If an attacker could engineer such a situation to occur, they might be
able to make loops or other arbitrary chains of linear pagetables, as
described in XSA-240.

IMPACT
==

A malicious or buggy PV guest may cause the hypervisor to crash,
resulting in Denial of Service (DoS) affecting the entire host.
Privilege escalation and information leaks cannot be excluded.

VULNERABLE SYSTEMS
==

All versions of Xen are vulnerable.

Only x86 systems are affected.  Arm systems are not affected.

Only x86 PV guests can leverage the vulnerability.  x86 HVM and PVH
guests cannot leverage the vulnerability.

Only systems which have enabled linear pagetables are vulnerable.
Systems which have disabled linear pagetables, either by selecting
CONFIG_PV_LINEAR_PT=n when building the hypervisor, or adding
pv-linear-pt=false on the command-line, are not vulnerable.

MITIGATION
==

If you don't have any guests which need linear pagetables, you can
disable the feature by adding pv-linear-pt=false to your Xen
command-line.  NetBSD is known to use linear pagetables; Linux and
MiniOS are known not to use linear pagetables.

CREDITS
===

This issue was discovered by Manuel Bouyer and diagnosed as a security
issue by Jan Beulich of SUSE.

RESOLUTION
==

Applying the appropriate attached patch resolves this issue.

xsa309.patch   xen-unstable, Xen 4.13 - Xen 4.8

$ sha256sum xsa309*
ddd00dfbc85bada4e4cee8a51b989e3138cc47c58992657054246bc95c8ae34d  xsa309.meta
0e4b75f4416624de698f3ed619c28418917ab0a5c9663c1641804e1d0a0dec1b  xsa309.patch
$

DEPLOYMENT DURING EMBARGO
=

Deployment of the patches described above (or others which are
substantially similar) is permitted during the embargo, even on
public-facing systems with untrusted guest users and administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Deployment of the `pv-linear-pt=false` mitigation is NOT permitted
(except where all the affected systems and VMs are administered and
used only by organisations which are members of the Xen Project
Security Issues Predisclosure List).  Specifically, deployment on
public cloud systems is NOT permitted.

This is because someone may notice the feature going away, and armed
with the knowledge of where the issue is, re-discover it.

Deployment of the mitigation is permitted only AFTER the embargo ends.
Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.


(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-BEGIN PGP SIGNATURE-

iQFABAEBCAAqFiEEI+MiLBRfRHX6gGCng/4UyVfoK9kFAl3w3FwMHHBncEB4ZW4u
b3JnAAoJEIP+FMlX6CvZt+QIAL4wU2XUXRQZFk4uS9m4EYV3tlzOidJVcAOvr4pC
x9O0rCRrUTnXvaqDj/X7fqPC4e/uHy4yPgg2gnRqb4y/jXJexPBkY/fsZJ64JdWJ
Fo+0a9CK8IrlzhXFcxVff49kUC3Vv/X2FMa5mY07wfg3ww2qyh9rUiKSFEX4B8vV
6lfMbFZNyOiO2vm1RnQzUCRnUeHnLXmR22BIvwLX6496qoI/ubHDBOK8NX0RU81e
N1wdKlOlfmX1SuXfYzKPcdulmKLHnxiVgxG5FAsaQ5At3luA0+WEn5scoBXG99uB
e6EkbmDpLabceQufMPR7Bvad3uVSzg3qLe/NvW4bd4Fvzb0=
=Td+m
-END PGP SIGNATURE-


xsa309.meta
Description: Binary data


xsa309.patch
Description: Binary data
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org

[Xen-devel] Xen Security Advisory 311 v4 (CVE-2019-19577) - Bugs in dynamic height handling for AMD IOMMU pagetables

2019-12-11 Thread Xen . org security team

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Xen Security Advisory CVE-2019-19577 / XSA-311
   version 4

 Bugs in dynamic height handling for AMD IOMMU pagetables

UPDATES IN VERSION 4


Public release.

Re-base 4.12 patch onto latest stable tree commits.

Updated metadata to add 4.13, update StableRef's

ISSUE DESCRIPTION
=

When running on AMD systems with an IOMMU, Xen attempted to
dynamically adapt the number of levels of pagetables (the pagetable
height) in the IOMMU according to the guest's address space size.  The
code to select and update the height had several bugs.

Notably, the update was done without taking a lock which is necessary
for safe operation.

IMPACT
==

A malicious guest administrator can cause Xen to access data
structures while they are being modified, causing Xen to crash.
Privilege escalation is thought to be very difficult but cannot be
ruled out.

Additionally, there is a potential memory leak of 4kb per guest boot,
under memory pressure.

VULNERABLE SYSTEMS
==

Only Xen on AMD CPUs is vulnerable.  Xen running on Intel CPUs is not
vulnerable.  ARM systems are not vulnerable.

Only systems where guests are given direct access to physical devices
are vulnerable.  Systems which do not use PCI pass-through are not
vulnerable.

Only HVM guests can exploit the vulnerability.  PV and PVH guests
cannot.

All versions of Xen with IOMMU support are vulnerable.

MITIGATION
==

In some configurations, use of passthrough can be replaced with a
higher-level protocol such as Xen PV block or network devices.
There is no other mitigation.

CREDITS
===

This issue was discovered by Sander Eikelenboom, along with Andrew Cooper of
Citrix.

RESOLUTION
==

Applying the appropriate (set of) attached patch(es) resolves this issue.

xsa311.patch   xen-unstable, Xen 4.13.x
xsa311-4.12.patch  Xen 4.12.x
xsa311-4.11.patch  Xen 4.11.x
xsa311-4.10-*.patchXen 4.10.x
xsa311-4.9-*.patch Xen 4.9.x
xsa311-4.8-*.patch Xen 4.8.x

$ sha256sum xsa311*
ea929752043b5d4659cb605314887441daa33ee6450e755d6f077e57fc7abf9e  xsa311.meta
732975f33b6d893b984540c4c748eb5cdf1cf81bd565e41b57795458cae3ccad  xsa311.patch
27e30da9360eec850f6e7d8f2ea465d2f00a5a5a45c43042e4c18786c6c9338f  
xsa311-4.8-1.patch
6e2372eb18f3ca25093445a93bcdf674ed2d7d3012e8611911ea2b9ca8d58bd4  
xsa311-4.8-2.patch
c73bee7aa8fac02d0982b4fb21de053918f80cc0158bd5bfca68e3dc994759be  
xsa311-4.9-1.patch
e89f5c381bd6a8fa8c5f63a829b586fdbefefe311c0f1084d2baeea3e933da66  
xsa311-4.9-2.patch
c73bee7aa8fac02d0982b4fb21de053918f80cc0158bd5bfca68e3dc994759be  
xsa311-4.10-1.patch
189a51048ad88efd855e6e78a307fff68e0c139225ce528c253558d266fffe02  
xsa311-4.10-2.patch
1aaf26d1c231c8b5dd00900c00c18bf884d23b9568c9746866d92f39daf1c02f  
xsa311-4.11.patch
5f43fa4628f6d1a8f6f903e662226a09524b8c354e06e1a6039837db656c0218  
xsa311-4.12.patch
$

DEPLOYMENT DURING EMBARGO
=

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.


(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-BEGIN PGP SIGNATURE-

iQFABAEBCAAqFiEEI+MiLBRfRHX6gGCng/4UyVfoK9kFAl3w3F8MHHBncEB4ZW4u
b3JnAAoJEIP+FMlX6CvZgF0IAIOtY9LMbRkBWgc16lOs+MTDOC7h4fYqofjQetFN
wAJ2Q3w2QXN+Zt54L8dmc6+Zzvn9Do4AJeMvfCzFxuw2OaMBwcwI9DcEbZ+CvYsa
hiXf9xKBBEfCu8PjisRnBqKuyqrLQdBSad9vXcGOVloXiFzJ1wbKnSMBNig9ZTi2
us3c9MeUTnf95W/KTQNe2Gu8KQiogzzBUUifdB6YU0MNNhL60OzfSwgautD9XHfA
+NcRogDnf6KgAs6VKgHSDxyVWbvnaWvKWGF2M2QXwXHjqCH/ox87OIIgZ/HSodXB
e07vCaweCG4GgWDGQN5K3+9Cu1B6+t0RYzPYmuhPDy/kWF0=
=RJ0B
-END PGP SIGNATURE-


xsa311.meta
Description: Binary data


xsa311.patch
Description: Binary data


xsa311-4.8-1.patch
Description: Binary data


xsa311-4.8-2.patch
Description: Binary data


xsa311-4.9-1.patch
Description: Binary data


xsa311-4.9-2.patch
Description: Binary data


xsa311-4.10-1.patch
Description: Binary data


xsa311-4.10-2.patch
Description: Binary data


xsa311-4.11.patch
Description: Binary data


xsa311-4.12.patch
Description: Binary data
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org

[Xen-devel] Xen Security Advisory 307 v3 (CVE-2019-19581, CVE-2019-19582) - find_next_bit() issues

2019-12-11 Thread Xen . org security team

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Xen Security Advisory CVE-2019-19581,CVE-2019-19582 / XSA-307
  version 3

find_next_bit() issues

UPDATES IN VERSION 3


Public release.

Updated metadata to add 4.13, update StableRef's

ISSUE DESCRIPTION
=

In a number of places bitmaps are being used by the hypervisor to track
certain state.  Iteration over all bits involves functions which may
misbehave in certain corner cases:
- - On 32-bit Arm accesses to bitmaps with bit a count which is a multiple
  of 32, an out of bounds access may occur.  (CVE-2019-19581)
- - On x86 accesses to bitmaps with a compile time known size of 64 may
  incur undefined behavior, which may in particular result in infinite
  loops. (CVE-2019-19582)

IMPACT
==

A malicious guest may cause a hypervisor crash or hang, resulting in a
Denial of Service (DoS).

VULNERABLE SYSTEMS
==

All versions of Xen are vulnerable.

32-bit Arm systems are vulnerable.

x86 systems with 64 or more nodes are vulnerable.  We are unaware of any
such systems that Xen would run on.

64-bit Arm systems as well as x86 systems with less than 64 nodes are
not vulnerable.

MITIGATION
==

There is no known mitigation for 32-bit Arm systems.

For x86 systems the issue can be avoided by suppressing the use of NUMA
information provided by firmware, via the "numa=off" command line
option.

RESOLUTION
==

Applying the attached patch resolves this issue.

xsa307.patch   xen-unstable, Xen 4.13.x ... 4.8.x

$ sha256sum xsa307*
e589e96a0b3ec66f1d2d6393b82fab13ed18fd9fb112044a12263336b8499c68  xsa307.meta
7df052768cc05329bc44bf724897227885da8bb2cde9ff01d0ba2a34611bde97  xsa307.patch
$

DEPLOYMENT DURING EMBARGO
=

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.

(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-BEGIN PGP SIGNATURE-

iQFABAEBCAAqFiEEI+MiLBRfRHX6gGCng/4UyVfoK9kFAl3w24gMHHBncEB4ZW4u
b3JnAAoJEIP+FMlX6CvZxokH/2bGTmGUZP0tyc+oDHjlrr3+FarhoJnRTl4EoqJS
hzsa5OkcqzcEgrQ+7VL7dLW3AboT2zcx2RQ9HyxCz61BfDY1XF8EDDr6chJiNofN
J7OGirNzSBHFFQJOc2KFG8al+1F8WzzKP3UMbqNBrqB07/tQc5lttdbA/t5Tnp9c
xreCAkkBscDk1LFR8HiUA3YeykiHQtF09O+VnxXO2AD/Dpo8e+K6AmJkCZ4+ysNP
JKMc13vQ3UKjMmYzgbuNCIswNu1Wy3EnNZMf2zvGIhuw6iN6vSJJgoz0OSPUb4yY
kXEe1dlgseSbMxXEqj4IyZ69pEw6Ijj+H6PybQo/IOie7q0=
=7XWU
-END PGP SIGNATURE-


xsa307.meta
Description: Binary data


xsa307.patch
Description: Binary data
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/6] xen/hypercall: Cope with -ERESTART on more hypercall paths

2019-12-11 Thread Julien Grall


Hi,

On 09/12/2019 17:37, Andrew Cooper wrote:

On 08/12/2019 12:57, Julien Grall wrote:

Hi Andrew,

On 05/12/2019 22:30, Andrew Cooper wrote:

These hypercalls each use continue_hypercall_on_cpu(), whose API is
about to
switch to use -ERESTART.  Update the soon-to-be affected paths to cope,
folding existing contination logic where applicable.

In addition:
   * For platform op and sysctl, insert a cpu_relax() into what is
otherwise a
     tight spinlock loop, and make the continuation logic common at the
     epilogue.
   * Contrary to the comment in the code, kexec_exec() does return in the
     KEXEC_REBOOT case, needs to pass ret back to the caller.


It is not entirely trivial to me that KEXEC_REBOOT refers to
KEXEC_DEFAULT_TYPE. The more that if you look at the kexec_reboot()
helper, it will not return (see BUG()). What may return is
continue_hypercall_on_cpu().

So would it make sense to use KEXEC_DEFAULT_TYPE?


I'm not sure why I capitalised it, but no - using KEXEC_DEFAULT_TYPE is
worse.  A casual reader is far more likely to understand kexec_reboot()
in this context.


But kexec_reboot() cannot return (see BUG()) so a reader may not 
understand why you suggest that it will return. So I think this want to 
be reworded.


[...]


@@ -816,6 +819,13 @@ ret_t
do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op)
    out:
   spin_unlock(_lock);
   +    if ( ret == -ERESTART )
+    {
+    create_continuation:


Shall we indent create_continuation the same way as out?


They have different scopes, and while it may look weird, this is in
accordance with our style.



[...]


@@ -1263,13 +1263,25 @@ static int do_kexec_op_internal(unsigned long
op,
     long do_kexec_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void)
uarg)
   {
-    return do_kexec_op_internal(op, uarg, 0);
+    int ret = do_kexec_op_internal(op, uarg, 0);

Shouldn't it be long (or unsigned long) here? Otherwise, the return of
hypercall_create_continuation() may be truncated.


If you're concerned about truncation via this pattern, then there are
other areas of the code to be worred about.


I knew you would mention the other places :).



However, there is nothing to truncate.  The return value of
hypercall_create_continuation() is the primary hypercall number, i.e.
__HYPERVISOR_kexec_op in this case.


Make sense. And I guess the signed bit will be propagated so if 
do_kexec_op() return -EINVAL (32-bit), then it would still be -EINVAL 
(64-bit).


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 3/3] xen/blkback: Remove unnecessary static variable name prefixes

2019-12-11 Thread Roger Pau Monné

On Wed, Dec 11, 2019 at 04:27:33AM +, SeongJae Park wrote:
> A few of static variables in blkback have 'xen_blkif_' prefix, though it
> is unnecessary for static variables.  This commit removes such prefixes.
> 
> Signed-off-by: SeongJae Park 

Thanks.

Reviewed-by: Roger Pau Monné 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 1/2] xenbus/backend: Add memory pressure handler callback

2019-12-11 Thread SeongJae Park

On Wed, 11 Dec 2019 11:51:12 +0100 "Roger Pau Monné"  
wrote:

> > On Tue, 10 Dec 2019 11:16:35 +0100 "Roger Pau Monné"  
> > wrote:
> > > > diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
> > > > index 869c816d5f8c..cdb075e4182f 100644
> > > > --- a/include/xen/xenbus.h
> > > > +++ b/include/xen/xenbus.h
> > > > @@ -104,6 +104,7 @@ struct xenbus_driver {
> > > > struct device_driver driver;
> > > > int (*read_otherend_details)(struct xenbus_device *dev);
> > > > int (*is_ready)(struct xenbus_device *dev);
> > > > +   unsigned (*reclaim)(struct xenbus_device *dev);
> > > 
> > > ... hence I wonder why it's returning an unsigned when it's just
> > > ignored.
> > > 
> > > IMO it should return an int to signal errors, and the return should be
> > > ignored.
> > 
> > I first thought similarly and set the callback to return something.  
> > However,
> > as this callback is called to simply notify the memory pressure and ask the
> > driver to free its memory as many as possible, I couldn't easily imagine 
> > what
> > kind of errors that need to be handled by its caller can occur in the 
> > callback,
> > especially because current blkback's callback implementation has no such 
> > error.
> > So, if you and others agree, I would like to simply set the return type to
> > 'void' for now and defer the error handling to a future change.
> 
> Yes, I also wondered the same, but seeing you returned an integer I
> assumed there was interest in returning some kind of value. If there's
> nothing to return let's just make it void.
> 
> > > 
> > > Also, I think it would preferable for this function to take an extra
> > > parameter to describe the resource the driver should attempt to free
> > > (ie: memory or interrupts for example). I'm however not able to find
> > > any existing Linux type to describe such resources.
> > 
> > Yes, such extention would be the right direction.  However, because there 
> > is no
> > existing Linux type to describe the type of resources to reclaim as you also
> > mentioned, there could be many different opinions about its implementation
> > detail.  In my opinion, it could be also possible to simply add another
> > callback for another resource type.  That said, because currently we have an
> > use case and an implementation for the memory pressure only, I would like to
> > let it as is for now and defer the extension as a future work, if you and
> > others have no objection.
> 
> Ack, can I please ask the callback to be named reclaim_memory or some
> such then?

Yes, I will change the name.


Thanks,
SeongJae Park

> 
> Thanks, Roger.
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 2/2] xen/blkback: Squeeze page pools if a memory pressure is detected

2019-12-11 Thread SeongJae Park

On Wed, 11 Dec 2019 12:14:44 +0100 "Roger Pau Monné"  
wrote:

> 
> I see that you have already sent v6, for future iterations can you
> please wait until the conversation on the previous version has been
> settled?
> 
> I'm still replying to your replies to v5, and hence you should hold off
> sending v6 until we get some kind of conclusion/agreement.

Sorry, I was inpatient.

> 
> On Wed, Dec 11, 2019 at 05:08:12AM +0100, SeongJae Park wrote:
> > On Tue, 10 Dec 2019 12:04:32 +0100 "Roger Pau Monné"  
> > wrote:
> > 
> > > > Each `blkif` has a free pages pool for the grant mapping.  The size of
> > > > the pool starts from zero and be increased on demand while processing
> > > > the I/O requests.  If current I/O requests handling is finished or 100
> > > > milliseconds has passed since last I/O requests handling, it checks and
> > > > shrinks the pool to not exceed the size limit, `max_buffer_pages`.
> > > > 
> > > > Therefore, `blkfront` running guests can cause a memory pressure in the
> > > > `blkback` running guest by attaching a large number of block devices and
> > > > inducing I/O.
> > > 
> > > Hm, I don't think this is actually true. blkfront cannot attach an
> > > arbitrary number of devices, blkfront is just a frontend for a device
> > > that's instantiated by the Xen toolstack, so it's the toolstack the one
> > > that controls the amount of PV block devices.
> > 
> > Right, the problem can occur only if it is mis-configured so that the 
> > frontend
> > running guests can attach a large number of devices which is enough to cause
> > the memory pressure.  I tried to explain it in below paragraph, but seems 
> > above
> > paragraph is a little bit confusing.  I will wordsmith the sentence in the 
> > next
> > version.
> 
> I would word it along these lines:
> 
> "Host administrators can cause memory pressure in blkback by attaching
> a large number of block devices and inducing I/O."

Hmm, much better :)

> 
> > > 
> > > > System administrators can avoid such problematic
> > > > situations by limiting the maximum number of devices each guest can
> > > > attach.  However, finding the optimal limit is not so easy.  Improper
> > > > set of the limit can results in the memory pressure or a resource
> > > > underutilization.  This commit avoids such problematic situations by
> > > > squeezing the pools (returns every free page in the pool to the system)
> > > > for a while (users can set this duration via a module parameter) if a
> > > > memory pressure is detected.
> > > > 
> > > > Discussions
> > > > ===
> > > > 
> > > > The `blkback`'s original shrinking mechanism returns only pages in the
> > > > pool, which are not currently be used by `blkback`, to the system.  In
> > > > other words, the pages are not mapped with foreign pages.  Because this
> > > ^ that   ^ granted
> > > > commit is changing only the shrink limit but uses the mechanism as is,
> > > > this commit does not introduce improper mappings related security
> > > > issues.
> > > 
> > > That last sentence is hard to parse. I think something like:
> > > 
> > > "Because this commit is changing only the shrink limit but still uses the
> > > same freeing mechanism it does not touch pages which are currently
> > > mapping grants."
> > > 
> > > > 
> > > > Once a memory pressure is detected, this commit keeps the squeezing
> > > > limit for a user-specified time duration.  The duration should be
> > > > neither too long nor too short.  If it is too long, the squeezing
> > > > incurring overhead can reduce the I/O performance.  If it is too short,
> > > > `blkback` will not free enough pages to reduce the memory pressure.
> > > > This commit sets the value as `10 milliseconds` by default because it is
> > > > a short time in terms of I/O while it is a long time in terms of memory
> > > > operations.  Also, as the original shrinking mechanism works for at
> > > > least every 100 milliseconds, this could be a somewhat reasonable
> > > > choice.  I also tested other durations (refer to the below section for
> > > > more details) and confirmed that 10 milliseconds is the one that works
> > > > best with the test.  That said, the proper duration depends on actual
> > > > configurations and workloads.  That's why this commit is allowing users
> > > ^ allows
> > > > to set it as their optimal value via the module parameter.
> > > 
> > > ... to set the duration as a module parameter.
> > 
> > Thank you for great suggestions, I will apply those.
> > 
> > > 
> > > > 
> > > > Memory Pressure Test
> > > > 
> > > > 
> > > > To show how this commit fixes the memory pressure situation well, I
> > > > configured a test environment on a xen-running virtualization system.
> > > > On the `blkfront` running guest instances, I attach a large number of
> > > > network-backed volume devices and induce I/O to those.  Meanwhile, I
> > > > measure the

[Xen-devel] [ovmf test] 144706: regressions - FAIL

2019-12-11 Thread osstest service owner

flight 144706 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144706/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   6 xen-buildfail REGR. vs. 144637
 build-i386-xsm6 xen-buildfail REGR. vs. 144637
 build-amd64-xsm   6 xen-buildfail REGR. vs. 144637
 build-i3866 xen-buildfail REGR. vs. 144637

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a

version targeted for testing:
 ovmf d3add11e87dace180387562d6f1951f2bffbd3d9
baseline version:
 ovmf 804666c86e7b6f04fe5c5cfdb13199c19e0e99b0

Last test of basis   144637  2019-12-09 09:09:49 Z2 days
Failing since144646  2019-12-10 01:39:53 Z1 days8 attempts
Testing same since   144703  2019-12-11 10:09:05 Z0 days2 attempts


People who touched revisions under test:
  Antoine Coeur 
  Bob Feng 
  Jiewen Yao 
  Michael Kubacki 
  Philippe Mathieu-Daude 
  Steven Shi 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit d3add11e87dace180387562d6f1951f2bffbd3d9
Author: Michael Kubacki 
Date:   Wed Nov 20 17:31:24 2019 -0800

MdeModulePkg PeiCore: Improve comment semantics

This patch clarifies wording in several PeiCore comments to improve
reading comprehension.

Cc: Dandan Bi 
Cc: Liming Gao 
Cc: Jian J Wang 
Cc: Hao A Wu 
Signed-off-by: Michael Kubacki 
Reviewed-by: Liming Gao 
Reviewed-by: Jian J Wang 

commit d39d1260c615b716675f67f5c4e1f4f52df01dad
Author: Michael Kubacki 
Date:   Wed Nov 20 17:10:48 2019 -0800

MdeModulePkg PeiCore: Fix typos

Cc: Dandan Bi 
Cc: Liming Gao 
Cc: Jian J Wang 
Cc: Hao A Wu 
Signed-off-by: Michael Kubacki 
Reviewed-by: Liming Gao 
Reviewed-by: Philippe Mathieu-Daude 
Reviewed-by: Jian J Wang 

commit 97eedf5dfbaffde33210fd88066247cf0b7d3325
Author: Antoine Coeur 
Date:   Wed Dec 4 12:14:53 2019 +0800

IntelFsp2WrapperPkg: Fix various typos

Fix various typos in comments and documentation.

Cc: Chasel Chiu 
Cc: Nate DeSimone 
Cc: Star Zeng 
Reviewed-by: Philippe Mathieu-Daude 
Signed-off-by: Philippe Mathieu-Daude 
Reviewed-by: Nate DeSimone 
Reviewed-by: Chasel Chiu 
Reviewed-by: Star Zeng 

commit 7e55cf6b48dcd43de46d008b2f12caaad2554503
Author: Jiewen Yao 
Date:   Sat Dec 7 21:41:10 2019 +0800

SecurityPkg/Tcg2Smm: Measure the table before patch.

REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1940

According to TCG PFP specification: the ACPI table must be
measured prior to any modification, and the measurement
must be same cross every boot cycle.

There is a fix 3a63c17ebc853cbb27d190729d01e27f68e65b94
for the HID data. However that is not enough.
The LAML/LASA and PCD configuration change may also cause
similar problem.

We need measure the table before any update.

Cc: Jian J Wang 
Cc: Chao Zhang 
Signed-off-by: Jiewen Yao 
Reviewed-by: Chao Zhang 

commit a80032dc44a1071a34f4415a7c5cef5170ee6159
Author: Steven Shi 
Date:   Tue Nov 19 16:22:08 2019 +0800

BaseTools: Remove

Re: [Xen-devel] [PATCH v6 1/3] xenbus/backend: Add memory pressure handler callback

2019-12-11 Thread Roger Pau Monné

On Wed, Dec 11, 2019 at 04:26:57AM +, SeongJae Park wrote:
> Granting pages consumes backend system memory.  In systems configured
> with insufficient spare memory for those pages, it can cause a memory
> pressure situation.  However, finding the optimal amount of the spare
  ^ s/the//
> memory is challenging for large systems having dynamic resource
> utilization patterns.  Also, such a static configuration might lack
> flexibility.
> 
> To mitigate such problems, this commit adds a memory reclaim callback to
> 'xenbus_driver'.  If a memory pressure is detected, 'xenbus' requests
   ^ s/a//
> every backend driver to volunarily release its memory.
> 
> Note that it would be able to improve the callback facility for more
^ possible
> sophisticated handlings of general pressures.  For example, it would be
^ handling of resource starvation.
> possible to monitor the memory consumption of each device and issue the
> release requests to only devices which causing the pressure.  Also, the
> callback could be extended to handle not only memory, but general
> resources.  Nevertheless, this version of the implementation defers such
> sophisticated goals as a future work.
> 
> Reviewed-by: Juergen Gross 
> Signed-off-by: SeongJae Park 
> ---
>  drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++
>  include/xen/xenbus.h  |  1 +
>  2 files changed, 33 insertions(+)
> 
> diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c 
> b/drivers/xen/xenbus/xenbus_probe_backend.c
> index b0bed4faf44c..aedbe2198de5 100644
> --- a/drivers/xen/xenbus/xenbus_probe_backend.c
> +++ b/drivers/xen/xenbus/xenbus_probe_backend.c
> @@ -248,6 +248,35 @@ static int backend_probe_and_watch(struct notifier_block 
> *notifier,
>   return NOTIFY_DONE;
>  }
>  
> +static int xenbus_backend_reclaim(struct device *dev, void *data)

No need for the xenbus_ prefix since it's a static function, ie:
backend_reclaim_memory should be fine IMO.

> +{
> + struct xenbus_driver *drv;

I've asked for this variable to be constified in v5, is it not
possible to make it const?

> +
> + if (!dev->driver)
> + return 0;
> + drv = to_xenbus_driver(dev->driver);
> + if (drv && drv->reclaim)
> + drv->reclaim(to_xenbus_device(dev));
> + return 0;
> +}
> +
> +/*
> + * Returns 0 always because we are using shrinker to only detect memory
> + * pressure.
> + */
> +static unsigned long xenbus_backend_shrink_count(struct shrinker *shrinker,
> + struct shrink_control *sc)
> +{
> + bus_for_each_dev(_backend.bus, NULL, NULL,
> + xenbus_backend_reclaim);
> + return 0;
> +}
> +
> +static struct shrinker xenbus_backend_shrinker = {

I would drop the xenbus prefix, and I think it's not possible to
constify this due to register_shrinker expecting a non-const
parameter?

> + .count_objects = xenbus_backend_shrink_count,
> + .seeks = DEFAULT_SEEKS,
> +};
> +
>  static int __init xenbus_probe_backend_init(void)
>  {
>   static struct notifier_block xenstore_notifier = {
> @@ -264,6 +293,9 @@ static int __init xenbus_probe_backend_init(void)
>  
>   register_xenstore_notifier(_notifier);
>  
> + if (register_shrinker(_backend_shrinker))
> + pr_warn("shrinker registration failed\n");

Can you add a xenbus prefix to the error message? Or else it's hard to
know which subsystem is complaining when you see such message on the
log. ie: "xenbus: shrinker ..."

> +
>   return 0;
>  }
>  subsys_initcall(xenbus_probe_backend_init);
> diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
> index 869c816d5f8c..196260017666 100644
> --- a/include/xen/xenbus.h
> +++ b/include/xen/xenbus.h
> @@ -104,6 +104,7 @@ struct xenbus_driver {
>   struct device_driver driver;
>   int (*read_otherend_details)(struct xenbus_device *dev);
>   int (*is_ready)(struct xenbus_device *dev);
> + void (*reclaim)(struct xenbus_device *dev);

reclaim_memory (if Juergen agrees).

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 1/2] x86/mm: factor out the code for shattering an l3 PTE

2019-12-11 Thread Julien Grall


Hi Hongyan,

On 11/12/2019 11:28, Xia, Hongyan wrote:

On Wed, 2019-12-11 at 11:10 +, Julien Grall wrote:

Hi Hongyan,
...

While this involves more instructions, how often do we expect the
code
to be called?

Cheers,



I don't expect this to be called very often in the current Xen.
Although with direct map removal, a lot of the memory allocations
(mostly xenheap allocations) will be mapped and unmapped on-demand and
there is a much higher change of merging/shattering.


Thank you for the explanation. In order to merge/shatter, you need the 
buddy allocator to ensure all the xenheap are allocated contiguously. I 
am pretty unconvinced there are a lot of page allocated via xenheap. So 
the chance to have contiguous xenheap allocation is very limited.


But the merging/shattering can be counterproductive. An example short 
memory allocation (we have a few places like that):

   xmalloc(...);
   do something
   xfree(...);

We would end up to merge and then a few ms later shatter again. So it 
feels to me, that merging is probably not worth it (I am planning to 
discuss with Andrew today about it).




However, the series moved all PTEs from xenheap to domheap, and we
might see other things moved to domheap in the future, so we might not
have many things left on xenheap anyway.


Typesafe is an important part of making our code base more secure than 
basic C (such as not mixing type).


In this case, I think if we start to merge/shatter a lot, then we have a 
bigger problem and we may want to consider to remove it (see above) So 
it feels the optimization is not worth it.


Note that I am not maintaining this code, so the final call is on Andrew 
and Jan.


Cheers,

--

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-next 2/7] x86: fix up hyperv-tlfs.h

2019-12-11 Thread Wei Liu

On Tue, Dec 10, 2019 at 04:35:41PM +0100, Jan Beulich wrote:
> On 25.10.2019 11:16, Wei Liu wrote:
> > Do the following:
> > 1. include xen/types.h and xen/bitops.h
> > 2. fix up invocations of BIT macro
> 
> Is it truly BIT(..., UL) in _all_ cases, and not BIT(..., U) in some?

In Linux BIT is 

   #define BIT(nr) (UL(1) << nr)

so yes Linux developers did mean UL everywhere.

I haven't checked them one by one per TLFS's definitions though.

Wei.

> 
> > Signed-off-by: Wei Liu 
> > ---
> > This can be squashed into previous patch if preferred.
> 
> Afaic - yes please.
> 
> Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable test] 144686: regressions - FAIL

2019-12-11 Thread osstest service owner

flight 144686 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144686/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 12 
guest-start/debianhvm.repeat fail REGR. vs. 144668

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds 16 guest-localmigrate   fail REGR. vs. 144668
 test-armhf-armhf-xl-rtds16 guest-start/debian.repeat fail REGR. vs. 144668

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 144668
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 144668
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 144668
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 144668
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 144668
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 144668
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 144668
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 144668
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 144668
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 xen  272c18435e93cbf749c096a9552ab5ef0d79a4ca
baseline version:
 xen  4935a5433db28077fe6313f920bbedcd54516cec

Last test of basis   144668  2019-12-10 15:37:57 Z0 days
Testing same since   144686  2019-12-11 01:51:21 Z0 days1

Re: [Xen-devel] [PATCH for-next 5/7] x86: use running_on_hypervisor to gate hypervisor_setup

2019-12-11 Thread Wei Liu

On Tue, Dec 10, 2019 at 05:17:28PM +0100, Jan Beulich wrote:
> On 25.10.2019 11:16, Wei Liu wrote:
> > --- a/xen/arch/x86/setup.c
> > +++ b/xen/arch/x86/setup.c
> > @@ -1577,7 +1577,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
> >  max_cpus = nr_cpu_ids;
> >  }
> >  
> > -if ( xen_guest )
> > +if ( running_on_hypervisor )
> >  hypervisor_setup();
> 
> This code is using hypervisor_name already, so I guess the patch
> has become unnecessary?

Yes. I basically squashed this patch into my previous series while
reworking that.

This patch here is not needed anymore.

Wei.

> 
> Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen-blkback: prevent premature module unload

2019-12-11 Thread Roger Pau Monné

On Tue, Dec 10, 2019 at 02:53:05PM +, Paul Durrant wrote:
> Objects allocated by xen_blkif_alloc come from the 'blkif_cache' kmem
> cache. This cache is destoyed when xen-blkif is unloaded so it is
> necessary to wait for the deferred free routine used for such objects to
> complete. This necessity was missed in commit 14855954f636 "xen-blkback:
> allow module to be cleanly unloaded". This patch fixes the problem by
> taking/releasing extra module references in xen_blkif_alloc/free()
> respectively.
> 
> Signed-off-by: Paul Durrant 

Reviewed-by: Roger Pau Monné 

One nit below.

> ---
> Cc: Konrad Rzeszutek Wilk 
> Cc: "Roger Pau Monné" 
> Cc: Jens Axboe 
> ---
>  drivers/block/xen-blkback/xenbus.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/block/xen-blkback/xenbus.c 
> b/drivers/block/xen-blkback/xenbus.c
> index e8c5c54e1d26..59d576d27ca7 100644
> --- a/drivers/block/xen-blkback/xenbus.c
> +++ b/drivers/block/xen-blkback/xenbus.c
> @@ -171,6 +171,15 @@ static struct xen_blkif *xen_blkif_alloc(domid_t domid)
>   blkif->domid = domid;
>   atomic_set(>refcnt, 1);
>   init_completion(>drain_complete);
> +
> + /*
> +  * Because freeing back to the cache may be deferred, it is not
> +  * safe to unload the module (and hence destroy the cache) until
> +  * this has completed. To prevent premature unloading, take an
> +  * extra module reference here and release only when the object
> +  * has been free back to the cache.
^ freed
> +  */
> + __module_get(THIS_MODULE);
>   INIT_WORK(>free_work, xen_blkif_deferred_free);
>  
>   return blkif;
> @@ -320,6 +329,7 @@ static void xen_blkif_free(struct xen_blkif *blkif)
>  
>   /* Make sure everything is drained before shutting down */
>   kmem_cache_free(xen_blkif_cachep, blkif);
> + module_put(THIS_MODULE);
>  }
>  
>  int __init xen_blkif_interface_init(void)
> -- 
> 2.20.1
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-next 1/7] x86: import hyperv-tlfs.h from Linux

2019-12-11 Thread Wei Liu

On Wed, Dec 11, 2019 at 11:22:49AM +, Durrant, Paul wrote:
> > -Original Message-
> > From: Wei Liu 
> > Sent: 11 December 2019 11:15
> > To: Jan Beulich 
> > Cc: Durrant, Paul ; Wei Liu ; Wei Liu
> > ; Paul Durrant ; Andrew Cooper
> > ; Michael Kelley ; Xen
> > Development List ; Roger Pau Monné
> > 
> > Subject: Re: [PATCH for-next 1/7] x86: import hyperv-tlfs.h from Linux
> > 
> > On Tue, Dec 10, 2019 at 04:43:30PM +0100, Jan Beulich wrote:
> > > On 10.12.2019 16:37, Durrant, Paul wrote:
> > > >> -Original Message-
> > > >> From: Xen-devel  On Behalf Of
> > Jan
> > > >> Beulich
> > > >> Sent: 10 December 2019 15:34
> > > >> To: Wei Liu 
> > > >> Cc: Wei Liu ; Paul Durrant ;
> > Andrew
> > > >> Cooper ; Michael Kelley
> > > >> ; Xen Development List  > > >> de...@lists.xenproject.org>; Roger Pau Monné 
> > > >> Subject: Re: [Xen-devel] [PATCH for-next 1/7] x86: import hyperv-
> > tlfs.h
> > > >> from Linux
> > > >>
> > > >> On 25.10.2019 11:16, Wei Liu wrote:
> > > >>> Taken from Linux commit b2d8b167e15bb5ec2691d1119c025630a247f649.
> > > >>>
> > > >>> This is a pristine copy from Linux. It is not used yet and probably
> > > >>> doesn't compile. Changes to make it work will come later.
> > > >>>
> > > >>> Signed-off-by: Wei Liu 
> > > >>
> > > >> This coming from Linux and assuming at least a fair part of it is
> > > >> going to be used, in principle
> > > >> Acked-by: Jan Beulich 
> > > >>
> > > >> However, there are many seemingly unnecessary uses of __packed
> > > >> here, which I'd rather not see go in at all (i.e. not be dropped
> > > >> later on, and then potentially missing some). I find ...
> > > >>
> > > >>> +typedef struct _HV_REFERENCE_TSC_PAGE {
> > > >>> + __u32 tsc_sequence;
> > > >>> + __u32 res1;
> > > >>> + __u64 tsc_scale;
> > > >>> + __s64 tsc_offset;
> > > >>> +}  __packed HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE;
> > > >>
> > > >
> > > > You realise there's a definition of this in the viridian code already,
> > right?
> > >
> > > It looked familiar, but it didn't occur to me to point this out.
> > > Yes, there looks to be room for deduplication...
> > >
> > 
> > I had a plan to make viridian code use this copy directly.
> > 
> 
> I have no objection to that, but I think it ought to be done as part of this 
> series so that we don't end up with long-term duplication.

Sure. That shouldn't be too hard.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 1/2] x86/mm: factor out the code for shattering an l3 PTE

2019-12-11 Thread Xia, Hongyan

On Wed, 2019-12-11 at 11:10 +, Julien Grall wrote:
> Hi Hongyan,
> ...
> 
> While this involves more instructions, how often do we expect the
> code 
> to be called?
> 
> Cheers,
> 

I don't expect this to be called very often in the current Xen.
Although with direct map removal, a lot of the memory allocations
(mostly xenheap allocations) will be mapped and unmapped on-demand and
there is a much higher change of merging/shattering.

However, the series moved all PTEs from xenheap to domheap, and we
might see other things moved to domheap in the future, so we might not
have many things left on xenheap anyway.

Hongyan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-next 1/7] x86: import hyperv-tlfs.h from Linux

2019-12-11 Thread Durrant, Paul

> -Original Message-
> From: Wei Liu 
> Sent: 11 December 2019 11:15
> To: Jan Beulich 
> Cc: Durrant, Paul ; Wei Liu ; Wei Liu
> ; Paul Durrant ; Andrew Cooper
> ; Michael Kelley ; Xen
> Development List ; Roger Pau Monné
> 
> Subject: Re: [PATCH for-next 1/7] x86: import hyperv-tlfs.h from Linux
> 
> On Tue, Dec 10, 2019 at 04:43:30PM +0100, Jan Beulich wrote:
> > On 10.12.2019 16:37, Durrant, Paul wrote:
> > >> -Original Message-
> > >> From: Xen-devel  On Behalf Of
> Jan
> > >> Beulich
> > >> Sent: 10 December 2019 15:34
> > >> To: Wei Liu 
> > >> Cc: Wei Liu ; Paul Durrant ;
> Andrew
> > >> Cooper ; Michael Kelley
> > >> ; Xen Development List  > >> de...@lists.xenproject.org>; Roger Pau Monné 
> > >> Subject: Re: [Xen-devel] [PATCH for-next 1/7] x86: import hyperv-
> tlfs.h
> > >> from Linux
> > >>
> > >> On 25.10.2019 11:16, Wei Liu wrote:
> > >>> Taken from Linux commit b2d8b167e15bb5ec2691d1119c025630a247f649.
> > >>>
> > >>> This is a pristine copy from Linux. It is not used yet and probably
> > >>> doesn't compile. Changes to make it work will come later.
> > >>>
> > >>> Signed-off-by: Wei Liu 
> > >>
> > >> This coming from Linux and assuming at least a fair part of it is
> > >> going to be used, in principle
> > >> Acked-by: Jan Beulich 
> > >>
> > >> However, there are many seemingly unnecessary uses of __packed
> > >> here, which I'd rather not see go in at all (i.e. not be dropped
> > >> later on, and then potentially missing some). I find ...
> > >>
> > >>> +typedef struct _HV_REFERENCE_TSC_PAGE {
> > >>> +   __u32 tsc_sequence;
> > >>> +   __u32 res1;
> > >>> +   __u64 tsc_scale;
> > >>> +   __s64 tsc_offset;
> > >>> +}  __packed HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE;
> > >>
> > >
> > > You realise there's a definition of this in the viridian code already,
> right?
> >
> > It looked familiar, but it didn't occur to me to point this out.
> > Yes, there looks to be room for deduplication...
> >
> 
> I had a plan to make viridian code use this copy directly.
> 

I have no objection to that, but I think it ought to be done as part of this 
series so that we don't end up with long-term duplication.

  Paul

> 
> > Actually, Wei, one more thing I was curious about - what is "tlfs"
> > an acronym of?
> 
> It means "Top-Level Function Specification".
> 
> (I wish Xen had something similar)
> 
> Wei.
> 
> >
> > Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 2/2] xen/blkback: Squeeze page pools if a memory pressure is detected

2019-12-11 Thread Roger Pau Monné

Hello,

I see that you have already sent v6, for future iterations can you
please wait until the conversation on the previous version has been
settled?

I'm still replying to your replies to v5, and hence you should hold off
sending v6 until we get some kind of conclusion/agreement.

On Wed, Dec 11, 2019 at 05:08:12AM +0100, SeongJae Park wrote:
> On Tue, 10 Dec 2019 12:04:32 +0100 "Roger Pau Monné"  
> wrote:
> 
> > > Each `blkif` has a free pages pool for the grant mapping.  The size of
> > > the pool starts from zero and be increased on demand while processing
> > > the I/O requests.  If current I/O requests handling is finished or 100
> > > milliseconds has passed since last I/O requests handling, it checks and
> > > shrinks the pool to not exceed the size limit, `max_buffer_pages`.
> > > 
> > > Therefore, `blkfront` running guests can cause a memory pressure in the
> > > `blkback` running guest by attaching a large number of block devices and
> > > inducing I/O.
> > 
> > Hm, I don't think this is actually true. blkfront cannot attach an
> > arbitrary number of devices, blkfront is just a frontend for a device
> > that's instantiated by the Xen toolstack, so it's the toolstack the one
> > that controls the amount of PV block devices.
> 
> Right, the problem can occur only if it is mis-configured so that the frontend
> running guests can attach a large number of devices which is enough to cause
> the memory pressure.  I tried to explain it in below paragraph, but seems 
> above
> paragraph is a little bit confusing.  I will wordsmith the sentence in the 
> next
> version.

I would word it along these lines:

"Host administrators can cause memory pressure in blkback by attaching
a large number of block devices and inducing I/O."

> > 
> > > System administrators can avoid such problematic
> > > situations by limiting the maximum number of devices each guest can
> > > attach.  However, finding the optimal limit is not so easy.  Improper
> > > set of the limit can results in the memory pressure or a resource
> > > underutilization.  This commit avoids such problematic situations by
> > > squeezing the pools (returns every free page in the pool to the system)
> > > for a while (users can set this duration via a module parameter) if a
> > > memory pressure is detected.
> > > 
> > > Discussions
> > > ===
> > > 
> > > The `blkback`'s original shrinking mechanism returns only pages in the
> > > pool, which are not currently be used by `blkback`, to the system.  In
> > > other words, the pages are not mapped with foreign pages.  Because this
> > ^ that   ^ granted
> > > commit is changing only the shrink limit but uses the mechanism as is,
> > > this commit does not introduce improper mappings related security
> > > issues.
> > 
> > That last sentence is hard to parse. I think something like:
> > 
> > "Because this commit is changing only the shrink limit but still uses the
> > same freeing mechanism it does not touch pages which are currently
> > mapping grants."
> > 
> > > 
> > > Once a memory pressure is detected, this commit keeps the squeezing
> > > limit for a user-specified time duration.  The duration should be
> > > neither too long nor too short.  If it is too long, the squeezing
> > > incurring overhead can reduce the I/O performance.  If it is too short,
> > > `blkback` will not free enough pages to reduce the memory pressure.
> > > This commit sets the value as `10 milliseconds` by default because it is
> > > a short time in terms of I/O while it is a long time in terms of memory
> > > operations.  Also, as the original shrinking mechanism works for at
> > > least every 100 milliseconds, this could be a somewhat reasonable
> > > choice.  I also tested other durations (refer to the below section for
> > > more details) and confirmed that 10 milliseconds is the one that works
> > > best with the test.  That said, the proper duration depends on actual
> > > configurations and workloads.  That's why this commit is allowing users
> > ^ allows
> > > to set it as their optimal value via the module parameter.
> > 
> > ... to set the duration as a module parameter.
> 
> Thank you for great suggestions, I will apply those.
> 
> > 
> > > 
> > > Memory Pressure Test
> > > 
> > > 
> > > To show how this commit fixes the memory pressure situation well, I
> > > configured a test environment on a xen-running virtualization system.
> > > On the `blkfront` running guest instances, I attach a large number of
> > > network-backed volume devices and induce I/O to those.  Meanwhile, I
> > > measure the number of pages that swapped in and out on the `blkback`
> > > running guest.  The test ran twice, once for the `blkback` before this
> > > commit and once for that after this commit.  As shown below, this commit
> > > has dramatically reduced the memory pressure:
> > > 
> > > pswpin

Re: [Xen-devel] [PATCH for-next 1/7] x86: import hyperv-tlfs.h from Linux

2019-12-11 Thread Wei Liu

On Tue, Dec 10, 2019 at 04:43:30PM +0100, Jan Beulich wrote:
> On 10.12.2019 16:37, Durrant, Paul wrote:
> >> -Original Message-
> >> From: Xen-devel  On Behalf Of Jan
> >> Beulich
> >> Sent: 10 December 2019 15:34
> >> To: Wei Liu 
> >> Cc: Wei Liu ; Paul Durrant ; Andrew
> >> Cooper ; Michael Kelley
> >> ; Xen Development List  >> de...@lists.xenproject.org>; Roger Pau Monné 
> >> Subject: Re: [Xen-devel] [PATCH for-next 1/7] x86: import hyperv-tlfs.h
> >> from Linux
> >>
> >> On 25.10.2019 11:16, Wei Liu wrote:
> >>> Taken from Linux commit b2d8b167e15bb5ec2691d1119c025630a247f649.
> >>>
> >>> This is a pristine copy from Linux. It is not used yet and probably
> >>> doesn't compile. Changes to make it work will come later.
> >>>
> >>> Signed-off-by: Wei Liu 
> >>
> >> This coming from Linux and assuming at least a fair part of it is
> >> going to be used, in principle
> >> Acked-by: Jan Beulich 
> >>
> >> However, there are many seemingly unnecessary uses of __packed
> >> here, which I'd rather not see go in at all (i.e. not be dropped
> >> later on, and then potentially missing some). I find ...
> >>
> >>> +typedef struct _HV_REFERENCE_TSC_PAGE {
> >>> + __u32 tsc_sequence;
> >>> + __u32 res1;
> >>> + __u64 tsc_scale;
> >>> + __s64 tsc_offset;
> >>> +}  __packed HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE;
> >>
> > 
> > You realise there's a definition of this in the viridian code already, 
> > right?
> 
> It looked familiar, but it didn't occur to me to point this out.
> Yes, there looks to be room for deduplication...
> 

I had a plan to make viridian code use this copy directly.


> Actually, Wei, one more thing I was curious about - what is "tlfs"
> an acronym of?

It means "Top-Level Function Specification".

(I wish Xen had something similar)

Wei.

> 
> Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 1/2] x86/mm: factor out the code for shattering an l3 PTE

2019-12-11 Thread Julien Grall


Hi Hongyan,

On 11/12/2019 10:28, Xia, Hongyan wrote:

On Tue, 2019-12-10 at 16:20 +0100, Jan Beulich wrote:


 ol2e = l2e_from_intpte(
l2e_get_intpte(ol2e) + (PAGE_SIZE <<
PAGETABLE_ORDER));

Of course, as mentioned before, I'm not overly happy to see type
safety lost in case like this one, where it's not needed like e.g.
further up to convert from L3 to L2 entry.



Okay, so I did a comparison between the efficiency of the assembly
under a release build.

The old "type-safe" way requires 16 instructions to prepare the first
l2e, and each iteration of the inner loop of populating l2t requires 7
instructions.

The new type-unsafe way requires 6 to prepare the first l2e, and each
iteration of populating l2t takes 5 instructions.

So the difference of populating l2t is 3600 vs. 2566 instructions,
which is not very small.
While this involves more instructions, how often do we expect the code 
to be called?


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 1/2] x86/mm: factor out the code for shattering an l3 PTE

2019-12-11 Thread Xia, Hongyan

On Tue, 2019-12-10 at 16:20 +0100, Jan Beulich wrote:
> On 09.12.2019 12:48, Hongyan Xia wrote:
> > --- a/xen/arch/x86/mm.c
> > +++ b/xen/arch/x86/mm.c
> > @@ -5151,6 +5151,51 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long
> > v)
> >   flush_area_local((const void *)v, f) : \
> >   flush_area_all((const void *)v, f))
> >  
> > +/* Shatter an l3 entry and populate l2. If virt is passed in, also
> > do flush. */
> > +static int shatter_l3e(l3_pgentry_t *pl3e, unsigned long virt,
> > bool locking)
> > +{
> > +unsigned int i;
> > +l3_pgentry_t ol3e;
> > +l2_pgentry_t ol2e, *l2t = alloc_xen_pagetable();
> > +
> > +if ( l2t == NULL )
> 
> Nowadays we seem to prefer !l2t in cases like this one.
> 
> > +return -1;
> 
> -ENOMEM please (and then handed on by the caller).
> 
> > +ol3e = *pl3e;
> 
> This could be the variable's initializer.
> 
> > +ol2e = l2e_from_intpte(l3e_get_intpte(ol3e));
> 
> There's nothing "old" about this L2 entry, so its name would better
> be just "l2e" I think.
> 
> > +for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
> > +{
> > +l2e_write(l2t + i, ol2e);
> > +ol2e = l2e_from_intpte(
> > +l2e_get_intpte(ol2e) + (1 << (PAGETABLE_ORDER +
> > PAGE_SHIFT)));
> 
> Indentation looks odd here (also further down). If the first argument
> of a function call doesn't fit on the line and would also be ugly to
> split across lines, what we do is indent it the usual 4 characters
> from the function invocation, i.e. in this case
> 
> ol2e = l2e_from_intpte(
>l2e_get_intpte(ol2e) + (1 << (PAGETABLE_ORDER +
> PAGE_SHIFT)));
> 
> and then slightly shorter
> 
> ol2e = l2e_from_intpte(
>l2e_get_intpte(ol2e) + (PAGE_SIZE <<
> PAGETABLE_ORDER));
> 
> Of course, as mentioned before, I'm not overly happy to see type
> safety lost in case like this one, where it's not needed like e.g.
> further up to convert from L3 to L2 entry.
> 
> > +}
> > +if ( locking )
> > +spin_lock(_pgdir_lock);
> > +if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) &&
> > + (l3e_get_flags(*pl3e) & _PAGE_PSE) )
> > +{
> > +l3e_write_atomic(pl3e,
> > +l3e_from_paddr((paddr_t)virt_to_maddr(l2t),
> > __PAGE_HYPERVISOR));
> > +l2t = NULL;
> > +}
> > +if ( locking )
> > +spin_unlock(_pgdir_lock);
> > +if ( virt )
> > +{
> > +unsigned int flush_flags =
> > +FLUSH_TLB | FLUSH_ORDER(2 * PAGETABLE_ORDER);
> > +
> > +if ( (l3e_get_flags(ol3e) & _PAGE_GLOBAL) )
> 
> Unnecessary pair of parentheses (which also wasn't there in the
> original code).
> 
> > +flush_flags |= FLUSH_TLB_GLOBAL;
> 
> Too deep indentation.
> 
> > +flush_area(virt, flush_flags);
> > +}
> > +if ( l2t )
> > +free_xen_pagetable(l2t);
> > +
> > +return 0;
> > +}
> 
> Also please add blank lines between
> - L2 population and lock acquire,
> - lock release and TLB flush,
> - TLB flush and free.
> 
> Jan

Issues fixed in v3. I have not touched the type safety part. If we
think this is really important we can revert to what it was before,
although from the quick study I did in my previous email, there is a
performance difference.

Hongyan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Setting up a monthly Xen Project dinner/pub-meeting

2019-12-11 Thread Lars Kurth

And this time with the correct list. Really despairing with Outlook which keeps 
adding random old contacts to my address book and puts them in front of 
frequently used entries (sigh)

From: Lars Kurth 
Date: Wednesday, 11 December 2019 at 10:31
To: "xen-de...@lists.xensource.com" , 
"mirageos-de...@lists.xenproject.org" , 
"win-pv-de...@lists.xenproject.org" , 
"minios-de...@lists.xenproject.org" 
Subject: Setting up a monthly Xen Project dinner/pub-meeting

Hi all,

with quite a few people working on Xen Project across different companies and 
organisations these days, I was wondering whether we should set up a regular 
monthly get-together. I would like to get a sense as to

  1.  Who would be willing to turn up – need to get a sense of numbers, because 
we need to see whether it is necessary to book a table
  2.  What day would be best and what week of the month
  3.  Whether we would always choose the same venue – which I guess partly 
depends on the answer to a). If the core group attending is larger than 8 
people, we probably need to book a table, which is easier if we choose the same 
venue

If the answer to c) is NO, we should probably have a local coordinator and/or 
establish what venues we do like to go through well in advance

Feel free to voice your opinion

Regards
Lars
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 0/2] Refactor super page shattering

2019-12-11 Thread Hongyan Xia

map_pages_to_xen and modify_xen_mappings use almost exactly the same
page shattering logic, and the code is mingled with other PTE
manipulations so it is less obvious that the intention is page
shattering. Factor out the functions to make them reusable and to make
the intention more obvious.

Of course, there is not much difference between the shattering logic of
each level, so we could further turn the per-level functions into a
single macro, although this is not that simple since we have per-level
functions and macros all over the place and there are slight differences
between levels. Keep it per-level for now.

tree:
https://xenbits.xen.org/git-http/people/hx242/xen.git shatter-refactor

---
Changes in v3:
- style and indentation fixes.

Changes in v2:
- rebase.
- improve asm code.
- avoid stale values when taking the lock.
- move allocation of PTE tables inside the shatter function.

Hongyan Xia (2):
  x86/mm: factor out the code for shattering an l3 PTE
  x86/mm: factor out the code for shattering an l2 PTE

 xen/arch/x86/mm.c | 194 +++---
 1 file changed, 98 insertions(+), 96 deletions(-)

-- 
2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 1/2] x86/mm: factor out the code for shattering an l3 PTE

2019-12-11 Thread Hongyan Xia

map_pages_to_xen and modify_xen_mappings are performing almost exactly
the same operations when shattering an l3 PTE, the only difference
being whether we want to flush.

Signed-off-by: Hongyan Xia 

---
Changes in v3:
- style and indentation changes.
- return -ENOMEM instead of -1.

Changes in v2:
- improve asm.
- re-read pl3e from memory when taking the lock.
- move the allocation of l2t inside the shatter function.
---
 xen/arch/x86/mm.c | 98 +++
 1 file changed, 49 insertions(+), 49 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 7d4dd80a85..97f11b6016 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5151,6 +5151,52 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long v)
  flush_area_local((const void *)v, f) : \
  flush_area_all((const void *)v, f))
 
+/* Shatter an l3 entry and populate l2. If virt is passed in, also do flush. */
+static int shatter_l3e(l3_pgentry_t *pl3e, unsigned long virt, bool locking)
+{
+unsigned int i;
+l3_pgentry_t ol3e = *pl3e;
+l2_pgentry_t l2e = l2e_from_intpte(l3e_get_intpte(ol3e));
+l2_pgentry_t *l2t = alloc_xen_pagetable();
+
+if ( !l2t )
+return -ENOMEM;
+
+for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
+{
+l2e_write(l2t + i, l2e);
+l2e = l2e_from_intpte(
+  l2e_get_intpte(l2e) + (PAGE_SIZE << PAGETABLE_ORDER));
+}
+
+if ( locking )
+spin_lock(_pgdir_lock);
+if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) &&
+ (l3e_get_flags(*pl3e) & _PAGE_PSE) )
+{
+l3e_write_atomic(pl3e,
+l3e_from_paddr((paddr_t)virt_to_maddr(l2t), __PAGE_HYPERVISOR));
+l2t = NULL;
+}
+if ( locking )
+spin_unlock(_pgdir_lock);
+
+if ( virt )
+{
+unsigned int flush_flags =
+FLUSH_TLB | FLUSH_ORDER(2 * PAGETABLE_ORDER);
+
+if ( l3e_get_flags(ol3e) & _PAGE_GLOBAL )
+flush_flags |= FLUSH_TLB_GLOBAL;
+flush_area(virt, flush_flags);
+}
+
+if ( l2t )
+free_xen_pagetable(l2t);
+
+return 0;
+}
+
 int map_pages_to_xen(
 unsigned long virt,
 mfn_t mfn,
@@ -5244,9 +5290,6 @@ int map_pages_to_xen(
 if ( (l3e_get_flags(ol3e) & _PAGE_PRESENT) &&
  (l3e_get_flags(ol3e) & _PAGE_PSE) )
 {
-unsigned int flush_flags =
-FLUSH_TLB | FLUSH_ORDER(2 * PAGETABLE_ORDER);
-
 /* Skip this PTE if there is no change. */
 if ( ((l3e_get_pfn(ol3e) & ~(L2_PAGETABLE_ENTRIES *
  L1_PAGETABLE_ENTRIES - 1)) +
@@ -5267,33 +5310,9 @@ int map_pages_to_xen(
 continue;
 }
 
-pl2e = alloc_xen_pagetable();
-if ( pl2e == NULL )
+/* Pass virt to indicate we need to flush. */
+if ( shatter_l3e(pl3e, virt, locking) )
 return -ENOMEM;
-
-for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
-l2e_write(pl2e + i,
-  l2e_from_pfn(l3e_get_pfn(ol3e) +
-   (i << PAGETABLE_ORDER),
-   l3e_get_flags(ol3e)));
-
-if ( l3e_get_flags(ol3e) & _PAGE_GLOBAL )
-flush_flags |= FLUSH_TLB_GLOBAL;
-
-if ( locking )
-spin_lock(_pgdir_lock);
-if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) &&
- (l3e_get_flags(*pl3e) & _PAGE_PSE) )
-{
-l3e_write_atomic(pl3e, l3e_from_mfn(virt_to_mfn(pl2e),
-__PAGE_HYPERVISOR));
-pl2e = NULL;
-}
-if ( locking )
-spin_unlock(_pgdir_lock);
-flush_area(virt, flush_flags);
-if ( pl2e )
-free_xen_pagetable(pl2e);
 }
 
 pl2e = virt_to_xen_l2e(virt);
@@ -5578,27 +5597,8 @@ int modify_xen_mappings(unsigned long s, unsigned long 
e, unsigned int nf)
 }
 
 /* PAGE1GB: shatter the superpage and fall through. */
-pl2e = alloc_xen_pagetable();
-if ( !pl2e )
+if ( shatter_l3e(pl3e, 0, locking) )
 return -ENOMEM;
-for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
-l2e_write(pl2e + i,
-  l2e_from_pfn(l3e_get_pfn(*pl3e) +
-   (i << PAGETABLE_ORDER),
-   l3e_get_flags(*pl3e)));
-if ( locking )
-spin_lock(_pgdir_lock);
-if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) &&
- (l3e_get_flags(*pl3e) & _PAGE_PSE) )
-{
-l3e_write_atomic(pl3e, l3e_from_mfn(virt_to_mfn(pl2e),
-__PAGE_HYPERVISOR));
-pl2e =

[Xen-devel] [PATCH v3 2/2] x86/mm: factor out the code for shattering an l2 PTE

2019-12-11 Thread Hongyan Xia

map_pages_to_xen and modify_xen_mappings are performing almost exactly
the same operations when shattering an l2 PTE, the only difference
being whether we want to flush.

Signed-off-by: Hongyan Xia 

---
Changes in v3:
- style and indentation changes.
- return -ENOMEM instead of -1.

Changes in v2:
- improve asm.
- re-read pl2e from memory when taking the lock.
- move the allocation of l1t inside the shatter function.
---
 xen/arch/x86/mm.c | 96 ---
 1 file changed, 49 insertions(+), 47 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 97f11b6016..e5ba6b52fb 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5151,6 +5151,52 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long v)
  flush_area_local((const void *)v, f) : \
  flush_area_all((const void *)v, f))
 
+/* Shatter an l2 entry and populate l1. If virt is passed in, also do flush. */
+static int shatter_l2e(l2_pgentry_t *pl2e, unsigned long virt, bool locking)
+{
+unsigned int i;
+l2_pgentry_t ol2e = *pl2e;
+l1_pgentry_t l1e = l1e_from_paddr(l2e_get_paddr(ol2e),
+  lNf_to_l1f(l2e_get_flags(ol2e)));
+l1_pgentry_t *l1t = alloc_xen_pagetable();
+
+if ( !l1t )
+return -ENOMEM;
+
+for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
+{
+l1e_write(l1t + i, l1e);
+l1e = l1e_from_intpte(l1e_get_intpte(l1e) + PAGE_SIZE);
+}
+
+if ( locking )
+spin_lock(_pgdir_lock);
+if ( (l2e_get_flags(*pl2e) & _PAGE_PRESENT) &&
+ (l2e_get_flags(*pl2e) & _PAGE_PSE) )
+{
+l2e_write_atomic(pl2e,
+l2e_from_paddr((paddr_t)virt_to_maddr(l1t), __PAGE_HYPERVISOR));
+l1t = NULL;
+}
+if ( locking )
+spin_unlock(_pgdir_lock);
+
+if ( virt )
+{
+unsigned int flush_flags =
+FLUSH_TLB | FLUSH_ORDER(PAGETABLE_ORDER);
+
+if ( l2e_get_flags(ol2e) & _PAGE_GLOBAL )
+flush_flags |= FLUSH_TLB_GLOBAL;
+flush_area(virt, flush_flags);
+}
+
+if ( l1t )
+free_xen_pagetable(l1t);
+
+return 0;
+}
+
 /* Shatter an l3 entry and populate l2. If virt is passed in, also do flush. */
 static int shatter_l3e(l3_pgentry_t *pl3e, unsigned long virt, bool locking)
 {
@@ -5364,9 +5410,6 @@ int map_pages_to_xen(
 }
 else if ( l2e_get_flags(*pl2e) & _PAGE_PSE )
 {
-unsigned int flush_flags =
-FLUSH_TLB | FLUSH_ORDER(PAGETABLE_ORDER);
-
 /* Skip this PTE if there is no change. */
 if ( (((l2e_get_pfn(*pl2e) & ~(L1_PAGETABLE_ENTRIES - 1)) +
l1_table_offset(virt)) == mfn_x(mfn)) &&
@@ -5385,32 +5428,9 @@ int map_pages_to_xen(
 goto check_l3;
 }
 
-pl1e = alloc_xen_pagetable();
-if ( pl1e == NULL )
+/* Pass virt to indicate we need to flush. */
+if ( shatter_l2e(pl2e, virt, locking) )
 return -ENOMEM;
-
-for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
-l1e_write([i],
-  l1e_from_pfn(l2e_get_pfn(*pl2e) + i,
-   lNf_to_l1f(l2e_get_flags(*pl2e;
-
-if ( l2e_get_flags(*pl2e) & _PAGE_GLOBAL )
-flush_flags |= FLUSH_TLB_GLOBAL;
-
-if ( locking )
-spin_lock(_pgdir_lock);
-if ( (l2e_get_flags(*pl2e) & _PAGE_PRESENT) &&
- (l2e_get_flags(*pl2e) & _PAGE_PSE) )
-{
-l2e_write_atomic(pl2e, l2e_from_mfn(virt_to_mfn(pl1e),
-__PAGE_HYPERVISOR));
-pl1e = NULL;
-}
-if ( locking )
-spin_unlock(_pgdir_lock);
-flush_area(virt, flush_flags);
-if ( pl1e )
-free_xen_pagetable(pl1e);
 }
 
 pl1e  = l2e_to_l1e(*pl2e) + l1_table_offset(virt);
@@ -5633,26 +5653,8 @@ int modify_xen_mappings(unsigned long s, unsigned long 
e, unsigned int nf)
 else
 {
 /* PSE: shatter the superpage and try again. */
-pl1e = alloc_xen_pagetable();
-if ( !pl1e )
+if ( shatter_l2e(pl2e, 0, locking) )
 return -ENOMEM;
-for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
-l1e_write([i],
-  l1e_from_pfn(l2e_get_pfn(*pl2e) + i,
-   l2e_get_flags(*pl2e) & ~_PAGE_PSE));
-if ( locking )
-spin_lock(_pgdir_lock);
-if ( (l2e_get_flags(*pl2e) & _PAGE_PRESENT) &&
-

[Xen-devel] [ovmf test] 144703: regressions - FAIL

2019-12-11 Thread osstest service owner

flight 144703 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144703/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   6 xen-buildfail REGR. vs. 144637
 build-i386-xsm6 xen-buildfail REGR. vs. 144637
 build-amd64-xsm   6 xen-buildfail REGR. vs. 144637
 build-i3866 xen-buildfail REGR. vs. 144637

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a

version targeted for testing:
 ovmf d3add11e87dace180387562d6f1951f2bffbd3d9
baseline version:
 ovmf 804666c86e7b6f04fe5c5cfdb13199c19e0e99b0

Last test of basis   144637  2019-12-09 09:09:49 Z2 days
Failing since144646  2019-12-10 01:39:53 Z1 days7 attempts
Testing same since   144703  2019-12-11 10:09:05 Z0 days1 attempts


People who touched revisions under test:
  Antoine Coeur 
  Bob Feng 
  Jiewen Yao 
  Michael Kubacki 
  Philippe Mathieu-Daude 
  Steven Shi 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit d3add11e87dace180387562d6f1951f2bffbd3d9
Author: Michael Kubacki 
Date:   Wed Nov 20 17:31:24 2019 -0800

MdeModulePkg PeiCore: Improve comment semantics

This patch clarifies wording in several PeiCore comments to improve
reading comprehension.

Cc: Dandan Bi 
Cc: Liming Gao 
Cc: Jian J Wang 
Cc: Hao A Wu 
Signed-off-by: Michael Kubacki 
Reviewed-by: Liming Gao 
Reviewed-by: Jian J Wang 

commit d39d1260c615b716675f67f5c4e1f4f52df01dad
Author: Michael Kubacki 
Date:   Wed Nov 20 17:10:48 2019 -0800

MdeModulePkg PeiCore: Fix typos

Cc: Dandan Bi 
Cc: Liming Gao 
Cc: Jian J Wang 
Cc: Hao A Wu 
Signed-off-by: Michael Kubacki 
Reviewed-by: Liming Gao 
Reviewed-by: Philippe Mathieu-Daude 
Reviewed-by: Jian J Wang 

commit 97eedf5dfbaffde33210fd88066247cf0b7d3325
Author: Antoine Coeur 
Date:   Wed Dec 4 12:14:53 2019 +0800

IntelFsp2WrapperPkg: Fix various typos

Fix various typos in comments and documentation.

Cc: Chasel Chiu 
Cc: Nate DeSimone 
Cc: Star Zeng 
Reviewed-by: Philippe Mathieu-Daude 
Signed-off-by: Philippe Mathieu-Daude 
Reviewed-by: Nate DeSimone 
Reviewed-by: Chasel Chiu 
Reviewed-by: Star Zeng 

commit 7e55cf6b48dcd43de46d008b2f12caaad2554503
Author: Jiewen Yao 
Date:   Sat Dec 7 21:41:10 2019 +0800

SecurityPkg/Tcg2Smm: Measure the table before patch.

REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1940

According to TCG PFP specification: the ACPI table must be
measured prior to any modification, and the measurement
must be same cross every boot cycle.

There is a fix 3a63c17ebc853cbb27d190729d01e27f68e65b94
for the HID data. However that is not enough.
The LAML/LASA and PCD configuration change may also cause
similar problem.

We need measure the table before any update.

Cc: Jian J Wang 
Cc: Chao Zhang 
Signed-off-by: Jiewen Yao 
Reviewed-by: Chao Zhang 

commit a80032dc44a1071a34f4415a7c5cef5170ee6159
Author: Steven Shi 
Date:   Tue Nov 19 16:22:08 2019 +0800

BaseTools: Remove

Re: [Xen-devel] [PATCH v2 4/4] xen-blkback: support dynamic unbind/bind

2019-12-11 Thread Durrant, Paul

> -Original Message-
> From: Roger Pau Monné 
> Sent: 11 December 2019 10:46
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; linux-ker...@vger.kernel.org; Konrad
> Rzeszutek Wilk ; Jens Axboe ;
> Boris Ostrovsky ; Juergen Gross
> ; Stefano Stabellini 
> Subject: Re: [PATCH v2 4/4] xen-blkback: support dynamic unbind/bind
> 
> On Tue, Dec 10, 2019 at 11:33:47AM +, Paul Durrant wrote:
> > By simply re-attaching to shared rings during connect_ring() rather than
> > assuming they are freshly allocated (i.e assuming the counters are zero)
> > it is possible for vbd instances to be unbound and re-bound from and to
> > (respectively) a running guest.
> >
> > This has been tested by running:
> >
> > while true;
> >   do fio --name=randwrite --ioengine=libaio --iodepth=16 \
> >   --rw=randwrite --bs=4k --direct=1 --size=1G --verify=crc32;
> >   done
> >
> > in a PV guest whilst running:
> >
> > while true;
> >   do echo vbd-$DOMID-$VBD >unbind;
> >   echo unbound;
> >   sleep 5;
> 
> Is there anyway to know when the unbind has finished? AFAICT
> xen_blkif_disconnect will return EBUSY if there are in flight
> requests, and the disconnect won't be completed until those requests
> are finished.

Yes, the device sysfs node will disappear when remove() completes.

> 
> >   echo vbd-$DOMID-$VBD >bind;
> >   echo bound;
> >   sleep 3;
> >   done
> >
> > in dom0 from /sys/bus/xen-backend/drivers/vbd to continuously unbind and
> > re-bind its system disk image.
> >
> > This is a highly useful feature for a backend module as it allows it to
> be
> > unloaded and re-loaded (i.e. updated) without requiring domUs to be
> halted.
> > This was also tested by running:
> >
> > while true;
> >   do echo vbd-$DOMID-$VBD >unbind;
> >   echo unbound;
> >   sleep 5;
> >   rmmod xen-blkback;
> >   echo unloaded;
> >   sleep 1;
> >   modprobe xen-blkback;
> >   echo bound;
> >   cd $(pwd);
> >   sleep 3;
> >   done
> >
> > in dom0 whilst running the same loop as above in the (single) PV guest.
> >
> > Some (less stressful) testing has also been done using a Windows HVM
> guest
> > with the latest 9.0 PV drivers installed.
> >
> > Signed-off-by: Paul Durrant 
> > ---
> > Cc: Konrad Rzeszutek Wilk 
> > Cc: "Roger Pau Monné" 
> > Cc: Jens Axboe 
> > Cc: Boris Ostrovsky 
> > Cc: Juergen Gross 
> > Cc: Stefano Stabellini 
> >
> > v2:
> >  - Apply a sanity check to the value of rsp_prod and fail the re-attach
> >if it is implausible
> >  - Set allow_rebind to prevent ring from being closed on unbind
> >  - Update test workload from dd to fio (with verification)
> > ---
> >  drivers/block/xen-blkback/xenbus.c | 59 +-
> >  1 file changed, 41 insertions(+), 18 deletions(-)
> >
> > diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-
> blkback/xenbus.c
> > index e8c5c54e1d26..13d09630b237 100644
> > --- a/drivers/block/xen-blkback/xenbus.c
> > +++ b/drivers/block/xen-blkback/xenbus.c
> > @@ -181,6 +181,8 @@ static int xen_blkif_map(struct xen_blkif_ring
> *ring, grant_ref_t *gref,
> >  {
> > int err;
> > struct xen_blkif *blkif = ring->blkif;
> > +   struct blkif_common_sring *sring_common;
> > +   RING_IDX rsp_prod, req_prod;
> >
> > /* Already connected through? */
> > if (ring->irq)
> > @@ -191,46 +193,66 @@ static int xen_blkif_map(struct xen_blkif_ring
> *ring, grant_ref_t *gref,
> > if (err < 0)
> > return err;
> >
> > +   sring_common = (struct blkif_common_sring *)ring->blk_ring;
> > +   rsp_prod = READ_ONCE(sring_common->rsp_prod);
> > +   req_prod = READ_ONCE(sring_common->req_prod);
> > +
> > switch (blkif->blk_protocol) {
> > case BLKIF_PROTOCOL_NATIVE:
> > {
> > -   struct blkif_sring *sring;
> > -   sring = (struct blkif_sring *)ring->blk_ring;
> > -   BACK_RING_INIT(>blk_rings.native, sring,
> > -  XEN_PAGE_SIZE * nr_grefs);
> > +   struct blkif_sring *sring_native =
> > +   (struct blkif_sring *)ring->blk_ring;
> 
> I think you can constify both sring_native and sring_common (and the
> other instances below).

Yes, I can do that. I don't think the macros would mind.

> 
> > +   unsigned int size = __RING_SIZE(sring_native,
> > +   XEN_PAGE_SIZE * nr_grefs);
> > +
> > +   BACK_RING_ATTACH(>blk_rings.native, sring_native,
> > +rsp_prod, XEN_PAGE_SIZE * nr_grefs);
> > +   err = (req_prod - rsp_prod > size) ? -EIO : 0;
> > break;
> > }
> > case BLKIF_PROTOCOL_X86_32:
> > {
> > -   struct blkif_x86_32_sring *sring_x86_32;
> > -   sring_x86_32 = (struct blkif_x86_32_sring *)ring->blk_ring;
> > -   BACK_RING_INIT(>blk_rings.x86_32, sring_x86_32,
> > -  XEN_PAGE_SIZE * nr_grefs);
> > +   struct blkif_x86_32_sring *sring_x86_32 =
> > +   (struct blkif_x86_32_sring

Re: [Xen-devel] [PATCH v5 1/2] xenbus/backend: Add memory pressure handler callback

2019-12-11 Thread Roger Pau Monné

On Wed, Dec 11, 2019 at 04:50:58AM +0100, SeongJae Park wrote:
> On Tue, 10 Dec 2019 11:16:35 +0100 "Roger Pau Monné"  
> wrote:
> > > diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
> > > index 869c816d5f8c..cdb075e4182f 100644
> > > --- a/include/xen/xenbus.h
> > > +++ b/include/xen/xenbus.h
> > > @@ -104,6 +104,7 @@ struct xenbus_driver {
> > >   struct device_driver driver;
> > >   int (*read_otherend_details)(struct xenbus_device *dev);
> > >   int (*is_ready)(struct xenbus_device *dev);
> > > + unsigned (*reclaim)(struct xenbus_device *dev);
> > 
> > ... hence I wonder why it's returning an unsigned when it's just
> > ignored.
> > 
> > IMO it should return an int to signal errors, and the return should be
> > ignored.
> 
> I first thought similarly and set the callback to return something.  However,
> as this callback is called to simply notify the memory pressure and ask the
> driver to free its memory as many as possible, I couldn't easily imagine what
> kind of errors that need to be handled by its caller can occur in the 
> callback,
> especially because current blkback's callback implementation has no such 
> error.
> So, if you and others agree, I would like to simply set the return type to
> 'void' for now and defer the error handling to a future change.

Yes, I also wondered the same, but seeing you returned an integer I
assumed there was interest in returning some kind of value. If there's
nothing to return let's just make it void.

> > 
> > Also, I think it would preferable for this function to take an extra
> > parameter to describe the resource the driver should attempt to free
> > (ie: memory or interrupts for example). I'm however not able to find
> > any existing Linux type to describe such resources.
> 
> Yes, such extention would be the right direction.  However, because there is 
> no
> existing Linux type to describe the type of resources to reclaim as you also
> mentioned, there could be many different opinions about its implementation
> detail.  In my opinion, it could be also possible to simply add another
> callback for another resource type.  That said, because currently we have an
> use case and an implementation for the memory pressure only, I would like to
> let it as is for now and defer the extension as a future work, if you and
> others have no objection.

Ack, can I please ask the callback to be named reclaim_memory or some
such then?

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 4/4] xen-blkback: support dynamic unbind/bind

2019-12-11 Thread Roger Pau Monné

On Tue, Dec 10, 2019 at 11:33:47AM +, Paul Durrant wrote:
> By simply re-attaching to shared rings during connect_ring() rather than
> assuming they are freshly allocated (i.e assuming the counters are zero)
> it is possible for vbd instances to be unbound and re-bound from and to
> (respectively) a running guest.
> 
> This has been tested by running:
> 
> while true;
>   do fio --name=randwrite --ioengine=libaio --iodepth=16 \
>   --rw=randwrite --bs=4k --direct=1 --size=1G --verify=crc32;
>   done
> 
> in a PV guest whilst running:
> 
> while true;
>   do echo vbd-$DOMID-$VBD >unbind;
>   echo unbound;
>   sleep 5;

Is there anyway to know when the unbind has finished? AFAICT
xen_blkif_disconnect will return EBUSY if there are in flight
requests, and the disconnect won't be completed until those requests
are finished.

>   echo vbd-$DOMID-$VBD >bind;
>   echo bound;
>   sleep 3;
>   done
> 
> in dom0 from /sys/bus/xen-backend/drivers/vbd to continuously unbind and
> re-bind its system disk image.
> 
> This is a highly useful feature for a backend module as it allows it to be
> unloaded and re-loaded (i.e. updated) without requiring domUs to be halted.
> This was also tested by running:
> 
> while true;
>   do echo vbd-$DOMID-$VBD >unbind;
>   echo unbound;
>   sleep 5;
>   rmmod xen-blkback;
>   echo unloaded;
>   sleep 1;
>   modprobe xen-blkback;
>   echo bound;
>   cd $(pwd);
>   sleep 3;
>   done
> 
> in dom0 whilst running the same loop as above in the (single) PV guest.
> 
> Some (less stressful) testing has also been done using a Windows HVM guest
> with the latest 9.0 PV drivers installed.
> 
> Signed-off-by: Paul Durrant 
> ---
> Cc: Konrad Rzeszutek Wilk 
> Cc: "Roger Pau Monné" 
> Cc: Jens Axboe 
> Cc: Boris Ostrovsky 
> Cc: Juergen Gross 
> Cc: Stefano Stabellini 
> 
> v2:
>  - Apply a sanity check to the value of rsp_prod and fail the re-attach
>if it is implausible
>  - Set allow_rebind to prevent ring from being closed on unbind
>  - Update test workload from dd to fio (with verification)
> ---
>  drivers/block/xen-blkback/xenbus.c | 59 +-
>  1 file changed, 41 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/block/xen-blkback/xenbus.c 
> b/drivers/block/xen-blkback/xenbus.c
> index e8c5c54e1d26..13d09630b237 100644
> --- a/drivers/block/xen-blkback/xenbus.c
> +++ b/drivers/block/xen-blkback/xenbus.c
> @@ -181,6 +181,8 @@ static int xen_blkif_map(struct xen_blkif_ring *ring, 
> grant_ref_t *gref,
>  {
>   int err;
>   struct xen_blkif *blkif = ring->blkif;
> + struct blkif_common_sring *sring_common;
> + RING_IDX rsp_prod, req_prod;
>  
>   /* Already connected through? */
>   if (ring->irq)
> @@ -191,46 +193,66 @@ static int xen_blkif_map(struct xen_blkif_ring *ring, 
> grant_ref_t *gref,
>   if (err < 0)
>   return err;
>  
> + sring_common = (struct blkif_common_sring *)ring->blk_ring;
> + rsp_prod = READ_ONCE(sring_common->rsp_prod);
> + req_prod = READ_ONCE(sring_common->req_prod);
> +
>   switch (blkif->blk_protocol) {
>   case BLKIF_PROTOCOL_NATIVE:
>   {
> - struct blkif_sring *sring;
> - sring = (struct blkif_sring *)ring->blk_ring;
> - BACK_RING_INIT(>blk_rings.native, sring,
> -XEN_PAGE_SIZE * nr_grefs);
> + struct blkif_sring *sring_native =
> + (struct blkif_sring *)ring->blk_ring;

I think you can constify both sring_native and sring_common (and the
other instances below).

> + unsigned int size = __RING_SIZE(sring_native,
> + XEN_PAGE_SIZE * nr_grefs);
> +
> + BACK_RING_ATTACH(>blk_rings.native, sring_native,
> +  rsp_prod, XEN_PAGE_SIZE * nr_grefs);
> + err = (req_prod - rsp_prod > size) ? -EIO : 0;
>   break;
>   }
>   case BLKIF_PROTOCOL_X86_32:
>   {
> - struct blkif_x86_32_sring *sring_x86_32;
> - sring_x86_32 = (struct blkif_x86_32_sring *)ring->blk_ring;
> - BACK_RING_INIT(>blk_rings.x86_32, sring_x86_32,
> -XEN_PAGE_SIZE * nr_grefs);
> + struct blkif_x86_32_sring *sring_x86_32 =
> + (struct blkif_x86_32_sring *)ring->blk_ring;
> + unsigned int size = __RING_SIZE(sring_x86_32,
> + XEN_PAGE_SIZE * nr_grefs);
> +
> + BACK_RING_ATTACH(>blk_rings.x86_32, sring_x86_32,
> +  rsp_prod, XEN_PAGE_SIZE * nr_grefs);
> + err = (req_prod - rsp_prod > size) ? -EIO : 0;
>   break;
>   }
>   case BLKIF_PROTOCOL_X86_64:
>   {
> - struct blkif_x86_64_sring *sring_x86_64;
> - sring_x86_64 = (struct blkif_x86_64_sring *)ring->blk_ring;
> - BACK_RING_INIT(>blk_rings.x86_64, sring_x86_64,
> -

Re: [Xen-devel] [PATCH v2 1/2] x86/mm: factor out the code for shattering an l3 PTE

2019-12-11 Thread Xia, Hongyan

On Tue, 2019-12-10 at 16:20 +0100, Jan Beulich wrote:
> 
> ol2e = l2e_from_intpte(
>l2e_get_intpte(ol2e) + (PAGE_SIZE <<
> PAGETABLE_ORDER));
> 
> Of course, as mentioned before, I'm not overly happy to see type
> safety lost in case like this one, where it's not needed like e.g.
> further up to convert from L3 to L2 entry.
> 

Okay, so I did a comparison between the efficiency of the assembly
under a release build.

The old "type-safe" way requires 16 instructions to prepare the first
l2e, and each iteration of the inner loop of populating l2t requires 7
instructions.

The new type-unsafe way requires 6 to prepare the first l2e, and each
iteration of populating l2t takes 5 instructions.

So the difference of populating l2t is 3600 vs. 2566 instructions,
which is not very small.

I have not tested the packed bit field way you suggested, but I think
it could even be higher than 3600 due to masking, shifting and also
overflow handling.

Hongyan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/4] xenbus: limit when state is forced to closed

2019-12-11 Thread Jürgen Groß

On 11.12.19 11:14, Durrant, Paul wrote:

-Original Message-
From: Roger Pau Monné 
Sent: 11 December 2019 10:06
To: Durrant, Paul 
Cc: xen-devel@lists.xenproject.org; linux-ker...@vger.kernel.org; Juergen
Gross ; Stefano Stabellini ;
Boris Ostrovsky 
Subject: Re: [Xen-devel] [PATCH v2 2/4] xenbus: limit when state is forced
to closed

On Tue, Dec 10, 2019 at 11:33:45AM +, Paul Durrant wrote:

If a driver probe() fails then leave the xenstore state alone. There is

no

reason to modify it as the failure may be due to transient resource
allocation issues and hence a subsequent probe() may succeed.

If the driver supports re-binding then only force state to closed during
remove() only in the case when the toolstack may need to clean up. This

can

be detected by checking whether the state in xenstore has been set to
closing prior to device removal.

NOTE: Re-bind support is indicated by new boolean in struct

xenbus_driver,

   which defaults to false. Subsequent patches will add support to
   some backend drivers.

My intention was to specify whether you want to close the
backends on unbind in sysfs, so that an user can decide at runtime,
rather than having a hardcoded value in the driver.

Anyway, I'm less sure whether such runtime tunable is useful at all,
so let's leave it out and can always be added afterwards. At the end
of day a user wrongly doing a rmmod blkback can always recover
gracefully by loading blkback again with your proposed approach to
leave connections open on module removal.

Sorry for the extra work.

Does this mean you don't think the extra driver flag is necessary any more? NB: 
now that xenbus actually takes module references you can't accidentally rmmod 
any more :-)

I'd like it to be kept, please.

Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-coverity test] 144699: all pass - PUSHED

2019-12-11 Thread osstest service owner

flight 144699 xen-unstable-coverity real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144699/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 xen  272c18435e93cbf749c096a9552ab5ef0d79a4ca
baseline version:
 xen  ae25407faaaddf4abe44137ebf0e177a8c8f9858

Last test of basis   144634  2019-12-08 09:18:58 Z3 days
Testing same since   144699  2019-12-11 09:18:24 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Daniel De Graaf 
  George Dunlap 
  Igor Druzhinin 
  Jan Beulich 
  Jeremi Piotrowski 
  Julien Grall 
  Krzysztof Kolasa 
  Mark Pryor 
  Rasmus Villemoes 
  Razvan Cojocaru 
  Roger Pau Monné 

jobs:
 coverity-amd64   pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   ae25407faa..272c18435e  272c18435e93cbf749c096a9552ab5ef0d79a4ca -> 
coverity-tested/smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/4] xenbus: limit when state is forced to closed

2019-12-11 Thread Durrant, Paul

> -Original Message-
> From: Roger Pau Monné 
> Sent: 11 December 2019 10:06
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; linux-ker...@vger.kernel.org; Juergen
> Gross ; Stefano Stabellini ;
> Boris Ostrovsky 
> Subject: Re: [Xen-devel] [PATCH v2 2/4] xenbus: limit when state is forced
> to closed
> 
> On Tue, Dec 10, 2019 at 11:33:45AM +, Paul Durrant wrote:
> > If a driver probe() fails then leave the xenstore state alone. There is
> no
> > reason to modify it as the failure may be due to transient resource
> > allocation issues and hence a subsequent probe() may succeed.
> >
> > If the driver supports re-binding then only force state to closed during
> > remove() only in the case when the toolstack may need to clean up. This
> can
> > be detected by checking whether the state in xenstore has been set to
> > closing prior to device removal.
> >
> > NOTE: Re-bind support is indicated by new boolean in struct
> xenbus_driver,
> >   which defaults to false. Subsequent patches will add support to
> >   some backend drivers.
> 
> My intention was to specify whether you want to close the
> backends on unbind in sysfs, so that an user can decide at runtime,
> rather than having a hardcoded value in the driver.
> 
> Anyway, I'm less sure whether such runtime tunable is useful at all,
> so let's leave it out and can always be added afterwards. At the end
> of day a user wrongly doing a rmmod blkback can always recover
> gracefully by loading blkback again with your proposed approach to
> leave connections open on module removal.
> 
> Sorry for the extra work.
> 

Does this mean you don't think the extra driver flag is necessary any more? NB: 
now that xenbus actually takes module references you can't accidentally rmmod 
any more :-)

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/4] xenbus: limit when state is forced to closed

2019-12-11 Thread Roger Pau Monné

On Tue, Dec 10, 2019 at 11:33:45AM +, Paul Durrant wrote:
> If a driver probe() fails then leave the xenstore state alone. There is no
> reason to modify it as the failure may be due to transient resource
> allocation issues and hence a subsequent probe() may succeed.
> 
> If the driver supports re-binding then only force state to closed during
> remove() only in the case when the toolstack may need to clean up. This can
> be detected by checking whether the state in xenstore has been set to
> closing prior to device removal.
> 
> NOTE: Re-bind support is indicated by new boolean in struct xenbus_driver,
>   which defaults to false. Subsequent patches will add support to
>   some backend drivers.

My intention was to specify whether you want to close the
backends on unbind in sysfs, so that an user can decide at runtime,
rather than having a hardcoded value in the driver.

Anyway, I'm less sure whether such runtime tunable is useful at all,
so let's leave it out and can always be added afterwards. At the end
of day a user wrongly doing a rmmod blkback can always recover
gracefully by loading blkback again with your proposed approach to
leave connections open on module removal.

Sorry for the extra work.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/microcode: Support builtin CPU microcode

2019-12-11 Thread Jan Beulich

On 11.12.2019 00:18, Eslam Elnikety wrote:
> On 10.12.19 10:37, Jan Beulich wrote:
>> On 09.12.2019 09:41, Eslam Elnikety wrote:
>>> --- a/docs/misc/xen-command-line.pandoc
>>> +++ b/docs/misc/xen-command-line.pandoc
>>> @@ -2113,7 +2113,7 @@ logic applies:
>>>  active by default.
>>>   
>>>   ### ucode (x86)
>>> -> `= List of [  | scan=, nmi= ]`
>>> +> `= List of [  | scan= | builtin=, nmi= ]`
>>
>> Despite my other question regarding the usefulness of this as a
>> whole a few comments.
>>
>> Do "scan" and "builtin" really need to exclude each other? I.e.
>> don't you mean , instead of | ?
> The useful case here would be specifying ucode=scan,builtin which would 
> translate to fallback onto the builtin microcode if a scan fails. In 
> fact, this is already the case given the implementation in v1. It will 
> be better to clarify this semantic by allowing scan,builtin.
> 
> On that note, I really see the "" and "scan=" to be 
> equal. Following the logic earlier we should probably also allow 
> ucode=,builtin. This translates to fallback to builtin if there 
> are no modules at index .

Almost - if the builtin one is newer than the separate blob, then
either of the cmdline options you name should still cause the
builtin one to be loaded. IOW you want to honor both options, not
prefer the earlier over a later one.

>>> +   ---help---
>>> + Include the CPU microcode update in the Xen image itself. With this
>>> + support, Xen can update the CPU microcode upon boot using the builtin
>>> + microcode, with no need for an additional microcode boot modules.
>>> +
>>> + If unsure, say N.
>>> +
>>> +config BUILTIN_UCODE_DIR
>>> +   string
>>> +   default "/lib/firmware"
>>> +   depends on BUILTIN_UCODE
>>
>> ... are two separate options needed at all? Can't this latter one
>> being the empty string just imply the feature to be disabled?
> 
> I can go either way here. To me, two options is clearer.

It's the other way around here, but I'd accept being outvoted.

>>> +ucode_blob.data = xmalloc_bytes(ucode_blob.size);
>>> +if ( !ucode_blob.data )
>>> +return;
>>> +
>>> +if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
>>> +memcpy(ucode_blob.data, __builtin_amd_ucode_start, 
>>> ucode_blob.size);
>>> +else
>>> +memcpy(ucode_blob.data, __builtin_intel_ucode_start, 
>>> ucode_blob.size);
>>
>> ... why the copying? Can't you simply point ucode_blob.data at
>> __builtin_{amd,intel}_ucode_start?
> 
> I am all onboard. See my earlier response to Andrew. I used the same 
> logic that already exists for scan (which assumes that ucode_blob.data 
> is allocated and should be freed when all CPUs are updated).

The scan case may be different in that it may not lend itself
to re-using the blob in its original location. If that's not
the reason for the present behavior, then I think we would
want to do away with the unnecessary copying there as well.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [ovmf test] 144693: regressions - FAIL

2019-12-11 Thread osstest service owner

flight 144693 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144693/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   6 xen-buildfail REGR. vs. 144637
 build-i386-xsm6 xen-buildfail REGR. vs. 144637
 build-amd64-xsm   6 xen-buildfail REGR. vs. 144637
 build-i3866 xen-buildfail REGR. vs. 144637

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a

version targeted for testing:
 ovmf 97eedf5dfbaffde33210fd88066247cf0b7d3325
baseline version:
 ovmf 804666c86e7b6f04fe5c5cfdb13199c19e0e99b0

Last test of basis   144637  2019-12-09 09:09:49 Z2 days
Failing since144646  2019-12-10 01:39:53 Z1 days6 attempts
Testing same since   144693  2019-12-11 06:15:38 Z0 days1 attempts


People who touched revisions under test:
  Antoine Coeur 
  Bob Feng 
  Jiewen Yao 
  Philippe Mathieu-Daude 
  Steven Shi 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 97eedf5dfbaffde33210fd88066247cf0b7d3325
Author: Antoine Coeur 
Date:   Wed Dec 4 12:14:53 2019 +0800

IntelFsp2WrapperPkg: Fix various typos

Fix various typos in comments and documentation.

Cc: Chasel Chiu 
Cc: Nate DeSimone 
Cc: Star Zeng 
Reviewed-by: Philippe Mathieu-Daude 
Signed-off-by: Philippe Mathieu-Daude 
Reviewed-by: Nate DeSimone 
Reviewed-by: Chasel Chiu 
Reviewed-by: Star Zeng 

commit 7e55cf6b48dcd43de46d008b2f12caaad2554503
Author: Jiewen Yao 
Date:   Sat Dec 7 21:41:10 2019 +0800

SecurityPkg/Tcg2Smm: Measure the table before patch.

REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1940

According to TCG PFP specification: the ACPI table must be
measured prior to any modification, and the measurement
must be same cross every boot cycle.

There is a fix 3a63c17ebc853cbb27d190729d01e27f68e65b94
for the HID data. However that is not enough.
The LAML/LASA and PCD configuration change may also cause
similar problem.

We need measure the table before any update.

Cc: Jian J Wang 
Cc: Chao Zhang 
Signed-off-by: Jiewen Yao 
Reviewed-by: Chao Zhang 

commit a80032dc44a1071a34f4415a7c5cef5170ee6159
Author: Steven Shi 
Date:   Tue Nov 19 16:22:08 2019 +0800

BaseTools: Remove redundant binary cache file

Redesign the binary cache and not need to save the
cache intermediate result and state in memory as a
ModuleBuildCacheIR class instance. So remove the
CacheIR.py which define the ModuleBuildCacheIR class.

Signed-off-by: Steven Shi 

Cc: Liming Gao 
Cc: Bob Feng 
Reviewed-by: Bob Feng 

commit fc8b8deac2d77524ff8cfe44acf95b5e1f59804e
Author: Steven Shi 
Date:   Tue Nov 19 16:17:00 2019 +0800

BaseTools: Leverage compiler output to optimize binary cache

Redesign the binary cache and bases on the compiler to
output the dependency header files info for every module.
The binary cache will directly consume the dependency header
files info and doesn't parse the C source code by iteself.
Also redesign the dependency files list format for

Re: [Xen-devel] [PATCH] x86/microcode: Support builtin CPU microcode

2019-12-11 Thread Jan Beulich

On 10.12.2019 23:40, Eslam Elnikety wrote:
> On 10.12.19 10:21, Jan Beulich wrote:
>> On 09.12.2019 22:49, Eslam Elnikety wrote:
>>> On 09.12.19 16:19, Andrew Cooper wrote:
 On 09/12/2019 08:41, Eslam Elnikety wrote:
> --- /dev/null
> +++ b/xen/arch/x86/microcode/Makefile
> @@ -0,0 +1,40 @@
> +# Copyright (C) 2019 Amazon.com, Inc. or its affiliates.
> +# Author: Eslam Elnikety 
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +
> +obj-y += builtin_ucode.o
> +
> +# Directory holding the microcode updates.
> +UCODE_DIR=$(patsubst "%",%,$(CONFIG_BUILTIN_UCODE_DIR))
> +amd-blobs := $(wildcard $(UCODE_DIR)/amd-ucode/*)
> +intel-blobs := $(wildcard $(UCODE_DIR)/intel-ucode/*)

 This is a little dangerous.  I can see why you want to do it like this,
 and I can't provide any obvious suggestions, but if this glob picks up
 anything which isn't a microcode file, it will break the logic to search
 for the right blob.

>>>
>>> We can limit the amd-blobs and intel-blob to binaries following the
>>> naming convention AuthenticAMD.bin and GenuineIntel.bin for amd and
>>> intel, respectively. Yet, this would be imposing an unnecessary
>>> restriction on administrators who may want to be innovative with naming
>>> (or want to use the naming microcode_amd_*.bin or FF-MM-SS instead).
>>>
>>> Alternatively, we can introduce CONFIG_BUILTIN_UCODE_INTEL and
>>> CONFIG_BUILTIN_UCODE_AMD. Both default to empty strings. Then, an
>>> administrator can specify exactly the microcodes to include relative to
>>> the CONFIG_BUILTIN_UCODE_DIR. For example:
>>> CONFIG_BUILTIN_UCODE_INTEL="intel-ucode/06-3a-09"
>>> CONFIG_BUILTIN_UCODE_AMD="amd-ucode/microcode_amd_fam15h.bin"
>>
>> This would make the feature even less generic - I already meant to
> 
> I do not follow the point about being less generic. (I hope my example 
> did not give the false impression that CONFIG_BUILTIN_UCODE_{AMD,INTEL} 
> allow for only a single microcode blob for a single signature).

Well, the example indeed has given this impression to me. I'm
having a hard time seeing how, beyond very narrow special cases,
either of the examples could be useful to anyone. Yet I think
examples should be generally useful.

>> ask whether building ucode into binaries is really a useful thing
>> when we already have more flexible ways. I could see this being
>> useful if there was no other way to make ucode available at boot
>> time.
> 
> It is useful in addition to the existing ways to do early microcode 
> updates. First, when operating many hosts, using boot modules (either a 
> distinct microcode module or within an initrd) becomes involved. For 
> instance, tools to update boot entries (e.g., 
> https://linux.die.net/man/8/grubby) do not support adding arbitrary 
> (microcode) modules.

I.e. you suggest to work around tools shortcomings by extending
Xen? Wouldn't the more appropriate way to deal with this be to
make the tools more capable?

> Second, there is often need to couple a Xen build with a minimum 
> microcode patch level. Having the microcode built within the Xen image 
> itself is a streamlined, natural way of achieving that.

Okay, I can accept this as a reason, to some degree at least. Yet
as said elsewhere, I don't think you want then to override a
possible "external" ucode module with the builtin blobs. Instead
the newest of everything that's available should then be loaded.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH-for-4.13] SUPPORT.md: add core scheduling

2019-12-11 Thread Jan Beulich

On 11.12.2019 09:45, Juergen Gross wrote:
> Add core scheduling feature to SUPPORT.md.
> 
> Signed-off-by: Juergen Gross 

Acked-by: Jan Beulich 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86emul: correct LFS et al handling for 64-bit mode

2019-12-11 Thread Jan Beulich

AMD and friends explicitly specify that 64-bit operands aren't possible
for these insns. Nevertheless REX.W isn't fully ignored: It still
cancels a possible operand size override (0x66). Intel otoh explicitly
provides for 64-bit operands on the respective insn page of the SDM.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -2640,6 +2640,15 @@ x86_decode_twobyte(
 }
 break;
 
+case 0xb2: /* lss */
+case 0xb4: /* lfs */
+case 0xb5: /* lgs */
+/* REX.W ignored on a vendor-dependent basis. */
+if ( op_bytes == 8 &&
+ (ctxt->cpuid->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
+op_bytes = 4;
+break;
+
 case 0xb8: /* jmpe / popcnt */
 if ( rep_prefix() )
 ctxt->opcode |= MASK_INSR(vex.pfx, X86EMUL_OPC_PFX_MASK);

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86emul: correct segment override decode for 64-bit mode

2019-12-11 Thread Jan Beulich

The legacy / compatibility mode ES, CS, SS, and DS overrides are null
prefixes in 64-bit mode, i.e. they in particular don't cancel an
earlier FS or GS one.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -2830,14 +2830,17 @@ x86_decode(
 case 0x67: /* address-size override */
 ad_bytes = def_ad_bytes ^ (mode_64bit() ? 12 : 6);
 break;
-case 0x2e: /* CS override */
-override_seg = x86_seg_cs;
+case 0x2e: /* CS override / null prefix in 64-bit mode */
+if ( !mode_64bit() )
+override_seg = x86_seg_cs;
 break;
-case 0x3e: /* DS override */
-override_seg = x86_seg_ds;
+case 0x3e: /* DS override / null prefix in 64-bit mode */
+if ( !mode_64bit() )
+override_seg = x86_seg_ds;
 break;
-case 0x26: /* ES override */
-override_seg = x86_seg_es;
+case 0x26: /* ES override / null prefix in 64-bit mode */
+if ( !mode_64bit() )
+override_seg = x86_seg_es;
 break;
 case 0x64: /* FS override */
 override_seg = x86_seg_fs;
@@ -2845,8 +2848,9 @@ x86_decode(
 case 0x65: /* GS override */
 override_seg = x86_seg_gs;
 break;
-case 0x36: /* SS override */
-override_seg = x86_seg_ss;
+case 0x36: /* SS override / null prefix in 64-bit mode */
+if ( !mode_64bit() )
+override_seg = x86_seg_ss;
 break;
 case 0xf0: /* LOCK */
 lock_prefix = 1;
@@ -2871,10 +2875,6 @@ x86_decode(
 }
  done_prefixes:
 
-/* %{e,c,s,d}s overrides are ignored in 64bit mode. */
-if ( mode_64bit() && override_seg < x86_seg_fs )
-override_seg = x86_seg_none;
-
 if ( rex_prefix & REX_W )
 op_bytes = 8;
 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 5/6] xen/tasklet: Return -ERESTART from continue_hypercall_on_cpu()

2019-12-11 Thread Andrew Cooper

On 11/12/2019 07:41, Jan Beulich wrote:
> On 10.12.2019 18:55, Andrew Cooper wrote:
>> On 10/12/2019 08:55, Jan Beulich wrote:
>>> On 09.12.2019 18:49, Andrew Cooper wrote:
 On 09/12/2019 16:52, Jan Beulich wrote:
> On 05.12.2019 23:30, Andrew Cooper wrote:
>> Some hypercalls tasklets want to create a continuation, rather than fail 
>> the
>> hypercall with a hard error.  By the time the tasklet is executing, it 
>> is too
>> late to create the continuation, and even continue_hypercall_on_cpu() 
>> doesn't
>> have enough state to do it correctly.
> I think it would be quite nice if you made clear what piece of state
> it is actually missing. To be honest, I don't recall anymore.
 How to correctly mutate the registers and/or memory (which is specific
 to the hypercall subop in some cases).
>>> Well, in-memory arguments can be accessed as long as the mapping is
>>> the right one (which it typically wouldn't be inside a tasklet). Do
>>> existing continue_hypercall_on_cpu() users need this? Looking over
>>> patch 4 again, I didn't think so. (Which isn't to say that removing
>>> the latent issue is not a good thing.)
>>>
>>> In-register values can be changed as long as the respective exit
>>> path will suitably pick up the value, which I thought was always
>>> the case.
>>>
>>> Hence I'm afraid your single reply sentence didn't really clarify
>>> matters. I'm sorry if this is just because of me being dense.
>> How, physically, would you arrange for continue_hypercall_on_cpu() to
>> make the requisite state adjustments?
> You can't (at least not without having sufficient further context),
> I agree. Yet ...
>
>> Yes - registers and memory can be accessed, but only the hypercall
>> (sub?)op handler knows how to mutate them appropriately.
>>
>> You'd have to copy the mutation logic into continue_hypercall_on_cpu(),
>> and pass in op/subops and a union of all pointers, *and* whatever
>> intermediate state the subop handler needs.
>>
>> Or you can return -ERESTART and let the caller DTRT with the state it
>> has in context, as it would in other cases requiring a continuation.
> ... it continues to be unclear to me whether you're fixing an actual
> bug here, or just a latent one.

I'm not fixing any bug.

I am making a fundamental change in behaviour, so tasklet context can
use -ERESTART.

Tasklet context doesn't even know what the primary hypercall index needs
to be to correctly create a continuation.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH-for-4.13] SUPPORT.md: add core scheduling

2019-12-11 Thread Juergen Gross

Add core scheduling feature to SUPPORT.md.

Signed-off-by: Juergen Gross 
---
 SUPPORT.md | 8 
 1 file changed, 8 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 1cad7d6164..169b6f8fcf 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -255,6 +255,14 @@ of using different schedulers and scheduling properties.
 
 Status: Supported
 
+### Core Scheduling
+
+Allows to group virtual cpus into virtual cores which are scheduled on the
+physical cores. This results in never running different guests at the same
+time on the same physical core.
+
+Status, x86: Experimental
+
 ### Credit Scheduler
 
 A weighted proportional fair share virtual CPU scheduler.
-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

1 2 >

1 - 100 of 103 matches

Mail list logo