[Xen-devel] [xen-4.7-testing test] 120025: regressions - FAIL

2018-02-26 Thread osstest service owner
flight 120025 xen-4.7-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120025/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-pvopsbroken  in 119952
 build-armhf-pvops  4 host-install(4) broken in 119952 REGR. vs. 119780
 test-armhf-armhf-xl-vhd   6 xen-install  fail REGR. vs. 119780
 build-armhf   6 xen-build  fail in 119995 REGR. vs. 119780

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemut-debianhvm-amd64 16 guest-localmigrate/x10 fail in 
119952 pass in 120025
 test-xtf-amd64-amd64-5   50 xtf/test-hvm64-lbr-tsx-vmentry fail pass in 119952
 test-xtf-amd64-amd64-3   50 xtf/test-hvm64-lbr-tsx-vmentry fail pass in 119995
 test-xtf-amd64-amd64-2   50 xtf/test-hvm64-lbr-tsx-vmentry fail pass in 119995

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-xsm   1 build-check(1)   blocked in 119952 n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)  blocked in 119995 n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked in 119995 n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked in 119995 n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked in 119995 n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1) blocked in 119995 n/a
 build-armhf-libvirt   1 build-check(1)   blocked in 119995 n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked in 119995 n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked in 119995 n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked in 119995 n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked in 119995 n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked in 119995 n/a
 test-xtf-amd64-amd64-4 50 xtf/test-hvm64-lbr-tsx-vmentry fail in 119952 like 
119780
 test-xtf-amd64-amd64-1 50 xtf/test-hvm64-lbr-tsx-vmentry fail in 119995 like 
119780
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 119780
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 119780
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 119780
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 119780
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 119780
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 119780
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 119780
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail like 119780
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 119780
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 119780
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 119780
 test-xtf-amd64-amd64-3   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-2   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-4   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-5   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-1   52 xtf/test-hvm64-memop-seg fail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 

Re: [Xen-devel] [v2 1/1] xen, mm: Allow deferred page initialization for xen pv domains

2018-02-26 Thread Juergen Gross
On 26/02/18 17:01, Pavel Tatashin wrote:
> Juergen Gross noticed that commit
> f7f99100d8d ("mm: stop zeroing memory during allocation in vmemmap")
> broke XEN PV domains when deferred struct page initialization is enabled.
> 
> This is because the xen's PagePinned() flag is getting erased from struct
> pages when they are initialized later in boot.
> 
> Juergen fixed this problem by disabling deferred pages on xen pv domains.
> It is desirable, however, to have this feature available as it reduces boot
> time. This fix re-enables the feature for pv-dmains, and fixes the problem
> the following way:
> 
> The fix is to delay setting PagePinned flag until struct pages for all
> allocated memory are initialized, i.e. until after free_all_bootmem().
> 
> A new x86_init.hyper op init_after_bootmem() is called to let xen know
> that boot allocator is done, and hence struct pages for all the allocated
> memory are now initialized. If deferred page initialization is enabled, the
> rest of struct pages are going to be initialized later in boot once
> page_alloc_init_late() is called.
> 
> xen_after_bootmem() walks page table's pages and marks them pinned.
> 
> Signed-off-by: Pavel Tatashin 

Verified to work on a system where the original issue caused a crash.

Reviewed-by: Juergen Gross 
Tested-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Error while booting android kernel 4.4 on jacinto j6 evm as DOM0

2018-02-26 Thread moin anjnawala
Hi,

I am trying to boot kernel 4.4 on Jacinto-6 EVM board, I am facing error
from xen. Please find attached log to find errors.

The kernel is stuck at
[6.881378] Waiting for root device /dev/mmcblk0p2...

Is it due to i2c related errors in boot log as mentioned below?
[0.831911] palmas 0-0058: IRQ missing: skipping irq request
[1.860292] omap_i2c 4807.i2c: controller timed out
[1.880663] palmas: probe of 0-0058 failed with error -110
[2.902874] omap_i2c 4807.i2c: controller timed out
[2.922933] pcf857x: probe of 0-0020 failed with error -110
[3.945557] omap_i2c 4807.i2c: controller timed out
[3.965619] pcf857x: probe of 0-0021 failed with error -110
[3.971774] omap_i2c 4807.i2c: bus 0 rev0.12 at 400 kHz
[5.000328] omap_i2c 48072000.i2c: controller timed out
[5.020390] pcf857x: probe of 1-0026 failed with error -110
[5.026750] omap_i2c 48072000.i2c: bus 1 rev0.12 at 400 kHz
[5.033041] omap_i2c 4806.i2c: bus 2 rev0.12 at 400 kHz
[5.039104] genirq: Flags mismatch irq 328. 6000 (4807a000.i2c) vs.
6000 (48072000.i2c)
[5.048233] omap_i2c 4807a000.i2c: failure requesting irq 328
[5.054274] omap_i2c: probe of 4807a000.i2c failed with error -16

Can anyone provide the solution to these problems?

Thanks,
Moinuddin
U-Boot 2014.07-g5a7d6bd-dirty (Feb 27 2018 - 10:47:00)

CPU  : DRA752-GP ES1.1
Board: DRA74x EVM REV G.0
I2C:   ready
DRAM:  1.5 GiB
WARNING: Caches not enabled
MMC:   OMAP SD/MMC: 0, OMAP SD/MMC: 1
SF: Detected S25FL256S_64K with page size 256 Bytes, erase size 64 KiB, total 
32 MiB, mapped at 5c00
SATA link 0 timeout.
AHCI 0001.0300 32 slots 1 ports 3 Gbps 0x1 impl SATA mode
flags: 64bit ncq stag pm led clo only pmp pio slum part ccc apst 
scanning bus for devices...
Found 0 device(s).
MMC: block number 0x22 exceeds max(0x0)
efi partition table not found
SCSI:  Net:   cpsw
Hit any key to stop autoboot:  0 
U-Boot# 
U-Boot# run a
reading xen-uimage
820232 bytes read in 66 ms (11.9 MiB/s)
reading xenpolicy-uimage
9625 bytes read in 4 ms (2.3 MiB/s)
reading zImage
7944664 bytes read in 613 ms (12.4 MiB/s)
reading boot.img
9764864 bytes read in 752 ms (12.4 MiB/s)
reading /dra7-evm-lcd-lg.dtb
111689 bytes read in 13 ms (8.2 MiB/s)
## Booting kernel from Legacy Image at c100 ...
   Image Name:   XEN
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:820168 Bytes = 800.9 KiB
   Load Address: 9000
   Entry Point:  9000
   Verifying Checksum ... OK
## Flattened Device Tree blob at c2f0
   Booting using the fdt blob at 0xc2f0
   Loading Kernel Image ... OK
   Using Device Tree in place at c2f0, end c2f1e448

Starting kernel ...

- UART enabled -
- CPU  booting -
- Xen starting in Hyp mode -
- Zero BSS -
- Setting up control registers -
- Turning on paging -
- Ready -
(XEN) Checking for initrd in /chosen
(XEN) RAM: 8000 - dfff
(XEN) 
(XEN) MODULE[0]: c2f0 - c2f1c000 Device Tree  
(XEN) MODULE[1]: c000 - c200 Kernel   
(XEN) MODULE[2]: c300 - c301 XSM  
(XEN)  RESVD[0]: c2f0 - c2f1c000
(XEN) 
(XEN) Command line: dom0_mem=512M dom0_rambase_pfn=0x8 console=dtuart 
dtuart=serial0 dom0_max_vcpus=2 bootscrub=0 flask_enforcing=1
(XEN) Placing Xen at 0xdfe0-0xe000
(XEN) Update BOOTMOD_XEN from 9000-90111701 => 
dfe0-dff11701
(XEN) Xen heap: da00-de00 (16384 pages)
(XEN) Dom heap: 376832 pages
(XEN) Domain heap initialised
(XEN) Platform: TI DRA7
(XEN) Looking for dtuart at "serial0", options ""
(XEN) omap-uart: Unable to retrieve the IRQ
(XEN) Unable to initialize dtuart: -22
(XEN) Bad console= option 'dtuart'
 Xen 4.6.6
(XEN) Xen version 4.6.6 (moinuddin.a@) (arm-linux-gnueabihf-gcc (crosstool-NG 
linaro-1.13.1-4.7-2013.03-20130313 - Linaro GCC 2013.03) 4.7.3 20130226 (
prerelease)) debug=y Mon Feb 26 14:18:12 IST 2018
(XEN) Latest ChangeSet: Thu Apr 14 15:41:19 2016 +0300 git:039cab3-dirty
(XEN) Processor: 412fc0f2: "ARM Limited", variant: 0x2, part 0xc0f, rev 0x2
(XEN) 32-bit Execution:
(XEN)   Processor Features: 1131:00011011
(XEN) Instruction Sets: AArch32 A32 Thumb Thumb-2 ThumbEE Jazelle
(XEN) Extensions: GenericTimer Security
(XEN)   Debug Features: 02010555
(XEN)   Auxiliary Features: 
(XEN)   Memory Model Features: 10201105 2000 0124 02102211
(XEN)  ISA Features: 02101110 13112111 21232041 2131 10011142 
(XEN) /psci method must be smc, but is: "hvc"
(XEN) Set AuxCoreBoot1 to dfe0004c (0020004c)
(XEN) Set AuxCoreBoot0 to 0x20
(XEN) Generic Timer IRQ: phys=30 hyp=26 virt=27 Freq: 6144 KHz
(XEN) GICv2 initialization:
(XEN) gic_dist_addr=48211000
(XEN) gic_cpu_addr=48212000
(XEN) gic_hyp_addr=48214000
(XEN) gic_vcpu_addr=48216000
(XEN) 

Re: [Xen-devel] [v2 1/1] xen, mm: Allow deferred page initialization for xen pv domains

2018-02-26 Thread Ingo Molnar

* Pavel Tatashin  wrote:

> Juergen Gross noticed that commit
> f7f99100d8d ("mm: stop zeroing memory during allocation in vmemmap")
> broke XEN PV domains when deferred struct page initialization is enabled.
> 
> This is because the xen's PagePinned() flag is getting erased from struct
> pages when they are initialized later in boot.
> 
> Juergen fixed this problem by disabling deferred pages on xen pv domains.
> It is desirable, however, to have this feature available as it reduces boot
> time. This fix re-enables the feature for pv-dmains, and fixes the problem
> the following way:
> 
> The fix is to delay setting PagePinned flag until struct pages for all
> allocated memory are initialized, i.e. until after free_all_bootmem().
> 
> A new x86_init.hyper op init_after_bootmem() is called to let xen know
> that boot allocator is done, and hence struct pages for all the allocated
> memory are now initialized. If deferred page initialization is enabled, the
> rest of struct pages are going to be initialized later in boot once
> page_alloc_init_late() is called.
> 
> xen_after_bootmem() walks page table's pages and marks them pinned.
> 
> Signed-off-by: Pavel Tatashin 
> ---
>  arch/x86/include/asm/x86_init.h |  2 ++
>  arch/x86/kernel/x86_init.c  |  1 +
>  arch/x86/mm/init_32.c   |  1 +
>  arch/x86/mm/init_64.c   |  1 +
>  arch/x86/xen/mmu_pv.c   | 38 ++
>  mm/page_alloc.c |  4 
>  6 files changed, 31 insertions(+), 16 deletions(-)

Acked-by: Ingo Molnar 

Thanks,

Ingo

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 8/9] drm/xen-front: Implement GEM operations

2018-02-26 Thread Oleksandr Andrushchenko

On 02/27/2018 01:47 AM, Boris Ostrovsky wrote:

On 02/23/2018 10:35 AM, Oleksandr Andrushchenko wrote:

On 02/23/2018 05:26 PM, Boris Ostrovsky wrote:

On 02/21/2018 03:03 AM, Oleksandr Andrushchenko wrote:

+static struct xen_gem_object *gem_create(struct drm_device *dev,
size_t size)
+{
+struct xen_drm_front_drm_info *drm_info = dev->dev_private;
+struct xen_gem_object *xen_obj;
+int ret;
+
+size = round_up(size, PAGE_SIZE);
+xen_obj = gem_create_obj(dev, size);
+if (IS_ERR_OR_NULL(xen_obj))
+return xen_obj;
+
+if (drm_info->cfg->be_alloc) {
+/*
+ * backend will allocate space for this buffer, so
+ * only allocate array of pointers to pages
+ */
+xen_obj->be_alloc = true;

If be_alloc is a flag (which I am not sure about) --- should it be set
to true *after* you've successfully allocated your things?

this is a configuration option telling about the way
the buffer gets allocated: either by the frontend or
backend (be_alloc -> buffer allocated by the backend)


I can see how drm_info->cfg->be_alloc might be a configuration option
but xen_obj->be_alloc is set here and that's not how configuration
options typically behave.

you are right, I will put be_alloc down the code and will slightly
rework error handling for this function



+ret = gem_alloc_pages_array(xen_obj, size);
+if (ret < 0) {
+gem_free_pages_array(xen_obj);
+goto fail;
+}
+
+ret = alloc_xenballooned_pages(xen_obj->num_pages,
+xen_obj->pages);

Why are you allocating balloon pages?

in this use-case we map pages provided by the backend
(yes, I know this can be a problem from both security
POV and that DomU can die holding pages of Dom0 forever:
but still it is a configuration option, so user decides
if her use-case needs this and takes responsibility for
such a decision).


Perhaps I am missing something here but when you say "I know this can be
a problem from both security POV ..." then there is something wrong with
your solution.

well, in this scenario there are actually 2 concerns:
1. If DomU dies the pages/grants from Dom0/DomD cannot be
reclaimed back
2. Misbehaving guest may send too many requests to the
backend exhausting grant references and memory of Dom0/DomD
(this is the only concern from security POV). Please see [1]

But, we are focusing on embedded use-cases,
so those systems we use are not that "dynamic" with respect to 2).
Namely: we have fixed number of domains and their functionality
is well known, so we can do rather precise assumption on resource
usage. This is why I try to warn on such a use-case and rely on
the end user who understands the caveats

I'll probably add more precise description of this use-case
clarifying what is that security POV, so there is no confusion

Hope this explanation answers your questions

-boris


Please see description of the buffering modes in xen_drm_front.h
specifically for backend allocated buffers:
  
***

  * 2. Buffers allocated by the backend
  
***

  *
  * This mode of operation is run-time configured via guest domain
configuration
  * through XenStore entries.
  *
  * For systems which do not provide IOMMU support, but having specific
  * requirements for display buffers it is possible to allocate such
buffers
  * at backend side and share those with the frontend.
  * For example, if host domain is 1:1 mapped and has DRM/GPU hardware
expecting
  * physically contiguous memory, this allows implementing zero-copying
  * use-cases.


-boris


+if (ret < 0) {
+DRM_ERROR("Cannot allocate %zu ballooned pages: %d\n",
+xen_obj->num_pages, ret);
+goto fail;
+}
+
+return xen_obj;
+}
+/*
+ * need to allocate backing pages now, so we can share those
+ * with the backend
+ */
+xen_obj->num_pages = DIV_ROUND_UP(size, PAGE_SIZE);
+xen_obj->pages = drm_gem_get_pages(_obj->base);
+if (IS_ERR_OR_NULL(xen_obj->pages)) {
+ret = PTR_ERR(xen_obj->pages);
+xen_obj->pages = NULL;
+goto fail;
+}
+
+return xen_obj;
+
+fail:
+DRM_ERROR("Failed to allocate buffer with size %zu\n", size);
+return ERR_PTR(ret);
+}
+


Thank you,
Oleksandr

[1] 
https://lists.xenproject.org/archives/html/xen-devel/2017-07/msg03100.html


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] x86/ACPI: Parse ACPI_FADT_LEGACY_DEVICES

2018-02-26 Thread Rajneesh Bhardwaj
On Tue, Feb 27, 2018 at 11:47:45AM +0530, Anshuman Gupta wrote:
> From: "Luis R. Rodriguez" 
>

Anshuman, looks like you sent this by mistake. Please, be careful!

Everyone, kindly ignore this patch.
 
> ACPI 5.2.9.3 IA-PC Boot Architecture flag ACPI_FADT_LEGACY_DEVICES
> can be used to determine if a system has legacy devices LPC or
> legacy components.
> 
> 2.7.4
> 

-- 
Best Regards,
Rajneesh

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 1/2] x86/ACPI: Parse ACPI_FADT_LEGACY_DEVICES

2018-02-26 Thread Anshuman Gupta
From: "Luis R. Rodriguez" 

ACPI 5.2.9.3 IA-PC Boot Architecture flag ACPI_FADT_LEGACY_DEVICES
can be used to determine if a system has legacy devices LPC or
ISA devices. The x86 platform already has a struct which lists
known associated legacy devices, we start off careful only
by disabling root devices we should not regress with. The struct
and device list can be expanded with time to cover more root
legacy components.

Change-Id: I85ba7dfb405c7faefc0f8e6b43a6e7260a27a1c9
Signed-off-by: Luis R. Rodriguez 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Denys Vlasenko 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: andrew.coop...@citrix.com
Cc: andriy.shevche...@linux.intel.com
Cc: bige...@linutronix.de
Cc: boris.ostrov...@oracle.com
Cc: david.vra...@citrix.com
Cc: ffaine...@freebox.fr
Cc: george.dun...@citrix.com
Cc: g...@suse.com
Cc: jgr...@suse.com
Cc: j...@suse.com
Cc: j...@joshtriplett.org
Cc: julien.gr...@linaro.org
Cc: konrad.w...@oracle.com
Cc: kozer...@parallels.com
Cc: l...@kernel.org
Cc: lgu...@lists.ozlabs.org
Cc: linux-a...@vger.kernel.org
Cc: lv.zh...@intel.com
Cc: m...@codeblueprint.co.uk
Cc: mbi...@freebox.fr
Cc: r...@rjwysocki.net
Cc: robert.mo...@intel.com
Cc: ru...@rustcorp.com.au
Cc: ti...@suse.de
Cc: toshi.k...@hp.com
Cc: xen-de...@lists.xensource.com
Link: 
http://lkml.kernel.org/r/1460592286-300-13-git-send-email-mcg...@kernel.org
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/acpi/boot.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 8816102..7a25121 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -936,6 +936,10 @@ late_initcall(hpet_insert_resource);
 
 static int __init acpi_parse_fadt(struct acpi_table_header *table)
 {
+   if (!(acpi_gbl_FADT.boot_flags & ACPI_FADT_LEGACY_DEVICES)) {
+   pr_debug("ACPI: no legacy devices present\n");
+   x86_platform.legacy.devices.pnpbios = 0;
+   }
 
if (acpi_gbl_FADT.header.revision >= FADT2_REVISION_ID &&
!(acpi_gbl_FADT.boot_flags & ACPI_FADT_8042) &&
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-linus test] 120022: regressions - FAIL

2018-02-26 Thread osstest service owner
flight 120022 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120022/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-xsm7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-libvirt   7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 
118324
 test-amd64-i386-xl-qemuu-win10-i386  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-freebsd10-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-boot   fail REGR. vs. 118324
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-boot   fail REGR. vs. 118324
 test-amd64-i386-xl-raw7 xen-boot fail REGR. vs. 118324
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start   fail REGR. vs. 118324
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start fail REGR. vs. 118324
 test-amd64-i386-examine   8 reboot   fail REGR. vs. 118324
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-pair 10 xen-boot/src_hostfail REGR. vs. 118324
 test-amd64-i386-pair 11 xen-boot/dst_hostfail REGR. vs. 118324
 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail REGR. vs. 118324
 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail REGR. vs. 118324
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-win10-i386  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-rumprun-i386  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-libvirt-xsm   7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-freebsd10-i386  7 xen-boot   fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 
118324
 test-armhf-armhf-libvirt  7 xen-boot fail REGR. vs. 118324

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 118324
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 118324
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 118324
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 118324
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 118324
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 118324
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 

[Xen-devel] [PATCH 2/2] xen: events: free irqs in error condition

2018-02-26 Thread Amit Shah
In case of errors in irq setup for MSI, free up the allocated irqs.

Fixes: 4892c9b4ada9f9 ("xen: add support for MSI message groups")
Reported-by: Hooman Mirhadi 
CC: 
CC: Roger Pau Monné 
CC: David Vrabel 
CC: Boris Ostrovsky 
CC: Eduardo Valentin 
CC: Juergen Gross 
CC: Thomas Gleixner 
CC: "K. Y. Srinivasan" 
CC: Liu Shuo 
CC: Anoob Soman 
Signed-off-by: Amit Shah 
---
 drivers/xen/events/events_base.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index b6b8b29..96aa575 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -758,6 +758,7 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct 
msi_desc *msidesc,
 error_irq:
for (; i >= 0; i--)
__unbind_from_irq(irq + i);
+   xen_free_irq(irq);
mutex_unlock(_mapping_update_lock);
return ret;
 }
-- 
2.7.3.AMZN

Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 0/2] xen: fix bugs in error conditions

2018-02-26 Thread Amit Shah
Hello,

These bugs were found during code review.  Details in the commits.

Please review and apply.

CC: Roger Pau Monné 
CC: David Vrabel 
CC: Boris Ostrovsky 
CC: Eduardo Valentin 
CC: Juergen Gross 
CC: Thomas Gleixner 
CC: "K. Y. Srinivasan" 
CC: Liu Shuo 
CC: Anoob Soman 


Amit Shah (2):
  xen: fix out-of-bounds irq unbind for MSI message groups
  xen: events: free irqs in error condition

 drivers/xen/events/events_base.c | 2 ++
 1 file changed, 2 insertions(+)

-- 
2.7.3.AMZN

Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 1/2] xen: fix out-of-bounds irq unbind for MSI message groups

2018-02-26 Thread Amit Shah
When an MSI descriptor was not available, the error path would try to
unbind an irq that was never acquired - potentially unbinding an
unrelated irq.

Fixes: 4892c9b4ada9f9 ("xen: add support for MSI message groups")
Reported-by: Hooman Mirhadi 
CC: 
CC: Roger Pau Monné 
CC: David Vrabel 
CC: Boris Ostrovsky 
CC: Eduardo Valentin 
CC: Juergen Gross 
CC: Thomas Gleixner 
CC: "K. Y. Srinivasan" 
CC: Liu Shuo 
CC: Anoob Soman 
Signed-off-by: Amit Shah 
---
 drivers/xen/events/events_base.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 1ab4bd1..b6b8b29 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -749,6 +749,7 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct 
msi_desc *msidesc,
}
 
ret = irq_set_msi_desc(irq, msidesc);
+   i--;
if (ret < 0)
goto error_irq;
 out:
-- 
2.7.3.AMZN

Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [ping] Re: [PATCH 2/5] x86/pv: Avoid leaking other guests' MSR_TSC_AUX values into PV context

2018-02-26 Thread Tian, Kevin
> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Sent: Tuesday, February 27, 2018 3:11 AM
> 
> On 26/02/18 11:25, Jan Beulich wrote:
>  On 20.02.18 at 12:58,  wrote:
> >> If the CPU pipeline supports RDTSCP or RDPID, a guest can observe the
> value in
> >> MSR_TSC_AUX, irrespective of whether the relevant CPUID features are
> >> advertised/hidden.
> >>
> >> At the moment, paravirt_ctxt_switch_to() only writes to MSR_TSC_AUX if
> >> TSC_MODE_PVRDTSCP mode is enabled, but this is not the default mode.
> >> Therefore, default PV guests can read the value from a previously
> scheduled
> >> HVM vcpu, or TSC_MODE_PVRDTSCP-enabled PV guest.
> >>
> >> Alter the PV path to always write to MSR_TSC_AUX, using 0 in the
> common
> >> case.
> >>
> >> To amortise overhead cost, introduce wrmsr_tsc_aux() which performs
> a lazy
> >> update of the MSR, and use this function consistently across the
> codebase.
> >>
> >> Signed-off-by: Andrew Cooper 
> > Despite me continuing to think that RDTSCP and RDPID should be
> > fully independent features, this being in line with the SDM:
> > Acked-by: Jan Beulich 
> 
> Thanks.
> 
> Given the important of this patch, I feel it is time to ping the VT-x
> and SVM maintainers for their input.
> 
> ~Andrew

Reviewed-by: Kevin Tian 
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 5/7] public / x86: introduce __HYPERCALL_iommu_op

2018-02-26 Thread Tian, Kevin
> From: Paul Durrant [mailto:paul.durr...@citrix.com]
> Sent: Monday, February 26, 2018 5:57 PM
> 
> > -Original Message-
> > From: Tian, Kevin [mailto:kevin.t...@intel.com]
> > Sent: 24 February 2018 02:57
> > To: Paul Durrant ; xen-
> de...@lists.xenproject.org
> > Cc: Stefano Stabellini ; Wei Liu
> > ; George Dunlap ;
> > Andrew Cooper ; Ian Jackson
> > ; Tim (Xen.org) ; Jan Beulich
> > ; Daniel De Graaf 
> > Subject: RE: [Xen-devel] [PATCH 5/7] public / x86: introduce
> > __HYPERCALL_iommu_op
> >
> > > From: Paul Durrant [mailto:paul.durr...@citrix.com]
> > > Sent: Friday, February 23, 2018 5:41 PM
> > >
> > > > -Original Message-
> > > > From: Tian, Kevin [mailto:kevin.t...@intel.com]
> > > > Sent: 23 February 2018 05:17
> > > > To: Paul Durrant ; xen-
> > > de...@lists.xenproject.org
> > > > Cc: Stefano Stabellini ; Wei Liu
> > > > ; George Dunlap ;
> > > > Andrew Cooper ; Ian Jackson
> > > > ; Tim (Xen.org) ; Jan Beulich
> > > > ; Daniel De Graaf 
> > > > Subject: RE: [Xen-devel] [PATCH 5/7] public / x86: introduce
> > > > __HYPERCALL_iommu_op
> > > >
> > > > > From: Paul Durrant [mailto:paul.durr...@citrix.com]
> > > > > Sent: Tuesday, February 13, 2018 5:23 PM
> > > > >
> > > > > > -Original Message-
> > > > > > From: Tian, Kevin [mailto:kevin.t...@intel.com]
> > > > > > Sent: 13 February 2018 06:43
> > > > > > To: Paul Durrant ; xen-
> > > > > de...@lists.xenproject.org
> > > > > > Cc: Stefano Stabellini ; Wei Liu
> > > > > > ; George Dunlap
> ;
> > > > > > Andrew Cooper ; Ian Jackson
> > > > > > ; Tim (Xen.org) ; Jan
> Beulich
> > > > > > ; Daniel De Graaf 
> > > > > > Subject: RE: [Xen-devel] [PATCH 5/7] public / x86: introduce
> > > > > > __HYPERCALL_iommu_op
> > > > > >
> > > > > > > From: Paul Durrant
> > > > > > > Sent: Monday, February 12, 2018 6:47 PM
> > > > > > >
> > > > > > > This patch introduces the boilerplate for a new hypercall to allow
> a
> > > > > > > domain to control IOMMU mappings for its own pages.
> > > > > > > Whilst there is duplication of code between the native and
> compat
> > > > > entry
> > > > > > > points which appears ripe for some form of combination, I think
> it is
> > > > > > > better to maintain the separation as-is because the compat entry
> > > point
> > > > > > > will necessarily gain complexity in subsequent patches.
> > > > > > >
> > > > > > > NOTE: This hypercall is only implemented for x86 and is currently
> > > > > > >   restricted by XSM to dom0 since it could be used to cause
> > > IOMMU
> > > > > > >   faults which may bring down a host.
> > > > > > >
> > > > > > > Signed-off-by: Paul Durrant 
> > > > > > [...]
> > > > > > > +
> > > > > > > +
> > > > > > > +static bool can_control_iommu(void)
> > > > > > > +{
> > > > > > > +struct domain *currd = current->domain;
> > > > > > > +
> > > > > > > +/*
> > > > > > > + * IOMMU mappings cannot be manipulated if:
> > > > > > > + * - the IOMMU is not enabled or,
> > > > > > > + * - the IOMMU is passed through or,
> > > > > > > + * - shared EPT configured or,
> > > > > > > + * - Xen is maintaining an identity map.
> > > > > >
> > > > > > "for dom0"
> > > > > >
> > > > > > > + */
> > > > > > > +if ( !iommu_enabled || iommu_passthrough ||
> > > > > > > + iommu_use_hap_pt(currd) || need_iommu(currd) )
> > > > > >
> > > > > > I guess it's clearer to directly check iommu_dom0_strict here
> > > > >
> > > > > Well, the problem with that is that it totally ties this interface to
> dom0.
> > > > > Whilst, in practice, that is the case at the moment (because of the
> xsm
> > > > > check) I do want to leave the potential to allow other PV domains to
> > > control
> > > > > their IOMMU mappings, if that make sense in future.
> > > > >
> > > >
> > > > first it's inconsistent from the comments - "Xen is maintaining
> > > > an identity map" which only applies to dom0.
> > >
> > > That's not true. If I assign a PCI device to an HVM domain, for instance,
> > > then need_iommu() is true for that domain and indeed Xen maintains a
> 1:1
> > > BFN:GFN map for that domain.
> > >
> > > >
> > > > second I'm afraid !need_iommu is not an accurate condition to
> represent
> > > > PV domain. what about iommu also enabled for future PV domains?
> > > >
> > >
> > > I don't quite follow... need_iommu is a per-domain flag, set for dom0
> when

[Xen-devel] [xen-unstable-smoke test] 120051: tolerable all pass - PUSHED

2018-02-26 Thread osstest service owner
flight 120051 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120051/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  cd8b749282475caef095ea2f339a01d1ff9714ae
baseline version:
 xen  9683360c574dff909d728cf55a1ed310a8bc60bb

Last test of basis   120044  2018-02-26 18:01:25 Z0 days
Testing same since   120051  2018-02-27 00:01:15 Z0 days1 attempts


People who touched revisions under test:
  Julien Grall 
  Stefano Stabellini 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   9683360c57..cd8b749282  cd8b749282475caef095ea2f339a01d1ff9714ae -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/6] x86: Handle the Xen MSRs via the new guest_{rd, wr}msr() infrastructure

2018-02-26 Thread Tian, Kevin
> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Sent: Tuesday, February 27, 2018 1:35 AM
> 
> Dispatch from the guest_{rd,wr}msr() functions, after falling through from
> the
> !is_viridian_domain() case.
> 
> Rename {rd,wr}msr_hypervisor_regs() to guest_{rd,wr}msr_xen() for
> consistency,
> and because the _regs suffix isn't very appropriate.
> 
> Update them to take a vcpu pointer rather than presuming that they act on
> current, and switch to using X86EMUL_* return values.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Kevin Tian 
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/6] x86/hvm: Handle viridian MSRs via the new guest_{rd, wr}msr() infrastructure

2018-02-26 Thread Tian, Kevin
> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Sent: Tuesday, February 27, 2018 1:35 AM
> 
> Dispatch from the guest_{rd,wr}msr() functions, after confirming that the
> domain is configured to use viridian.  This allows for simplifiction of the
> viridian helpers as they don't need to cope with the "not a viridian MSR"
> case.  It also means that viridian MSRs which are unimplemented, or
> excluded
> because of features, don't fall back into default handling path.
> 
> Rename {rd,wr}msr_viridian_regs() to guest_{rd,wr}msr_viridian() for
> consistency, and because the _regs suffix isn't very appropriate.
> 
> Update them to take a vcpu pointer rather than presuming that they act on
> current, which is safe for all implemented operations.  Also update them to
> use X86EMUL_* return values.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Kevin Tian 
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 8/9] drm/xen-front: Implement GEM operations

2018-02-26 Thread Boris Ostrovsky
On 02/23/2018 10:35 AM, Oleksandr Andrushchenko wrote:
> On 02/23/2018 05:26 PM, Boris Ostrovsky wrote:
>> On 02/21/2018 03:03 AM, Oleksandr Andrushchenko wrote:
>>> +static struct xen_gem_object *gem_create(struct drm_device *dev,
>>> size_t size)
>>> +{
>>> +struct xen_drm_front_drm_info *drm_info = dev->dev_private;
>>> +struct xen_gem_object *xen_obj;
>>> +int ret;
>>> +
>>> +size = round_up(size, PAGE_SIZE);
>>> +xen_obj = gem_create_obj(dev, size);
>>> +if (IS_ERR_OR_NULL(xen_obj))
>>> +return xen_obj;
>>> +
>>> +if (drm_info->cfg->be_alloc) {
>>> +/*
>>> + * backend will allocate space for this buffer, so
>>> + * only allocate array of pointers to pages
>>> + */
>>> +xen_obj->be_alloc = true;
>> If be_alloc is a flag (which I am not sure about) --- should it be set
>> to true *after* you've successfully allocated your things?
> this is a configuration option telling about the way
> the buffer gets allocated: either by the frontend or
> backend (be_alloc -> buffer allocated by the backend)


I can see how drm_info->cfg->be_alloc might be a configuration option
but xen_obj->be_alloc is set here and that's not how configuration
options typically behave.


>>
>>> +ret = gem_alloc_pages_array(xen_obj, size);
>>> +if (ret < 0) {
>>> +gem_free_pages_array(xen_obj);
>>> +goto fail;
>>> +}
>>> +
>>> +ret = alloc_xenballooned_pages(xen_obj->num_pages,
>>> +xen_obj->pages);
>> Why are you allocating balloon pages?
> in this use-case we map pages provided by the backend
> (yes, I know this can be a problem from both security
> POV and that DomU can die holding pages of Dom0 forever:
> but still it is a configuration option, so user decides
> if her use-case needs this and takes responsibility for
> such a decision).


Perhaps I am missing something here but when you say "I know this can be
a problem from both security POV ..." then there is something wrong with
your solution.

-boris

>
> Please see description of the buffering modes in xen_drm_front.h
> specifically for backend allocated buffers:
>  
> ***
>
>  * 2. Buffers allocated by the backend
>  
> ***
>
>  *
>  * This mode of operation is run-time configured via guest domain
> configuration
>  * through XenStore entries.
>  *
>  * For systems which do not provide IOMMU support, but having specific
>  * requirements for display buffers it is possible to allocate such
> buffers
>  * at backend side and share those with the frontend.
>  * For example, if host domain is 1:1 mapped and has DRM/GPU hardware
> expecting
>  * physically contiguous memory, this allows implementing zero-copying
>  * use-cases.
>
>>
>> -boris
>>
>>> +if (ret < 0) {
>>> +DRM_ERROR("Cannot allocate %zu ballooned pages: %d\n",
>>> +xen_obj->num_pages, ret);
>>> +goto fail;
>>> +}
>>> +
>>> +return xen_obj;
>>> +}
>>> +/*
>>> + * need to allocate backing pages now, so we can share those
>>> + * with the backend
>>> + */
>>> +xen_obj->num_pages = DIV_ROUND_UP(size, PAGE_SIZE);
>>> +xen_obj->pages = drm_gem_get_pages(_obj->base);
>>> +if (IS_ERR_OR_NULL(xen_obj->pages)) {
>>> +ret = PTR_ERR(xen_obj->pages);
>>> +xen_obj->pages = NULL;
>>> +goto fail;
>>> +}
>>> +
>>> +return xen_obj;
>>> +
>>> +fail:
>>> +DRM_ERROR("Failed to allocate buffer with size %zu\n", size);
>>> +return ERR_PTR(ret);
>>> +}
>>> +
>>>
>


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 120044: tolerable all pass - PUSHED

2018-02-26 Thread osstest service owner
flight 120044 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120044/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  9683360c574dff909d728cf55a1ed310a8bc60bb
baseline version:
 xen  a823a5280f25ad19a751dd9a41044f556471e61a

Last test of basis   119966  2018-02-23 14:33:58 Z3 days
Testing same since   120044  2018-02-26 18:01:25 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 
  Roger Pau Monne 
  Roger Pau Monné 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   a823a5280f..9683360c57  9683360c574dff909d728cf55a1ed310a8bc60bb -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen: fix out-of-bounds irq unbind for MSI message groups

2018-02-26 Thread Boris Ostrovsky
On 02/26/2018 12:36 PM, Amit Shah wrote:
> When an MSI descriptor was not available, the error path would try to
> unbind an irq that was never acquired - potentially unbinding an
> unrelated irq.
>
> Fixes: 4892c9b4ada9f9 ("xen: add support for MSI message groups")
> Reported-by: Hooman Mirhadi 
> CC: 
> CC: Roger Pau Monné 
> CC: David Vrabel 
> CC: Boris Ostrovsky 
> CC: Eduardo Valentin 
> CC: Juergen Gross 
> CC: Thomas Gleixner 
> CC: "K. Y. Srinivasan" 
> CC: Liu Shuo 
> CC: Anoob Soman 
> Signed-off-by: Amit Shah 
> ---
>  drivers/xen/events/events_base.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/xen/events/events_base.c 
> b/drivers/xen/events/events_base.c
> index 1ab4bd1..b6b8b29 100644
> --- a/drivers/xen/events/events_base.c
> +++ b/drivers/xen/events/events_base.c
> @@ -749,6 +749,7 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct 
> msi_desc *msidesc,
>   }
>  
>   ret = irq_set_msi_desc(irq, msidesc);
> + i--;
>   if (ret < 0)
>   goto error_irq;

We really only need to do this in case of an error.

(And this patch needs to go to stable trees as well.)

Thanks
-boris



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/5] x86/pv: Avoid leaking other guests' MSR_TSC_AUX values into PV context

2018-02-26 Thread Boris Ostrovsky
On 02/20/2018 06:58 AM, Andrew Cooper wrote:
> If the CPU pipeline supports RDTSCP or RDPID, a guest can observe the value in
> MSR_TSC_AUX, irrespective of whether the relevant CPUID features are
> advertised/hidden.
>
> At the moment, paravirt_ctxt_switch_to() only writes to MSR_TSC_AUX if
> TSC_MODE_PVRDTSCP mode is enabled, but this is not the default mode.
> Therefore, default PV guests can read the value from a previously scheduled
> HVM vcpu, or TSC_MODE_PVRDTSCP-enabled PV guest.
>
> Alter the PV path to always write to MSR_TSC_AUX, using 0 in the common case.
>
> To amortise overhead cost, introduce wrmsr_tsc_aux() which performs a lazy
> update of the MSR, and use this function consistently across the codebase.
>
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Jun Nakajima 
> CC: Kevin Tian 
> CC: Boris Ostrovsky 
> CC: Suravee Suthikulpanit 
> CC: Wei Liu 
> CC: Roger Pau Monné 
> CC: Konrad Rzeszutek Wilk 

Reviewed-by: Boris Ostrovsky 

(Apologies for the delay. I am quite behind with my emails)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 0/5] x86: Multiple fixes to MSR_TSC_AUX and RDTSCP handling for guests

2018-02-26 Thread Boris Ostrovsky
On 02/26/2018 02:12 PM, Andrew Cooper wrote:
> On 20/02/18 11:58, Andrew Cooper wrote:
>> This rats nest was discovered when finding that MSR_TSC_AUX leaked into PV
>> guests.  It is RFC because I haven't done extensive testing on the result, 
>> and
>> because there are some functional changes for the virtualised TSC modes.
>>
>> Andrew Cooper (5):
>>   x86/hvm: Don't shadow the domain parameter in hvm_save_cpu_msrs()
>>   x86/pv: Avoid leaking other guests' MSR_TSC_AUX values into PV context
>>   x86/time: Rework pv_soft_rdtsc() to aid further cleanup
>>   x86/pv: Remove deferred RDTSC{,P} handling in pv_emulate_privileged_op()
>>   x86: Rework MSR_TSC_AUX handling from scratch.
> Konrad/Boris: Can we have any input WRT TSC_MODE_PVRDTSCP usage?  Are
> you still using the feature, or is it abandoned?


I already asked a few internal teams about, haven't heard back.

-boris

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/6] x86: Handle the Xen MSRs via the new guest_{rd, wr}msr() infrastructure

2018-02-26 Thread Boris Ostrovsky
On 02/26/2018 12:35 PM, Andrew Cooper wrote:
> Dispatch from the guest_{rd,wr}msr() functions, after falling through from the
> !is_viridian_domain() case.
>
> Rename {rd,wr}msr_hypervisor_regs() to guest_{rd,wr}msr_xen() for consistency,
> and because the _regs suffix isn't very appropriate.
>
> Update them to take a vcpu pointer rather than presuming that they act on
> current, and switch to using X86EMUL_* return values.
>
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Jun Nakajima 
> CC: Paul Durrant 
> CC: Kevin Tian 
> CC: Boris Ostrovsky 
> CC: Suravee Suthikulpanit 
> CC: Wei Liu 
> CC: Roger Pau Monné 
> CC: Sergey Dyasli 
> ---
>  xen/arch/x86/hvm/svm/svm.c  | 25 -
>  xen/arch/x86/hvm/vmx/vmx.c  | 24 
>  xen/arch/x86/msr.c  |  8 
>  xen/arch/x86/pv/emul-priv-op.c  |  6 --
>  xen/arch/x86/traps.c| 33 -
>  xen/include/asm-x86/processor.h |  4 ++--
>  6 files changed, 34 insertions(+), 66 deletions(-)


Reviewed-by: Boris Ostrovsky 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/6] x86/hvm: Handle viridian MSRs via the new guest_{rd, wr}msr() infrastructure

2018-02-26 Thread Boris Ostrovsky
On 02/26/2018 12:35 PM, Andrew Cooper wrote:
> Dispatch from the guest_{rd,wr}msr() functions, after confirming that the
> domain is configured to use viridian.  This allows for simplifiction of the
> viridian helpers as they don't need to cope with the "not a viridian MSR"
> case.  It also means that viridian MSRs which are unimplemented, or excluded
> because of features, don't fall back into default handling path.
>
> Rename {rd,wr}msr_viridian_regs() to guest_{rd,wr}msr_viridian() for
> consistency, and because the _regs suffix isn't very appropriate.
>
> Update them to take a vcpu pointer rather than presuming that they act on
> current, which is safe for all implemented operations.  Also update them to
> use X86EMUL_* return values.
>
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Jun Nakajima 
> CC: Paul Durrant 
> CC: Kevin Tian 
> CC: Boris Ostrovsky 
> CC: Suravee Suthikulpanit 
> CC: Wei Liu 
> CC: Roger Pau Monné 
> CC: Sergey Dyasli 

Reviewed-by: Boris Ostrovsky 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 0/5] x86: Multiple fixes to MSR_TSC_AUX and RDTSCP handling for guests

2018-02-26 Thread Andrew Cooper
On 20/02/18 11:58, Andrew Cooper wrote:
> This rats nest was discovered when finding that MSR_TSC_AUX leaked into PV
> guests.  It is RFC because I haven't done extensive testing on the result, and
> because there are some functional changes for the virtualised TSC modes.
>
> Andrew Cooper (5):
>   x86/hvm: Don't shadow the domain parameter in hvm_save_cpu_msrs()
>   x86/pv: Avoid leaking other guests' MSR_TSC_AUX values into PV context
>   x86/time: Rework pv_soft_rdtsc() to aid further cleanup
>   x86/pv: Remove deferred RDTSC{,P} handling in pv_emulate_privileged_op()
>   x86: Rework MSR_TSC_AUX handling from scratch.

Konrad/Boris: Can we have any input WRT TSC_MODE_PVRDTSCP usage?  Are
you still using the feature, or is it abandoned?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [ping] Re: [PATCH 2/5] x86/pv: Avoid leaking other guests' MSR_TSC_AUX values into PV context

2018-02-26 Thread Andrew Cooper
On 26/02/18 11:25, Jan Beulich wrote:
 On 20.02.18 at 12:58,  wrote:
>> If the CPU pipeline supports RDTSCP or RDPID, a guest can observe the value 
>> in
>> MSR_TSC_AUX, irrespective of whether the relevant CPUID features are
>> advertised/hidden.
>>
>> At the moment, paravirt_ctxt_switch_to() only writes to MSR_TSC_AUX if
>> TSC_MODE_PVRDTSCP mode is enabled, but this is not the default mode.
>> Therefore, default PV guests can read the value from a previously scheduled
>> HVM vcpu, or TSC_MODE_PVRDTSCP-enabled PV guest.
>>
>> Alter the PV path to always write to MSR_TSC_AUX, using 0 in the common 
>> case.
>>
>> To amortise overhead cost, introduce wrmsr_tsc_aux() which performs a lazy
>> update of the MSR, and use this function consistently across the codebase.
>>
>> Signed-off-by: Andrew Cooper 
> Despite me continuing to think that RDTSCP and RDPID should be
> fully independent features, this being in line with the SDM:
> Acked-by: Jan Beulich 

Thanks.

Given the important of this patch, I feel it is time to ping the VT-x
and SVM maintainers for their input.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen: use hvc console for dom0

2018-02-26 Thread Boris Ostrovsky
On 02/26/2018 06:08 AM, Juergen Gross wrote:
> Today the hvc console is added as a preferred console for pv domUs
> only. As this requires a boot parameter for getting dom0 messages per
> default add it for dom0, too.
>
> Signed-off-by: Juergen Gross 
> ---
>  arch/x86/xen/enlighten_pv.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
> index c047f42552e1..d27740a80c5e 100644
> --- a/arch/x86/xen/enlighten_pv.c
> +++ b/arch/x86/xen/enlighten_pv.c
> @@ -1377,7 +1377,6 @@ asmlinkage __visible void __init xen_start_kernel(void)
>   if (!xen_initial_domain()) {
>   add_preferred_console("xenboot", 0, NULL);
>   add_preferred_console("tty", 0, NULL);
> - add_preferred_console("hvc", 0, NULL);
>   if (pci_xen)
>   x86_init.pci.arch_init = pci_xen_init;
>   } else {
> @@ -1410,6 +1409,9 @@ asmlinkage __visible void __init xen_start_kernel(void)
>  
>   xen_boot_params_init_edd();
>   }
> +
> + add_preferred_console("hvc", 0, NULL);
> +

Won't this prevent dom0 output from showing up on vga console by default?

-boris

>  #ifdef CONFIG_PCI
>   /* PCI BIOS service won't work from a PV guest. */
>   pci_probe &= ~PCI_PROBE_BIOS;


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/link: Don't merge .init.text and .init.data

2018-02-26 Thread Andrew Cooper
On 12/01/18 16:05, Jan Beulich wrote:
 On 12.01.18 at 16:41,  wrote:
>> On 12/01/18 11:23, Jan Beulich wrote:
>>> Even it you make .init.data its own section again, some
>>> garbage will remain (mainly due to the trampoline data).
>> That is far easier to spot an isolate, because it is tiny and reasonably
>> obviously bounded.
> Well, I've pointed out where the easy to spot bounding is:
>
>>>  If you
>>> care to use tools to find certain patterns in .init, simply discard
>>> everything following _einittext before giving the tool a go. IOW I
> ^^^

Perhaps easy if you're reading the disassembly, and are one half of the
Xen x86 maintainership.

I know about _einittext, but don't consider this easy, and it certainly
isn't reasonable to expect of anyone trying to use xen-syms.

>
>>> would much prefer for things to be left as they are.
>> No.  We should not be deliberately corrupting the binary to work around
>> a theoretical runtime issue for which there is a perfectly fine runtime
>> solution.  This is the real hack.
> Putting all init-time code and data in a single section is a perfectly
> valid thing to do imo.

Having our debug symbols borderline unusable, isn't valid.

Certainly not to work around what you yourself identify as a theoretical
issue in the first place.

> We don't care about permissions at that
> point in time, so the resulting RWX section is quite fine (and has a
> name much better suitable for a PE image). Otherwise the fact
> that we merge r/o and r/w init time data into a single section
> would need to be called a hack, too.

It is very much a hack.  I don't see what you're getting at.

Other option would be to #ifdef the section handing based on EFI, or
attempt to merge the sections with objcopy before passing mkreloc?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen: events: free irqs in error condition

2018-02-26 Thread Shah, Amit

On Mo, 2018-02-26 at 18:14 +, Roger Pau Monné wrote:
> On Mon, Feb 26, 2018 at 05:36:35PM +, Amit Shah wrote:
> > 
> > In case of errors in irq setup for MSI, free up the allocated irqs.
> > 
> > Fixes: 4892c9b4ada9f9 ("xen: add support for MSI message groups")
> > Reported-by: Hooman Mirhadi 
> > CC: 
> > CC: Roger Pau Monné 
> > CC: David Vrabel 
> > CC: Boris Ostrovsky 
> > CC: Eduardo Valentin 
> > CC: Juergen Gross 
> > CC: Thomas Gleixner 
> > CC: "K. Y. Srinivasan" 
> > CC: Liu Shuo 
> > CC: Anoob Soman 
> > Signed-off-by: Amit Shah 
> > ---
> >  drivers/xen/events/events_base.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/xen/events/events_base.c
> > b/drivers/xen/events/events_base.c
> > index b6b8b29..96aa575 100644
> > --- a/drivers/xen/events/events_base.c
> > +++ b/drivers/xen/events/events_base.c
> > @@ -758,6 +758,7 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev
> > *dev, struct msi_desc *msidesc,
> >  error_irq:
> >     for (; i >= 0; i--)
> >     __unbind_from_irq(irq + i);
> > +   xen_free_irq(irq);
> Hm, xen_free_irq calls irq_free_desc, which is irq_free_descs(irq,
> 1),

Er...  right.

> I think you will have to introduce a new free function:
> 
> xen_free_irqs(unsigned irq, unsigned int nr)
> 
> That calls irq_free_descs(irq, nr)

Actually, xen_free_irq() is already done in __unbind_from_irq(), so
this patch is actually wrong and not needed.


Amit
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH V2] libxl: set channel devid when not provided by application

2018-02-26 Thread Jim Fehlig
Applications like libvirt may not populate a device devid field,
delegating that to libxl. If needed, the application can later
retrieve the libxl-produced devid. Indeed most devices are handled
this way in libvirt, channel devices included.

This works well when only one channel device is defined, but more
than one results in

qemu-system-i386: -chardev socket,id=libxl-channel-1,\
path=/tmp/test-org.qemu.guest_agent.00,server,nowait:
Duplicate ID 'libxl-channel-1' for chardev

Besides the odd '-1' value in the id, multiple channels have the same
id, causing qemu to fail. A simple fix is to set an uninitialized
devid (-1) to the dev_num passed to libxl__init_console_from_channel().

Signed-off-by: Jim Fehlig 
---

V2:
Set console devid to channel devid as part of initializing a console
from a channel.

 tools/libxl/libxl_console.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_console.c b/tools/libxl/libxl_console.c
index 39d8430df8..9a02a23c2a 100644
--- a/tools/libxl/libxl_console.c
+++ b/tools/libxl/libxl_console.c
@@ -401,6 +401,9 @@ int libxl__init_console_from_channel(libxl__gc *gc,
 
 /* Perform validation first, allocate second. */
 
+if (channel->devid == -1)
+channel->devid = dev_num;
+
 if (!channel->name) {
 LOG(ERROR, "channel %d has no name", channel->devid);
 return ERROR_INVAL;
@@ -446,7 +449,7 @@ int libxl__init_console_from_channel(libxl__gc *gc,
 abort();
 }
 
-console->devid = dev_num;
+console->devid = channel->devid;
 console->consback = LIBXL__CONSOLE_BACKEND_IOEMU;
 console->backend_domid = channel->backend_domid;
 console->name = libxl__strdup(NOGC, channel->name);
-- 
2.16.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/pv: Rename pv/ro-page-fault.c to pv/emul-ro-page-fault.c

2018-02-26 Thread Andrew Cooper
On 05/02/18 13:01, Jan Beulich wrote:
 On 05.02.18 at 13:22,  wrote:
>> On 05/02/18 08:57, Jan Beulich wrote:
>> On 02.02.18 at 17:58,  wrote:
 To match all our other emulation handling.

 No functional change.

 Signed-off-by: Andrew Cooper 
 ---
 CC: Jan Beulich 
 ---
  xen/arch/x86/pv/Makefile  | 2 +-
  xen/arch/x86/pv/{ro-page-fault.c => emul-ro-page-fault.c} | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)
  rename xen/arch/x86/pv/{ro-page-fault.c => emul-ro-page-fault.c} (99%)
>>> When this file was introduced, iirc I had specifically asked to drop
>>> the pointless emul- prefix. If you want to make things consistent
>>> again, please instead drop the emul- prefixes of the other files.
>> No.
>>
>> First of all, this file is the most recent to come into existence,
>> around 3 months after the others.
> Right - it was too late for me to realize the needlessly long names
> in those earlier code movement patches.

That is a very subjective point of view which I don't agree with.

Naming is all to do with conveying meaning, and shorter isn't
necessarily better.

>
>> The point of naming things in a consistent fashion is for the benefit of
>> humans, and having the emulation related functionality logically grouped
>> is a benefit, not a detriment.
> They're all quite well grouped now already by being in pv/.

That is not the relevant grouping.  Most of our emulation based logic
has an emul- prefix and this file is an odd one out.

Naming the files without their emul- prefix leaves them with no context
as to what they are doing.  "gate-op.c" or "invl-op.c" are far less
obvious to their purpose than "emul-gate-op.c" and "emul-invl-op.c".

> Otherwise do you mean to also change e.g. gpr_switch.S to emul-gpr_switch.S?

This is extra special, and as soon as I can figure out how it actually
works, I plan to replace it with something comprehensive.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen: events: free irqs in error condition

2018-02-26 Thread Roger Pau Monné
On Mon, Feb 26, 2018 at 05:36:35PM +, Amit Shah wrote:
> In case of errors in irq setup for MSI, free up the allocated irqs.
> 
> Fixes: 4892c9b4ada9f9 ("xen: add support for MSI message groups")
> Reported-by: Hooman Mirhadi 
> CC: 
> CC: Roger Pau Monné 
> CC: David Vrabel 
> CC: Boris Ostrovsky 
> CC: Eduardo Valentin 
> CC: Juergen Gross 
> CC: Thomas Gleixner 
> CC: "K. Y. Srinivasan" 
> CC: Liu Shuo 
> CC: Anoob Soman 
> Signed-off-by: Amit Shah 
> ---
>  drivers/xen/events/events_base.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/xen/events/events_base.c 
> b/drivers/xen/events/events_base.c
> index b6b8b29..96aa575 100644
> --- a/drivers/xen/events/events_base.c
> +++ b/drivers/xen/events/events_base.c
> @@ -758,6 +758,7 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct 
> msi_desc *msidesc,
>  error_irq:
>   for (; i >= 0; i--)
>   __unbind_from_irq(irq + i);
> + xen_free_irq(irq);

Hm, xen_free_irq calls irq_free_desc, which is irq_free_descs(irq, 1),
I think you will have to introduce a new free function:

xen_free_irqs(unsigned irq, unsigned int nr)

That calls irq_free_descs(irq, nr)

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] common/gnttab: Introduce command line feature controls

2018-02-26 Thread Andrew Cooper
On 05/02/18 13:14, George Dunlap wrote:
> On 02/05/2018 12:56 PM, Jan Beulich wrote:
> On 05.02.18 at 12:55,  wrote:
>>> On 02/02/18 08:51, Jan Beulich wrote:
>>> On 01.02.18 at 15:38,  wrote:
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -916,6 +916,19 @@ Controls EPT related features.
>  
>  Specify which console gdbstub should use. See **console**.
>  
> +### gnttab
> +> `= List of [ max_ver:, transitive= ]`
 I realize you don't want to change this as people already use it, but
 I'd still like to give my usual comment: I'd prefer if we could avoid
 introducing further underscore-containing (sub)options. I really don't
 understand why everyone does this: Dashes are easier to type on
 all keyboards I'm aware of, and there's no need to mimic C identifier
 names for command line options.
>>> I can introduce a max-ver alias if you insist, but dropping max_ver here
>>> is going to break users who took this patch for XSA-226.
>> Hence the way I've worded my reply - I don't mean to insist on
>> changing what you have, or the introduction of an alias. I merely
>> wanted to give the comment, in the hope that it helps to avoid
>> future underscores in command line option names.
> FWIW I often end up looking at other options and name things similarly;
> so making the documentation say "max-ver", but accepting both "max-ver"
> and "max_ver", would probably make it more likely that future options
> would start out as having a dash rather than an underscore.
>
> But it's just a suggestion; I wouldn't push for it.

So how to unblock this?  There are no concrete suggestions, and no
concrete objections to the patch in its current form.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 07/11] vpci/bars: add handlers to map the BARs

2018-02-26 Thread Roger Pau Monné
On Mon, Feb 26, 2018 at 04:20:17AM -0700, Jan Beulich wrote:
> (re-sending with xen-devel re-added; not sure how it got lost)
> 
> >>> On 23.01.18 at 16:07,  wrote:
> > ---
> >  tools/tests/vpci/emul.h   |   1 +
> >  xen/arch/x86/hvm/ioreq.c  |   4 +
> 
> Again the Cc to Paul is missing (no matter that it's just a tiny change).

Sorry, I will run get_maintainer.pl against the whole patchset before
sending a new version.

> > +static int map_range(unsigned long s, unsigned long e, void *data,
> > + unsigned long *c)
> > +{
> > +const struct map_data *map = data;
> > +int rc;
> > +
> > +for ( ; ; )
> > +{
> > +unsigned long size = e - s + 1;
> 
> Considering the lack of a condition in the for() and the inclusiveness
> of the range (which means you can't express an empty range) I don't
> understand how ...
> 
> > +/*
> > + * ARM TODOs:
> > + * - On ARM whether the memory is prefetchable or not should be 
> > passed
> > + *   to map_mmio_regions in order to decide which memory attributes
> > + *   should be used.
> > + *
> > + * - {un}map_mmio_regions doesn't support preemption, hence the 
> > bodge
> > + *   below in order to limit the amount of mappings to 64 pages for
> > + *   each function call.
> > + */
> > +
> > +#ifdef CONFIG_ARM
> > +size = min(64ul, size);
> > +#endif
> > +
> > +rc = (map->map ? map_mmio_regions : unmap_mmio_regions)
> > + (map->d, _gfn(s), size, _mfn(s));
> > +if ( rc == 0 )
> > +{
> > +*c += size;
> > +#ifdef CONFIG_ARM
> > +rc = -ERESTART;
> > +#endif
> > +break;
> 
> ... this works in the ARM case. If the whole thing doesn't work (and
> doesn't get built) for ARM right now, make this obvious by one or more
> #error directives.

ARM will never iterate, instead it will rely on always returning
-ERESTART and letting the caller of rangeset_consume_ranges deal with
it. What's wrong here is that the call to rangeset_consume_ranges
should be:

while ( rangeset_consume_ranges(...) == -ERESTART );

But that makes the code even more convoluted than what it already is.
I've basically added the ARM part because Julien requested it, but I
think the right way to fix this is to unify the behaviour of
{un}map_mmio_regions between x86 and ARM.

I will drop the ARM chunks.

> > +static void modify_decoding(const struct pci_dev *pdev, bool map, bool rom)
> > +{
> > +struct vpci_header *header = >vpci->header;
> > +uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
> > +unsigned int i;
> > +
> > +for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
> > +{
> > +if ( rom && header->bars[i].type == VPCI_BAR_ROM )
> > +{
> > +unsigned int rom_pos = (i == 6) ? PCI_ROM_ADDRESS
> 
> I probably should have mentioned this earlier, but I'm really unhappy
> about such literal "magic" numbers. Please use a suitable expression
> or #define.
> 
> > +bool vpci_process_pending(struct vcpu *v)
> > +{
> > +while ( v->vpci.mem )
> > +{
> > +struct map_data data = {
> > +.d = v->domain,
> > +.map = v->vpci.map,
> > +};
> > +
> > +switch ( rangeset_consume_ranges(v->vpci.mem, map_range, ) )
> > +{
> > +case -ERESTART:
> > +return true;
> > +
> > +default:
> > +if ( v->vpci.map )
> > +{
> > +spin_lock(>vpci.pdev->vpci->lock);
> > +modify_decoding(v->vpci.pdev, v->vpci.map, v->vpci.rom);
> > +spin_unlock(>vpci.pdev->vpci->lock);
> > +}
> > +/* fallthrough. */
> > +case -ENOMEM:
> 
> You carefully handle this error here.

On second thought, I'm not sure handling ENOMEM separately makes
sense. Unless you object I plan to remove this special casing.

> > +static void maybe_defer_map(struct domain *d, const struct pci_dev *pdev,
> > +struct rangeset *mem, bool map, bool rom)
> > +{
> > +struct vcpu *curr = current;
> > +
> > +if ( is_idle_vcpu(curr) )
> > +{
> > +struct map_data data = { .d = d, .map = true };
> > +
> > +/*
> > + * Only used for domain construction in order to map the BARs
> > + * of devices with memory decoding enabled.
> > + */
> > +ASSERT(map && !rom);
> > +rangeset_consume_ranges(mem, map_range, );
> 
> What if this produces -ENOMEM? And despite having looked at
> several revisions of this, I can't make the connection to why this
> is in an is_idle_vcpu() conditional (neither the direct caller nor the
> next level up make this obvious to me). There's clearly a need
> for extending the comment.

I thought the comment above that mentions domain construction would be
enough. I can try to expand this, maybe like:

"This function will only 

[Xen-devel] [linux-3.18 test] 120010: regressions - FAIL

2018-02-26 Thread osstest service owner
flight 120010 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120010/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-freebsd10-i386 17 guest-localmigrate/x10 fail REGR. vs. 119432
 test-amd64-i386-xl-qemut-debianhvm-amd64 16 guest-localmigrate/x10 fail REGR. 
vs. 119432

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-examine  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 119432
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 119432
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 119432
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 119432
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 119432
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 119432
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 119432
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 build-arm64-pvops 6 kernel-build fail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass

version targeted for testing:
 linuxf8f8e8c5bbed6c3941845a1b7956bd893818f29f
baseline version:
 linux0c946219398a3108a9fe8dbc5096586bdcc797d6

Last test of basis   119432  2018-02-16 20:38:28 Z9 days
Testing same since   120010  2018-02-25 10:46:44 Z1 days1 attempts


People who touched revisions under test:
  Al Viro 
  Alex Deucher 
  Aliaksei Karaliou 
  Anand Moon 

[Xen-devel] [PATCH 1/6] x86/vmx: Simplfy the default cases in vmx_msr_{read, write}_intercept()

2018-02-26 Thread Andrew Cooper
The default case of vmx_msr_write_intercept() in particular is very tangled.

First of all, fold long_mode_do_msr_{read,write}() into their callers.  These
functions were split out in the past because of the 32bit build of Xen, but it
is unclear why the cases weren't simply #ifdef'd in place.

Next, invert the vmx_write_guest_msr()/is_last_branch_msr() logic to break if
the condition is satisfied, rather than nesting if it wasn't.  This allows the
wrmsr_hypervisor_regs() call to be un-nested with respect to the other default
logic.

No practical difference from a guests point of view.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Jun Nakajima 
CC: Kevin Tian 
CC: Wei Liu 
CC: Roger Pau Monné 
CC: Sergey Dyasli 
---
 xen/arch/x86/hvm/vmx/vmx.c | 214 ++---
 1 file changed, 86 insertions(+), 128 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 8f6d87b..e1e4f17 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -482,102 +482,6 @@ static void vmx_vcpu_destroy(struct vcpu *v)
 passive_domain_destroy(v);
 }
 
-static int long_mode_do_msr_read(unsigned int msr, uint64_t *msr_content)
-{
-struct vcpu *v = current;
-
-switch ( msr )
-{
-case MSR_FS_BASE:
-__vmread(GUEST_FS_BASE, msr_content);
-break;
-
-case MSR_GS_BASE:
-__vmread(GUEST_GS_BASE, msr_content);
-break;
-
-case MSR_SHADOW_GS_BASE:
-rdmsrl(MSR_SHADOW_GS_BASE, *msr_content);
-break;
-
-case MSR_STAR:
-*msr_content = v->arch.hvm_vmx.star;
-break;
-
-case MSR_LSTAR:
-*msr_content = v->arch.hvm_vmx.lstar;
-break;
-
-case MSR_CSTAR:
-*msr_content = v->arch.hvm_vmx.cstar;
-break;
-
-case MSR_SYSCALL_MASK:
-*msr_content = v->arch.hvm_vmx.sfmask;
-break;
-
-default:
-return X86EMUL_UNHANDLEABLE;
-}
-
-HVM_DBG_LOG(DBG_LEVEL_MSR, "msr %#x content %#"PRIx64, msr, *msr_content);
-
-return X86EMUL_OKAY;
-}
-
-static int long_mode_do_msr_write(unsigned int msr, uint64_t msr_content)
-{
-struct vcpu *v = current;
-
-HVM_DBG_LOG(DBG_LEVEL_MSR, "msr %#x content %#"PRIx64, msr, msr_content);
-
-switch ( msr )
-{
-case MSR_FS_BASE:
-case MSR_GS_BASE:
-case MSR_SHADOW_GS_BASE:
-if ( !is_canonical_address(msr_content) )
-return X86EMUL_EXCEPTION;
-
-if ( msr == MSR_FS_BASE )
-__vmwrite(GUEST_FS_BASE, msr_content);
-else if ( msr == MSR_GS_BASE )
-__vmwrite(GUEST_GS_BASE, msr_content);
-else
-wrmsrl(MSR_SHADOW_GS_BASE, msr_content);
-
-break;
-
-case MSR_STAR:
-v->arch.hvm_vmx.star = msr_content;
-wrmsrl(MSR_STAR, msr_content);
-break;
-
-case MSR_LSTAR:
-if ( !is_canonical_address(msr_content) )
-return X86EMUL_EXCEPTION;
-v->arch.hvm_vmx.lstar = msr_content;
-wrmsrl(MSR_LSTAR, msr_content);
-break;
-
-case MSR_CSTAR:
-if ( !is_canonical_address(msr_content) )
-return X86EMUL_EXCEPTION;
-v->arch.hvm_vmx.cstar = msr_content;
-break;
-
-case MSR_SYSCALL_MASK:
-v->arch.hvm_vmx.sfmask = msr_content;
-wrmsrl(MSR_SYSCALL_MASK, msr_content);
-break;
-
-default:
-return X86EMUL_UNHANDLEABLE;
-}
-
-return X86EMUL_OKAY;
-}
-
 /*
  * To avoid MSR save/restore at every VM exit/entry time, we restore
  * the x86_64 specific MSRs at domain switch time. Since these MSRs
@@ -2894,6 +2798,35 @@ static int vmx_msr_read_intercept(unsigned int msr, 
uint64_t *msr_content)
 case MSR_IA32_SYSENTER_EIP:
 __vmread(GUEST_SYSENTER_EIP, msr_content);
 break;
+
+case MSR_FS_BASE:
+__vmread(GUEST_FS_BASE, msr_content);
+break;
+
+case MSR_GS_BASE:
+__vmread(GUEST_GS_BASE, msr_content);
+break;
+
+case MSR_SHADOW_GS_BASE:
+rdmsrl(MSR_SHADOW_GS_BASE, *msr_content);
+break;
+
+case MSR_STAR:
+*msr_content = curr->arch.hvm_vmx.star;
+break;
+
+case MSR_LSTAR:
+*msr_content = curr->arch.hvm_vmx.lstar;
+break;
+
+case MSR_CSTAR:
+*msr_content = curr->arch.hvm_vmx.cstar;
+break;
+
+case MSR_SYSCALL_MASK:
+*msr_content = curr->arch.hvm_vmx.sfmask;
+break;
+
 case MSR_IA32_DEBUGCTLMSR:
 __vmread(GUEST_IA32_DEBUGCTL, msr_content);
 break;
@@ -2928,14 +2861,6 @@ static int vmx_msr_read_intercept(unsigned int msr, 
uint64_t *msr_content)
 default:
 if ( passive_domain_do_rdmsr(msr, msr_content) )
 goto done;
-switch ( long_mode_do_msr_read(msr, msr_content) )
-{
-case 

[Xen-devel] [PATCH 3/6] x86: Handle the Xen MSRs via the new guest_{rd, wr}msr() infrastructure

2018-02-26 Thread Andrew Cooper
Dispatch from the guest_{rd,wr}msr() functions, after falling through from the
!is_viridian_domain() case.

Rename {rd,wr}msr_hypervisor_regs() to guest_{rd,wr}msr_xen() for consistency,
and because the _regs suffix isn't very appropriate.

Update them to take a vcpu pointer rather than presuming that they act on
current, and switch to using X86EMUL_* return values.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Jun Nakajima 
CC: Paul Durrant 
CC: Kevin Tian 
CC: Boris Ostrovsky 
CC: Suravee Suthikulpanit 
CC: Wei Liu 
CC: Roger Pau Monné 
CC: Sergey Dyasli 
---
 xen/arch/x86/hvm/svm/svm.c  | 25 -
 xen/arch/x86/hvm/vmx/vmx.c  | 24 
 xen/arch/x86/msr.c  |  8 
 xen/arch/x86/pv/emul-priv-op.c  |  6 --
 xen/arch/x86/traps.c| 33 -
 xen/include/asm-x86/processor.h |  4 ++--
 6 files changed, 34 insertions(+), 66 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 6d8ed5c..f90a7b4 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1967,9 +1967,6 @@ static int svm_msr_read_intercept(unsigned int msr, 
uint64_t *msr_content)
 else if ( ret )
 break;
 
-if ( rdmsr_hypervisor_regs(msr, msr_content) )
-break;
-
 if ( rdmsr_safe(msr, *msr_content) == 0 )
 break;
 
@@ -2122,25 +2119,11 @@ static int svm_msr_write_intercept(unsigned int msr, 
uint64_t msr_content)
 else if ( ret )
 break;
 
-switch ( wrmsr_hypervisor_regs(msr, msr_content) )
-{
-case -ERESTART:
-result = X86EMUL_RETRY;
-break;
-case 0:
-/*
- * Match up with the RDMSR side for now; ultimately this entire
- * case block should go away.
- */
-if ( rdmsr_safe(msr, msr_content) == 0 )
-break;
-goto gpf;
-case 1:
+/* Match up with the RDMSR side; ultimately this should go away. */
+if ( rdmsr_safe(msr, msr_content) == 0 )
 break;
-default:
-goto gpf;
-}
-break;
+
+goto gpf;
 }
 
 if ( sync )
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 5bf6f62..6caaabc 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2871,9 +2871,6 @@ static int vmx_msr_read_intercept(unsigned int msr, 
uint64_t *msr_content)
 break;
 }
 
-if ( rdmsr_hypervisor_regs(msr, msr_content) )
-break;
-
 if ( rdmsr_safe(msr, *msr_content) == 0 )
 break;
 
@@ -3115,24 +3112,11 @@ static int vmx_msr_write_intercept(unsigned int msr, 
uint64_t msr_content)
  is_last_branch_msr(msr) )
 break;
 
-switch ( wrmsr_hypervisor_regs(msr, msr_content) )
-{
-case -ERESTART:
-return X86EMUL_RETRY;
-case 0:
-/*
- * Match up with the RDMSR side for now; ultimately this
- * entire case block should go away.
- */
-if ( rdmsr_safe(msr, msr_content) == 0 )
-break;
-goto gp_fault;
-case 1:
+/* Match up with the RDMSR side; ultimately this should go away. */
+if ( rdmsr_safe(msr, msr_content) == 0 )
 break;
-default:
-goto gp_fault;
-}
-break;
+
+goto gp_fault;
 }
 
 return X86EMUL_OKAY;
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 2ff9361..9f20fd8 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -183,6 +183,10 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
uint64_t *val)
 }
 
 /* Fallthrough. */
+case 0x4200 ... 0x42ff:
+ret = guest_rdmsr_xen(v, msr, val);
+goto out;
+
 default:
 return X86EMUL_UNHANDLEABLE;
 }
@@ -274,6 +278,10 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
 }
 
 /* Fallthrough. */
+case 0x4200 ... 0x42ff:
+ret = guest_wrmsr_xen(v, msr, val);
+goto out;
+
 default:
 return X86EMUL_UNHANDLEABLE;
 }
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 17aaf97..97fcac0 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -974,9 +974,6 @@ static int read_msr(unsigned int reg, uint64_t *val,
 }
 /* fall through */
 default:
-if ( rdmsr_hypervisor_regs(reg, val) )
-return X86EMUL_OKAY;
-
 rc = vmce_rdmsr(reg, val);
 if ( rc < 0 )
 

[Xen-devel] [PATCH 2/6] x86/hvm: Handle viridian MSRs via the new guest_{rd, wr}msr() infrastructure

2018-02-26 Thread Andrew Cooper
Dispatch from the guest_{rd,wr}msr() functions, after confirming that the
domain is configured to use viridian.  This allows for simplifiction of the
viridian helpers as they don't need to cope with the "not a viridian MSR"
case.  It also means that viridian MSRs which are unimplemented, or excluded
because of features, don't fall back into default handling path.

Rename {rd,wr}msr_viridian_regs() to guest_{rd,wr}msr_viridian() for
consistency, and because the _regs suffix isn't very appropriate.

Update them to take a vcpu pointer rather than presuming that they act on
current, which is safe for all implemented operations.  Also update them to
use X86EMUL_* return values.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Jun Nakajima 
CC: Paul Durrant 
CC: Kevin Tian 
CC: Boris Ostrovsky 
CC: Suravee Suthikulpanit 
CC: Wei Liu 
CC: Roger Pau Monné 
CC: Sergey Dyasli 
---
 xen/arch/x86/hvm/svm/svm.c |  6 +
 xen/arch/x86/hvm/viridian.c| 49 ++
 xen/arch/x86/hvm/vmx/vmx.c |  6 +
 xen/arch/x86/msr.c | 41 +++
 xen/include/asm-x86/hvm/viridian.h | 11 ++---
 5 files changed, 64 insertions(+), 49 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 8b4cefd..6d8ed5c 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1967,8 +1967,7 @@ static int svm_msr_read_intercept(unsigned int msr, 
uint64_t *msr_content)
 else if ( ret )
 break;
 
-if ( rdmsr_viridian_regs(msr, msr_content) ||
- rdmsr_hypervisor_regs(msr, msr_content) )
+if ( rdmsr_hypervisor_regs(msr, msr_content) )
 break;
 
 if ( rdmsr_safe(msr, *msr_content) == 0 )
@@ -2123,9 +2122,6 @@ static int svm_msr_write_intercept(unsigned int msr, 
uint64_t msr_content)
 else if ( ret )
 break;
 
-if ( wrmsr_viridian_regs(msr, msr_content) )
-break;
-
 switch ( wrmsr_hypervisor_regs(msr, msr_content) )
 {
 case -ERESTART:
diff --git a/xen/arch/x86/hvm/viridian.c b/xen/arch/x86/hvm/viridian.c
index 70aab52..23de433 100644
--- a/xen/arch/x86/hvm/viridian.c
+++ b/xen/arch/x86/hvm/viridian.c
@@ -554,13 +554,11 @@ static void update_reference_tsc(struct domain *d, bool_t 
initialize)
 put_page_and_type(page);
 }
 
-int wrmsr_viridian_regs(uint32_t idx, uint64_t val)
+int guest_wrmsr_viridian(struct vcpu *v, uint32_t idx, uint64_t val)
 {
-struct vcpu *v = current;
 struct domain *d = v->domain;
 
-if ( !is_viridian_domain(d) )
-return 0;
+ASSERT(is_viridian_domain(d));
 
 switch ( idx )
 {
@@ -615,7 +613,7 @@ int wrmsr_viridian_regs(uint32_t idx, uint64_t val)
 
 case HV_X64_MSR_REFERENCE_TSC:
 if ( !(viridian_feature_mask(d) & HVMPV_reference_tsc) )
-return 0;
+goto gp_fault;
 
 perfc_incr(mshv_wrmsr_tsc_msr);
 d->arch.hvm_domain.viridian.reference_tsc.raw = val;
@@ -655,14 +653,15 @@ int wrmsr_viridian_regs(uint32_t idx, uint64_t val)
 }
 
 default:
-if ( idx >= VIRIDIAN_MSR_MIN && idx <= VIRIDIAN_MSR_MAX )
-gprintk(XENLOG_WARNING, "write to unimplemented MSR %#x\n",
-idx);
-
-return 0;
+gdprintk(XENLOG_WARNING,
+ "Write %016"PRIx64" to unimplemented MSR %#x\n", val, idx);
+goto gp_fault;
 }
 
-return 1;
+return X86EMUL_OKAY;
+
+ gp_fault:
+return X86EMUL_EXCEPTION;
 }
 
 static int64_t raw_trc_val(struct domain *d)
@@ -698,13 +697,11 @@ void viridian_time_ref_count_thaw(struct domain *d)
 trc->off = (int64_t)trc->val - raw_trc_val(d);
 }
 
-int rdmsr_viridian_regs(uint32_t idx, uint64_t *val)
+int guest_rdmsr_viridian(const struct vcpu *v, uint32_t idx, uint64_t *val)
 {
-struct vcpu *v = current;
 struct domain *d = v->domain;
-
-if ( !is_viridian_domain(d) )
-return 0;
+
+ASSERT(is_viridian_domain(d));
 
 switch ( idx )
 {
@@ -725,7 +722,7 @@ int rdmsr_viridian_regs(uint32_t idx, uint64_t *val)
 
 case HV_X64_MSR_TSC_FREQUENCY:
 if ( viridian_feature_mask(d) & HVMPV_no_freq )
-return 0;
+goto gp_fault;
 
 perfc_incr(mshv_rdmsr_tsc_frequency);
 *val = (uint64_t)d->arch.tsc_khz * 1000ull;
@@ -733,7 +730,7 @@ int rdmsr_viridian_regs(uint32_t idx, uint64_t *val)
 
 case HV_X64_MSR_APIC_FREQUENCY:
 if ( viridian_feature_mask(d) & HVMPV_no_freq )
-return 0;
+goto gp_fault;
 
 perfc_incr(mshv_rdmsr_apic_frequency);
 *val = 10ull / APIC_BUS_CYCLE_NS;
@@ -757,7 +754,7 @@ int 

[Xen-devel] [PATCH 0/6] x86: Switch some bits of MSR handing over to the new infrastructure

2018-02-26 Thread Andrew Cooper
Various changes to MSR handling which don't impact the MSR policy objects
themselves.  See individual patches for details.

Andrew Cooper (6):
  x86/vmx: Simplfy the default cases in vmx_msr_{read,write}_intercept()
  x86/hvm: Handle viridian MSRs via the new guest_{rd,wr}msr() infrastructure
  x86: Handle the Xen MSRs via the new guest_{rd,wr}msr() infrastructure
  x86/hvm: Constify the read side of vlapic handling
  x86/hvm: Handle x2apic MSRs the new guest_{rd,wr}msr() infrastructure
  x86/msr: Blacklist various MSRs which guests definitely shouldn't be using

 xen/arch/x86/hvm/hvm.c |  10 --
 xen/arch/x86/hvm/svm/svm.c |  27 +
 xen/arch/x86/hvm/viridian.c|  49 -
 xen/arch/x86/hvm/vlapic.c  |  74 +++--
 xen/arch/x86/hvm/vmx/vmx.c | 208 +
 xen/arch/x86/hvm/vpt.c |   2 +-
 xen/arch/x86/msr.c | 108 ++-
 xen/arch/x86/pv/emul-priv-op.c |   6 --
 xen/arch/x86/traps.c   |  33 +++---
 xen/include/asm-x86/hvm/hvm.h  |   6 +-
 xen/include/asm-x86/hvm/viridian.h |  11 +-
 xen/include/asm-x86/processor.h|   4 +-
 12 files changed, 268 insertions(+), 270 deletions(-)

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 6/6] x86/msr: Blacklist various MSRs which guests definitely shouldn't be using

2018-02-26 Thread Andrew Cooper
The main purpose is to blacklist the Intel Resource Director Technology MSRs.
We do not yet virtualise support for guests, but Linux has been observed to
probe for these MSRs without checking CPUID first.

The architecturally inaccessable ranges don't need to fall back into the
legacy ranges, because they are not going to eventually evaluate as
accessible.

The Silicon Debug interface will probably never be virtualised for guests, but
doesn't want to leak through from real hardware.  SGX isn't yet virtualised,
but likely will be in the future.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 
CC: Sergey Dyasli 
---
 xen/arch/x86/msr.c | 44 
 1 file changed, 44 insertions(+)

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 3cb4158..0237637 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -158,6 +158,10 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
uint64_t *val)
 *val = vp->spec_ctrl.raw;
 break;
 
+case 0x8c ... 0x8f: /* SGX Launch Enclave Public Key Hash. */
+/* Not implemented yet. */
+goto gp_fault;
+
 case MSR_INTEL_PLATFORM_INFO:
 if ( !dp->plaform_info.available )
 goto gp_fault;
@@ -183,6 +187,15 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
uint64_t *val)
 ret = guest_rdmsr_x2apic(v, msr, val);
 goto out;
 
+case 0xc80:
+/* Silicon Debug Inferface not advertised to guests. */
+goto gp_fault;
+
+case 0xc81 ... 0xc8f: /* Misc RDT MSRs. */
+case 0xc90 ... 0xd8f: /* CAT Mask registers. */
+/* Not implemented yet. */
+goto gp_fault;
+
 case 0x4000 ... 0x41ff:
 if ( is_viridian_domain(d) )
 {
@@ -196,6 +209,15 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
uint64_t *val)
 goto out;
 
 default:
+/*
+ * Blacklist the architecturally inaccessable MSRs. No point wandering
+ * the legacy handlers.
+ */
+if ( msr > 0x1fff &&
+ (msr < 0xc000 || msr > 0xc0001fff) &&
+ (msr < 0xc001 || msr > 0xc0011fff) )
+goto gp_fault;
+
 return X86EMUL_UNHANDLEABLE;
 }
 
@@ -255,6 +277,10 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
 wrmsrl(MSR_PRED_CMD, PRED_CMD_IBPB);
 break;
 
+case 0x8c ... 0x8f: /* SGX Launch Enclave Public Key Hash. */
+/* Not implemented yet. */
+goto gp_fault;
+
 case MSR_INTEL_MISC_FEATURES_ENABLES:
 {
 uint64_t rsvd = ~0ull;
@@ -285,6 +311,15 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
 ret = guest_wrmsr_x2apic(v, msr, val);
 goto out;
 
+case 0xc80:
+/* Silicon Debug Inferface not advertised to guests. */
+goto gp_fault;
+
+case 0xc81 ... 0xc8f: /* Misc RDT MSRs. */
+case 0xc90 ... 0xd8f: /* CAT Mask registers. */
+/* Not implemented yet. */
+goto gp_fault;
+
 case 0x4000 ... 0x41ff:
 if ( is_viridian_domain(d) )
 {
@@ -298,6 +333,15 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
 goto out;
 
 default:
+/*
+ * Blacklist the architecturally inaccessable MSRs. No point wandering
+ * the legacy handlers.
+ */
+if ( msr > 0x1fff &&
+ (msr < 0xc000 || msr > 0xc0001fff) &&
+ (msr < 0xc001 || msr > 0xc0011fff) )
+goto gp_fault;
+
 return X86EMUL_UNHANDLEABLE;
 }
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 4/6] x86/hvm: Constify the read side of vlapic handling

2018-02-26 Thread Andrew Cooper
This is in preparation to make hvm_x2apic_msr_read() take a const vcpu
pointer.  One modification is to alter vlapic_get_tmcct() to not use current.

This in turn needs an alteration to hvm_get_guest_time_fixed(), which is safe
because the only mutable action it makes is to take the domain plt lock.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 
CC: Sergey Dyasli 
---
 xen/arch/x86/hvm/vlapic.c | 13 +++--
 xen/arch/x86/hvm/vpt.c|  2 +-
 xen/include/asm-x86/hvm/hvm.h |  2 +-
 3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 7387f91..e715729 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -171,12 +171,12 @@ void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, 
uint8_t trig)
 vcpu_kick(target);
 }
 
-static int vlapic_find_highest_isr(struct vlapic *vlapic)
+static int vlapic_find_highest_isr(const struct vlapic *vlapic)
 {
 return vlapic_find_highest_vector(>regs->data[APIC_ISR]);
 }
 
-static uint32_t vlapic_get_ppr(struct vlapic *vlapic)
+static uint32_t vlapic_get_ppr(const struct vlapic *vlapic)
 {
 uint32_t tpr, isrv, ppr;
 int isr;
@@ -550,9 +550,9 @@ void vlapic_ipi(
 }
 }
 
-static uint32_t vlapic_get_tmcct(struct vlapic *vlapic)
+static uint32_t vlapic_get_tmcct(const struct vlapic *vlapic)
 {
-struct vcpu *v = current;
+const struct vcpu *v = const_vlapic_vcpu(vlapic);
 uint32_t tmcct = 0, tmict = vlapic_get_reg(vlapic, APIC_TMICT);
 uint64_t counter_passed;
 
@@ -590,7 +590,8 @@ static void vlapic_set_tdcr(struct vlapic *vlapic, unsigned 
int val)
 "timer_divisor: %d", vlapic->hw.timer_divisor);
 }
 
-static uint32_t vlapic_read_aligned(struct vlapic *vlapic, unsigned int offset)
+static uint32_t vlapic_read_aligned(const struct vlapic *vlapic,
+unsigned int offset)
 {
 switch ( offset )
 {
@@ -680,7 +681,7 @@ int hvm_x2apic_msr_read(struct vcpu *v, unsigned int msr, 
uint64_t *msr_content)
 REGBLOCK(ISR) | REGBLOCK(TMR) | REGBLOCK(IRR)
 #undef REGBLOCK
 };
-struct vlapic *vlapic = vcpu_vlapic(v);
+const struct vlapic *vlapic = vcpu_vlapic(v);
 uint32_t high = 0, reg = msr - MSR_IA32_APICBASE_MSR, offset = reg << 4;
 
 if ( !vlapic_x2apic_mode(vlapic) ||
diff --git a/xen/arch/x86/hvm/vpt.c b/xen/arch/x86/hvm/vpt.c
index 181f4cb..862c715 100644
--- a/xen/arch/x86/hvm/vpt.c
+++ b/xen/arch/x86/hvm/vpt.c
@@ -35,7 +35,7 @@ void hvm_init_guest_time(struct domain *d)
 pl->last_guest_time = 0;
 }
 
-u64 hvm_get_guest_time_fixed(struct vcpu *v, u64 at_tsc)
+u64 hvm_get_guest_time_fixed(const struct vcpu *v, u64 at_tsc)
 {
 struct pl_time *pl = v->domain->arch.hvm_domain.pl_time;
 u64 now;
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index dd3dd5f..031af12 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -269,7 +269,7 @@ u64 hvm_get_tsc_scaling_ratio(u32 gtsc_khz);
 
 void hvm_init_guest_time(struct domain *d);
 void hvm_set_guest_time(struct vcpu *v, u64 guest_time);
-u64 hvm_get_guest_time_fixed(struct vcpu *v, u64 at_tsc);
+u64 hvm_get_guest_time_fixed(const struct vcpu *v, u64 at_tsc);
 #define hvm_get_guest_time(v) hvm_get_guest_time_fixed(v, 0)
 
 int vmsi_deliver(
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 44/49] ARM: new VGIC: vgic-init: register VGIC

2018-02-26 Thread Andre Przywara
Hi,

On 19/02/18 12:39, Julien Grall wrote:
> Hi Andre,
> 
> On 09/02/18 14:39, Andre Przywara wrote:
>> This patch implements the function which is called by Xen when it wants
>> to register the virtual GIC.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>   xen/arch/arm/vgic/vgic-init.c | 62
>> +++
>>   xen/arch/arm/vgic/vgic.h  |  3 +++
>>   2 files changed, 65 insertions(+)
>>   create mode 100644 xen/arch/arm/vgic/vgic-init.c
>>
>> diff --git a/xen/arch/arm/vgic/vgic-init.c
>> b/xen/arch/arm/vgic/vgic-init.c
>> new file mode 100644
>> index 00..b5f1183a50
>> --- /dev/null
>> +++ b/xen/arch/arm/vgic/vgic-init.c
>> @@ -0,0 +1,62 @@
>> +/*
>> + * Copyright (C) 2015, 2016 ARM Ltd.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see .
>> + */
>> +
>> +#include 
>> +#include 
>> +
>> +#include "vgic.h"
>> +
>> +/* CREATION */
>> +
>> +/**
>> + * domain_vgic_register: create a virtual GIC
>> + * @d: domain pointer
>> + * @mmio_count: pointer to add number of required MMIO regions
>> + *
>> + * was: kvm_vgic_create
>> + */
>> +int domain_vgic_register(struct domain *d, int *mmio_count)
> 
> mmio_count should be set to the number of I/O region you will register.
> 
>> +{
>> +    switch ( d->arch.vgic.version )
>> +    {
>> +#ifdef CONFIG_HAS_GICV3
>> +    case GIC_V3:
>> +    d->arch.max_vcpus = VGIC_V3_MAX_CPUS;
>> +    break;
>> +#endif
>> +    case GIC_V2:
>> +    d->arch.max_vcpus = VGIC_V2_MAX_CPUS;
>> +    break;
>> +    }
>> +
>> +    if ( d->max_vcpus > d->arch.max_vcpus )
>> +    return -E2BIG;
>> +
>> +    d->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
>> +    d->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
>> +    d->arch.vgic.vgic_redist_base = VGIC_ADDR_UNDEF;
> 
> Is there any reason to store an address rather than a frame? The latter
> would add a be more safety.

Possibly, but the existing VGIC doesn't do it as well, which means that
at the moment we deal with addresses (and not frame numbers) everywhere
(for instance GUEST_GICD_BASE).
Changing this for the distributor internally already (as requested for
some other patch) caused more changes than I liked, and at the moment I
am trying to make the TODO list smaller, not bigger ;-)

Cheers,
Andre.

>> +
>> +    return 0;
>> +}
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> diff --git a/xen/arch/arm/vgic/vgic.h b/xen/arch/arm/vgic/vgic.h
>> index b104f8e964..205ce10ffa 100644
>> --- a/xen/arch/arm/vgic/vgic.h
>> +++ b/xen/arch/arm/vgic/vgic.h
>> @@ -20,6 +20,9 @@
>>   #define PRODUCT_ID_KVM  0x4b    /* ASCII code K */
>>   #define IMPLEMENTER_ARM 0x43b
>>   +#define VGIC_ADDR_UNDEF (-1)
> 
> Please use INVALID_PADDR here.
> 
>> +#define IS_VGIC_ADDR_UNDEF(_x)  ((_x) == VGIC_ADDR_UNDEF)
>> +
>>   #define VGIC_PRI_BITS   5
>>     #define vgic_irq_is_sgi(intid) ((intid) < VGIC_NR_SGIS)
>>
> 
> Cheers,
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 38/49] ARM: new VGIC: handle hardware mapped IRQs

2018-02-26 Thread Julien Grall

Hi,

On 02/26/2018 05:19 PM, Andre Przywara wrote:

Hi,

On 26/02/18 16:57, Julien Grall wrote:



On 02/26/2018 04:48 PM, Andre Przywara wrote:

Hi,

On 23/02/18 18:14, Julien Grall wrote:



On 23/02/18 18:02, Andre Przywara wrote:

Hi,


Hi Andre,


On 19/02/18 12:19, Julien Grall wrote:

Hi,

On 09/02/18 14:39, Andre Przywara wrote:

The VGIC supports virtual IRQs to be connected to a hardware IRQ, so
when a guest EOIs the virtual interrupt, it affects the state of that
corresponding interrupt on the hardware side at the same time.
Implement the interface that the Xen arch/core code expects to
connect
the virtual and the physical world.

Signed-off-by: Andre Przywara 
---
     xen/arch/arm/vgic/vgic.c | 63

     1 file changed, 63 insertions(+)

diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
index dc5e011fa3..8d5260a7db 100644
--- a/xen/arch/arm/vgic/vgic.c
+++ b/xen/arch/arm/vgic/vgic.c
@@ -693,6 +693,69 @@ void vgic_kick_vcpus(struct domain *d)
     }
     }
     +struct irq_desc *vgic_get_hw_irq_desc(struct domain *d,
struct vcpu
*v,
+  unsigned int virq)
+{
+    struct irq_desc *desc = NULL;
+    struct vgic_irq *irq = vgic_get_irq(d, v, virq);
+    unsigned long flags;
+
+    if ( !irq )
+    return NULL;
+
+    spin_lock_irqsave(>irq_lock, flags);
+    if ( irq->hw )
+    desc = irq_to_desc(irq->hwintid);


This is not going to work well for PPIs. We should consider to add at
least an ASSERT(...) in the code to prevent bad use of it.


Yeah, done. But I wonder if we eventually should extend the
irq_to_desc() function to take the vCPU, since we will need it anyway
once we use hardware mapped timer IRQs (PPIs) in the future. But this
should not be in this series, I guess.


irq_to_desc only deal with hardware interrupt, so you mean pCPU instead
of vCPU?


Yes, indeed. But I think this points to the problem of this approach:
the virtual IRQ is tied to a VCPU, and we have to make sure that not
only the affinity is updated on a CPU migration (as we do for SPIs), but
actually the interrupt itself is changed: since CPU0/PPI9 has a
different irq_desc* from, say, CPU1/PPI9.
So there is more than just adding a parameter to irq_to_desc().


Change in the irq_to_desc() interface needs to be justify. The use case
I have in mind for PPI is the virtual timer. In that case, you will
always receive the PPI on the right pCPU.


Yes, but the connection between virtual and physical IRQ is realised as
a connection between struct pending_irq/vgic_irq and struct irq_desc.
For an SPI this is always the same irq_desc, regardless of the affinity
or running CPU. But for PPIs you would need to change the actual
irq_desc pointer when changing the affinity. Not really rocket science
(though it may become nasty with the locking), but needs to be implemented.


I don't think it is a big deal. You would remove the "link" when saving 
the vCPU state and add the "link" when restoring. In both case, you 
would be on the right pCPU. Have a look at:


https://lists.xenproject.org/archives/html/xen-devel/2015-11/msg00925.html

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 38/49] ARM: new VGIC: handle hardware mapped IRQs

2018-02-26 Thread Andre Przywara
Hi,

On 26/02/18 16:57, Julien Grall wrote:
> 
> 
> On 02/26/2018 04:48 PM, Andre Przywara wrote:
>> Hi,
>>
>> On 23/02/18 18:14, Julien Grall wrote:
>>>
>>>
>>> On 23/02/18 18:02, Andre Przywara wrote:
 Hi,
>>>
>>> Hi Andre,
>>>
 On 19/02/18 12:19, Julien Grall wrote:
> Hi,
>
> On 09/02/18 14:39, Andre Przywara wrote:
>> The VGIC supports virtual IRQs to be connected to a hardware IRQ, so
>> when a guest EOIs the virtual interrupt, it affects the state of that
>> corresponding interrupt on the hardware side at the same time.
>> Implement the interface that the Xen arch/core code expects to
>> connect
>> the virtual and the physical world.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>     xen/arch/arm/vgic/vgic.c | 63
>> 
>>     1 file changed, 63 insertions(+)
>>
>> diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
>> index dc5e011fa3..8d5260a7db 100644
>> --- a/xen/arch/arm/vgic/vgic.c
>> +++ b/xen/arch/arm/vgic/vgic.c
>> @@ -693,6 +693,69 @@ void vgic_kick_vcpus(struct domain *d)
>>     }
>>     }
>>     +struct irq_desc *vgic_get_hw_irq_desc(struct domain *d,
>> struct vcpu
>> *v,
>> +  unsigned int virq)
>> +{
>> +    struct irq_desc *desc = NULL;
>> +    struct vgic_irq *irq = vgic_get_irq(d, v, virq);
>> +    unsigned long flags;
>> +
>> +    if ( !irq )
>> +    return NULL;
>> +
>> +    spin_lock_irqsave(>irq_lock, flags);
>> +    if ( irq->hw )
>> +    desc = irq_to_desc(irq->hwintid);
>
> This is not going to work well for PPIs. We should consider to add at
> least an ASSERT(...) in the code to prevent bad use of it.

 Yeah, done. But I wonder if we eventually should extend the
 irq_to_desc() function to take the vCPU, since we will need it anyway
 once we use hardware mapped timer IRQs (PPIs) in the future. But this
 should not be in this series, I guess.
>>>
>>> irq_to_desc only deal with hardware interrupt, so you mean pCPU instead
>>> of vCPU?
>>
>> Yes, indeed. But I think this points to the problem of this approach:
>> the virtual IRQ is tied to a VCPU, and we have to make sure that not
>> only the affinity is updated on a CPU migration (as we do for SPIs), but
>> actually the interrupt itself is changed: since CPU0/PPI9 has a
>> different irq_desc* from, say, CPU1/PPI9.
>> So there is more than just adding a parameter to irq_to_desc().
> 
> Change in the irq_to_desc() interface needs to be justify. The use case
> I have in mind for PPI is the virtual timer. In that case, you will
> always receive the PPI on the right pCPU.

Yes, but the connection between virtual and physical IRQ is realised as
a connection between struct pending_irq/vgic_irq and struct irq_desc.
For an SPI this is always the same irq_desc, regardless of the affinity
or running CPU. But for PPIs you would need to change the actual
irq_desc pointer when changing the affinity. Not really rocket science
(though it may become nasty with the locking), but needs to be implemented.

> Do you really see a use case where a vCPU is running on pCPU A but the
> PPI is routed to pCPU B?

Not at the moment, I guess the very nature of PPIs would avoid this. The
other PPIs I know about are PMUs and the VGIC. The latter is not of a
concern for us yet (fortunately). PMUs should have the same local
property as the arch timer.

Cheers,
Andre.

>> +    spin_unlock_irqrestore(>irq_lock, flags);
>> +
>> +    vgic_put_irq(d, irq);
>> +
>> +    return desc;
>> +}
>> +
>> +/*
>> + * was:
>> + *  int kvm_vgic_map_phys_irq(struct vcpu *vcpu, u32 virt_irq,
>> u32 phys_irq)
>> + *  int kvm_vgic_unmap_phys_irq(struct vcpu *vcpu, unsigned int
>> virt_irq)
>> + */
>> +int vgic_connect_hw_irq(struct domain *d, struct vcpu *vcpu,
>> +    unsigned int virt_irq, struct irq_desc *desc,
>> +    bool connect)
>
> Indentation.
>
>> +{
>> +    struct vgic_irq *irq = vgic_get_irq(d, vcpu, virt_irq);
>> +    unsigned long flags;
>> +    int ret = 0;
>> +
>> +    if ( !irq )
>> +    return -EINVAL;
>> +
>> +    spin_lock_irqsave(>irq_lock, flags);
>> +
>> +    if ( connect )  /* assign a mapped IRQ */
>> +    {
>> +    /* The VIRQ should not be already enabled by the guest */
>> +    if ( !irq->hw && !irq->enabled )
>> +    {
>> +    irq->hw = true;
>> +    irq->hwintid = desc->irq;
>> +    }
>> +    else
>> +    {
>> +    ret = -EBUSY;
>> +    }
>
> I know that it should not matter for SPIs today. But aren't you
> meant to
> get a reference on that interrupt if 

Re: [Xen-devel] [RFC PATCH 41/49] ARM: new VGIC: dump virtual IRQ info

2018-02-26 Thread Julien Grall



On 02/26/2018 04:58 PM, Andre Przywara wrote:

Hi,

On 19/02/18 12:26, Julien Grall wrote:

Hi Andre,

On 09/02/18 14:39, Andre Przywara wrote:

When we dump guest state on the Xen console, we also print the state of
IRQs that are on a VCPU.
Add the code to dump the state of an IRQ handled by the new VGIC.

Signed-off-by: Andre Przywara 
---
   xen/arch/arm/vgic/vgic.c | 13 +
   1 file changed, 13 insertions(+)

diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
index 3b475ed1a4..97ffdba5ad 100644
--- a/xen/arch/arm/vgic/vgic.c
+++ b/xen/arch/arm/vgic/vgic.c
@@ -757,6 +757,19 @@ void vgic_free_virq(struct domain *d, unsigned
int virq)
   clear_bit(virq, d->arch.vgic.allocated_irqs);
   }
   +void gic_dump_vgic_info(struct vcpu *v)
+{
+    struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
+    struct vgic_irq *irq;
+
+    list_for_each_entry(irq, _cpu->ap_list_head, ap_list)


I don't think you can assume that the vCPU is not running somewhere
else. So likely you want to take the lock while dumping the info.


Oh, good point. Totally forgot the locking here :-(
Same for the IRQs within.
Thanks for pointing this out.




+    printk("   on CPU: %s %s irq %u: %spending, %sactive,
%senabled\n",


I am not sure the value of "on CPU".


That is meant to be a short phrase for "being on the ap_list", which is
an implementation specific term. "Active" or "pending" alone are
confusing or misleading. If you have a better term (not too long!), I am
happy to take that.


How about a print before dumping the list? This would avoid the on CPU 
on each line and it can be longer :).


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 41/49] ARM: new VGIC: dump virtual IRQ info

2018-02-26 Thread Andre Przywara
Hi,

On 19/02/18 12:26, Julien Grall wrote:
> Hi Andre,
> 
> On 09/02/18 14:39, Andre Przywara wrote:
>> When we dump guest state on the Xen console, we also print the state of
>> IRQs that are on a VCPU.
>> Add the code to dump the state of an IRQ handled by the new VGIC.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>   xen/arch/arm/vgic/vgic.c | 13 +
>>   1 file changed, 13 insertions(+)
>>
>> diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
>> index 3b475ed1a4..97ffdba5ad 100644
>> --- a/xen/arch/arm/vgic/vgic.c
>> +++ b/xen/arch/arm/vgic/vgic.c
>> @@ -757,6 +757,19 @@ void vgic_free_virq(struct domain *d, unsigned
>> int virq)
>>   clear_bit(virq, d->arch.vgic.allocated_irqs);
>>   }
>>   +void gic_dump_vgic_info(struct vcpu *v)
>> +{
>> +    struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
>> +    struct vgic_irq *irq;
>> +
>> +    list_for_each_entry(irq, _cpu->ap_list_head, ap_list)
> 
> I don't think you can assume that the vCPU is not running somewhere
> else. So likely you want to take the lock while dumping the info.

Oh, good point. Totally forgot the locking here :-(
Same for the IRQs within.
Thanks for pointing this out.

> 
>> +    printk("   on CPU: %s %s irq %u: %spending, %sactive,
>> %senabled\n",
> 
> I am not sure the value of "on CPU".

That is meant to be a short phrase for "being on the ap_list", which is
an implementation specific term. "Active" or "pending" alone are
confusing or misleading. If you have a better term (not too long!), I am
happy to take that.

Cheers,
Andre.

> 
>> +   irq->hw ? "hardware" : "virtual",
>> +   irq->config == VGIC_CONFIG_LEVEL ? "level" : "edge",
>> +   irq->intid, irq_is_pending(irq) ? "" : "not ",
>> +   irq->active ? "" : "not ", irq->enabled ? "" : "not ");
>> +}
>> +
>>   struct irq_desc *vgic_get_hw_irq_desc(struct domain *d, struct vcpu *v,
>>     unsigned int virq)
>>   {
>>
> 
> Cheers,
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 38/49] ARM: new VGIC: handle hardware mapped IRQs

2018-02-26 Thread Julien Grall



On 02/26/2018 04:48 PM, Andre Przywara wrote:

Hi,

On 23/02/18 18:14, Julien Grall wrote:



On 23/02/18 18:02, Andre Przywara wrote:

Hi,


Hi Andre,


On 19/02/18 12:19, Julien Grall wrote:

Hi,

On 09/02/18 14:39, Andre Przywara wrote:

The VGIC supports virtual IRQs to be connected to a hardware IRQ, so
when a guest EOIs the virtual interrupt, it affects the state of that
corresponding interrupt on the hardware side at the same time.
Implement the interface that the Xen arch/core code expects to connect
the virtual and the physical world.

Signed-off-by: Andre Przywara 
---
    xen/arch/arm/vgic/vgic.c | 63

    1 file changed, 63 insertions(+)

diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
index dc5e011fa3..8d5260a7db 100644
--- a/xen/arch/arm/vgic/vgic.c
+++ b/xen/arch/arm/vgic/vgic.c
@@ -693,6 +693,69 @@ void vgic_kick_vcpus(struct domain *d)
    }
    }
    +struct irq_desc *vgic_get_hw_irq_desc(struct domain *d, struct vcpu
*v,
+  unsigned int virq)
+{
+    struct irq_desc *desc = NULL;
+    struct vgic_irq *irq = vgic_get_irq(d, v, virq);
+    unsigned long flags;
+
+    if ( !irq )
+    return NULL;
+
+    spin_lock_irqsave(>irq_lock, flags);
+    if ( irq->hw )
+    desc = irq_to_desc(irq->hwintid);


This is not going to work well for PPIs. We should consider to add at
least an ASSERT(...) in the code to prevent bad use of it.


Yeah, done. But I wonder if we eventually should extend the
irq_to_desc() function to take the vCPU, since we will need it anyway
once we use hardware mapped timer IRQs (PPIs) in the future. But this
should not be in this series, I guess.


irq_to_desc only deal with hardware interrupt, so you mean pCPU instead
of vCPU?


Yes, indeed. But I think this points to the problem of this approach:
the virtual IRQ is tied to a VCPU, and we have to make sure that not
only the affinity is updated on a CPU migration (as we do for SPIs), but
actually the interrupt itself is changed: since CPU0/PPI9 has a
different irq_desc* from, say, CPU1/PPI9.
So there is more than just adding a parameter to irq_to_desc().


Change in the irq_to_desc() interface needs to be justify. The use case 
I have in mind for PPI is the virtual timer. In that case, you will 
always receive the PPI on the right pCPU.


Do you really see a use case where a vCPU is running on pCPU A but the 
PPI is routed to pCPU B?






+    spin_unlock_irqrestore(>irq_lock, flags);
+
+    vgic_put_irq(d, irq);
+
+    return desc;
+}
+
+/*
+ * was:
+ *  int kvm_vgic_map_phys_irq(struct vcpu *vcpu, u32 virt_irq,
u32 phys_irq)
+ *  int kvm_vgic_unmap_phys_irq(struct vcpu *vcpu, unsigned int
virt_irq)
+ */
+int vgic_connect_hw_irq(struct domain *d, struct vcpu *vcpu,
+    unsigned int virt_irq, struct irq_desc *desc,
+    bool connect)


Indentation.


+{
+    struct vgic_irq *irq = vgic_get_irq(d, vcpu, virt_irq);
+    unsigned long flags;
+    int ret = 0;
+
+    if ( !irq )
+    return -EINVAL;
+
+    spin_lock_irqsave(>irq_lock, flags);
+
+    if ( connect )  /* assign a mapped IRQ */
+    {
+    /* The VIRQ should not be already enabled by the guest */
+    if ( !irq->hw && !irq->enabled )
+    {
+    irq->hw = true;
+    irq->hwintid = desc->irq;
+    }
+    else
+    {
+    ret = -EBUSY;
+    }


I know that it should not matter for SPIs today. But aren't you meant to
get a reference on that interrupt if you connect it?


No, the refcount feature is strictly for the pointer to the structure,
not for everything related to this virtual IRQ.
We store only the virtual IRQ number in the dev_id/info members, we will
get the struct vgic_irq pointer via the vIRQ number on do_IRQ().
Does that make sense?


But technically you "allocate" the virtual SPI at that time, right? So
this would mean you need to get a reference, otherwise it might disappear.


We will realise that is has disappeared when vgic_get_irq() called with
that virtual number returns NULL. The refcount is really just to know
when you can free dynamically allocated struct vgic_irqs, so it's
strictly about the *pointer* to the *memory*, not about the logical
entity of that particular virtual IRQ.
Actually it should not really happen that you end up with a hardware IRQ
still assigned to an abandoned virtual IRQ, as you would expect to free
that connection *before* disbanding the virtual IRQ.


So I am not entirely sure why the reference is not necessary here.


Typically to remove a virtual IRQ, you arrange for vgic_get_irq() to
return NULL on that number. Then you "wait" for the refcount to drop to
zero, at which point it's safe to free the memory allocated for that
vgic_irq. As mentioned, only really useful for LPIs, but it's a central
property of the new VGIC architecture, because we need to have those
gets/puts in 

Re: [Xen-devel] [PATCH v2] tools/xenstore: try to get minimum thread stack size for watch thread

2018-02-26 Thread Jim Fehlig

On 02/26/2018 01:46 AM, Juergen Gross wrote:

When creating a pthread in xs_watch() try to get the minimal needed
size of the thread from glibc instead of using a constant. This avoids
problems when the library is used in programs with large per-thread
memory.

Use dlsym() to get the pointer to __pthread_get_minstack() in order to
avoid linkage problems and fall back to the current constant size if
not found.

Signed-off-by: Juergen Gross 
---
V2:
- use _GNU_SOURCE (Wei Liu)
- call __pthread_get_minstack() with parameter
- add -ldl to correct make flags
- ensure to not using smaller stack size than today
---
  tools/xenstore/Makefile |  4 
  tools/xenstore/xs.c | 21 -
  2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/tools/xenstore/Makefile b/tools/xenstore/Makefile
index 2b99d2bc1b..0831be0b6f 100644
--- a/tools/xenstore/Makefile
+++ b/tools/xenstore/Makefile
@@ -100,6 +100,10 @@ libxenstore.so.$(MAJOR): libxenstore.so.$(MAJOR).$(MINOR)
ln -sf $< $@
  
  xs.opic: CFLAGS += -DUSE_PTHREAD

+ifeq ($(CONFIG_Linux),y)
+xs.opic: CFLAGS += -DUSE_DLSYM
+libxenstore.so.$(MAJOR).$(MINOR): LDFLAGS += -ldl
+endif


Dropping this patch in one of my automated builds caused a libxenstore link 
failure

[   99s] gcc-lsystemd -ldl -pthread -Wl,-soname -Wl,libxenstore.so.3.0 
-shared -o libxenstore.so.3.0.3 xs.opic xs_lib.opic 
/home/abuild/rpmbuild/BUILD/xen-4.10.0-testing/tools/xenstore/../../tools/libs/toolcore/libxentoolcore.so 

[   99s] 
/home/abuild/rpmbuild/BUILD/xen-4.10.0-testing/tools/xenstore/../../tools/xenstore/libxenstore.so: 
undefined reference to `dlsym'


I hacked around it by appending '-ldl' to the end of the subsequent 
libxenstore.so rule.



  libxenstore.so.$(MAJOR).$(MINOR): xs.opic xs_lib.opic
$(CC) $(LDFLAGS) $(PTHREAD_LDFLAGS) -Wl,$(SONAME_LDFLAG) 
-Wl,libxenstore.so.$(MAJOR) $(SHLIB_LDFLAGS) -o $@ $^ $(LDLIBS_libxentoolcore) 
$(SOCKET_LIBS) $(PTHREAD_LIBS) $(APPEND_LDFLAGS)
diff --git a/tools/xenstore/xs.c b/tools/xenstore/xs.c
index abffd9cd80..77700bff2b 100644
--- a/tools/xenstore/xs.c
+++ b/tools/xenstore/xs.c
@@ -16,6 +16,8 @@
  License along with this library; If not, see 
.
  */
  
+#define _GNU_SOURCE

+
  #include 
  #include 
  #include 
@@ -47,6 +49,10 @@ struct xs_stored_msg {
  
  #include 
  
+#ifdef USE_DLSYM

+#include 
+#endif
+
  struct xs_handle {
/* Communications channel to xenstore daemon. */
int fd;
@@ -810,12 +816,25 @@ bool xs_watch(struct xs_handle *h, const char *path, 
const char *token)
if (!h->read_thr_exists) {
sigset_t set, old_set;
pthread_attr_t attr;
+   static size_t stack_size;
+#ifdef USE_DLSYM
+   size_t (*getsz)(pthread_attr_t *attr);
+#endif
  
  		if (pthread_attr_init() != 0) {

mutex_unlock(>request_mutex);
return false;
}
-   if (pthread_attr_setstacksize(, READ_THREAD_STACKSIZE) != 
0) {
+   if (!stack_size) {
+#ifdef USE_DLSYM
+   getsz = dlsym(RTLD_DEFAULT, "__pthread_get_minstack");
+   if (getsz)
+   stack_size = getsz();
+#endif
+   if (stack_size < READ_THREAD_STACKSIZE)
+   stack_size = READ_THREAD_STACKSIZE;
+   }
+   if (pthread_attr_setstacksize(, stack_size) != 0) {
pthread_attr_destroy();
mutex_unlock(>request_mutex);
return false;


This worked fine, even on the system with the buggy glibc.

Tested-by: Jim Fehlig 

Regards,
Jim

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 38/49] ARM: new VGIC: handle hardware mapped IRQs

2018-02-26 Thread Andre Przywara
Hi,

On 23/02/18 18:14, Julien Grall wrote:
> 
> 
> On 23/02/18 18:02, Andre Przywara wrote:
>> Hi,
> 
> Hi Andre,
> 
>> On 19/02/18 12:19, Julien Grall wrote:
>>> Hi,
>>>
>>> On 09/02/18 14:39, Andre Przywara wrote:
 The VGIC supports virtual IRQs to be connected to a hardware IRQ, so
 when a guest EOIs the virtual interrupt, it affects the state of that
 corresponding interrupt on the hardware side at the same time.
 Implement the interface that the Xen arch/core code expects to connect
 the virtual and the physical world.

 Signed-off-by: Andre Przywara 
 ---
    xen/arch/arm/vgic/vgic.c | 63
 
    1 file changed, 63 insertions(+)

 diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
 index dc5e011fa3..8d5260a7db 100644
 --- a/xen/arch/arm/vgic/vgic.c
 +++ b/xen/arch/arm/vgic/vgic.c
 @@ -693,6 +693,69 @@ void vgic_kick_vcpus(struct domain *d)
    }
    }
    +struct irq_desc *vgic_get_hw_irq_desc(struct domain *d, struct vcpu
 *v,
 +  unsigned int virq)
 +{
 +    struct irq_desc *desc = NULL;
 +    struct vgic_irq *irq = vgic_get_irq(d, v, virq);
 +    unsigned long flags;
 +
 +    if ( !irq )
 +    return NULL;
 +
 +    spin_lock_irqsave(>irq_lock, flags);
 +    if ( irq->hw )
 +    desc = irq_to_desc(irq->hwintid);
>>>
>>> This is not going to work well for PPIs. We should consider to add at
>>> least an ASSERT(...) in the code to prevent bad use of it.
>>
>> Yeah, done. But I wonder if we eventually should extend the
>> irq_to_desc() function to take the vCPU, since we will need it anyway
>> once we use hardware mapped timer IRQs (PPIs) in the future. But this
>> should not be in this series, I guess.
> 
> irq_to_desc only deal with hardware interrupt, so you mean pCPU instead
> of vCPU?

Yes, indeed. But I think this points to the problem of this approach:
the virtual IRQ is tied to a VCPU, and we have to make sure that not
only the affinity is updated on a CPU migration (as we do for SPIs), but
actually the interrupt itself is changed: since CPU0/PPI9 has a
different irq_desc* from, say, CPU1/PPI9.
So there is more than just adding a parameter to irq_to_desc().

 +    spin_unlock_irqrestore(>irq_lock, flags);
 +
 +    vgic_put_irq(d, irq);
 +
 +    return desc;
 +}
 +
 +/*
 + * was:
 + *  int kvm_vgic_map_phys_irq(struct vcpu *vcpu, u32 virt_irq,
 u32 phys_irq)
 + *  int kvm_vgic_unmap_phys_irq(struct vcpu *vcpu, unsigned int
 virt_irq)
 + */
 +int vgic_connect_hw_irq(struct domain *d, struct vcpu *vcpu,
 +    unsigned int virt_irq, struct irq_desc *desc,
 +    bool connect)
>>>
>>> Indentation.
>>>
 +{
 +    struct vgic_irq *irq = vgic_get_irq(d, vcpu, virt_irq);
 +    unsigned long flags;
 +    int ret = 0;
 +
 +    if ( !irq )
 +    return -EINVAL;
 +
 +    spin_lock_irqsave(>irq_lock, flags);
 +
 +    if ( connect )  /* assign a mapped IRQ */
 +    {
 +    /* The VIRQ should not be already enabled by the guest */
 +    if ( !irq->hw && !irq->enabled )
 +    {
 +    irq->hw = true;
 +    irq->hwintid = desc->irq;
 +    }
 +    else
 +    {
 +    ret = -EBUSY;
 +    }
>>>
>>> I know that it should not matter for SPIs today. But aren't you meant to
>>> get a reference on that interrupt if you connect it?
>>
>> No, the refcount feature is strictly for the pointer to the structure,
>> not for everything related to this virtual IRQ.
>> We store only the virtual IRQ number in the dev_id/info members, we will
>> get the struct vgic_irq pointer via the vIRQ number on do_IRQ().
>> Does that make sense?
> 
> But technically you "allocate" the virtual SPI at that time, right? So
> this would mean you need to get a reference, otherwise it might disappear.

We will realise that is has disappeared when vgic_get_irq() called with
that virtual number returns NULL. The refcount is really just to know
when you can free dynamically allocated struct vgic_irqs, so it's
strictly about the *pointer* to the *memory*, not about the logical
entity of that particular virtual IRQ.
Actually it should not really happen that you end up with a hardware IRQ
still assigned to an abandoned virtual IRQ, as you would expect to free
that connection *before* disbanding the virtual IRQ.

> So I am not entirely sure why the reference is not necessary here.

Typically to remove a virtual IRQ, you arrange for vgic_get_irq() to
return NULL on that number. Then you "wait" for the refcount to drop to
zero, at which point it's safe to free the memory allocated for that
vgic_irq. As mentioned, only really useful for LPIs, but 

Re: [Xen-devel] [PATCH][next] xen-netback: make function xenvif_rx_skb static

2018-02-26 Thread David Miller
From: Colin King 
Date: Fri, 23 Feb 2018 17:16:57 +

> From: Colin Ian King 
> 
> The function xenvif_rx_skb is local to the source and does not need
> to be in global scope, so make it static.
> 
> Cleans up sparse warning:
> drivers/net/xen-netback/rx.c:422:6: warning: symbol 'xenvif_rx_skb'
> was not declared. Should it be static?
> 
> Signed-off-by: Colin Ian King 

Applied.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 26/49] ARM: new VGIC: Implement vgic_vcpu_pending_irq

2018-02-26 Thread Julien Grall



On 02/26/2018 04:25 PM, Andre Przywara wrote:

Hi,

On 26/02/18 15:55, Julien Grall wrote:

Hi,

On 02/26/2018 03:29 PM, Andre Przywara wrote:

On 13/02/18 16:35, Julien Grall wrote:

diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
index f4f2a04a60..9e7fb1edcb 100644
--- a/xen/arch/arm/vgic/vgic.c
+++ b/xen/arch/arm/vgic/vgic.c
@@ -646,6 +646,38 @@ void gic_inject(void)
    vgic_restore_state(current);
    }
    +static int vgic_vcpu_pending_irq(struct vcpu *vcpu)
+{
+    struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
+    struct vgic_irq *irq;
+    bool pending = false;
+    unsigned long flags;
+
+    if ( !vcpu->domain->arch.vgic.enabled )
+    return false;
+
+    spin_lock_irqsave(_cpu->ap_list_lock, flags);
+
+    list_for_each_entry(irq, _cpu->ap_list_head, ap_list)
+    {
+    spin_lock(>irq_lock);
+    pending = irq_is_pending(irq) && irq->enabled;
+    spin_unlock(>irq_lock);
+
+    if ( pending )
+    break;
+    }
+
+    spin_unlock_irqrestore(_cpu->ap_list_lock, flags);
+
+    return pending;
+}
+
+int gic_events_need_delivery(void)


You probably want to rename that function or just expose
vgic_vcpu_pending_irq().


Rename to what? I need both functions: vgic_vcpu_pending_irq() is also
called by vgic_kick_vcpus() (later in the series).
And gic_events_need_delivery(void) is the interface that the arch code
expects. Shall I rename this there? To what?


Let me start with it is a bit odd to have a function name 'gic_*' in the
virtual GIC code. So at least renaming to vgic_events_need_delivery
would be an improvement.

Regarding the interface itself, it is ARM specific and not set in stone.
It would not be too bad to use vgic_vcpu_pending_irq(current). Is there
any reason for not doing that?


Not really, but I am a bit reluctant to change too much original Xen
code, don't want to step on anyone's toes ;-)


The original code is going to get kill at some point. So better use name 
that makes sense in the new context. It is quite similar to the 
gic_inject change.




But if that's fine with you, I am OK with the renaming - though it adds
yet another patch ;-)


The end goal is a better world, so the number of patches does not matter 
here :).


If you put them at the beginning, we can merge them right away.

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 26/49] ARM: new VGIC: Implement vgic_vcpu_pending_irq

2018-02-26 Thread Andre Przywara
Hi,

On 26/02/18 15:55, Julien Grall wrote:
> Hi,
> 
> On 02/26/2018 03:29 PM, Andre Przywara wrote:
>> On 13/02/18 16:35, Julien Grall wrote:
 diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
 index f4f2a04a60..9e7fb1edcb 100644
 --- a/xen/arch/arm/vgic/vgic.c
 +++ b/xen/arch/arm/vgic/vgic.c
 @@ -646,6 +646,38 @@ void gic_inject(void)
    vgic_restore_state(current);
    }
    +static int vgic_vcpu_pending_irq(struct vcpu *vcpu)
 +{
 +    struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
 +    struct vgic_irq *irq;
 +    bool pending = false;
 +    unsigned long flags;
 +
 +    if ( !vcpu->domain->arch.vgic.enabled )
 +    return false;
 +
 +    spin_lock_irqsave(_cpu->ap_list_lock, flags);
 +
 +    list_for_each_entry(irq, _cpu->ap_list_head, ap_list)
 +    {
 +    spin_lock(>irq_lock);
 +    pending = irq_is_pending(irq) && irq->enabled;
 +    spin_unlock(>irq_lock);
 +
 +    if ( pending )
 +    break;
 +    }
 +
 +    spin_unlock_irqrestore(_cpu->ap_list_lock, flags);
 +
 +    return pending;
 +}
 +
 +int gic_events_need_delivery(void)
>>>
>>> You probably want to rename that function or just expose
>>> vgic_vcpu_pending_irq().
>>
>> Rename to what? I need both functions: vgic_vcpu_pending_irq() is also
>> called by vgic_kick_vcpus() (later in the series).
>> And gic_events_need_delivery(void) is the interface that the arch code
>> expects. Shall I rename this there? To what?
> 
> Let me start with it is a bit odd to have a function name 'gic_*' in the
> virtual GIC code. So at least renaming to vgic_events_need_delivery
> would be an improvement.
> 
> Regarding the interface itself, it is ARM specific and not set in stone.
> It would not be too bad to use vgic_vcpu_pending_irq(current). Is there
> any reason for not doing that?

Not really, but I am a bit reluctant to change too much original Xen
code, don't want to step on anyone's toes ;-)

But if that's fine with you, I am OK with the renaming - though it adds
yet another patch ;-)

Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 25/49] ARM: new VGIC: Add GICv2 world switch backend

2018-02-26 Thread Andre Przywara
Hi,

On 26/02/18 15:59, Julien Grall wrote:
> 
> 
> On 02/26/2018 03:16 PM, Andre Przywara wrote:
>> Hi,
> 
> Hi,
> 
>> forgot to mention:
>>
>> On 13/02/18 14:31, Julien Grall wrote:
>>> Hi,
>>>
>>> On 09/02/18 14:39, Andre Przywara wrote:
 Processing maintenance interrupts and accessing the list registers
 are dependent on the host's GIC version.
 Introduce vgic-v2.c to contain GICv2 specific functions.
 Implement the GICv2 specific code for syncing the emulation state
 into the VGIC registers.
 This also adds the hook to let Xen setup the host GIC addresses.

 This is based on Linux commit 140b086dd197, written by Marc Zyngier.

 Signed-off-by: Andre Przywara 
 ---
    xen/arch/arm/vgic/vgic-v2.c | 261
 
    xen/arch/arm/vgic/vgic.c    |  20 
    xen/arch/arm/vgic/vgic.h    |   8 ++
    3 files changed, 289 insertions(+)
    create mode 100644 xen/arch/arm/vgic/vgic-v2.c

 diff --git a/xen/arch/arm/vgic/vgic-v2.c b/xen/arch/arm/vgic/vgic-v2.c
 new file mode 100644
 index 00..10fc467ffa
 --- /dev/null
 +++ b/xen/arch/arm/vgic/vgic-v2.c
>>
>> 
>>
 +void vgic_v2_save_state(struct vcpu *vcpu)
 +{
 +    u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
 +
 +    if ( used_lrs )
 +    {
 +    save_lrs(vcpu, gic_v2_hw_data.hbase);
 +    writel_relaxed(0, gic_v2_hw_data.hbase + GICH_HCR);
 +    }
 +}
>>>
>>> I am not entirely convinced that have a separate function to save the
>>> LRs is necessary. This could be done in fold_lr_state().
>>>
 +
 +void vgic_v2_restore_state(struct vcpu *vcpu)
 +{
 +    struct vgic_v2_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v2;,
 +    u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
 +    int i;
 +
 +    if ( used_lrs )
 +    {
 +    writel_relaxed(cpu_if->vgic_hcr,
 +   gic_v2_hw_data.hbase + GICH_HCR);
 +    for ( i = 0; i < used_lrs; i++ )
 +    writel_relaxed(cpu_if->vgic_lr[i],
 +   gic_v2_hw_data.hbase + GICH_LR0 + (i * 4));
 +    }
>>>
>>> Same here but with populate_lr_state(). This would make the code easier
>>> to follow and also avoid a lot ifery in the vgic.c code.
>>
>> This is mostly due to KVM's inability to directly access the GICv3 LRs
>> when running in EL1. I will take a look whether what it would take to
>> merge this. Sounds tempting, but there might be side effects.
> 
> I am not sure what would be the side effects. You basically
> call save_state and right after fold_lr_state. This would streamline a
> bit more the code.

The possible side effects are that we actually now have a shadow copy of
the LRs in our data structures. I have the gut feeling we don't need
this in Xen, but need to check more thoroughly.

Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [Minios-devel] Excited for Xen Project in Outreachy

2018-02-26 Thread Lars Kurth
Hi Kanika,

> On 24 Feb 2018, at 20:40, KANIKA SAINI  wrote:
> 
> Thank you, Lars. 
> The links you have provided have managed to make things clearer to me now.
> 
> I understand how a flag in the makefile can be changed by passing a parameter 
> to the make command in the command line and how 
> "origin" checks where the variable comes from.
> I did realize creating the dependency of the Makefile itself would indeed not 
> be a good idea at the finer level of granularity. 
> I went through the documentation at 
> http://unikraft.neclab.eu/developers-app.html 
>  and can now understand the 
> structure of makefiles of libraries better. 
> 
> I spent some time studying the problem and came across this solution at 
> https://www.cmcrossroads.com/article/rebuilding-when-cppflags-changes 
>  which 
> uses the concept of signatures. 
> I also understood how rules are dynamically being set by the Makefiles in 
> support/build/ directory. 
> 
> To be able to draft a solution there's one more thing lacking my 
> understanding, which is the order of the execution of the makefiles. 
> What calls the dynamic makefiles and when is the main makefile called and how 
> do the dynamic makefiles pass those parameters to the main one?

I am not sure I understand your question.

Generally, you can debug makefiles which can help you understand what is going 
on: e.g. see http://www.oreilly.com/openbook/make3/book/ch12.pdf 


But in a nutshell: you call make ... in your directory and the Makefile is 
invoked
* Other rules, e.g. in support / build are included via include statements
* All the other *.uk makefiles are also included 
=> The makefile in unikraft / unikraft.git drives everything related to 
building and configuration (including menuconfig)

In some cases you have nested makefile execution, where another makefile is 
invoked via "@make" as in the Hello World app

Lars

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 25/49] ARM: new VGIC: Add GICv2 world switch backend

2018-02-26 Thread Andre Przywara
Hi,

On 26/02/18 16:02, Julien Grall wrote:
> Hi Andre,
> 
> On 02/26/2018 03:13 PM, Andre Przywara wrote:
>> Hi,
>>
>> On 13/02/18 14:31, Julien Grall wrote:
>>> Hi,
>>>
>>> On 09/02/18 14:39, Andre Przywara wrote:
 Processing maintenance interrupts and accessing the list registers
 are dependent on the host's GIC version.
 Introduce vgic-v2.c to contain GICv2 specific functions.
 Implement the GICv2 specific code for syncing the emulation state
 into the VGIC registers.
 This also adds the hook to let Xen setup the host GIC addresses.

 This is based on Linux commit 140b086dd197, written by Marc Zyngier.

 Signed-off-by: Andre Przywara 
 ---
    xen/arch/arm/vgic/vgic-v2.c | 261
 
    xen/arch/arm/vgic/vgic.c    |  20 
    xen/arch/arm/vgic/vgic.h    |   8 ++
    3 files changed, 289 insertions(+)
    create mode 100644 xen/arch/arm/vgic/vgic-v2.c

 diff --git a/xen/arch/arm/vgic/vgic-v2.c b/xen/arch/arm/vgic/vgic-v2.c
 new file mode 100644
 index 00..10fc467ffa
 --- /dev/null
 +++ b/xen/arch/arm/vgic/vgic-v2.c
 @@ -0,0 +1,261 @@
 +/*
 + * Copyright (C) 2015, 2016 ARM Ltd.
 + * Imported from Linux ("new" KVM VGIC) and heavily adapted to Xen.
 + *
 + * This program is free software; you can redistribute it and/or
 modify
 + * it under the terms of the GNU General Public License version 2 as
 + * published by the Free Software Foundation.
 + *
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU General Public License for more details.
 + *
 + * You should have received a copy of the GNU General Public License
 + * along with this program.  If not, see
 .
 + */
 +
 +#include 
 +#include 
 +#include 
 +#include 
 +#include 
 +
 +#include "vgic.h"
 +
 +#define GICH_ELRSR0 0x30
 +#define GICH_ELRSR1 0x34
 +#define GICH_LR0    0x100
 +
 +#define GICH_LR_VIRTUALID   (0x3ff << 0)
 +#define GICH_LR_PHYSID_CPUID_SHIFT  (10)
 +#define GICH_LR_PHYSID_CPUID    (0x3ff <<
 GICH_LR_PHYSID_CPUID_SHIFT)
 +#define GICH_LR_PRIORITY_SHIFT  23
 +#define GICH_LR_STATE   (3 << 28)
 +#define GICH_LR_PENDING_BIT (1 << 28)
 +#define GICH_LR_ACTIVE_BIT  (1 << 29)
 +#define GICH_LR_EOI (1 << 19)
 +#define GICH_LR_HW  (1 << 31)
>>>
>>> Can we define them in either in gic.h or a new header gic-v2.h?
>>
>> Yes, but they clash with some ill-named GICv3 LR bits. So expect another
>> patch which renames GICH_LR_STATE_SHIFT to ICH_LR_STATE_SHIFT. Which is
>> the actual spec name for that system register in GICv3, there is no
>> GICH_LR_ with the GICv3 bit positions.
> 
> While this would be a nice clean-up. Wouldn't create a new gic-v2.h
> sufficient?

I don't think that would be right. We actually already have some GICH_
definitions in xen/include/asm-arm/gic.h, so just adding the missing
ones there sounds natural. I now remember that I just didn't do this
initially because of the clash and and at this time I just wanted to
make it compile ;-)

And since assigning GICH_ names to GICv3 ICH_ register bits sounds wrong
in the first place, I consider this a good opportunity to fix this.

Cheers,
Andre.

> 
>>
>>
 +
 +static struct {
 +    bool enabled;
 +    paddr_t dbase;  /* Distributor interface address */
 +    paddr_t cbase;  /* CPU interface address & size */
 +    paddr_t csize;
 +    paddr_t vbase;  /* Virtual CPU interface address */
 +    void __iomem *hbase;    /* Hypervisor control interface */
 +
 +    /* Offset to add to get an 8kB contiguous region if GIC is
 aliased */
 +    uint32_t aliased_offset;
 +} gic_v2_hw_data;
 +
 +void vgic_v2_setup_hw(paddr_t dbase, paddr_t cbase, paddr_t csize,
 +  paddr_t vbase, void __iomem *hbase,
 +  uint32_t aliased_offset)
 +{
 +    gic_v2_hw_data.enabled = true;
 +    gic_v2_hw_data.dbase = dbase;
 +    gic_v2_hw_data.cbase = cbase;
 +    gic_v2_hw_data.csize = csize;
 +    gic_v2_hw_data.vbase = vbase;
 +    gic_v2_hw_data.hbase = hbase;
 +    gic_v2_hw_data.aliased_offset = aliased_offset;
 +}
 +
 +void vgic_v2_set_underflow(struct vcpu *vcpu)
 +{
 +    gic_hw_ops->update_hcr_status(GICH_HCR_UIE, 1);
 +}
 +
 +/*
 + * transfer the content of the LRs back into the corresponding
 ap_list:
 + * - active bit 

Re: [Xen-devel] [PATCH] x86/xen: zero MSR_IA32_SPEC_CTRL before suspend

2018-02-26 Thread Jan Beulich
>>> On 26.02.18 at 15:08,  wrote:
> Older Xen versions (4.5 and before) might have problems migrating pv
> guests with MSR_IA32_SPEC_CTRL having a non-zero value. So before
> suspending zero that MSR and restore it after being resumed.
> 
> Cc: sta...@vger.kernel.org 
> Signed-off-by: Juergen Gross 

Reviewed-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC Patch v4 8/8] x86/hvm: bump the maximum number of vcpus to 512

2018-02-26 Thread Jan Beulich
>>> On 26.02.18 at 14:11,  wrote:
> On Mon, Feb 26, 2018 at 01:26:42AM -0700, Jan Beulich wrote:
> On 23.02.18 at 19:11,  wrote:
>>> On Wed, Dec 06, 2017 at 03:50:14PM +0800, Chao Gao wrote:
 Signed-off-by: Chao Gao 
 ---
  xen/include/public/hvm/hvm_info_table.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/xen/include/public/hvm/hvm_info_table.h 
>>> b/xen/include/public/hvm/hvm_info_table.h
 index 08c252e..6833a4c 100644
 --- a/xen/include/public/hvm/hvm_info_table.h
 +++ b/xen/include/public/hvm/hvm_info_table.h
 @@ -32,7 +32,7 @@
  #define HVM_INFO_PADDR   ((HVM_INFO_PFN << 12) + HVM_INFO_OFFSET)
  
  /* Maximum we can support with current vLAPIC ID mapping. */
 -#define HVM_MAX_VCPUS128
 +#define HVM_MAX_VCPUS512
>>> 
>>> Wow, that looks like a pretty big jump. I certainly don't have access
>>> to any box with this number of vCPUs, so that's going to be quite hard
>>> to test. What the reasoning behind this bump? Is hardware with 512
>>> ways expected soon-ish?
>>> 
>>> Also osstest is not even able to test the current limit, so I would
>>> maybe bump this to 256, but as I expressed in other occasions I don't
>>> feel comfortable with have a number of vCPUs that the current test
>>> system doesn't have hardware to test with.
>>
>>I think implementation limit and supported limit need to be clearly
>>distinguished here. Therefore I'd put the question the other way
>>around: What's causing the limit to be 512, rather than 1024,
>>4096, or even 4G-1 (x2APIC IDs are 32 bits wide, after all)?
> 
> TBH, I have no idea. When I choose a value, what comes up to my mind is
> that the value should be 288, because Intel has Xeon-phi platform which
> has 288 physical threads, and some customers wants to use this new platform
> for HPC cloud. Furthermore, they requests to support a big VM in which
> almost computing and device resources are assigned to the VM. They just
> use virtulization technology to manage the machines. In this situation,
> I choose 512 is because I feel much better if the limit is a power of 2.
> 
> You are asking that as these patches remove limitations imposed by some
> components, which one is the next bottleneck and how many vcpus does it
> limit.  Maybe it would be the use-case. No one is requesting to support
> more than 288 at this moment. So what is the value you prefer? 288 or
> 512? or you think I should find the next bottleneck in Xen's
> implementation.

Again - here we're talking about implementation limits, not
bottlenecks. So in this context all I'm interested in is whether
(and if so which) implementation limit remains. If an (almost)
arbitrary number is fine, perhaps we'll want to have a Kconfig
option.

I'm also curious - do Phis not come in multi-socket configs? It's
my understanding that 288 is the count for a single socket.

As to bottlenecks - you've been told they exist far below 128.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [ovmf test] 120014: regressions - FAIL

2018-02-26 Thread osstest service owner
flight 120014 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120014/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 119996

version targeted for testing:
 ovmf f440f7e3caba12c0649c9ce15c33c7ec7aa2a4e8
baseline version:
 ovmf ebfca258f5d7ab59cd1b72ad56f1de0e7a138ba9

Last test of basis   119996  2018-02-24 17:57:18 Z1 days
Testing same since   120014  2018-02-25 14:21:05 Z1 days1 attempts


People who touched revisions under test:
  Feng, YunhuaX 
  Kinney, Michael D 
  Michael D Kinney 
  Yonghong Zhu 
  Yunhua Feng 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit f440f7e3caba12c0649c9ce15c33c7ec7aa2a4e8
Author: Yonghong Zhu 
Date:   Fri Feb 23 13:05:34 2018 +0800

BaseTools: Add *B Flag for the field that from command line

For structure PCD, the field value may override in the command line,
so in the report when we print the field info we add *B Flag for those
field.

Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Yonghong Zhu 
Reviewed-by: Liming Gao 

commit 3be421e98756efc6d355b45e632c5c7b19b35b9e
Author: Feng, YunhuaX 
Date:   Fri Feb 23 19:47:30 2018 +0800

BaseTools: Update ValueExpressionEx for flexible PCD

1. Byte  array number should less than 0xFF.
2. Add SplitPcdValueString for PCD split

Cc: Liming Gao 
Cc: Yonghong Zhu 
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Yunhua Feng 
Reviewed-by: Yonghong Zhu 

commit 8bd72d7c05ff820ee7826809a033fda9b007d18f
Author: Kinney, Michael D 
Date:   Tue Feb 20 20:08:32 2018 -0800

BaseTools/Expression: Use 2nd passes on PCD values

Use 2 passes when evaluating PCD values to discover
all the LABEL() operators and compute the byte offset
of each LABEL().  The 2nd pass then has the information
to replace the OFFSET_OF() operator with the computed
byte offset.  The 2 passes allows OFFSET_OF() to be used
before a LABEL() is declared.

fixes:https://bugzilla.tianocore.org/show_bug.cgi?id=880
Cc: Liming Gao 
Cc: Yonghong Zhu 
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Michael D Kinney 
Reviewed-by: Yonghong Zhu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [v2 1/1] xen, mm: Allow deferred page initialization for xen pv domains

2018-02-26 Thread Pavel Tatashin
Juergen Gross noticed that commit
f7f99100d8d ("mm: stop zeroing memory during allocation in vmemmap")
broke XEN PV domains when deferred struct page initialization is enabled.

This is because the xen's PagePinned() flag is getting erased from struct
pages when they are initialized later in boot.

Juergen fixed this problem by disabling deferred pages on xen pv domains.
It is desirable, however, to have this feature available as it reduces boot
time. This fix re-enables the feature for pv-dmains, and fixes the problem
the following way:

The fix is to delay setting PagePinned flag until struct pages for all
allocated memory are initialized, i.e. until after free_all_bootmem().

A new x86_init.hyper op init_after_bootmem() is called to let xen know
that boot allocator is done, and hence struct pages for all the allocated
memory are now initialized. If deferred page initialization is enabled, the
rest of struct pages are going to be initialized later in boot once
page_alloc_init_late() is called.

xen_after_bootmem() walks page table's pages and marks them pinned.

Signed-off-by: Pavel Tatashin 
---
 arch/x86/include/asm/x86_init.h |  2 ++
 arch/x86/kernel/x86_init.c  |  1 +
 arch/x86/mm/init_32.c   |  1 +
 arch/x86/mm/init_64.c   |  1 +
 arch/x86/xen/mmu_pv.c   | 38 ++
 mm/page_alloc.c |  4 
 6 files changed, 31 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 5ffa116ddb08..c06046e2d3ff 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -122,12 +122,14 @@ struct x86_init_pci {
  * @guest_late_init:   guest late init
  * @x2apic_available:  X2APIC detection
  * @init_mem_mapping:  setup early mappings during init_mem_mapping()
+ * @init_after_bootmem:guest init after boot allocator is 
finished
  */
 struct x86_hyper_init {
void (*init_platform)(void);
void (*guest_late_init)(void);
bool (*x2apic_available)(void);
void (*init_mem_mapping)(void);
+   void (*init_after_bootmem)(void);
 };
 
 /**
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index aab817eb05cf..3215bffbf4d1 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -91,6 +91,7 @@ struct x86_init_ops x86_init __initdata = {
.guest_late_init= x86_init_noop,
.x2apic_available   = bool_x86_init_noop,
.init_mem_mapping   = x86_init_noop,
+   .init_after_bootmem = x86_init_noop,
},
 
.acpi = {
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 79cb066f40c0..0b750c845078 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -763,6 +763,7 @@ void __init mem_init(void)
free_all_bootmem();
 
after_bootmem = 1;
+   x86_init.hyper.init_after_bootmem();
 
mem_init_print_info(NULL);
printk(KERN_INFO "virtual kernel memory layout:\n"
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 332f6e25977a..8d60443dd900 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1189,6 +1189,7 @@ void __init mem_init(void)
/* this will put all memory onto the freelists */
free_all_bootmem();
after_bootmem = 1;
+   x86_init.hyper.init_after_bootmem();
 
/*
 * Must be done after boot memory is put on freelist, because here we
diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
index d20763472920..486c0a34d00b 100644
--- a/arch/x86/xen/mmu_pv.c
+++ b/arch/x86/xen/mmu_pv.c
@@ -116,6 +116,8 @@ DEFINE_PER_CPU(unsigned long, xen_current_cr3);  /* 
actual vcpu cr3 */
 
 static phys_addr_t xen_pt_base, xen_pt_size __initdata;
 
+static DEFINE_STATIC_KEY_FALSE(xen_struct_pages_ready);
+
 /*
  * Just beyond the highest usermode address.  STACK_TOP_MAX has a
  * redzone above it, so round it up to a PGD boundary.
@@ -155,11 +157,18 @@ void make_lowmem_page_readwrite(void *vaddr)
 }
 
 
+/*
+ * During early boot all page table pages are pinned, but we do not have struct
+ * pages, so return true until struct pages are ready.
+ */
 static bool xen_page_pinned(void *ptr)
 {
-   struct page *page = virt_to_page(ptr);
+   if (static_branch_likely(_struct_pages_ready)) {
+   struct page *page = virt_to_page(ptr);
 
-   return PagePinned(page);
+   return PagePinned(page);
+   }
+   return true;
 }
 
 static void xen_extend_mmu_update(const struct mmu_update *update)
@@ -836,11 +845,6 @@ void xen_mm_pin_all(void)
spin_unlock(_lock);
 }
 
-/*
- * The init_mm pagetable is really pinned as soon as its created, but
- * that's before we have page structures to store the bits.  So do all
- * the book-keeping now.
- */
 static int __init xen_mark_pinned(struct mm_struct *mm, struct page 

[Xen-devel] [v2 0/1] Allow deferred page initialization for xen pv domains

2018-02-26 Thread Pavel Tatashin
Changelog
v1 - v2
- Addressed coomment from Juergen Gross: fixed a comment, and moved
  after_bootmem from PV framework to x86_init.hyper.

From this discussion:
https://www.spinics.net/lists/linux-mm/msg145604.html

I investigated whether it is feasible to re-enable deferred page
initialization on xen's para-vitalized domains. After studying the
code, I found non-intrusive way to do just that.

All we need to do is to assume that page-table's pages are pinned early in
boot, which is always true, and add a new x86_init.hyper OP call to notify
guests that boot allocator is finished, so we can set all the necessary
fields in already initialized struct pages.

I have tested this on my laptop with 64-bit kernel, but I would appreciate
if someone could provide more xen testing.

Apply against: linux-next. Enable the following configs:

CONFIG_XEN_PV=y
CONFIG_DEFERRED_STRUCT_PAGE_INIT=y
The above two are needed to test deferred page initialization on PV Xen
domains. If fix is applied correctly, dmesg should output line(s) like this
during boot:
[0.266180] node 0 initialised, 717570 pages in 36ms

CONFIG_DEBUG_VM=y
This is needed to poison struct page's memory, otherwise it would be all
zero.

CONFIG_DEBUG_VM_PGFLAGS=y
Verifies that we do not access struct pages flags while memory is still
poisoned (struct pages are not initialized yet).

Pavel Tatashin (1):
  xen, mm: Allow deferred page initialization for xen pv domains

 arch/x86/include/asm/x86_init.h |  2 ++
 arch/x86/kernel/x86_init.c  |  1 +
 arch/x86/mm/init_32.c   |  1 +
 arch/x86/mm/init_64.c   |  1 +
 arch/x86/xen/mmu_pv.c   | 38 ++
 mm/page_alloc.c |  4 
 6 files changed, 31 insertions(+), 16 deletions(-)

-- 
2.16.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 25/49] ARM: new VGIC: Add GICv2 world switch backend

2018-02-26 Thread Julien Grall

Hi Andre,

On 02/26/2018 03:13 PM, Andre Przywara wrote:

Hi,

On 13/02/18 14:31, Julien Grall wrote:

Hi,

On 09/02/18 14:39, Andre Przywara wrote:

Processing maintenance interrupts and accessing the list registers
are dependent on the host's GIC version.
Introduce vgic-v2.c to contain GICv2 specific functions.
Implement the GICv2 specific code for syncing the emulation state
into the VGIC registers.
This also adds the hook to let Xen setup the host GIC addresses.

This is based on Linux commit 140b086dd197, written by Marc Zyngier.

Signed-off-by: Andre Przywara 
---
   xen/arch/arm/vgic/vgic-v2.c | 261

   xen/arch/arm/vgic/vgic.c    |  20 
   xen/arch/arm/vgic/vgic.h    |   8 ++
   3 files changed, 289 insertions(+)
   create mode 100644 xen/arch/arm/vgic/vgic-v2.c

diff --git a/xen/arch/arm/vgic/vgic-v2.c b/xen/arch/arm/vgic/vgic-v2.c
new file mode 100644
index 00..10fc467ffa
--- /dev/null
+++ b/xen/arch/arm/vgic/vgic-v2.c
@@ -0,0 +1,261 @@
+/*
+ * Copyright (C) 2015, 2016 ARM Ltd.
+ * Imported from Linux ("new" KVM VGIC) and heavily adapted to Xen.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "vgic.h"
+
+#define GICH_ELRSR0 0x30
+#define GICH_ELRSR1 0x34
+#define GICH_LR0    0x100
+
+#define GICH_LR_VIRTUALID   (0x3ff << 0)
+#define GICH_LR_PHYSID_CPUID_SHIFT  (10)
+#define GICH_LR_PHYSID_CPUID    (0x3ff <<
GICH_LR_PHYSID_CPUID_SHIFT)
+#define GICH_LR_PRIORITY_SHIFT  23
+#define GICH_LR_STATE   (3 << 28)
+#define GICH_LR_PENDING_BIT (1 << 28)
+#define GICH_LR_ACTIVE_BIT  (1 << 29)
+#define GICH_LR_EOI (1 << 19)
+#define GICH_LR_HW  (1 << 31)


Can we define them in either in gic.h or a new header gic-v2.h?


Yes, but they clash with some ill-named GICv3 LR bits. So expect another
patch which renames GICH_LR_STATE_SHIFT to ICH_LR_STATE_SHIFT. Which is
the actual spec name for that system register in GICv3, there is no
GICH_LR_ with the GICv3 bit positions.


While this would be a nice clean-up. Wouldn't create a new gic-v2.h 
sufficient?






+
+static struct {
+    bool enabled;
+    paddr_t dbase;  /* Distributor interface address */
+    paddr_t cbase;  /* CPU interface address & size */
+    paddr_t csize;
+    paddr_t vbase;  /* Virtual CPU interface address */
+    void __iomem *hbase;    /* Hypervisor control interface */
+
+    /* Offset to add to get an 8kB contiguous region if GIC is
aliased */
+    uint32_t aliased_offset;
+} gic_v2_hw_data;
+
+void vgic_v2_setup_hw(paddr_t dbase, paddr_t cbase, paddr_t csize,
+  paddr_t vbase, void __iomem *hbase,
+  uint32_t aliased_offset)
+{
+    gic_v2_hw_data.enabled = true;
+    gic_v2_hw_data.dbase = dbase;
+    gic_v2_hw_data.cbase = cbase;
+    gic_v2_hw_data.csize = csize;
+    gic_v2_hw_data.vbase = vbase;
+    gic_v2_hw_data.hbase = hbase;
+    gic_v2_hw_data.aliased_offset = aliased_offset;
+}
+
+void vgic_v2_set_underflow(struct vcpu *vcpu)
+{
+    gic_hw_ops->update_hcr_status(GICH_HCR_UIE, 1);
+}
+
+/*
+ * transfer the content of the LRs back into the corresponding ap_list:
+ * - active bit is transferred as is
+ * - pending bit is
+ *   - transferred as is in case of edge sensitive IRQs
+ *   - set to the line-level (resample time) for level sensitive IRQs
+ */
+void vgic_v2_fold_lr_state(struct vcpu *vcpu)


I am wondering how much we could share this code with
vgic_v3_fold_lr_state.


I think we discussed this and dismissed the idea:
- The actual LR encoding is much different between GICv3 and GICv2, up
to the point where we have some fields in one which are not in the
other. That really clutters the code.
- Originally this function was much shorter and didn't have that many
special cases. So the code duplication was really minimal.

I see your point, but don't really want to go there now for two reasons:
- It is probably nasty to implement, since we always have to check which
GIC we are running on when masking the LR value.
- It would deviate further from the KVM implementation, in a core
function. For any bugs introduced we are on our own here.

I will try to bring this up with the KVM people, to see whether it's

Re: [Xen-devel] [RFC PATCH 25/49] ARM: new VGIC: Add GICv2 world switch backend

2018-02-26 Thread Julien Grall



On 02/26/2018 03:16 PM, Andre Przywara wrote:

Hi,


Hi,


forgot to mention:

On 13/02/18 14:31, Julien Grall wrote:

Hi,

On 09/02/18 14:39, Andre Przywara wrote:

Processing maintenance interrupts and accessing the list registers
are dependent on the host's GIC version.
Introduce vgic-v2.c to contain GICv2 specific functions.
Implement the GICv2 specific code for syncing the emulation state
into the VGIC registers.
This also adds the hook to let Xen setup the host GIC addresses.

This is based on Linux commit 140b086dd197, written by Marc Zyngier.

Signed-off-by: Andre Przywara 
---
   xen/arch/arm/vgic/vgic-v2.c | 261

   xen/arch/arm/vgic/vgic.c    |  20 
   xen/arch/arm/vgic/vgic.h    |   8 ++
   3 files changed, 289 insertions(+)
   create mode 100644 xen/arch/arm/vgic/vgic-v2.c

diff --git a/xen/arch/arm/vgic/vgic-v2.c b/xen/arch/arm/vgic/vgic-v2.c
new file mode 100644
index 00..10fc467ffa
--- /dev/null
+++ b/xen/arch/arm/vgic/vgic-v2.c





+void vgic_v2_save_state(struct vcpu *vcpu)
+{
+    u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
+
+    if ( used_lrs )
+    {
+    save_lrs(vcpu, gic_v2_hw_data.hbase);
+    writel_relaxed(0, gic_v2_hw_data.hbase + GICH_HCR);
+    }
+}


I am not entirely convinced that have a separate function to save the
LRs is necessary. This could be done in fold_lr_state().


+
+void vgic_v2_restore_state(struct vcpu *vcpu)
+{
+    struct vgic_v2_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v2;,
+    u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
+    int i;
+
+    if ( used_lrs )
+    {
+    writel_relaxed(cpu_if->vgic_hcr,
+   gic_v2_hw_data.hbase + GICH_HCR);
+    for ( i = 0; i < used_lrs; i++ )
+    writel_relaxed(cpu_if->vgic_lr[i],
+   gic_v2_hw_data.hbase + GICH_LR0 + (i * 4));
+    }


Same here but with populate_lr_state(). This would make the code easier
to follow and also avoid a lot ifery in the vgic.c code.


This is mostly due to KVM's inability to directly access the GICv3 LRs
when running in EL1. I will take a look whether what it would take to
merge this. Sounds tempting, but there might be side effects.


I am not sure what would be the side effects. You basically
call save_state and right after fold_lr_state. This would streamline a 
bit more the code.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 26/49] ARM: new VGIC: Implement vgic_vcpu_pending_irq

2018-02-26 Thread Julien Grall

Hi,

On 02/26/2018 03:29 PM, Andre Przywara wrote:

On 13/02/18 16:35, Julien Grall wrote:

diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
index f4f2a04a60..9e7fb1edcb 100644
--- a/xen/arch/arm/vgic/vgic.c
+++ b/xen/arch/arm/vgic/vgic.c
@@ -646,6 +646,38 @@ void gic_inject(void)
   vgic_restore_state(current);
   }
   +static int vgic_vcpu_pending_irq(struct vcpu *vcpu)
+{
+    struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
+    struct vgic_irq *irq;
+    bool pending = false;
+    unsigned long flags;
+
+    if ( !vcpu->domain->arch.vgic.enabled )
+    return false;
+
+    spin_lock_irqsave(_cpu->ap_list_lock, flags);
+
+    list_for_each_entry(irq, _cpu->ap_list_head, ap_list)
+    {
+    spin_lock(>irq_lock);
+    pending = irq_is_pending(irq) && irq->enabled;
+    spin_unlock(>irq_lock);
+
+    if ( pending )
+    break;
+    }
+
+    spin_unlock_irqrestore(_cpu->ap_list_lock, flags);
+
+    return pending;
+}
+
+int gic_events_need_delivery(void)


You probably want to rename that function or just expose
vgic_vcpu_pending_irq().


Rename to what? I need both functions: vgic_vcpu_pending_irq() is also
called by vgic_kick_vcpus() (later in the series).
And gic_events_need_delivery(void) is the interface that the arch code
expects. Shall I rename this there? To what?


Let me start with it is a bit odd to have a function name 'gic_*' in the 
virtual GIC code. So at least renaming to vgic_events_need_delivery 
would be an improvement.


Regarding the interface itself, it is ARM specific and not set in stone. 
It would not be too bad to use vgic_vcpu_pending_irq(current). Is there 
any reason for not doing that?


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 26/49] ARM: new VGIC: Implement vgic_vcpu_pending_irq

2018-02-26 Thread Andre Przywara
Hi,

On 13/02/18 16:35, Julien Grall wrote:
> Hi,
> 
> On 09/02/18 14:39, Andre Przywara wrote:
>> Tell Xen whether a particular VCPU has an IRQ that needs handling
>> in the guest. This is used to decide whether a VCPU is runnable.
>>
>> This is based on Linux commit 90eee56c5f90, written by Eric Auger.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>   xen/arch/arm/vgic/vgic.c | 32 
>>   1 file changed, 32 insertions(+)
>>
>> diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
>> index f4f2a04a60..9e7fb1edcb 100644
>> --- a/xen/arch/arm/vgic/vgic.c
>> +++ b/xen/arch/arm/vgic/vgic.c
>> @@ -646,6 +646,38 @@ void gic_inject(void)
>>   vgic_restore_state(current);
>>   }
>>   +static int vgic_vcpu_pending_irq(struct vcpu *vcpu)
>> +{
>> +    struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
>> +    struct vgic_irq *irq;
>> +    bool pending = false;
>> +    unsigned long flags;
>> +
>> +    if ( !vcpu->domain->arch.vgic.enabled )
>> +    return false;
>> +
>> +    spin_lock_irqsave(_cpu->ap_list_lock, flags);
>> +
>> +    list_for_each_entry(irq, _cpu->ap_list_head, ap_list)
>> +    {
>> +    spin_lock(>irq_lock);
>> +    pending = irq_is_pending(irq) && irq->enabled;
>> +    spin_unlock(>irq_lock);
>> +
>> +    if ( pending )
>> +    break;
>> +    }
>> +
>> +    spin_unlock_irqrestore(_cpu->ap_list_lock, flags);
>> +
>> +    return pending;
>> +}
>> +
>> +int gic_events_need_delivery(void)
> 
> You probably want to rename that function or just expose
> vgic_vcpu_pending_irq().

Rename to what? I need both functions: vgic_vcpu_pending_irq() is also
called by vgic_kick_vcpus() (later in the series).
And gic_events_need_delivery(void) is the interface that the arch code
expects. Shall I rename this there? To what?

Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 25/49] ARM: new VGIC: Add GICv2 world switch backend

2018-02-26 Thread Andre Przywara
Hi,

On 13/02/18 14:31, Julien Grall wrote:
> Hi,
> 
> On 09/02/18 14:39, Andre Przywara wrote:
>> Processing maintenance interrupts and accessing the list registers
>> are dependent on the host's GIC version.
>> Introduce vgic-v2.c to contain GICv2 specific functions.
>> Implement the GICv2 specific code for syncing the emulation state
>> into the VGIC registers.
>> This also adds the hook to let Xen setup the host GIC addresses.
>>
>> This is based on Linux commit 140b086dd197, written by Marc Zyngier.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>   xen/arch/arm/vgic/vgic-v2.c | 261
>> 
>>   xen/arch/arm/vgic/vgic.c    |  20 
>>   xen/arch/arm/vgic/vgic.h    |   8 ++
>>   3 files changed, 289 insertions(+)
>>   create mode 100644 xen/arch/arm/vgic/vgic-v2.c
>>
>> diff --git a/xen/arch/arm/vgic/vgic-v2.c b/xen/arch/arm/vgic/vgic-v2.c
>> new file mode 100644
>> index 00..10fc467ffa
>> --- /dev/null
>> +++ b/xen/arch/arm/vgic/vgic-v2.c
>> @@ -0,0 +1,261 @@
>> +/*
>> + * Copyright (C) 2015, 2016 ARM Ltd.
>> + * Imported from Linux ("new" KVM VGIC) and heavily adapted to Xen.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see .
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include "vgic.h"
>> +
>> +#define GICH_ELRSR0 0x30
>> +#define GICH_ELRSR1 0x34
>> +#define GICH_LR0    0x100
>> +
>> +#define GICH_LR_VIRTUALID   (0x3ff << 0)
>> +#define GICH_LR_PHYSID_CPUID_SHIFT  (10)
>> +#define GICH_LR_PHYSID_CPUID    (0x3ff <<
>> GICH_LR_PHYSID_CPUID_SHIFT)
>> +#define GICH_LR_PRIORITY_SHIFT  23
>> +#define GICH_LR_STATE   (3 << 28)
>> +#define GICH_LR_PENDING_BIT (1 << 28)
>> +#define GICH_LR_ACTIVE_BIT  (1 << 29)
>> +#define GICH_LR_EOI (1 << 19)
>> +#define GICH_LR_HW  (1 << 31)
> 
> Can we define them in either in gic.h or a new header gic-v2.h?

Yes, but they clash with some ill-named GICv3 LR bits. So expect another
patch which renames GICH_LR_STATE_SHIFT to ICH_LR_STATE_SHIFT. Which is
the actual spec name for that system register in GICv3, there is no
GICH_LR_ with the GICv3 bit positions.


>> +
>> +static struct {
>> +    bool enabled;
>> +    paddr_t dbase;  /* Distributor interface address */
>> +    paddr_t cbase;  /* CPU interface address & size */
>> +    paddr_t csize;
>> +    paddr_t vbase;  /* Virtual CPU interface address */
>> +    void __iomem *hbase;    /* Hypervisor control interface */
>> +
>> +    /* Offset to add to get an 8kB contiguous region if GIC is
>> aliased */
>> +    uint32_t aliased_offset;
>> +} gic_v2_hw_data;
>> +
>> +void vgic_v2_setup_hw(paddr_t dbase, paddr_t cbase, paddr_t csize,
>> +  paddr_t vbase, void __iomem *hbase,
>> +  uint32_t aliased_offset)
>> +{
>> +    gic_v2_hw_data.enabled = true;
>> +    gic_v2_hw_data.dbase = dbase;
>> +    gic_v2_hw_data.cbase = cbase;
>> +    gic_v2_hw_data.csize = csize;
>> +    gic_v2_hw_data.vbase = vbase;
>> +    gic_v2_hw_data.hbase = hbase;
>> +    gic_v2_hw_data.aliased_offset = aliased_offset;
>> +}
>> +
>> +void vgic_v2_set_underflow(struct vcpu *vcpu)
>> +{
>> +    gic_hw_ops->update_hcr_status(GICH_HCR_UIE, 1);
>> +}
>> +
>> +/*
>> + * transfer the content of the LRs back into the corresponding ap_list:
>> + * - active bit is transferred as is
>> + * - pending bit is
>> + *   - transferred as is in case of edge sensitive IRQs
>> + *   - set to the line-level (resample time) for level sensitive IRQs
>> + */
>> +void vgic_v2_fold_lr_state(struct vcpu *vcpu)
> 
> I am wondering how much we could share this code with
> vgic_v3_fold_lr_state.

I think we discussed this and dismissed the idea:
- The actual LR encoding is much different between GICv3 and GICv2, up
to the point where we have some fields in one which are not in the
other. That really clutters the code.
- Originally this function was much shorter and didn't have that many
special cases. So the code duplication was really minimal.

I see your point, but don't really want to go there now for two reasons:
- It is probably nasty to implement, since we always have to check which
GIC we are running on when masking the LR value.
- It would deviate further from 

[Xen-devel] [seabios test] 120002: regressions - FAIL

2018-02-26 Thread osstest service owner
flight 120002 seabios real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120002/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop   fail REGR. vs. 115539

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 115539
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 115539
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 115539
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass

version targeted for testing:
 seabios  af0daeb2687ad2595482b8a71b02a082a5672ceb
baseline version:
 seabios  0ca6d6277dfafc671a5b3718cbeb5c78e2a888ea

Last test of basis   115539  2017-11-03 20:48:58 Z  114 days
Failing since115733  2017-11-10 17:19:59 Z  107 days  137 attempts
Testing same since   119258  2018-02-15 09:12:54 Z   11 days   15 attempts


People who touched revisions under test:
  Kevin O'Connor 
  Marcel Apfelbaum 
  Michael S. Tsirkin 
  Nikolay Nikolov 
  Paul Menzel 
  Stefan Berger 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-qemuu-nested-amdfail
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-ws16-amd64 fail
 test-amd64-i386-xl-qemuu-ws16-amd64  fail
 test-amd64-amd64-xl-qemuu-win10-i386 fail
 test-amd64-i386-xl-qemuu-win10-i386  fail
 test-amd64-amd64-qemuu-nested-intel  pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit af0daeb2687ad2595482b8a71b02a082a5672ceb
Author: Nikolay Nikolov 
Date:   Sat Feb 10 13:52:17 2018 +0200

floppy: Send 4 sense interrupt commands during controller initialization

During initialization, real floppy controllers need 4 sense interrupt 
commands
to clear the interrupt status (this represents the transition from "not 
ready"
to "ready" for each of the four virtual floppy drives), instead of just one.

This is described in detail in section 7.4 - Drive Polling of the Intel 
82077AA
datasheet.

Signed-off-by: Nikolay Nikolov 

commit 2611db472c0f0bad4987c20990a45c175342fc22
Author: Nikolay Nikolov 
Date:   Sat Feb 10 13:52:16 2018 +0200

floppy: Wait for the floppy motor to reach a stable speed, after starting

When starting up the 

Re: [Xen-devel] [PATCH v2 7/7] x86/build: Use new .nop directive when available

2018-02-26 Thread Roger Pau Monné
On Mon, Feb 26, 2018 at 01:08:05PM +, Andrew Cooper wrote:
> On 26/02/18 12:31, Roger Pau Monné wrote:
> > On Mon, Feb 26, 2018 at 11:35:04AM +, Andrew Cooper wrote:
> >> Newer versions of binutils are capable of emitting an exact number bytes 
> >> worth
> >> of optimised nops.  Use this in preference to .skip when available.
> >>
> >> Signed-off-by: Andrew Cooper 
> >> ---
> >> CC: Jan Beulich 
> >> CC: Konrad Rzeszutek Wilk 
> >> CC: Roger Pau Monné 
> >> CC: Wei Liu 
> >>
> >> RFC until support is actually committed to binutils mainline.
> > Since RFC has been dropped from the subject, is this now committed?
> 
> https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=62a02d25b6e5d9f92c205260daa11355d0c62532

Thanks for the reference.

> >
> >> ---
> >>  xen/arch/x86/Rules.mk |  4 
> >>  xen/include/asm-x86/alternative-asm.h | 14 --
> >>  xen/include/asm-x86/alternative.h | 13 ++---
> >>  3 files changed, 26 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
> >> index e169d67..bf5047f 100644
> >> --- a/xen/arch/x86/Rules.mk
> >> +++ b/xen/arch/x86/Rules.mk
> >> @@ -28,6 +28,10 @@ $(call as-option-add,CFLAGS,CC,".equ \"x\"$$(comma)1", \
> >>  $(call as-option-add,CFLAGS,CC,\
> >>  ".if ((1 > 0) < 0); .error \"\";.endif",,-DHAVE_AS_NEGATIVE_TRUE)
> >>  
> >> +# Check to see whether the assmbler supports the .nop directive.
> >> +$(call as-option-add,CFLAGS,CC,\
> >> +".L1: .L2: .nop (.L2 - .L1)$$(comma)9",-DHAVE_AS_NOP_DIRECTIVE)
> >> +
> >>  CFLAGS += -mno-red-zone -fpic -fno-asynchronous-unwind-tables
> >>  
> >>  # Xen doesn't use SSE interally.  If the compiler supports it, also skip 
> >> the
> >> diff --git a/xen/include/asm-x86/alternative-asm.h 
> >> b/xen/include/asm-x86/alternative-asm.h
> >> index 25f79fe..9e46bed 100644
> >> --- a/xen/include/asm-x86/alternative-asm.h
> >> +++ b/xen/include/asm-x86/alternative-asm.h
> >> @@ -1,6 +1,8 @@
> >>  #ifndef _ASM_X86_ALTERNATIVE_ASM_H_
> >>  #define _ASM_X86_ALTERNATIVE_ASM_H_
> >>  
> >> +#include 
> >> +
> >>  #ifdef __ASSEMBLY__
> >>  
> >>  /*
> >> @@ -18,6 +20,14 @@
> >>  .byte \pad_len
> >>  .endm
> >>  
> >> +.macro mknops nr_bytes
> >> +#ifdef HAVE_AS_NOP_DIRECTIVE
> >> +.nop \nr_bytes, ASM_NOP_MAX
> > I'm not able to find any online document about the .nop directive, and
> > I cannot really figure out the purpose of the second parameter.
> 
> Its a bit woolly, named "control".  In practice, it is the maximum
> length of an individual nop.  Beyond 11 bytes (iirc), most pipelines
> take a decode stall.

Oh, I've looked at the commit above, and since control is an optional
parameter, why not leave the assembler chose the default value? AFAICT
in our case this should be 11 because it's 64bit code.

> >
> > Also, after this patch is applied it seems like the padding is not
> > going to be 0x90, because as will already emit optimized nops. Are
> > those nops more optimized than the ones added by the alternatives
> > framework?
> 
> They are the same nops (by and large), although arranged differently. 
> For example, GAS fills backwards rather than forwards, which is as
> recommended in the Intel ORM.

So that 'larger' nop instructions are going to be placed at the end of
the region instead of the beginning of it?

> > I would expect the nops added at runtime would be more optimized than
> > the ones added at build time, because the runtime ones could take into
> > account the specific CPU model Xen is running on.
> 
> There are only a very few 64-bit capable CPUs which prefer K8 nops over
> P6 nops, and they are all old.  The build-time nops are correct for the
> overwhelming majority of hardware.

OK, I have no idea about this, but if you think filling with .nop is
not going to introduce a performance penalty over the run-time filling
then that's fine for me.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC Patch v4 4/8] hvmloader: boot cpu through broadcast

2018-02-26 Thread Roger Pau Monné
On Mon, Feb 26, 2018 at 08:33:23PM +0800, Chao Gao wrote:
> On Mon, Feb 26, 2018 at 01:28:07AM -0700, Jan Beulich wrote:
>  On 24.02.18 at 06:49,  wrote:
> >> On Fri, Feb 23, 2018 at 04:42:10PM +, Roger Pau Monné wrote:
> >>>On Wed, Dec 06, 2017 at 03:50:10PM +0800, Chao Gao wrote:
>  Intel SDM Extended XAPIC (X2APIC) -> "Initialization by System Software"
>  has the following description:
>  
>  "The ACPI interfaces for the x2APIC are described in Section 5.2, “ACPI 
>  System
>  Description Tables,” of the Advanced Configuration and Power Interface
>  Specification, Revision 4.0a (http://www.acpi.info/spec.htm). The default
>  behavior for BIOS is to pass the control to the operating system with the
>  local x2APICs in xAPIC mode if all APIC IDs reported by CPUID.0BH:EDX 
>  are less
>  than 255, and in x2APIC mode if there are any logical processor 
>  reporting an
>  APIC ID of 255 or greater."
>  
>  In this patch, hvmloader enables x2apic mode for all vcpus if there are 
>  cpus
>  with APIC ID > 255. To wake up processors whose APIC ID is greater than 
>  255,
>  the SIPI is broadcasted to all APs. It is the way how Seabios wakes up 
>  APs.
>  APs may compete for the stack, thus a lock is introduced to protect the 
>  stack.
> >>>
> >>>Hm, how are we going to deal with this on PVH? hvmloader doesn't run
> >>>for PVH guests, hence it seems like switching to x2APIC mode should be
> >>>done somewhere else that shared between HVM and PVH.
> >>>
> >>>Maybe the hypercall that sets the number of vCPUs should change the
> >>>APIC mode?
> >> 
> >> Yes. A flag can be passed when setting the maximum number of vCPUs. Xen
> >> will switch all local APICs to x2APIC mode or xAPIC mode according to
> >> the flag.
> >
> >A flag? Where? Why isn't 257+ vCPU-s on its own sufficient to tell
> >that the mode needs to be switched?
> 
> In struct xen_domctl_max_vcpus, a flag, like SWITCH_TO_X2APIC_MODE, can
> be used to instruct Xen to initialize vlapic and do this switch.
> 
> Yes, it is another option: Xen can do this switch when need. This
> solution leads to smaller code change compared with introducing a new
> flag when setting the maximum number of vCPUs.

Since APIC ID is currently hardcoded in guest_cpuid as vcpu_id * 2,
IMO Xen should switch to x2APIC mode when it detects that vCPUs > 128,
like Jan has suggest. Then you won't need to modify hvmloader at all,
and the same would work for PVH I assume?

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC Patch v4 8/8] x86/hvm: bump the maximum number of vcpus to 512

2018-02-26 Thread Chao Gao
On Mon, Feb 26, 2018 at 01:26:42AM -0700, Jan Beulich wrote:
 On 23.02.18 at 19:11,  wrote:
>> On Wed, Dec 06, 2017 at 03:50:14PM +0800, Chao Gao wrote:
>>> Signed-off-by: Chao Gao 
>>> ---
>>>  xen/include/public/hvm/hvm_info_table.h | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>> 
>>> diff --git a/xen/include/public/hvm/hvm_info_table.h 
>> b/xen/include/public/hvm/hvm_info_table.h
>>> index 08c252e..6833a4c 100644
>>> --- a/xen/include/public/hvm/hvm_info_table.h
>>> +++ b/xen/include/public/hvm/hvm_info_table.h
>>> @@ -32,7 +32,7 @@
>>>  #define HVM_INFO_PADDR   ((HVM_INFO_PFN << 12) + HVM_INFO_OFFSET)
>>>  
>>>  /* Maximum we can support with current vLAPIC ID mapping. */
>>> -#define HVM_MAX_VCPUS128
>>> +#define HVM_MAX_VCPUS512
>> 
>> Wow, that looks like a pretty big jump. I certainly don't have access
>> to any box with this number of vCPUs, so that's going to be quite hard
>> to test. What the reasoning behind this bump? Is hardware with 512
>> ways expected soon-ish?
>> 
>> Also osstest is not even able to test the current limit, so I would
>> maybe bump this to 256, but as I expressed in other occasions I don't
>> feel comfortable with have a number of vCPUs that the current test
>> system doesn't have hardware to test with.
>
>I think implementation limit and supported limit need to be clearly
>distinguished here. Therefore I'd put the question the other way
>around: What's causing the limit to be 512, rather than 1024,
>4096, or even 4G-1 (x2APIC IDs are 32 bits wide, after all)?

TBH, I have no idea. When I choose a value, what comes up to my mind is
that the value should be 288, because Intel has Xeon-phi platform which
has 288 physical threads, and some customers wants to use this new platform
for HPC cloud. Furthermore, they requests to support a big VM in which
almost computing and device resources are assigned to the VM. They just
use virtulization technology to manage the machines. In this situation,
I choose 512 is because I feel much better if the limit is a power of 2.

You are asking that as these patches remove limitations imposed by some
components, which one is the next bottleneck and how many vcpus does it
limit.  Maybe it would be the use-case. No one is requesting to support
more than 288 at this moment. So what is the value you prefer? 288 or
512? or you think I should find the next bottleneck in Xen's
implementation.

Thanks
Chao

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC 1/3] x86/vpt: execute callbacks for masked interrupts

2018-02-26 Thread Roger Pau Monné
On Mon, Feb 26, 2018 at 06:04:51AM -0700, Jan Beulich wrote:
> >>> On 26.02.18 at 13:48,  wrote:
> > On Mon, Feb 26, 2018 at 12:35:54PM +, Wei Liu wrote:
> >> On Fri, Feb 23, 2018 at 01:27:41PM +, Roger Pau Monne wrote:
> >> >  int pt_update_irq(struct vcpu *v)
> >> >  {
> >> >  struct list_head *head = >arch.hvm_vcpu.tm_list;
> >> > +LIST_HEAD(purged);
> >> 
> >> to_purge?
> > 
> > My point is that they have already been purged from the pt->list, but
> > I really don't have a preference.
> > 
> >> >  struct periodic_time *pt, *temp, *earliest_pt;
> >> >  uint64_t max_lag;
> >> >  int irq, is_lapic, pt_vector;
> >> > @@ -267,7 +289,10 @@ int pt_update_irq(struct vcpu *v)
> >> >  {
> >> >  /* suspend timer emulation */
> >> >  list_del(>list);
> >> > -pt->on_list = 0;
> >> > +if ( pt->cb )
> >> > +list_add(>list, );
> >> > +else
> >> > +pt->on_list = 0;
> >> >  }
> >> >  else
> >> >  {
> >> > @@ -283,6 +308,7 @@ int pt_update_irq(struct vcpu *v)
> >> >  if ( earliest_pt == NULL )
> >> >  {
> >> >  spin_unlock(>arch.hvm_vcpu.tm_lock);
> >> > +execute_callbacks(v, );
> >> 
> >> It would be better to check if the list is not empty before calling the
> >> function to avoid the extra lock / unlock.
> > 
> > The lock is also protecting the 'purged' list, so I think that for
> > consistency the lock needs to be held before accessing it.
> 
> But that's a local list, isn't it? No-one else can access it.

destroy_periodic_time can still remove items from this list, if a
timer that's on the 'purged' list is destroyed between added to the
list and executing the callback.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 3/7] x86/alt: Clean up the assembly used to generate alternatives

2018-02-26 Thread Roger Pau Monné
On Mon, Feb 26, 2018 at 11:35:00AM +, Andrew Cooper wrote:
>  * On the C side, switch to using local lables rather than hardcoded numbers.
>  * Rename parameters and lables to be consistent with alt_instr names, and
>consistent between the the C and asm versions.
>  * On the asm side, factor some expressions out into macros to aid clarity.
>  * Consistently declare section attributes.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Roger Pau Monné 

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86/xen: zero MSR_IA32_SPEC_CTRL before suspend

2018-02-26 Thread Juergen Gross
Older Xen versions (4.5 and before) might have problems migrating pv
guests with MSR_IA32_SPEC_CTRL having a non-zero value. So before
suspending zero that MSR and restore it after being resumed.

Cc: sta...@vger.kernel.org
Signed-off-by: Juergen Gross 
---
 arch/x86/xen/suspend.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c
index d9f96cc5d743..1d83152c761b 100644
--- a/arch/x86/xen/suspend.c
+++ b/arch/x86/xen/suspend.c
@@ -1,12 +1,15 @@
 // SPDX-License-Identifier: GPL-2.0
 #include 
 #include 
+#include 
 
 #include 
 #include 
 #include 
 #include 
 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -15,6 +18,8 @@
 #include "mmu.h"
 #include "pmu.h"
 
+static DEFINE_PER_CPU(u64, spec_ctrl);
+
 void xen_arch_pre_suspend(void)
 {
xen_save_time_memory_area();
@@ -35,6 +40,9 @@ void xen_arch_post_suspend(int cancelled)
 
 static void xen_vcpu_notify_restore(void *data)
 {
+   if (xen_pv_domain() && boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+   wrmsrl(MSR_IA32_SPEC_CTRL, this_cpu_read(spec_ctrl));
+
/* Boot processor notified via generic timekeeping_resume() */
if (smp_processor_id() == 0)
return;
@@ -44,7 +52,15 @@ static void xen_vcpu_notify_restore(void *data)
 
 static void xen_vcpu_notify_suspend(void *data)
 {
+   u64 tmp;
+
tick_suspend_local();
+
+   if (xen_pv_domain() && boot_cpu_has(X86_FEATURE_SPEC_CTRL)) {
+   rdmsrl(MSR_IA32_SPEC_CTRL, tmp);
+   this_cpu_write(spec_ctrl, tmp);
+   wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+   }
 }
 
 void xen_arch_resume(void)
-- 
2.13.6


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] (partial) Spectre v2 mitigation without on Skylake IBRS

2018-02-26 Thread Juergen Gross
On 26/02/18 14:11, Jan Beulich wrote:
 On 26.02.18 at 13:36,  wrote:
>> On 26/02/18 11:49, Jan Beulich wrote:
>> On 26.02.18 at 11:18,  wrote:
 If this is the case I believe the easiest solution would be to let the
 kernel set the MSR again after leaving suspended state. suspend/resume
 require hooks in pv kernels after all.
>>>
>>> Hmm, this could be leveraged irrespective of what I've written
>>> above - the kernel could then also clear the MSR during suspend,
>>> thus allowing the check in libxc to pass.
>>
>> Something like the attached patch?
> 
> With proper checking added of whether the MSR actually exists,
> yes, I think so. Will need to see how this can be converted to
> something that works on the old XenoLinux trees.

Okay, will post the (modified) patch to lkml.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [v1 1/1] xen, mm: Allow deferred page initialization for xen pv domains

2018-02-26 Thread Pavel Tatashin
Hi Juergen,

Thank you for taking a look at this patch, I will address your
comments, and send out an updated patch.

>>  extern void default_banner(void);
>>
>> +static inline void paravirt_after_bootmem(void)
>> +{
>> + pv_init_ops.after_bootmem();
>> +}
>> +
>
> Putting this in the paravirt framework is overkill IMO. There is no need
> to patch the callsites for optimal performance.
>
> I'd put it into struct x86_hyper_init and pre-init it with x86_init_noop

Sure, I will move it into x86_hyper_init.

>>
>> +/*
>> + * During early boot all pages are pinned, but we do not have struct pages,
>> + * so return true until struct pages are ready.
>> + */
>
> Uuh, this comment is just not true.
>
> The "pinned" state for Xen means it is a pv pagetable known to Xen. Such
> pages are read-only for the guest and can be modified via hypercalls
> only.
>
> So either the "pinned" state will be tested for page tables only, in
> which case the comment needs adjustment, or the code is wrong.

The comment should state: During early boot all _page table_ pages are pinned

Thank you,
Pavel

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC Patch v4 4/8] hvmloader: boot cpu through broadcast

2018-02-26 Thread Chao Gao
On Mon, Feb 26, 2018 at 01:28:07AM -0700, Jan Beulich wrote:
 On 24.02.18 at 06:49,  wrote:
>> On Fri, Feb 23, 2018 at 04:42:10PM +, Roger Pau Monné wrote:
>>>On Wed, Dec 06, 2017 at 03:50:10PM +0800, Chao Gao wrote:
 Intel SDM Extended XAPIC (X2APIC) -> "Initialization by System Software"
 has the following description:
 
 "The ACPI interfaces for the x2APIC are described in Section 5.2, “ACPI 
 System
 Description Tables,” of the Advanced Configuration and Power Interface
 Specification, Revision 4.0a (http://www.acpi.info/spec.htm). The default
 behavior for BIOS is to pass the control to the operating system with the
 local x2APICs in xAPIC mode if all APIC IDs reported by CPUID.0BH:EDX are 
 less
 than 255, and in x2APIC mode if there are any logical processor reporting 
 an
 APIC ID of 255 or greater."
 
 In this patch, hvmloader enables x2apic mode for all vcpus if there are 
 cpus
 with APIC ID > 255. To wake up processors whose APIC ID is greater than 
 255,
 the SIPI is broadcasted to all APs. It is the way how Seabios wakes up APs.
 APs may compete for the stack, thus a lock is introduced to protect the 
 stack.
>>>
>>>Hm, how are we going to deal with this on PVH? hvmloader doesn't run
>>>for PVH guests, hence it seems like switching to x2APIC mode should be
>>>done somewhere else that shared between HVM and PVH.
>>>
>>>Maybe the hypercall that sets the number of vCPUs should change the
>>>APIC mode?
>> 
>> Yes. A flag can be passed when setting the maximum number of vCPUs. Xen
>> will switch all local APICs to x2APIC mode or xAPIC mode according to
>> the flag.
>
>A flag? Where? Why isn't 257+ vCPU-s on its own sufficient to tell
>that the mode needs to be switched?

In struct xen_domctl_max_vcpus, a flag, like SWITCH_TO_X2APIC_MODE, can
be used to instruct Xen to initialize vlapic and do this switch.

Yes, it is another option: Xen can do this switch when need. This
solution leads to smaller code change compared with introducing a new
flag when setting the maximum number of vCPUs.

Thanks
Chao

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86/HVM: don't give the wrong impression of WRMSR succeeding

2018-02-26 Thread Andrew Cooper
On 23/02/18 08:36, Jan Beulich wrote:
> ... for non-existent MSRs: wrmsr_hypervisor_regs()'s comment clearly
> says that the function returns 0 for unrecognized MSRs, so
> {svm,vmx}_msr_write_intercept() should not convert this into success. We
> don't want to unconditionally fail the access though, as we can't be
> certain the list of handled MSRs is complete enough for the guest types
> we care about, so instead mirror what we do on the read paths and probe
> the MSR to decide whether to raise #GP.
>
> Signed-off-by: Jan Beulich 

Having thought this through:

At the moment, a write to any unhandled MSR is treated as silent write
discard.  This is terrible behaviour from the guests point of view.

With this patch in place, a write to any unreadable MSR yields #GP,
which is better behaviour.

The only write-only MSRs I'm aware of are in the x2apic block, and
MSR_PRED_CMD, all of which are explicitly handled.

Therefore, Reviewed-by: Andrew Cooper , as
this is an improvement in behaviour, even if the result still isn't great.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] (partial) Spectre v2 mitigation without on Skylake IBRS

2018-02-26 Thread Jan Beulich
>>> On 26.02.18 at 13:36,  wrote:
> On 26/02/18 11:49, Jan Beulich wrote:
> On 26.02.18 at 11:18,  wrote:
>>> If this is the case I believe the easiest solution would be to let the
>>> kernel set the MSR again after leaving suspended state. suspend/resume
>>> require hooks in pv kernels after all.
>> 
>> Hmm, this could be leveraged irrespective of what I've written
>> above - the kernel could then also clear the MSR during suspend,
>> thus allowing the check in libxc to pass.
> 
> Something like the attached patch?

With proper checking added of whether the MSR actually exists,
yes, I think so. Will need to see how this can be converted to
something that works on the old XenoLinux trees.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable test] 120001: tolerable FAIL - PUSHED

2018-02-26 Thread osstest service owner
flight 120001 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120001/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 119713
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 119713
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 119713
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 119713
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 119713
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 119713
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 119713
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 119713
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 119713
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 119713
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass

version targeted for testing:
 xen  a823a5280f25ad19a751dd9a41044f556471e61a
baseline version:
 xen  8f9ccfe93570ecae18d9cc224931787d0bca9c66

Last test of basis   119713  2018-02-20 07:56:20 Z6 days
Failing since119785  2018-02-21 02:46:06 Z5 days4 attempts
Testing same since   120001  2018-02-25 03:01:38 Z1 days1 attempts


People who touched revisions under test:
  Alan Robinson 
  Alexandru Isaila 
  Andrew Cooper 
  

Re: [Xen-devel] ARM64:Porting xen to new hardware

2018-02-26 Thread bharat gohil
On Mon, Feb 26, 2018 at 3:51 PM, Julien Grall  wrote:
> Hi,
>
> What I meant by using '>' for quoting is all my reply should be prefixed
> with '>'. You write your reply normally.
>
> You can do that in gmail by switching the e-mail from HTML to plain text.
>
Ok. Got it.
> On 26/02/18 07:31, bharat gohil wrote:
>>
>> On Thu, Feb 22, 2018 at 4:57 PM, Julien Grall > > wrote:
>> This looks quite wrong to me. By modifying the interrupt parent
>> property, you also modify which interrupt controller will be used
>> for routing the interrupt. This is probably the reason of the hang
>> you mention below.
>>
>> What are the interrupts controller you have on your platform?
>>
>>  >It has interrupt controller which change the polarity of SPI IRQ before
>> redirect to GIC-400.
>>  >In DTB debug, I got following trace,
>>  >(XEN) irq 0 not connected to primary controller. Connected to
>> /intpol-controller@10220a80.
>>  >I think Xen skip interrupt controller(if other than GIC) while domain
>> creation.
>
>
> Xen will not try to map interrupt that are routed to a different interrupt
> controller. This is because we don't know how to translate the property
> 'regs' for those interrupts.
>
>>  >Do you have suggestion to solve this?
>>  >Do I need to support custom IRQ controller in Xen or hard code the
>> custom controller register in Xen and modified DTB with GIC as primary
>> controller?
>
>
> If any interrupts used by Xen (e.g UART) are behind that custom IRQ
> controller, then you would need to add the driver in Xen.
>
> There was an attempt to provide a framework for hooking custom IRQ
> controller in Xen (see [1]).
>
> You could probably look for doing something similar for your board.
>
Thanks lot for your hint. I am able to get login prompt for Dom0.

> Cheers,
>
> [1]
> https://lists.xenproject.org/archives/html/xen-devel/2017-04/msg00991.html
>
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 7/7] x86/build: Use new .nop directive when available

2018-02-26 Thread Andrew Cooper
On 26/02/18 12:31, Roger Pau Monné wrote:
> On Mon, Feb 26, 2018 at 11:35:04AM +, Andrew Cooper wrote:
>> Newer versions of binutils are capable of emitting an exact number bytes 
>> worth
>> of optimised nops.  Use this in preference to .skip when available.
>>
>> Signed-off-by: Andrew Cooper 
>> ---
>> CC: Jan Beulich 
>> CC: Konrad Rzeszutek Wilk 
>> CC: Roger Pau Monné 
>> CC: Wei Liu 
>>
>> RFC until support is actually committed to binutils mainline.
> Since RFC has been dropped from the subject, is this now committed?

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=62a02d25b6e5d9f92c205260daa11355d0c62532

>
>> ---
>>  xen/arch/x86/Rules.mk |  4 
>>  xen/include/asm-x86/alternative-asm.h | 14 --
>>  xen/include/asm-x86/alternative.h | 13 ++---
>>  3 files changed, 26 insertions(+), 5 deletions(-)
>>
>> diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
>> index e169d67..bf5047f 100644
>> --- a/xen/arch/x86/Rules.mk
>> +++ b/xen/arch/x86/Rules.mk
>> @@ -28,6 +28,10 @@ $(call as-option-add,CFLAGS,CC,".equ \"x\"$$(comma)1", \
>>  $(call as-option-add,CFLAGS,CC,\
>>  ".if ((1 > 0) < 0); .error \"\";.endif",,-DHAVE_AS_NEGATIVE_TRUE)
>>  
>> +# Check to see whether the assmbler supports the .nop directive.
>> +$(call as-option-add,CFLAGS,CC,\
>> +".L1: .L2: .nop (.L2 - .L1)$$(comma)9",-DHAVE_AS_NOP_DIRECTIVE)
>> +
>>  CFLAGS += -mno-red-zone -fpic -fno-asynchronous-unwind-tables
>>  
>>  # Xen doesn't use SSE interally.  If the compiler supports it, also skip the
>> diff --git a/xen/include/asm-x86/alternative-asm.h 
>> b/xen/include/asm-x86/alternative-asm.h
>> index 25f79fe..9e46bed 100644
>> --- a/xen/include/asm-x86/alternative-asm.h
>> +++ b/xen/include/asm-x86/alternative-asm.h
>> @@ -1,6 +1,8 @@
>>  #ifndef _ASM_X86_ALTERNATIVE_ASM_H_
>>  #define _ASM_X86_ALTERNATIVE_ASM_H_
>>  
>> +#include 
>> +
>>  #ifdef __ASSEMBLY__
>>  
>>  /*
>> @@ -18,6 +20,14 @@
>>  .byte \pad_len
>>  .endm
>>  
>> +.macro mknops nr_bytes
>> +#ifdef HAVE_AS_NOP_DIRECTIVE
>> +.nop \nr_bytes, ASM_NOP_MAX
> I'm not able to find any online document about the .nop directive, and
> I cannot really figure out the purpose of the second parameter.

Its a bit woolly, named "control".  In practice, it is the maximum
length of an individual nop.  Beyond 11 bytes (iirc), most pipelines
take a decode stall.

>
> Also, after this patch is applied it seems like the padding is not
> going to be 0x90, because as will already emit optimized nops. Are
> those nops more optimized than the ones added by the alternatives
> framework?

They are the same nops (by and large), although arranged differently. 
For example, GAS fills backwards rather than forwards, which is as
recommended in the Intel ORM.

> I would expect the nops added at runtime would be more optimized than
> the ones added at build time, because the runtime ones could take into
> account the specific CPU model Xen is running on.

There are only a very few 64-bit capable CPUs which prefer K8 nops over
P6 nops, and they are all old.  The build-time nops are correct for the
overwhelming majority of hardware.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC 1/3] x86/vpt: execute callbacks for masked interrupts

2018-02-26 Thread Jan Beulich
>>> On 26.02.18 at 13:48,  wrote:
> On Mon, Feb 26, 2018 at 12:35:54PM +, Wei Liu wrote:
>> On Fri, Feb 23, 2018 at 01:27:41PM +, Roger Pau Monne wrote:
>> >  int pt_update_irq(struct vcpu *v)
>> >  {
>> >  struct list_head *head = >arch.hvm_vcpu.tm_list;
>> > +LIST_HEAD(purged);
>> 
>> to_purge?
> 
> My point is that they have already been purged from the pt->list, but
> I really don't have a preference.
> 
>> >  struct periodic_time *pt, *temp, *earliest_pt;
>> >  uint64_t max_lag;
>> >  int irq, is_lapic, pt_vector;
>> > @@ -267,7 +289,10 @@ int pt_update_irq(struct vcpu *v)
>> >  {
>> >  /* suspend timer emulation */
>> >  list_del(>list);
>> > -pt->on_list = 0;
>> > +if ( pt->cb )
>> > +list_add(>list, );
>> > +else
>> > +pt->on_list = 0;
>> >  }
>> >  else
>> >  {
>> > @@ -283,6 +308,7 @@ int pt_update_irq(struct vcpu *v)
>> >  if ( earliest_pt == NULL )
>> >  {
>> >  spin_unlock(>arch.hvm_vcpu.tm_lock);
>> > +execute_callbacks(v, );
>> 
>> It would be better to check if the list is not empty before calling the
>> function to avoid the extra lock / unlock.
> 
> The lock is also protecting the 'purged' list, so I think that for
> consistency the lock needs to be held before accessing it.

But that's a local list, isn't it? No-one else can access it.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC 00/10] x86 passthrough code cleanup

2018-02-26 Thread Wei Liu
On Mon, Feb 26, 2018 at 01:47:38AM +0100, Marek Marczykowski-Górecki wrote:
> On Fri, Feb 23, 2018 at 10:39:20PM -0600, Doug Goldstein wrote:
> > On 2/22/18 11:12 PM, Tian, Kevin wrote:
> > >> From: Wei Liu
> > >> Sent: Thursday, February 22, 2018 5:47 AM
> > >>
> > >> Hi all
> > >>
> > >> At some point I would like to make CONFIG_HVM and CONFIG_PV work.
> > >> The
> > >> passthrough code is one of the road blocks for that work.
> > > 
> > > Can you elaborate the motivation of this change? why does someone
> > > want to disable HVM or PV logic completely from hypervisor?
> > 
> > I can say I recall advocating for this at Xen Summit in Cambridge. I
> > believe I talked about it in Toronto as well. There are a number of
> > users of Xen that would certainly want to ship without all the code
> > associated with PV compiled in. Given the nature of design "compromises"
> > in many parts of x86 systems there is certainly a non-zero sum of people
> > that would likely utilize the ability to remove code that doesn't need
> > to be there. I think every individual on this list who has been involved
> > in the security has been in a room of @intel.com folks has seen features
> > vs security win out many times.
> > 
> > I don't think its a hard stretch of the imagination to see people
> > disabling PV in data centers running newer workloads on PVH and HVM
> > only.
> 
> Yes, definitely disabling PV will be useful. Right after being able to
> use PCI passthrough with PVH.
> 
> > I can see the real question being why HVM? That I would say lies
> > with the direction of discretionary access controls in Xen vs mandatory
> > access controls. To solve for the lack of functionality we've grown
> > things like "dmops" and I could certainly see a product like Qubes
> > running only PVH domains in the future.
> > 
> > Since I picked on Qubes I've CC'd Marek.
> 
> So, is it going to be an option to have CONFIG_HVM=n and CONFIG_PVH=y at
> the same time? While currently we do support Windows, so need
> CONFIG_HVM=y, but I can see in some future/alternative version we could
> have even that disabled. For example right now we do have
> CONFIG_SHADOW_PAGING disabled.
> 

Hypervisor doesn't distinguish HVM and PVH at this point. More work is
needed there. But I expect the debate of what each option covers will
take longer than actually writing the code.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC 1/3] x86/vpt: execute callbacks for masked interrupts

2018-02-26 Thread Roger Pau Monné
On Mon, Feb 26, 2018 at 12:35:54PM +, Wei Liu wrote:
> On Fri, Feb 23, 2018 at 01:27:41PM +, Roger Pau Monne wrote:
> > Execute periodic_time callbacks even if the interrupt is not actually
> > injected because the IRQ is masked.
> > 
> > Current callbacks from emulated timer devices only update emulated
> > registers, which from my reading of the specs should happen regardless
> > of whether the interrupt has been injected or not.
> > 
> > Signed-off-by: Roger Pau Monné 
> > ---
> > Cc: Jan Beulich 
> > Cc: Andrew Cooper 
> > Cc: Stefan Bader 
> > ---
> >  xen/arch/x86/hvm/vpt.c | 30 +-
> >  1 file changed, 29 insertions(+), 1 deletion(-)
> > 
> > diff --git a/xen/arch/x86/hvm/vpt.c b/xen/arch/x86/hvm/vpt.c
> > index 181f4cb631..1a24fbaa44 100644
> > --- a/xen/arch/x86/hvm/vpt.c
> > +++ b/xen/arch/x86/hvm/vpt.c
> > @@ -247,9 +247,31 @@ static void pt_timer_fn(void *data)
> >  pt_unlock(pt);
> >  }
> >  
> > +static void execute_callbacks(struct vcpu *v, struct list_head *tm)
> > +{
> > +spin_lock(>arch.hvm_vcpu.tm_lock);
> > +while ( !list_empty(tm) )
> > +{
> > +struct periodic_time *pt = list_first_entry(tm, struct 
> > periodic_time,
> > +list);
> > +time_cb *cb = pt->cb;
> > +void *cb_priv = pt->priv;
> > +
> > +list_del(>list);
> > +pt->on_list = 0;
> > +spin_unlock(>arch.hvm_vcpu.tm_lock);
> > +
> > +cb(v, cb_priv);
> > +
> > +spin_lock(>arch.hvm_vcpu.tm_lock);
> > +}
> > +spin_unlock(>arch.hvm_vcpu.tm_lock);
> > +}
> > +
> >  int pt_update_irq(struct vcpu *v)
> >  {
> >  struct list_head *head = >arch.hvm_vcpu.tm_list;
> > +LIST_HEAD(purged);
> 
> to_purge?

My point is that they have already been purged from the pt->list, but
I really don't have a preference.

> >  struct periodic_time *pt, *temp, *earliest_pt;
> >  uint64_t max_lag;
> >  int irq, is_lapic, pt_vector;
> > @@ -267,7 +289,10 @@ int pt_update_irq(struct vcpu *v)
> >  {
> >  /* suspend timer emulation */
> >  list_del(>list);
> > -pt->on_list = 0;
> > +if ( pt->cb )
> > +list_add(>list, );
> > +else
> > +pt->on_list = 0;
> >  }
> >  else
> >  {
> > @@ -283,6 +308,7 @@ int pt_update_irq(struct vcpu *v)
> >  if ( earliest_pt == NULL )
> >  {
> >  spin_unlock(>arch.hvm_vcpu.tm_lock);
> > +execute_callbacks(v, );
> 
> It would be better to check if the list is not empty before calling the
> function to avoid the extra lock / unlock.

The lock is also protecting the 'purged' list, so I think that for
consistency the lock needs to be held before accessing it.

Since this is only a empty check *and* there can't be any
additions to the list at this point I guess it would be safe to test
for emptiness without holding the lock, but I find it kind of
confusing.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC 1/3] x86/vpt: execute callbacks for masked interrupts

2018-02-26 Thread Wei Liu
On Fri, Feb 23, 2018 at 01:27:41PM +, Roger Pau Monne wrote:
> Execute periodic_time callbacks even if the interrupt is not actually
> injected because the IRQ is masked.
> 
> Current callbacks from emulated timer devices only update emulated
> registers, which from my reading of the specs should happen regardless
> of whether the interrupt has been injected or not.
> 
> Signed-off-by: Roger Pau Monné 
> ---
> Cc: Jan Beulich 
> Cc: Andrew Cooper 
> Cc: Stefan Bader 
> ---
>  xen/arch/x86/hvm/vpt.c | 30 +-
>  1 file changed, 29 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/x86/hvm/vpt.c b/xen/arch/x86/hvm/vpt.c
> index 181f4cb631..1a24fbaa44 100644
> --- a/xen/arch/x86/hvm/vpt.c
> +++ b/xen/arch/x86/hvm/vpt.c
> @@ -247,9 +247,31 @@ static void pt_timer_fn(void *data)
>  pt_unlock(pt);
>  }
>  
> +static void execute_callbacks(struct vcpu *v, struct list_head *tm)
> +{
> +spin_lock(>arch.hvm_vcpu.tm_lock);
> +while ( !list_empty(tm) )
> +{
> +struct periodic_time *pt = list_first_entry(tm, struct periodic_time,
> +list);
> +time_cb *cb = pt->cb;
> +void *cb_priv = pt->priv;
> +
> +list_del(>list);
> +pt->on_list = 0;
> +spin_unlock(>arch.hvm_vcpu.tm_lock);
> +
> +cb(v, cb_priv);
> +
> +spin_lock(>arch.hvm_vcpu.tm_lock);
> +}
> +spin_unlock(>arch.hvm_vcpu.tm_lock);
> +}
> +
>  int pt_update_irq(struct vcpu *v)
>  {
>  struct list_head *head = >arch.hvm_vcpu.tm_list;
> +LIST_HEAD(purged);

to_purge?

>  struct periodic_time *pt, *temp, *earliest_pt;
>  uint64_t max_lag;
>  int irq, is_lapic, pt_vector;
> @@ -267,7 +289,10 @@ int pt_update_irq(struct vcpu *v)
>  {
>  /* suspend timer emulation */
>  list_del(>list);
> -pt->on_list = 0;
> +if ( pt->cb )
> +list_add(>list, );
> +else
> +pt->on_list = 0;
>  }
>  else
>  {
> @@ -283,6 +308,7 @@ int pt_update_irq(struct vcpu *v)
>  if ( earliest_pt == NULL )
>  {
>  spin_unlock(>arch.hvm_vcpu.tm_lock);
> +execute_callbacks(v, );

It would be better to check if the list is not empty before calling the
function to avoid the extra lock / unlock.

(Haven't checked if the basic premise of this patch is compatible with
the spec)

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] (partial) Spectre v2 mitigation without on Skylake IBRS

2018-02-26 Thread Juergen Gross
On 26/02/18 11:49, Jan Beulich wrote:
 On 26.02.18 at 11:18,  wrote:
>> On 26/02/18 10:44, Jan Beulich wrote:
>>> if running PV Linux on older Xen (4.5 and earlier) is relevant, it may be
>>> necessary to use a mechanism other than IBRS to mitigate Spectre v2
>>> on Skylake. That is because the new MSR value can't be migrated
>>> prior to migration v2. Of course one option would be to retrofit some
>>> mechanism into newer Xen versions that makes them accept whatever
>>> extension to e.g. struct hvm_hw_cpu one might want to invent for
>>> the older Xen versions. But that doesn't seem very desirable.
>>
>> Can you please elaborate a little bit more what the real problem is?
>>
>> I _think_ you are referring to the problem that a pv kernel would want
>> to use IBRS for mitigation of Spectre V2 and after a migration that
>> setting would be lost.
> 
> "Lost" is the wrong term imo: A hypervisor that's been patched for
> Spectre v2 (and that's a prereq anyway, because we want the
> kernel to use IBPB, which utilizes an MSR that doesn't need
> migrating) should at least do _something_ with the MSR (when it's
> non-zero). The most natural thing (imo) is to make those older
> hypervisors support XEN_DOMCTL_{get,set}_vcpu_msrs. That in
> turn calls for the tool stack to gain the check that Andrew had
> added to libxc in db24f7f012 ("libxc: use an explicit check for PV
> MSRs in xc_domain_save() "), causing migration to fail when the
> MSR is non-zero on any of the guest's vCPU-s.
> 
>> If this is the case I believe the easiest solution would be to let the
>> kernel set the MSR again after leaving suspended state. suspend/resume
>> require hooks in pv kernels after all.
> 
> Hmm, this could be leveraged irrespective of what I've written
> above - the kernel could then also clear the MSR during suspend,
> thus allowing the check in libxc to pass.

Something like the attached patch?


Juergen

>From 5a03c1e0a21f3a5a3f2228e5df7150d3f3be6e1f Mon Sep 17 00:00:00 2001
From: Juergen Gross 
Date: Mon, 26 Feb 2018 13:10:55 +0100
Subject: [PATCH] x86/xen: zero MSR_IA32_SPEC_CTRL before suspend

Older Xen versions (before 4.5) might have problems migrating pv guests
with MSR_IA32_SPEC_CTRL having a non-zero value. So before suspending
zero that MSR and restore it after being resumed.

Signed-off-by: Juergen Gross 
---
 arch/x86/xen/suspend.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c
index d9f96cc5d743..2eb3069fd413 100644
--- a/arch/x86/xen/suspend.c
+++ b/arch/x86/xen/suspend.c
@@ -15,6 +15,8 @@
 #include "mmu.h"
 #include "pmu.h"
 
+static DEFINE_PER_CPU(u64, spec_ctrl);
+
 void xen_arch_pre_suspend(void)
 {
 	xen_save_time_memory_area();
@@ -35,6 +37,9 @@ void xen_arch_post_suspend(int cancelled)
 
 static void xen_vcpu_notify_restore(void *data)
 {
+	if (xen_pv_domain())
+		wrmsrl(MSR_IA32_SPEC_CTRL, this_cpu_read(spec_ctrl));
+
 	/* Boot processor notified via generic timekeeping_resume() */
 	if (smp_processor_id() == 0)
 		return;
@@ -44,7 +49,15 @@ static void xen_vcpu_notify_restore(void *data)
 
 static void xen_vcpu_notify_suspend(void *data)
 {
+	u64 tmp;
+
 	tick_suspend_local();
+
+	if (xen_pv_domain()) {
+		rdmsrl(MSR_IA32_SPEC_CTRL, tmp);
+		this_cpu_write(spec_ctrl, tmp);
+		wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+	}
 }
 
 void xen_arch_resume(void)
-- 
2.13.6

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen: use hvc console for dom0

2018-02-26 Thread Juergen Gross
On 26/02/18 13:06, Andrii Anisov wrote:
> Hello Juergen,
> 
> 
> On 26.02.18 13:08, Juergen Gross wrote:
>> Today the hvc console is added as a preferred console for pv domUs
>> only. As this requires a boot parameter for getting dom0 messages per
>> default add it for dom0, too.
>>
>> Signed-off-by: Juergen Gross 
>> ---
>>   arch/x86/xen/enlighten_pv.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
> Is this something x86 specific? Could it be a generic approach?

In case ARM wants something similar I guess the test for
xen_initial_domain() should be dropped in xen_early_init().

Stefano?


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 7/7] x86/build: Use new .nop directive when available

2018-02-26 Thread Roger Pau Monné
On Mon, Feb 26, 2018 at 11:35:04AM +, Andrew Cooper wrote:
> Newer versions of binutils are capable of emitting an exact number bytes worth
> of optimised nops.  Use this in preference to .skip when available.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Konrad Rzeszutek Wilk 
> CC: Roger Pau Monné 
> CC: Wei Liu 
> 
> RFC until support is actually committed to binutils mainline.

Since RFC has been dropped from the subject, is this now committed?

> ---
>  xen/arch/x86/Rules.mk |  4 
>  xen/include/asm-x86/alternative-asm.h | 14 --
>  xen/include/asm-x86/alternative.h | 13 ++---
>  3 files changed, 26 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
> index e169d67..bf5047f 100644
> --- a/xen/arch/x86/Rules.mk
> +++ b/xen/arch/x86/Rules.mk
> @@ -28,6 +28,10 @@ $(call as-option-add,CFLAGS,CC,".equ \"x\"$$(comma)1", \
>  $(call as-option-add,CFLAGS,CC,\
>  ".if ((1 > 0) < 0); .error \"\";.endif",,-DHAVE_AS_NEGATIVE_TRUE)
>  
> +# Check to see whether the assmbler supports the .nop directive.
> +$(call as-option-add,CFLAGS,CC,\
> +".L1: .L2: .nop (.L2 - .L1)$$(comma)9",-DHAVE_AS_NOP_DIRECTIVE)
> +
>  CFLAGS += -mno-red-zone -fpic -fno-asynchronous-unwind-tables
>  
>  # Xen doesn't use SSE interally.  If the compiler supports it, also skip the
> diff --git a/xen/include/asm-x86/alternative-asm.h 
> b/xen/include/asm-x86/alternative-asm.h
> index 25f79fe..9e46bed 100644
> --- a/xen/include/asm-x86/alternative-asm.h
> +++ b/xen/include/asm-x86/alternative-asm.h
> @@ -1,6 +1,8 @@
>  #ifndef _ASM_X86_ALTERNATIVE_ASM_H_
>  #define _ASM_X86_ALTERNATIVE_ASM_H_
>  
> +#include 
> +
>  #ifdef __ASSEMBLY__
>  
>  /*
> @@ -18,6 +20,14 @@
>  .byte \pad_len
>  .endm
>  
> +.macro mknops nr_bytes
> +#ifdef HAVE_AS_NOP_DIRECTIVE
> +.nop \nr_bytes, ASM_NOP_MAX

I'm not able to find any online document about the .nop directive, and
I cannot really figure out the purpose of the second parameter.

Also, after this patch is applied it seems like the padding is not
going to be 0x90, because as will already emit optimized nops. Are
those nops more optimized than the ones added by the alternatives
framework?

I would expect the nops added at runtime would be more optimized than
the ones added at build time, because the runtime ones could take into
account the specific CPU model Xen is running on.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86/alt: Support for automatic padding calculations

2018-02-26 Thread Roger Pau Monné
On Mon, Feb 26, 2018 at 11:24:55AM +, Andrew Cooper wrote:
> The correct amount of padding in an origin patch site can be calculated
> automatically, based on the relative lengths of the replacements.
> 
> This requires a bit of trickery to calculate correctly, especially in the
> ALTENRATIVE_2 case where a branchless max() calculation in needed.  The
> calculation is further complicated because GAS's idea of true is -1 rather
> than 1, which is why the extra negations are required.
> 
> Additionally, have apply_alternatives() attempt to optimise the padding nops.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Konrad Rzeszutek Wilk 
> CC: Roger Pau Monné 
> CC: Wei Liu 
> 
> v2: Fix build with Clang.
> ---
>  xen/arch/x86/Rules.mk |  4 +++
>  xen/arch/x86/alternative.c| 32 ---
>  xen/include/asm-x86/alternative-asm.h | 60 
> +--
>  xen/include/asm-x86/alternative.h | 46 +--
>  4 files changed, 120 insertions(+), 22 deletions(-)
> 
> diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
> index 9897dea..e169d67 100644
> --- a/xen/arch/x86/Rules.mk
> +++ b/xen/arch/x86/Rules.mk
> @@ -24,6 +24,10 @@ $(call as-option-add,CFLAGS,CC,".equ \"x\"$$(comma)1", \
>   -U__OBJECT_LABEL__ -DHAVE_GAS_QUOTED_SYM \
>   '-D__OBJECT_LABEL__=$(subst 
> $(BASEDIR)/,,$(CURDIR))/$$@')
>  
> +# GCC's idea of true is -1.  Clang's idea is 1

Nit: that should be GNU as rather than GCC itself?

>  #define alt_orig_len   "(.LXEN%=_orig_e - .LXEN%=_orig_s)"
> +#define alt_pad_len"(.LXEN%=_orig_p - .LXEN%=_orig_e)"
> +#define alt_total_len  "(.LXEN%=_orig_p - .LXEN%=_orig_s)"
>  #define alt_repl_s(num)".LXEN%=_repl_s"#num
>  #define alt_repl_e(num)".LXEN%=_repl_e"#num
>  #define alt_repl_len(num)  "(" alt_repl_e(num) " - " alt_repl_s(num) ")"
>  
> +/* GCC's idea of true is -1, while Clang's idea is 1. */
> +#ifdef HAVE_AS_NEGATIVE_TRUE
> +# define AS_TRUE "-"
> +#else
> +# define AS_TRUE ""
> +#endif
> +
> +#define as_max(a, b) "(("a") ^ ((("a") ^ ("b")) & -("AS_TRUE"(("a") < 
> ("b")"
> +
> +#define OLDINSTR_1(oldinstr, n1) \
> +".LXEN%=_orig_s:\n\t" oldinstr "\n .LXEN%=_orig_e:\n\t"  \
> +".LXEN%=_diff = "alt_repl_len(n1)"-"alt_orig_len"\n\t"   \
> +".skip "AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff, 0x90\n\t" \
> +".LXEN%=_orig_p:\n\t"
> +
> +#define ALT_PADDING_LEN(n1, n2) \
> +as_max((alt_repl_len(n1), alt_repl_len(n2))"-"alt_orig_len
> +
> +#define OLDINSTR_2(oldinstr, n1, n2) \
> +".LXEN%=_orig_s:\n\t" oldinstr "\n .LXEN%=_orig_e:\n\t"  \
> +".LXEN%=_diff = "ALT_PADDING_LEN(n1, n2)"\n\t"   \
> +".skip "AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff, 0x90\n\t" \
> +".LXEN%=_orig_p:\n\t"

OLDINSTR_1 is mostly the same as OLDINSTR_2, I wonder whether:

#define OLDINSTR(oldinstr, pad)  \
".LXEN%=_orig_s:\n\t" oldinstr "\n .LXEN%=_orig_e:\n\t"  \
".LXEN%=_diff = " pad "\n\t" \
".skip "AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff, 0x90\n\t" \
".LXEN%=_orig_p:\n\t"

and then:

#define OLDINSTR_1(oldinstr, n1) \
OLDINSTR(oldinstr, alt_repl_len(n1)"-"alt_orig_len)
#define OLDINSTR_2(oldinstr, n1, n2) \
OLDINSTR(oldinstr, ALT_PADDING_LEN(n1, n2))

Wouldn't work? That seems to avoid some code repetition.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] tools/xenstore: try to get minimum thread stack size for watch thread

2018-02-26 Thread Ian Jackson
Wei Liu writes ("Re: [Xen-devel] [PATCH v2] tools/xenstore: try to get minimum 
thread stack size for watch thread"):
> It is already enclosed in CONFIG_Linux. I think that should be enough.

Oh, I see.  I had read USE_DLSYM as CONFIG_DLSYM, ie "dlsym is
available".  A better name might be USE_DLSYM_MINSTACK_HACK but I'm
happy with the patch as is.

Acked-by: Ian Jackson 

I would like a test report from Jim, although the thread suggests that
Jim's *actual* problem was something else so that might not be
applicable.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 5/7] libxl: support unmapping static shared memory areas during domain destruction

2018-02-26 Thread Ian Jackson
Wei Liu writes ("Re: [Xen-devel] [PATCH v4 5/7] libxl: support unmapping static 
shared memory areas during domain destruction"):
> On Mon, Feb 12, 2018 at 03:24:26PM +, Julien Grall wrote:
> > In any case, the worst that could happen is the unmap is called twice on the
> > same region. So you would get spurious error message. Not that bad.
> 
> Yeah, not that bad. Not going to be a security issue, not going to leak
> resources in the end.
> 
> To avoid spurious unmap, can we maybe unmap the pages after the xenstore
> transaction is committed? In that case, only the successful one gets to
> unmap, the ones that aren't committed will bail.
> 
> (Just tossing around ideas)

It should be the other way around.  Because, your way, if your process
crashes for some reason between the xenstore commit and the unmap, the
memory is leaked.

Instead, do the unmap first.  Check the error code to see if it means
"this was already unmapped" and if so report that only via a debug log
message.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen: use hvc console for dom0

2018-02-26 Thread Andrii Anisov

Hello Juergen,


On 26.02.18 13:08, Juergen Gross wrote:

Today the hvc console is added as a preferred console for pv domUs
only. As this requires a boot parameter for getting dom0 messages per
default add it for dom0, too.

Signed-off-by: Juergen Gross 
---
  arch/x86/xen/enlighten_pv.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c

Is this something x86 specific? Could it be a generic approach?


index c047f42552e1..d27740a80c5e 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1377,7 +1377,6 @@ asmlinkage __visible void __init xen_start_kernel(void)
if (!xen_initial_domain()) {
add_preferred_console("xenboot", 0, NULL);
add_preferred_console("tty", 0, NULL);
-   add_preferred_console("hvc", 0, NULL);
if (pci_xen)
x86_init.pci.arch_init = pci_xen_init;
} else {
@@ -1410,6 +1409,9 @@ asmlinkage __visible void __init xen_start_kernel(void)
  
  		xen_boot_params_init_edd();

}
+
+   add_preferred_console("hvc", 0, NULL);
+
  #ifdef CONFIG_PCI
/* PCI BIOS service won't work from a PV guest. */
pci_probe &= ~PCI_PROBE_BIOS;


--

*Andrii Anisov*



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] tools/xenstore: try to get minimum thread stack size for watch thread

2018-02-26 Thread Wei Liu
On Mon, Feb 26, 2018 at 12:03:29PM +, Ian Jackson wrote:
> Wei Liu writes ("Re: [Xen-devel] [PATCH v2] tools/xenstore: try to get 
> minimum thread stack size for watch thread"):
> > I don't think FreeBSD needs this particular workaround for glibc FWIW.
> 
> Indeed.
> 
> Err, I guess we should have a configure test of some kind then ?
> 

It is already enclosed in CONFIG_Linux. I think that should be enough.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] tools/xenstore: try to get minimum thread stack size for watch thread

2018-02-26 Thread Ian Jackson
Wei Liu writes ("Re: [Xen-devel] [PATCH v2] tools/xenstore: try to get minimum 
thread stack size for watch thread"):
> I don't think FreeBSD needs this particular workaround for glibc FWIW.

Indeed.

Err, I guess we should have a configure test of some kind then ?

The patch looks good.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] RTDS with extra time issue

2018-02-26 Thread Andrii Anisov

Hello Dario,

On 22.02.18 19:53, Dario Faggioli wrote:

As I said already, improving the accounting would be more than welcome.
If you're planning on doing something like this already, I'll be happy
to look at the patches. :-)
First I have to document my findings and make some conclusions about 
applicability of XEN to build systems with real-time requirements.

Then I hopefully will be on that.

--

*Andrii Anisov*



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] libxl - avoid calling block script

2018-02-26 Thread Ian Jackson
Marek Marczykowski-Górecki writes ("Re: [Xen-devel] libxl - avoid calling block 
script"):
> On Fri, Feb 09, 2018 at 11:03:55AM +, Roger Pau Monné wrote:
> > Really adding Ian and Wei.
> > 
> > On Fri, Feb 09, 2018 at 10:55:24AM +, Roger Pau Monné wrote:
> > > So the problem is creation time for domains that have quite a lot of
> > > disks attached. Adding Ian and Wei who know more about the async
> > > dispatch system, but I think (at least from a technical PoV) it
> > > should be possible to parallelize device attachment and thus hotplug
> > > script execution. Devices are independent from each other.
> 
> In theory yes, but in practice block script (at least on Linux) takes a
> lock and serialize execution...

Indeed.

> > > Also the Linux hotplug scripts in general seem extremely convoluted,
> > > I'm not sure whether we could gain some speed there just by
> > > simplification.
> 
> Well, we're comparing a bunch of fork+exec(), including starting bash
> (default /bin/sh on most systems), with just a single stat() call...
> Handling scripts in libxl itself also takes some time (in my case libxl
> live in libvirt, which may or may not have an impact). For a domU with
> 4 disks, getting rid of hotplug scripts saved about 2s of startup time.

The scripts themselves are terribly terribly slow.  They are as Roger
says incredibly convoluted.  I'm sure your 2s is right, but almost all
of that will be actual script execution.

I am not opposed to moving the functionality for very simplest case
into libxl.

But I think from your pov it would be worth trying a simple shell
script which doesn't take a lock, but just provides the physical
device information.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4] vmx/hap: optimize CR4 trapping

2018-02-26 Thread Tian, Kevin
> From: Roger Pau Monne [mailto:roger@citrix.com]
> Sent: Monday, February 26, 2018 6:05 PM
> 
> There a bunch of bits in CR4 that should be allowed to be set directly
> by the guest without requiring Xen intervention, currently this is
> already done by passing through guest writes into the CR4 used when
> running in non-root mode, but taking an expensive vmexit in order to
> do so.
> 
> xenalyze reports the following when running a PV guest in shim mode:
> 
>  CR_ACCESS 3885950  6.41s 17.04%  3957 cyc { 2361| 3378| 7920}
>cr4  3885940  6.41s 17.04%  3957 cyc { 2361| 3378| 7920}
>cr31  0.00s  0.00%  3480 cyc { 3480| 3480| 3480}
>  *[  0]1  0.00s  0.00%  3480 cyc { 3480| 3480| 3480}
>cr07  0.00s  0.00%  7112 cyc { 3248| 5960|17480}
>clts2  0.00s  0.00%  4588 cyc { 3456| 5720| 5720}
> 
> After this change this turns into:
> 
>  CR_ACCESS  12  0.00s  0.00%  9972 cyc { 3680|11024|24032}
>cr42  0.00s  0.00% 17528 cyc {11024|24032|24032}
>cr31  0.00s  0.00%  3680 cyc { 3680| 3680| 3680}
>  *[  0]1  0.00s  0.00%  3680 cyc { 3680| 3680| 3680}
>cr07  0.00s  0.00%  9209 cyc { 4184| 7848|17488}
>clts2  0.00s  0.00%  8232 cyc { 5352|2|2}
> 
> Note that this optimized trapping is currently only applied to guests
> running with HAP on Intel hardware. If using shadow paging more CR4
> bits need to be unconditionally trapped, which makes this approach
> unlikely to yield any important performance improvements.
> 
> Reported-by: Andrew Cooper 
> Signed-off-by: Roger Pau Monné 
> Acked-by: Razvan Cojocaru 

Reviewed-by: Kevin Tian 
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 5/7] public / x86: introduce __HYPERCALL_iommu_op

2018-02-26 Thread Tian, Kevin
> From: Paul Durrant [mailto:paul.durr...@citrix.com]
> Sent: Monday, February 26, 2018 5:57 PM
> 
> > -Original Message-
> > From: Tian, Kevin [mailto:kevin.t...@intel.com]
> > Sent: 24 February 2018 02:57
> > To: Paul Durrant ; xen-
> de...@lists.xenproject.org
> > Cc: Stefano Stabellini ; Wei Liu
> > ; George Dunlap ;
> > Andrew Cooper ; Ian Jackson
> > ; Tim (Xen.org) ; Jan Beulich
> > ; Daniel De Graaf 
> > Subject: RE: [Xen-devel] [PATCH 5/7] public / x86: introduce
> > __HYPERCALL_iommu_op
> >
> > > From: Paul Durrant [mailto:paul.durr...@citrix.com]
> > > Sent: Friday, February 23, 2018 5:41 PM
> > >
> > > > -Original Message-
> > > > From: Tian, Kevin [mailto:kevin.t...@intel.com]
> > > > Sent: 23 February 2018 05:17
> > > > To: Paul Durrant ; xen-
> > > de...@lists.xenproject.org
> > > > Cc: Stefano Stabellini ; Wei Liu
> > > > ; George Dunlap ;
> > > > Andrew Cooper ; Ian Jackson
> > > > ; Tim (Xen.org) ; Jan Beulich
> > > > ; Daniel De Graaf 
> > > > Subject: RE: [Xen-devel] [PATCH 5/7] public / x86: introduce
> > > > __HYPERCALL_iommu_op
> > > >
> > > > > From: Paul Durrant [mailto:paul.durr...@citrix.com]
> > > > > Sent: Tuesday, February 13, 2018 5:23 PM
> > > > >
> > > > > > -Original Message-
> > > > > > From: Tian, Kevin [mailto:kevin.t...@intel.com]
> > > > > > Sent: 13 February 2018 06:43
> > > > > > To: Paul Durrant ; xen-
> > > > > de...@lists.xenproject.org
> > > > > > Cc: Stefano Stabellini ; Wei Liu
> > > > > > ; George Dunlap
> ;
> > > > > > Andrew Cooper ; Ian Jackson
> > > > > > ; Tim (Xen.org) ; Jan
> Beulich
> > > > > > ; Daniel De Graaf 
> > > > > > Subject: RE: [Xen-devel] [PATCH 5/7] public / x86: introduce
> > > > > > __HYPERCALL_iommu_op
> > > > > >
> > > > > > > From: Paul Durrant
> > > > > > > Sent: Monday, February 12, 2018 6:47 PM
> > > > > > >
> > > > > > > This patch introduces the boilerplate for a new hypercall to allow
> a
> > > > > > > domain to control IOMMU mappings for its own pages.
> > > > > > > Whilst there is duplication of code between the native and
> compat
> > > > > entry
> > > > > > > points which appears ripe for some form of combination, I think
> it is
> > > > > > > better to maintain the separation as-is because the compat entry
> > > point
> > > > > > > will necessarily gain complexity in subsequent patches.
> > > > > > >
> > > > > > > NOTE: This hypercall is only implemented for x86 and is currently
> > > > > > >   restricted by XSM to dom0 since it could be used to cause
> > > IOMMU
> > > > > > >   faults which may bring down a host.
> > > > > > >
> > > > > > > Signed-off-by: Paul Durrant 
> > > > > > [...]
> > > > > > > +
> > > > > > > +
> > > > > > > +static bool can_control_iommu(void)
> > > > > > > +{
> > > > > > > +struct domain *currd = current->domain;
> > > > > > > +
> > > > > > > +/*
> > > > > > > + * IOMMU mappings cannot be manipulated if:
> > > > > > > + * - the IOMMU is not enabled or,
> > > > > > > + * - the IOMMU is passed through or,
> > > > > > > + * - shared EPT configured or,
> > > > > > > + * - Xen is maintaining an identity map.
> > > > > >
> > > > > > "for dom0"
> > > > > >
> > > > > > > + */
> > > > > > > +if ( !iommu_enabled || iommu_passthrough ||
> > > > > > > + iommu_use_hap_pt(currd) || need_iommu(currd) )
> > > > > >
> > > > > > I guess it's clearer to directly check iommu_dom0_strict here
> > > > >
> > > > > Well, the problem with that is that it totally ties this interface to
> dom0.
> > > > > Whilst, in practice, that is the case at the moment (because of the
> xsm
> > > > > check) I do want to leave the potential to allow other PV domains to
> > > control
> > > > > their IOMMU mappings, if that make sense in future.
> > > > >
> > > >
> > > > first it's inconsistent from the comments - "Xen is maintaining
> > > > an identity map" which only applies to dom0.
> > >
> > > That's not true. If I assign a PCI device to an HVM domain, for instance,
> > > then need_iommu() is true for that domain and indeed Xen maintains a
> 1:1
> > > BFN:GFN map for that domain.
> > >
> > > >
> > > > second I'm afraid !need_iommu is not an accurate condition to
> represent
> > > > PV domain. what about iommu also enabled for future PV domains?
> > > >
> > >
> > > I don't quite follow... need_iommu is a per-domain flag, set for dom0
> when

Re: [Xen-devel] pvh+vcpus startup issue

2018-02-26 Thread Juergen Gross
On 22/02/18 21:38, x...@randomwebstuff.com wrote:
> 
> On 22/02/18 6:35 PM, Juergen Gross wrote:
>> On 22/02/18 05:37, x...@randomwebstuff.com wrote:
>>> Hi.  I have a domU.  Its params file has: vcpus = 8.  It will start with
>>> pv, but not type="pvh".  It will not start (on pvh) with vcpus = 7 or 6
>>> or 5.  It does start with vcpus = 4.
>>>
>>> I diffed the xl -v create logs, no difference there on either startup.
>>>
>>> I grabbed the domU console output for a vcpus = 5 start (attached).  It
>>> dies right after:
>>>
>>> [    0.007110] cpu 3 spinlock event irq 23
>>> [    0.007336] installing Xen timer for CPU 4
>> Can you please post the hypervisor log ("xl dmesg")?
>>
>>
>> Juergen
> 
> Attached.

Can you please try again with "loglvl=all guest_loglvl=all" in the
hypervisor's boot parameters and after the pvh guest failing?


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 7/7] x86/build: Use new .nop directive when available

2018-02-26 Thread Andrew Cooper
Newer versions of binutils are capable of emitting an exact number bytes worth
of optimised nops.  Use this in preference to .skip when available.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Konrad Rzeszutek Wilk 
CC: Roger Pau Monné 
CC: Wei Liu 

RFC until support is actually committed to binutils mainline.
---
 xen/arch/x86/Rules.mk |  4 
 xen/include/asm-x86/alternative-asm.h | 14 --
 xen/include/asm-x86/alternative.h | 13 ++---
 3 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
index e169d67..bf5047f 100644
--- a/xen/arch/x86/Rules.mk
+++ b/xen/arch/x86/Rules.mk
@@ -28,6 +28,10 @@ $(call as-option-add,CFLAGS,CC,".equ \"x\"$$(comma)1", \
 $(call as-option-add,CFLAGS,CC,\
 ".if ((1 > 0) < 0); .error \"\";.endif",,-DHAVE_AS_NEGATIVE_TRUE)
 
+# Check to see whether the assmbler supports the .nop directive.
+$(call as-option-add,CFLAGS,CC,\
+".L1: .L2: .nop (.L2 - .L1)$$(comma)9",-DHAVE_AS_NOP_DIRECTIVE)
+
 CFLAGS += -mno-red-zone -fpic -fno-asynchronous-unwind-tables
 
 # Xen doesn't use SSE interally.  If the compiler supports it, also skip the
diff --git a/xen/include/asm-x86/alternative-asm.h 
b/xen/include/asm-x86/alternative-asm.h
index 25f79fe..9e46bed 100644
--- a/xen/include/asm-x86/alternative-asm.h
+++ b/xen/include/asm-x86/alternative-asm.h
@@ -1,6 +1,8 @@
 #ifndef _ASM_X86_ALTERNATIVE_ASM_H_
 #define _ASM_X86_ALTERNATIVE_ASM_H_
 
+#include 
+
 #ifdef __ASSEMBLY__
 
 /*
@@ -18,6 +20,14 @@
 .byte \pad_len
 .endm
 
+.macro mknops nr_bytes
+#ifdef HAVE_AS_NOP_DIRECTIVE
+.nop \nr_bytes, ASM_NOP_MAX
+#else
+.skip \nr_bytes, 0x90
+#endif
+.endm
+
 #define orig_len   (.L\@_orig_e   - .L\@_orig_s)
 #define pad_len(.L\@_orig_p   - .L\@_orig_e)
 #define total_len  (.L\@_orig_p   - .L\@_orig_s)
@@ -43,7 +53,7 @@
  */
 .L\@_diff = repl_len(1) - orig_len
 
-.skip as_true(.L\@_diff > 0) * .L\@_diff, 0x90
+ mknops (as_true(.L\@_diff > 0) * .L\@_diff)
 .L\@_orig_p:
 
 .pushsection .altinstructions, "a", @progbits
@@ -76,7 +86,7 @@
  */
 .L\@_diff = as_max(repl_len(1), repl_len(2)) - orig_len
 
- .skip as_true(.L\@_diff > 0) * .L\@_diff, 0x90
+ mknops (as_true(.L\@_diff > 0) * .L\@_diff)
 .L\@_orig_p:
 
 .pushsection .altinstructions, "a", @progbits
diff --git a/xen/include/asm-x86/alternative.h 
b/xen/include/asm-x86/alternative.h
index d53cea0..51538b6 100644
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -2,7 +2,6 @@
 #define __X86_ALTERNATIVE_H__
 
 #include 
-#include 
 
 #ifndef __ASSEMBLY__
 #include 
@@ -27,6 +26,14 @@ extern void apply_alternatives(const struct alt_instr *start,
const struct alt_instr *end);
 extern void alternative_instructions(void);
 
+asm ( ".macro mknops nr_bytes\n\t"
+#ifdef HAVE_AS_NOP_DIRECTIVE
+  ".nop \\nr_bytes, " __stringify(ASM_NOP_MAX) "\n\t"
+#else
+  ".skip \\nr_bytes, 0x90\n\t"
+#endif
+  ".endm\n\t" );
+
 #define alt_orig_len   "(.LXEN%=_orig_e - .LXEN%=_orig_s)"
 #define alt_pad_len"(.LXEN%=_orig_p - .LXEN%=_orig_e)"
 #define alt_total_len  "(.LXEN%=_orig_p - .LXEN%=_orig_s)"
@@ -46,7 +53,7 @@ extern void alternative_instructions(void);
 #define OLDINSTR_1(oldinstr, n1) \
 ".LXEN%=_orig_s:\n\t" oldinstr "\n .LXEN%=_orig_e:\n\t"  \
 ".LXEN%=_diff = "alt_repl_len(n1)"-"alt_orig_len"\n\t"   \
-".skip "AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff, 0x90\n\t" \
+"mknops ("AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff)\n\t"\
 ".LXEN%=_orig_p:\n\t"
 
 #define ALT_PADDING_LEN(n1, n2) \
@@ -55,7 +62,7 @@ extern void alternative_instructions(void);
 #define OLDINSTR_2(oldinstr, n1, n2) \
 ".LXEN%=_orig_s:\n\t" oldinstr "\n .LXEN%=_orig_e:\n\t"  \
 ".LXEN%=_diff = "ALT_PADDING_LEN(n1, n2)"\n\t"   \
-".skip "AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff, 0x90\n\t" \
+"mknops ("AS_TRUE"(.LXEN%=_diff > 0) * .LXEN%=_diff)\n\t"\
 ".LXEN%=_orig_p:\n\t"
 
 #define ALTINSTR_ENTRY(feature, num)\
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 5/7] x86/alt: Support for automatic padding calculations

2018-02-26 Thread Andrew Cooper
The correct amount of padding in an origin patch site can be calculated
automatically, based on the relative lengths of the replacements.

This requires a bit of trickery to calculate correctly, especially in the
ALTENRATIVE_2 case where a branchless max() calculation in needed.  The
calculation is further complicated because GAS's idea of true is -1 rather
than 1, which is why the extra negations are required.

Additionally, have apply_alternatives() attempt to optimise the padding nops.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Konrad Rzeszutek Wilk 
CC: Roger Pau Monné 
CC: Wei Liu 

v2: Fix build with Clang.
---
 xen/arch/x86/Rules.mk |  4 +++
 xen/arch/x86/alternative.c| 32 ---
 xen/include/asm-x86/alternative-asm.h | 60 +--
 xen/include/asm-x86/alternative.h | 46 +--
 4 files changed, 120 insertions(+), 22 deletions(-)

diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
index 9897dea..e169d67 100644
--- a/xen/arch/x86/Rules.mk
+++ b/xen/arch/x86/Rules.mk
@@ -24,6 +24,10 @@ $(call as-option-add,CFLAGS,CC,".equ \"x\"$$(comma)1", \
  -U__OBJECT_LABEL__ -DHAVE_GAS_QUOTED_SYM \
  '-D__OBJECT_LABEL__=$(subst $(BASEDIR)/,,$(CURDIR))/$$@')
 
+# GCC's idea of true is -1.  Clang's idea is 1
+$(call as-option-add,CFLAGS,CC,\
+".if ((1 > 0) < 0); .error \"\";.endif",,-DHAVE_AS_NEGATIVE_TRUE)
+
 CFLAGS += -mno-red-zone -fpic -fno-asynchronous-unwind-tables
 
 # Xen doesn't use SSE interally.  If the compiler supports it, also skip the
diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index 51ca53e..e24db84 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -180,13 +180,37 @@ void init_or_livepatch apply_alternatives(const struct 
alt_instr *start,
 uint8_t *orig = ALT_ORIG_PTR(a);
 uint8_t *repl = ALT_REPL_PTR(a);
 uint8_t buf[MAX_PATCH_LEN];
+unsigned int total_len = a->orig_len + a->pad_len;
 
-BUG_ON(a->repl_len > a->orig_len);
-BUG_ON(a->orig_len > sizeof(buf));
+BUG_ON(a->repl_len > total_len);
+BUG_ON(total_len > sizeof(buf));
 BUG_ON(a->cpuid >= NCAPINTS * 32);
 
+/* No replacement to make, but try to optimise any padding. */
 if ( !boot_cpu_has(a->cpuid) )
+{
+unsigned int i;
+
+if ( a->pad_len <= 1 )
+continue;
+
+/* Search the padding area for any byte which isn't a nop. */
+for ( i = a->orig_len; i < total_len; ++i )
+if ( orig[i] != 0x90 )
+break;
+
+/*
+ * Only make any changes if all padding bytes are unoptimised
+ * nops.  With multiple alternatives over the same origin site, we
+ * may have already made a replacement, or optimised the nops.
+ */
+if ( i != total_len )
+continue;
+
+add_nops(buf, a->pad_len);
+text_poke(orig + a->orig_len, buf, a->pad_len);
 continue;
+}
 
 memcpy(buf, repl, a->repl_len);
 
@@ -194,8 +218,8 @@ void init_or_livepatch apply_alternatives(const struct 
alt_instr *start,
 if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
 *(int32_t *)(buf + 1) += repl - orig;
 
-add_nops(buf + a->repl_len, a->orig_len - a->repl_len);
-text_poke(orig, buf, a->orig_len);
+add_nops(buf + a->repl_len, total_len - a->repl_len);
+text_poke(orig, buf, total_len);
 }
 }
 
diff --git a/xen/include/asm-x86/alternative-asm.h 
b/xen/include/asm-x86/alternative-asm.h
index 150bd1a..25f79fe 100644
--- a/xen/include/asm-x86/alternative-asm.h
+++ b/xen/include/asm-x86/alternative-asm.h
@@ -9,30 +9,55 @@
  * enough information for the alternatives patching code to patch an
  * instruction. See apply_alternatives().
  */
-.macro altinstruction_entry orig repl feature orig_len repl_len
+.macro altinstruction_entry orig repl feature orig_len repl_len pad_len
 .long \orig - .
 .long \repl - .
 .word \feature
 .byte \orig_len
 .byte \repl_len
+.byte \pad_len
 .endm
 
 #define orig_len   (.L\@_orig_e   - .L\@_orig_s)
+#define pad_len(.L\@_orig_p   - .L\@_orig_e)
+#define total_len  (.L\@_orig_p   - .L\@_orig_s)
 #define repl_len(nr)   (.L\@_repl_e\()nr  - .L\@_repl_s\()nr)
 #define decl_repl(insn, nr) .L\@_repl_s\()nr: insn; .L\@_repl_e\()nr:
 
+/* GCC's idea of true is -1, while Clang's idea is 1. */
+#ifdef HAVE_AS_NEGATIVE_TRUE
+# define as_true(x) (-(x))
+#else
+# define as_true(x) (x)
+#endif
+
+#define as_max(a, b)   ((a) ^ (((a) ^ (b)) & -as_true((a) < (b
+
 .macro ALTERNATIVE 

[Xen-devel] [PATCH v2 6/7] x86/alt: Drop explicit padding of origin sites

2018-02-26 Thread Andrew Cooper
Now that the alternatives infrastructure can calculate the required padding
automatically, there is no need to hard code it.

Signed-off-by: Andrew Cooper 
Reviewed-by: Wei Liu 
Reviewed-by: Roger Pau Monné 
Reviewed-by: Jan Beulich 
---
 xen/arch/x86/x86_64/compat/entry.S  |  2 +-
 xen/arch/x86/x86_64/entry.S |  2 +-
 xen/include/asm-x86/nops.h  |  7 ---
 xen/include/asm-x86/spec_ctrl_asm.h | 19 ---
 4 files changed, 10 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/x86_64/compat/entry.S 
b/xen/arch/x86/x86_64/compat/entry.S
index 8aba269..f650610 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -134,7 +134,7 @@ ENTRY(compat_restore_all_guest)
 jne   1b
 2:
 .endm
-   ALTERNATIVE_2 ".skip 45, 0x90", \
+   ALTERNATIVE_2 "", \
 alt_cr4_pv32, X86_FEATURE_XEN_SMEP, \
 alt_cr4_pv32, X86_FEATURE_XEN_SMAP
 
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index e939f20..cc94333 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -564,7 +564,7 @@ handle_exception_saved:
 testb $X86_EFLAGS_IF>>8,UREGS_eflags+1(%rsp)
 jzexception_with_ints_disabled
 
-ALTERNATIVE_2 "jmp .Lcr4_pv32_done; .skip 2, 0x90", \
+ALTERNATIVE_2 "jmp .Lcr4_pv32_done", \
 __stringify(mov VCPU_domain(%rbx), %rax), X86_FEATURE_XEN_SMEP, \
 __stringify(mov VCPU_domain(%rbx), %rax), X86_FEATURE_XEN_SMAP
 
diff --git a/xen/include/asm-x86/nops.h b/xen/include/asm-x86/nops.h
index 61319cc..1a46b97 100644
--- a/xen/include/asm-x86/nops.h
+++ b/xen/include/asm-x86/nops.h
@@ -65,13 +65,6 @@
 #define ASM_NOP8 _ASM_MK_NOP(P6_NOP8)
 #define ASM_NOP9 _ASM_MK_NOP(P6_NOP9)
 
-#define ASM_NOP17 ASM_NOP8; ASM_NOP7; ASM_NOP2
-#define ASM_NOP21 ASM_NOP8; ASM_NOP8; ASM_NOP5
-#define ASM_NOP24 ASM_NOP8; ASM_NOP8; ASM_NOP8
-#define ASM_NOP29 ASM_NOP8; ASM_NOP8; ASM_NOP8; ASM_NOP5
-#define ASM_NOP32 ASM_NOP8; ASM_NOP8; ASM_NOP8; ASM_NOP8
-#define ASM_NOP40 ASM_NOP8; ASM_NOP8; ASM_NOP8; ASM_NOP8; ASM_NOP8
-
 #define ASM_NOP_MAX 9
 
 #endif /* __X86_ASM_NOPS_H__ */
diff --git a/xen/include/asm-x86/spec_ctrl_asm.h 
b/xen/include/asm-x86/spec_ctrl_asm.h
index 1f2b6f3..1623fc0 100644
--- a/xen/include/asm-x86/spec_ctrl_asm.h
+++ b/xen/include/asm-x86/spec_ctrl_asm.h
@@ -216,9 +216,8 @@
 
 /* Use after a VMEXIT from an HVM guest. */
 #define SPEC_CTRL_ENTRY_FROM_VMEXIT \
-ALTERNATIVE __stringify(ASM_NOP40), \
-DO_OVERWRITE_RSB, X86_FEATURE_RSB_VMEXIT;   \
-ALTERNATIVE_2 __stringify(ASM_NOP32),   \
+ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_RSB_VMEXIT;   \
+ALTERNATIVE_2 "",   \
 __stringify(DO_SPEC_CTRL_ENTRY_FROM_VMEXIT  \
 ibrs_val=SPEC_CTRL_IBRS),   \
 X86_FEATURE_XEN_IBRS_SET,   \
@@ -228,9 +227,8 @@
 
 /* Use after an entry from PV context (syscall/sysenter/int80/int82/etc). */
 #define SPEC_CTRL_ENTRY_FROM_PV \
-ALTERNATIVE __stringify(ASM_NOP40), \
-DO_OVERWRITE_RSB, X86_FEATURE_RSB_NATIVE;   \
-ALTERNATIVE_2 __stringify(ASM_NOP21),   \
+ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_RSB_NATIVE;   \
+ALTERNATIVE_2 "",   \
 __stringify(DO_SPEC_CTRL_ENTRY maybexen=0   \
 ibrs_val=SPEC_CTRL_IBRS),   \
 X86_FEATURE_XEN_IBRS_SET,   \
@@ -239,9 +237,8 @@
 
 /* Use in interrupt/exception context.  May interrupt Xen or PV context. */
 #define SPEC_CTRL_ENTRY_FROM_INTR   \
-ALTERNATIVE __stringify(ASM_NOP40), \
-DO_OVERWRITE_RSB, X86_FEATURE_RSB_NATIVE;   \
-ALTERNATIVE_2 __stringify(ASM_NOP29),   \
+ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_RSB_NATIVE;   \
+ALTERNATIVE_2 "",   \
 __stringify(DO_SPEC_CTRL_ENTRY maybexen=1   \
 ibrs_val=SPEC_CTRL_IBRS),   \
 X86_FEATURE_XEN_IBRS_SET,   \
@@ -250,13 +247,13 @@
 
 /* Use when exiting to Xen context. */
 #define SPEC_CTRL_EXIT_TO_XEN   \
-ALTERNATIVE_2 __stringify(ASM_NOP17),   \
+ALTERNATIVE_2 "", 

  1   2   >