[Xen-devel] [linux-4.9 test] 112086: regressions - FAIL

2017-07-21 Thread osstest service owner
flight 112086 linux-4.9 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112086/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
111883

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 111843
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 111843
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 111883
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 111883
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore   fail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass

version targeted for testing:
 linuxc03917de04aa68017a737e90ea01338d991eaff5
baseline version:
 linuxf0cd77ded5127168b1b83ca2f366ee17e9c0586f

Last test of basis   111883  2017-07-16 11:10:00 Z5 days
Testing same since   112086  2017-07-21 06:22:54 Z0 days1 attempts


People who touched revisions under test:
  "Eric W. Biederman" 
  Adam Borowski 
  Alban Browaeys 
  Alexei Starovoitov 
  Amit Pundir 
  Andrei Vagin 

[Xen-devel] [linux-linus test] 112083: regressions - FAIL

2017-07-21 Thread osstest service owner
flight 112083 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112083/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. 
vs. 110515
 test-amd64-amd64-i386-pvgrub  7 xen-boot fail REGR. vs. 110515
 test-amd64-amd64-xl-pvh-intel  7 xen-bootfail REGR. vs. 110515
 test-amd64-amd64-qemuu-nested-intel  7 xen-boot  fail REGR. vs. 110515
 test-amd64-amd64-xl-qcow2 7 xen-boot fail REGR. vs. 110515
 test-amd64-amd64-amd64-pvgrub  7 xen-bootfail REGR. vs. 110515
 test-amd64-amd64-xl  16 guest-localmigrate   fail REGR. vs. 110515
 test-amd64-amd64-libvirt-pair 21 guest-start/debian  fail REGR. vs. 110515
 test-amd64-i386-libvirt-xsm  16 guest-saverestore.2  fail REGR. vs. 110515
 test-amd64-amd64-xl-credit2  15 guest-saverestorefail REGR. vs. 110515
 test-amd64-amd64-xl-qemut-debianhvm-amd64  7 xen-bootfail REGR. vs. 110515
 test-amd64-amd64-pygrub   7 xen-boot fail REGR. vs. 110515
 test-amd64-amd64-xl-xsm  16 guest-localmigrate   fail REGR. vs. 110515
 test-amd64-amd64-libvirt 16 guest-saverestore.2  fail REGR. vs. 110515
 test-amd64-i386-xl   16 guest-localmigrate   fail REGR. vs. 110515
 test-amd64-amd64-xl-multivcpu 15 guest-saverestore   fail REGR. vs. 110515
 test-amd64-amd64-pair21 guest-start/debian   fail REGR. vs. 110515
 test-amd64-amd64-libvirt-xsm 16 guest-saverestore.2  fail REGR. vs. 110515
 test-amd64-i386-libvirt  16 guest-saverestore.2  fail REGR. vs. 110515
 test-amd64-i386-xl-xsm   16 guest-localmigrate   fail REGR. vs. 110515
 test-amd64-amd64-xl-pvh-amd  16 guest-localmigrate   fail REGR. vs. 110515
 test-amd64-i386-libvirt-pair 21 guest-start/debian   fail REGR. vs. 110515
 test-amd64-i386-pair 21 guest-start/debian   fail REGR. vs. 110515
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
110515
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
110515
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
110515

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 110515
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 110515
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 110515
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 110515
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 110515
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 110515
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore   fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 

Re: [Xen-devel] [PATCH 09/15] xen: vmx: handle SGX related MSRs

2017-07-21 Thread Huang, Kai



On 7/21/2017 9:42 PM, Huang, Kai wrote:



On 7/20/2017 5:27 AM, Andrew Cooper wrote:

On 09/07/17 09:09, Kai Huang wrote:

This patch handles IA32_FEATURE_CONTROL and IA32_SGXLEPUBKEYHASHn MSRs.

For IA32_FEATURE_CONTROL, if SGX is exposed to domain, then 
SGX_ENABLE bit
is always set. If SGX launch control is also exposed to domain, and 
physical
IA32_SGXLEPUBKEYHASHn are writable, then SGX_LAUNCH_CONTROL_ENABLE 
bit is

also always set. Write to IA32_FEATURE_CONTROL is ignored.

For IA32_SGXLEPUBKEYHASHn, a new 'struct sgx_vcpu' is added for 
per-vcpu SGX
staff, and currently it has vcpu's virtual ia32_sgxlepubkeyhash[0-3]. 
Two
boolean 'readable' and 'writable' are also added to indicate whether 
virtual

IA32_SGXLEPUBKEYHASHn are readable and writable.

During vcpu is initialized, virtual ia32_sgxlepubkeyhash are also 
initialized.
If physical IA32_SGXLEPUBKEYHASHn are writable, then 
ia32_sgxlepubkeyhash are
set to Intel's default value, as for physical machine, those MSRs 
will have
Intel's default value. If physical MSRs are not writable (it is 
*locked* by
BIOS before handling to Xen), then we try to read those MSRs and use 
physical
values as defult value for virtual MSRs. One thing is rdmsr_safe is 
used, as
although SDM says if SGX is present, IA32_SGXLEPUBKEYHASHn are 
available for
read, but in reality, skylake client (at least some, depending on 
BIOS) doesn't
have those MSRs available, so we use rdmsr_safe and set readable to 
false if it

returns error code.

For IA32_SGXLEPUBKEYHASHn MSR read from guest, if physical MSRs are not
readable, guest is not allowed to read either, otherwise vcpu's 
virtual MSR

value is returned.

For IA32_SGXLEPUBKEYHASHn MSR write from guest, we allow guest to 
write if both

physical MSRs are writable and SGX launch control is exposed to domain,
otherwise error is injected.

To make EINIT run successfully in guest, vcpu's virtual 
IA32_SGXLEPUBKEYHASHn

will be update to physical MSRs when vcpu is scheduled in.

Signed-off-by: Kai Huang 
---
  xen/arch/x86/hvm/vmx/sgx.c | 194 
+

  xen/arch/x86/hvm/vmx/vmx.c |  24 +
  xen/include/asm-x86/cpufeature.h   |   3 +
  xen/include/asm-x86/hvm/vmx/sgx.h  |  22 +
  xen/include/asm-x86/hvm/vmx/vmcs.h |   2 +
  xen/include/asm-x86/msr-index.h|   6 ++
  6 files changed, 251 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/sgx.c b/xen/arch/x86/hvm/vmx/sgx.c
index 14379151e8..4944e57aef 100644
--- a/xen/arch/x86/hvm/vmx/sgx.c
+++ b/xen/arch/x86/hvm/vmx/sgx.c
@@ -405,6 +405,200 @@ void hvm_destroy_epc(struct domain *d)
  hvm_reset_epc(d, true);
  }
+/* Whether IA32_SGXLEPUBKEYHASHn are physically *unlocked* by BIOS */
+bool_t sgx_ia32_sgxlepubkeyhash_writable(void)
+{
+uint64_t sgx_lc_enabled = IA32_FEATURE_CONTROL_SGX_ENABLE |
+  
IA32_FEATURE_CONTROL_SGX_LAUNCH_CONTROL_ENABLE |

+  IA32_FEATURE_CONTROL_LOCK;
+uint64_t val;
+
+rdmsrl(MSR_IA32_FEATURE_CONTROL, val);
+
+return (val & sgx_lc_enabled) == sgx_lc_enabled;
+}
+
+bool_t domain_has_sgx(struct domain *d)
+{
+/* hvm_epc_populated(d) implies CPUID has SGX */
+return hvm_epc_populated(d);
+}
+
+bool_t domain_has_sgx_launch_control(struct domain *d)
+{
+struct cpuid_policy *p = d->arch.cpuid;
+
+if ( !domain_has_sgx(d) )
+return false;
+
+/* Unnecessary but check anyway */
+if ( !cpu_has_sgx_launch_control )
+return false;
+
+return !!p->feat.sgx_launch_control;
+}


Both of these should be d->arch.cpuid->feat.{sgx,sgx_lc} only, and not
from having individual helpers.

The CPUID setup during host boot and domain construction should take
care of setting everything up properly, or hiding the features from the
guest.  The point of the work I've been doing is to prevent situations
where the guest can see SGX but something doesn't work because of Xen
using nested checks like this.


Thanks for comments. Will change to simple check against 
d->arch.cpuid->feat.{sgx,sgx_lc}.





+
+/* Digest of Intel signing key. MSR's default value after reset. */
+#define SGX_INTEL_DEFAULT_LEPUBKEYHASH0 0xa6053e051270b7ac
+#define SGX_INTEL_DEFAULT_LEPUBKEYHASH1 0x6cfbe8ba8b3b413d
+#define SGX_INTEL_DEFAULT_LEPUBKEYHASH2 0xc4916d99f2b3735d
+#define SGX_INTEL_DEFAULT_LEPUBKEYHASH3 0xd4f8c05909f9bb3b
+
+void sgx_vcpu_init(struct vcpu *v)
+{
+struct sgx_vcpu *sgxv = to_sgx_vcpu(v);
+
+memset(sgxv, 0, sizeof (*sgxv));
+
+if ( sgx_ia32_sgxlepubkeyhash_writable() )
+{
+/*
+ * If physical MSRs are writable, set vcpu's default value 
to Intel's
+ * default value. For real machine, after reset, MSRs 
contain Intel's

+ * default value.
+ */
+sgxv->ia32_sgxlepubkeyhash[0] = 
SGX_INTEL_DEFAULT_LEPUBKEYHASH0;
+sgxv->ia32_sgxlepubkeyhash[1] = 
SGX_INTEL_DEFAULT_LEPUBKEYHASH1;
+sgxv->ia32_sgxlepubkeyhash[2] = 

Re: [Xen-devel] [PATCH 03/15] xen: x86: add early stage SGX feature detection

2017-07-21 Thread Huang, Kai



On 7/21/2017 9:17 PM, Huang, Kai wrote:



On 7/20/2017 2:23 AM, Andrew Cooper wrote:

On 09/07/17 09:09, Kai Huang wrote:
This patch adds early stage SGX feature detection via SGX CPUID 0x12. 
Function
detect_sgx is added to detect SGX info on each CPU (called from 
vmx_cpu_up).
SDM says SGX info returned by CPUID is per-thread, and we cannot 
assume all
threads will return the same SGX info, so we have to detect SGX for 
each CPU.
For simplicity, currently SGX is only supported when all CPUs reports 
the same

SGX info.

SDM also says it's possible to have multiple EPC sections but this is 
only for
multiple-socket server, which we don't support now (there are other 
things
need to be done, ex, NUMA EPC, scheduling, etc, as well), so 
currently only

one EPC is supported.

Dedicated files sgx.c and sgx.h are added (under vmx directory as SGX 
is Intel
specific) for bulk of above SGX detection code detection code, and 
for further

SGX code as well.

Signed-off-by: Kai Huang 


I am not sure putting this under hvm/ is a sensible move.  Almost
everything in this patch is currently common, and I can forsee us
wanting to introduce PV support, so it would be good to introduce this
in a guest-neutral location to begin with.


Sorry I forgot to response to this in my last reply. I looked at code 
again and yes I think we can make the code to common place. I will move 
current sgx.c to arch/x86/sgx.c. Thanks for comments.





---
  xen/arch/x86/hvm/vmx/Makefile |   1 +
  xen/arch/x86/hvm/vmx/sgx.c| 208 
++

  xen/arch/x86/hvm/vmx/vmcs.c   |   4 +
  xen/include/asm-x86/cpufeature.h  |   1 +
  xen/include/asm-x86/hvm/vmx/sgx.h |  45 +
  5 files changed, 259 insertions(+)
  create mode 100644 xen/arch/x86/hvm/vmx/sgx.c
  create mode 100644 xen/include/asm-x86/hvm/vmx/sgx.h

diff --git a/xen/arch/x86/hvm/vmx/Makefile 
b/xen/arch/x86/hvm/vmx/Makefile

index 04a29ce59d..f6bcf0d143 100644
--- a/xen/arch/x86/hvm/vmx/Makefile
+++ b/xen/arch/x86/hvm/vmx/Makefile
@@ -4,3 +4,4 @@ obj-y += realmode.o
  obj-y += vmcs.o
  obj-y += vmx.o
  obj-y += vvmx.o
+obj-y += sgx.o
diff --git a/xen/arch/x86/hvm/vmx/sgx.c b/xen/arch/x86/hvm/vmx/sgx.c
new file mode 100644
index 00..6b41469371
--- /dev/null
+++ b/xen/arch/x86/hvm/vmx/sgx.c


This file looks like it should be arch/x86/sgx.c, given its current 
content.


Will do.




@@ -0,0 +1,208 @@
+/*
+ * Intel Software Guard Extensions support


Please include a GPLv2 header.


Yes will do.

Thanks,
-Kai



+ *
+ * Author: Kai Huang 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static struct sgx_cpuinfo __read_mostly sgx_cpudata[NR_CPUS];
+static struct sgx_cpuinfo __read_mostly boot_sgx_cpudata;


I don't think any of this is necessary.  The description says that all
EPCs across the server will be reported in CPUID subleaves, and our
implementation gives up if the data are non-identical across CPUs.

Therefore, we only need to keep one copy of the data, and check check
APs against the master copy.


Right. boot_sgx_cpudata is what we need. Currently detect_sgx is called 
from vmx_cpu_up. How about changing to calling it from identify_cpu, and 
something like below ?


 if ( c == _cpu_data )
 detect_sgx(_sgx_cpudata);
 else {
 struct sgx_cpuinfo tmp;
 detect_sgx();
 if ( memcmp(_sgx_cpudata, , sizeof (tmp)) )
 //disable SGX
 }

Thanks,
-Kai



Let me see about splitting up a few bits of the existing CPUID
infrastructure, so we can use the host cpuid policy more effectively for
Xen related things.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PULL for-2.10 1/2] xen: fix compilation on 32-bit hosts

2017-07-21 Thread Stefano Stabellini
From: Igor Druzhinin 

Signed-off-by: Igor Druzhinin 
Reviewed-by: Stefano Stabellini 
---
 hw/i386/xen/xen-mapcache.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/hw/i386/xen/xen-mapcache.c b/hw/i386/xen/xen-mapcache.c
index 2a1fbd1..bb1078c 100644
--- a/hw/i386/xen/xen-mapcache.c
+++ b/hw/i386/xen/xen-mapcache.c
@@ -527,7 +527,7 @@ static uint8_t *xen_replace_cache_entry_unlocked(hwaddr 
old_phys_addr,
 entry = entry->next;
 }
 if (!entry) {
-DPRINTF("Trying to update an entry for %lx " \
+DPRINTF("Trying to update an entry for "TARGET_FMT_plx \
 "that is not in the mapcache!\n", old_phys_addr);
 return NULL;
 }
@@ -535,15 +535,16 @@ static uint8_t *xen_replace_cache_entry_unlocked(hwaddr 
old_phys_addr,
 address_index  = new_phys_addr >> MCACHE_BUCKET_SHIFT;
 address_offset = new_phys_addr & (MCACHE_BUCKET_SIZE - 1);
 
-fprintf(stderr, "Replacing a dummy mapcache entry for %lx with %lx\n",
-old_phys_addr, new_phys_addr);
+fprintf(stderr, "Replacing a dummy mapcache entry for "TARGET_FMT_plx \
+" with "TARGET_FMT_plx"\n", old_phys_addr, new_phys_addr);
 
 xen_remap_bucket(entry, entry->vaddr_base,
  cache_size, address_index, false);
 if (!test_bits(address_offset >> XC_PAGE_SHIFT,
 test_bit_size >> XC_PAGE_SHIFT,
 entry->valid_mapping)) {
-DPRINTF("Unable to update a mapcache entry for %lx!\n", old_phys_addr);
+DPRINTF("Unable to update a mapcache entry for "TARGET_FMT_plx"!\n",
+old_phys_addr);
 return NULL;
 }
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PULL for-2.10 0/2] please pull xen-20170721-tag

2017-07-21 Thread Stefano Stabellini
The following changes since commit 91939262ffcd3c85ea6a4793d3029326eea1d649:

  configure: Drop ancient Solaris 9 and earlier support (2017-07-21 15:04:05 
+0100)

are available in the git repository at:

  git://xenbits.xen.org/people/sstabellini/qemu-dm.git tags/xen-20170721-tag

for you to fetch changes up to 7fb394ad8a7c4609cefa2136dec16cf65d028f40:

  xen-mapcache: Fix the bug when overlapping emulated DMA operations may cause 
inconsistency in guest memory mappings (2017-07-21 17:37:06 -0700)


Xen 2017/07/21


Alexey G (1):
  xen-mapcache: Fix the bug when overlapping emulated DMA operations may 
cause inconsistency in guest memory mappings

Igor Druzhinin (1):
  xen: fix compilation on 32-bit hosts

 hw/i386/xen/xen-mapcache.c | 22 --
 1 file changed, 16 insertions(+), 6 deletions(-)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PULL for-2.10 2/2] xen-mapcache: Fix the bug when overlapping emulated DMA operations may cause inconsistency in guest memory mappings

2017-07-21 Thread Stefano Stabellini
From: Alexey G 

Under certain circumstances normal xen-mapcache functioning may be broken
by guest's actions. This may lead to either QEMU performing exit() due to
a caught bad pointer (and with QEMU process gone the guest domain simply
appears hung afterwards) or actual use of the incorrect pointer inside
QEMU address space -- a write to unmapped memory is possible. The bug is
hard to reproduce on a i440 machine as multiple DMA sources are required
(though it's possible in theory, using multiple emulated devices), but can
be reproduced somewhat easily on a Q35 machine using an emulated AHCI
controller -- each NCQ queue command slot may be used as an independent
DMA source ex. using READ FPDMA QUEUED command, so a single storage
device on the AHCI controller port will be enough to produce multiple DMAs
(up to 32). The detailed description of the issue follows.

Xen-mapcache provides an ability to map parts of a guest memory into
QEMU's own address space to work with.

There are two types of cache lookups:
 - translating a guest physical address into a pointer in QEMU's address
   space, mapping a part of guest domain memory if necessary (while trying
   to reduce a number of such (re)mappings to a minimum)
 - translating a QEMU's pointer back to its physical address in guest RAM

These lookups are managed via two linked-lists of structures.
MapCacheEntry is used for forward cache lookups, while MapCacheRev -- for
reverse lookups.

Every guest physical address is broken down into 2 parts:
address_index  = phys_addr >> MCACHE_BUCKET_SHIFT;
address_offset = phys_addr & (MCACHE_BUCKET_SIZE - 1);

MCACHE_BUCKET_SHIFT depends on a system (32/64) and is equal to 20 for
a 64-bit system (which assumed for the further description). Basically,
this means that we deal with 1 MB chunks and offsets within those 1 MB
chunks. All mappings are created with 1MB-granularity, i.e. 1MB/2MB/3MB
etc. Most DMA transfers typically are less than 1MB, however, if the
transfer crosses any 1MB border(s) - than a nearest larger mapping size
will be used, so ex. a 512-byte DMA transfer with the start address
700FFF80h will actually require a 2MB range.

Current implementation assumes that MapCacheEntries are unique for a given
address_index and size pair and that a single MapCacheEntry may be reused
by multiple requests -- in this case the 'lock' field will be larger than
1. On other hand, each requested guest physical address (with 'lock' flag)
is described by each own MapCacheRev. So there may be multiple MapCacheRev
entries corresponding to a single MapCacheEntry. The xen-mapcache code
uses MapCacheRev entries to retrieve the address_index & size pair which
in turn used to find a related MapCacheEntry. The 'lock' field within
a MapCacheEntry structure is actually a reference counter which shows
a number of corresponding MapCacheRev entries.

The bug lies in ability for the guest to indirectly manipulate with the
xen-mapcache MapCacheEntries list via a special sequence of DMA
operations, typically for storage devices. In order to trigger the bug,
guest needs to issue DMA operations in specific order and timing.
Although xen-mapcache is protected by the mutex lock -- this doesn't help
in this case, as the bug is not due to a race condition.

Suppose we have 3 DMA transfers, namely A, B and C, where
- transfer A crosses 1MB border and thus uses a 2MB mapping
- transfers B and C are normal transfers within 1MB range
- and all 3 transfers belong to the same address_index

In this case, if all these transfers are to be executed one-by-one
(without overlaps), no special treatment necessary -- each transfer's
mapping lock will be set and then cleared on unmap before starting
the next transfer.
The situation changes when DMA transfers overlap in time, ex. like this:

  |= transfer A (2MB) =|

  |= transfer B (1MB) =|

  |= transfer C (1MB) =|
 time --->

In this situation the following sequence of actions happens:

1. transfer A creates a mapping to 2MB area (lock=1)
2. transfer B (1MB) tries to find available mapping but cannot find one
   because transfer A is still in progress, and it has 2MB size + non-zero
   lock. So transfer B creates another mapping -- same address_index,
   but 1MB size.
3. transfer A completes, making 1st mapping entry available by setting its
   lock to 0
4. transfer C starts and tries to find available mapping entry and sees
   that 1st entry has lock=0, so it uses this entry but remaps the mapping
   to a 1MB size
5. transfer B completes and by this time
  - there are two locked entries in the MapCacheEntry list with the SAME
values for both address_index and size
  - the entry for transfer B actually resides farther in list while
transfer C's entry is first
6. xen_ram_addr_from_mapcache() for transfer B gets correct address_index
   and size pair from corresponding MapCacheRev entry, but then it starts
   looking for 

Re: [Xen-devel] [PATCH] xen-mapcache: Fix the bug when overlapping emulated DMA operations may cause inconsistency in guest memory mappings

2017-07-21 Thread Stefano Stabellini
On Thu, 20 Jul 2017, Alexey G wrote:
> On Wed, 19 Jul 2017 11:00:26 -0700 (PDT)
> Stefano Stabellini  wrote:
> 
> > My expectation is that unlocked mappings are much more frequent than
> > locked mappings. Also, I expect that only very rarely we'll be able to
> > reuse locked mappings. Over the course of a VM lifetime, it seems to me
> > that walking the list every time would cost more than it would benefit.
> > 
> > These are only "expectations", I would love to see numbers. Numbers make
> > for better decisions :-)  Would you be up for gathering some of these
> > numbers? Such as how many times you get to reuse locked mappings and how
> > many times we walk items on the list fruitlessly?
> > 
> > Otherwise, would you be up for just testing the modified version of the
> > patch I sent to verify that solves the bug?
> 
> Numbers will show that there is a one single entry in the bucket's list
> most of the time. :) Even two entries are rare encounters, typically to be
> seen only when guest performs some intensive I/O. OK, I'll collect some real
> stats for different scenarios, these are interesting numbers, might come
> useful for later optimizations.
> 
> The approach your proposed is good, but it allows reusing of suitable
> locked entries only when they come first in list (an existing behavior).
> But we can actually reuse a locked entry which may come next (if any) in
> the list as well. When we have the situation when lock=0 entry comes first
> in the list and lock=1 entry is the second -- there is a chance the first
> entry was a 2MB-type (must be some reason why 2nd entry was added to the
> list), so picking it for a lock0-request might result in
> xen_remap_bucket... which should be avoided. Anyway, there is no big deal
> which approach is better as these situations are uncommon. After all,
> mostly it's just a single entry in the bucket's list. 

Given that QEMU is about to release and I have to send a pull request
with another fix now, I am going to also send my version of the fix
right away (keeping you as main author of course).

However, I am more than happy to change the behavior of the algorithm in
the future if the numbers show that your version is better.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PULL for-2.10 6/7] xen/mapcache: introduce xen_replace_cache_entry()

2017-07-21 Thread Stefano Stabellini
On Fri, 21 Jul 2017, Igor Druzhinin wrote:
> On 21/07/17 14:50, Anthony PERARD wrote:
> > On Tue, Jul 18, 2017 at 03:22:41PM -0700, Stefano Stabellini wrote:
> > > From: Igor Druzhinin 
> > 
> > ...
> > 
> > > +static uint8_t *xen_replace_cache_entry_unlocked(hwaddr old_phys_addr,
> > > + hwaddr new_phys_addr,
> > > + hwaddr size)
> > > +{
> > > +MapCacheEntry *entry;
> > > +hwaddr address_index, address_offset;
> > > +hwaddr test_bit_size, cache_size = size;
> > > +
> > > +address_index  = old_phys_addr >> MCACHE_BUCKET_SHIFT;
> > > +address_offset = old_phys_addr & (MCACHE_BUCKET_SIZE - 1);
> > > +
> > > +assert(size);
> > > +/* test_bit_size is always a multiple of XC_PAGE_SIZE */
> > > +test_bit_size = size + (old_phys_addr & (XC_PAGE_SIZE - 1));
> > > +if (test_bit_size % XC_PAGE_SIZE) {
> > > +test_bit_size += XC_PAGE_SIZE - (test_bit_size % XC_PAGE_SIZE);
> > > +}
> > > +cache_size = size + address_offset;
> > > +if (cache_size % MCACHE_BUCKET_SIZE) {
> > > +cache_size += MCACHE_BUCKET_SIZE - (cache_size %
> > > MCACHE_BUCKET_SIZE);
> > > +}
> > > +
> > > +entry = >entry[address_index % mapcache->nr_buckets];
> > > +while (entry && !(entry->paddr_index == address_index &&
> > > +  entry->size == cache_size)) {
> > > +entry = entry->next;
> > > +}
> > > +if (!entry) {
> > > +DPRINTF("Trying to update an entry for %lx " \
> > > +"that is not in the mapcache!\n", old_phys_addr);
> > > +return NULL;
> > > +}
> > > +
> > > +address_index  = new_phys_addr >> MCACHE_BUCKET_SHIFT;
> > > +address_offset = new_phys_addr & (MCACHE_BUCKET_SIZE - 1);
> > > +
> > > +fprintf(stderr, "Replacing a dummy mapcache entry for %lx with
> > > %lx\n",
> > > +old_phys_addr, new_phys_addr);
> > 
> > Looks likes this does not build on 32bits.
> > in:
> > http://logs.test-lab.xenproject.org/osstest/logs/112041/build-i386/6.ts-xen-build.log
> > 
> > /home/osstest/build.112041.build-i386/xen/tools/qemu-xen-dir/hw/i386/xen/xen-mapcache.c:
> > In function 'xen_replace_cache_entry_unlocked':
> > /home/osstest/build.112041.build-i386/xen/tools/qemu-xen-dir/hw/i386/xen/xen-mapcache.c:539:13:
> > error: format '%lx' expects argument of type 'long unsigned int', but
> > argument 3 has type 'hwaddr' [-Werror=format=]
> >   old_phys_addr, new_phys_addr);
> >   ^
> > /home/osstest/build.112041.build-i386/xen/tools/qemu-xen-dir/hw/i386/xen/xen-mapcache.c:539:13:
> > error: format '%lx' expects argument of type 'long unsigned int', but
> > argument 4 has type 'hwaddr' [-Werror=format=]
> > cc1: all warnings being treated as errors
> >CC  i386-softmmu/target/i386/gdbstub.o
> > /home/osstest/build.112041.build-i386/xen/tools/qemu-xen-dir/rules.mak:66:
> > recipe for target 'hw/i386/xen/xen-mapcache.o' failed
> > 
> > > +
> > > +xen_remap_bucket(entry, entry->vaddr_base,
> > > + cache_size, address_index, false);
> > > +if (!test_bits(address_offset >> XC_PAGE_SHIFT,
> > > +test_bit_size >> XC_PAGE_SHIFT,
> > > +entry->valid_mapping)) {
> > > +DPRINTF("Unable to update a mapcache entry for %lx!\n",
> > > old_phys_addr);
> > > +return NULL;
> > > +}
> > > +
> > > +return entry->vaddr_base + address_offset;
> > > +}
> > > +
> > 
> 
> Please, accept the attached patch to fix the issue.

The patch looks good to me. I'll send it upstream.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v1 04/13] xen/pvcalls: implement connect command

2017-07-21 Thread Stefano Stabellini
Send PVCALLS_CONNECT to the backend. Allocate a new ring and evtchn for
the active socket.

Introduce a data structure to keep track of sockets. Introduce a
waitqueue to allow the frontend to wait on data coming from the backend
on the active socket (recvmsg command).

Two mutexes (one of reads and one for writes) will be used to protect
the active socket in and out rings from concurrent accesses.

sock->sk->sk_send_head is not used for ip sockets: reuse the field to
store a pointer to the struct sock_mapping corresponding to the socket.
This way, we can easily get the struct sock_mapping from the struct
socket.

Convert the struct socket pointer into an uint64_t and use it as id for
the new socket to pass to the backend.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 153 
 drivers/xen/pvcalls-front.h |   2 +
 2 files changed, 155 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 7933c73..0d305e0 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -13,6 +13,8 @@
  */
 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -20,6 +22,8 @@
 #include 
 #include 
 
+#include 
+
 #define PVCALLS_INVALID_ID (UINT_MAX)
 #define RING_ORDER XENBUS_MAX_RING_GRANT_ORDER
 #define PVCALLS_NR_REQ_PER_RING __CONST_RING_SIZE(xen_pvcalls, XEN_PAGE_SIZE)
@@ -38,6 +42,24 @@ struct pvcalls_bedata {
 };
 struct xenbus_device *pvcalls_front_dev;
 
+struct sock_mapping {
+   bool active_socket;
+   struct list_head list;
+   struct socket *sock;
+   union {
+   struct {
+   int irq;
+   grant_ref_t ref;
+   struct pvcalls_data_intf *ring;
+   struct pvcalls_data data;
+   struct mutex in_mutex;
+   struct mutex out_mutex;
+
+   wait_queue_head_t inflight_conn_req;
+   } active;
+   };
+};
+
 static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id)
 {
struct xenbus_device *dev = dev_id;
@@ -80,6 +102,18 @@ static irqreturn_t pvcalls_front_event_handler(int irq, 
void *dev_id)
return IRQ_HANDLED;
 }
 
+static irqreturn_t pvcalls_front_conn_handler(int irq, void *sock_map)
+{
+   struct sock_mapping *map = sock_map;
+
+   if (map == NULL)
+   return IRQ_HANDLED;
+
+   wake_up_interruptible(>active.inflight_conn_req);
+
+   return IRQ_HANDLED;
+}
+
 int pvcalls_front_socket(struct socket *sock)
 {
struct pvcalls_bedata *bedata;
@@ -134,6 +168,125 @@ int pvcalls_front_socket(struct socket *sock)
return ret;
 }
 
+static struct sock_mapping *create_active(int *evtchn)
+{
+   struct sock_mapping *map = NULL;
+   void *bytes;
+   int ret, irq = -1, i;
+
+   map = kzalloc(sizeof(*map), GFP_KERNEL);
+   if (map == NULL)
+   return NULL;
+
+   init_waitqueue_head(>active.inflight_conn_req);
+
+   map->active.ring = (struct pvcalls_data_intf *)
+   __get_free_page(GFP_KERNEL | __GFP_ZERO);
+   if (map->active.ring == NULL)
+   goto out_error;
+   memset(map->active.ring, 0, XEN_PAGE_SIZE);
+   map->active.ring->ring_order = RING_ORDER;
+   bytes = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
+   map->active.ring->ring_order);
+   if (bytes == NULL)
+   goto out_error;
+   for (i = 0; i < (1 << map->active.ring->ring_order); i++)
+   map->active.ring->ref[i] = gnttab_grant_foreign_access(
+   pvcalls_front_dev->otherend_id,
+   pfn_to_gfn(virt_to_pfn(bytes) + i), 0);
+
+   map->active.ref = gnttab_grant_foreign_access(
+   pvcalls_front_dev->otherend_id,
+   pfn_to_gfn(virt_to_pfn((void *)map->active.ring)), 0);
+
+   ret = xenbus_alloc_evtchn(pvcalls_front_dev, evtchn);
+   if (ret)
+   goto out_error;
+   map->active.data.in = bytes;
+   map->active.data.out = bytes +
+   XEN_FLEX_RING_SIZE(map->active.ring->ring_order);
+   irq = bind_evtchn_to_irqhandler(*evtchn, pvcalls_front_conn_handler,
+   0, "pvcalls-frontend", map);
+   if (irq < 0)
+   goto out_error;
+
+   map->active.irq = irq;
+   map->active_socket = true;
+   mutex_init(>active.in_mutex);
+   mutex_init(>active.out_mutex);
+
+   return map;
+
+out_error:
+   if (irq >= 0)
+   unbind_from_irqhandler(irq, map);
+   else if (*evtchn >= 0)
+   xenbus_free_evtchn(pvcalls_front_dev, *evtchn);
+   kfree(map->active.data.in);
+   kfree(map->active.ring);
+   kfree(map);
+   return NULL;
+}
+
+int pvcalls_front_connect(struct socket *sock, 

[Xen-devel] [PATCH v1 09/13] xen/pvcalls: implement recvmsg

2017-07-21 Thread Stefano Stabellini
Implement recvmsg by copying data from the "in" ring. If not enough data
is available and the recvmsg call is blocking, then wait on the
inflight_conn_req waitqueue. Take the active socket in_mutex so that
only one function can access the ring at any given time.

If not enough data is available on the ring, rather than returning
immediately or sleep-waiting, spin for up to 5000 cycles. This small
optimization turns out to improve performance and latency significantly.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 106 
 drivers/xen/pvcalls-front.h |   4 ++
 2 files changed, 110 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index bf29f40..3d1041a 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -94,6 +94,20 @@ static int pvcalls_front_write_todo(struct sock_mapping *map)
return size - pvcalls_queued(prod, cons, size);
 }
 
+static int pvcalls_front_read_todo(struct sock_mapping *map)
+{
+   struct pvcalls_data_intf *intf = map->active.ring;
+   RING_IDX cons, prod;
+   int32_t error;
+
+   cons = intf->in_cons;
+   prod = intf->in_prod;
+   error = intf->in_error;
+   return (error != 0 ||
+   pvcalls_queued(prod, cons,
+  XEN_FLEX_RING_SIZE(intf->ring_order))) != 0;
+}
+
 static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id)
 {
struct xenbus_device *dev = dev_id;
@@ -413,6 +427,98 @@ int pvcalls_front_sendmsg(struct socket *sock, struct 
msghdr *msg,
return tot_sent;
 }
 
+static int __read_ring(struct pvcalls_data_intf *intf,
+  struct pvcalls_data *data,
+  struct iov_iter *msg_iter,
+  size_t len, int flags)
+{
+   RING_IDX cons, prod, size, masked_prod, masked_cons;
+   RING_IDX array_size = XEN_FLEX_RING_SIZE(intf->ring_order);
+   int32_t error;
+
+   cons = intf->in_cons;
+   prod = intf->in_prod;
+   error = intf->in_error;
+   /* get pointers before reading from the ring */
+   virt_rmb();
+   if (error < 0)
+   return error;
+
+   size = pvcalls_queued(prod, cons, array_size);
+   masked_prod = pvcalls_mask(prod, array_size);
+   masked_cons = pvcalls_mask(cons, array_size);
+
+   if (size == 0)
+   return 0;
+
+   if (len > size)
+   len = size;
+
+   if (masked_prod > masked_cons) {
+   copy_to_iter(data->in + masked_cons, len, msg_iter);
+   } else {
+   if (len > (array_size - masked_cons)) {
+   copy_to_iter(data->in + masked_cons,
+array_size - masked_cons, msg_iter);
+   copy_to_iter(data->in,
+len - (array_size - masked_cons),
+msg_iter);
+   } else {
+   copy_to_iter(data->in + masked_cons, len, msg_iter);
+   }
+   }
+   /* read data from the ring before increasing the index */
+   virt_mb();
+   if (!(flags & MSG_PEEK))
+   intf->in_cons += len;
+
+   return len;
+}
+
+int pvcalls_front_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+int flags)
+{
+   struct pvcalls_bedata *bedata;
+   int ret = -EAGAIN;
+   struct sock_mapping *map;
+   int count = 0;
+
+   if (!pvcalls_front_dev)
+   return -ENOTCONN;
+   bedata = dev_get_drvdata(_front_dev->dev);
+
+   map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head);
+   if (!map)
+   return -ENOTSOCK;
+
+   if (flags & (MSG_CMSG_CLOEXEC|MSG_ERRQUEUE|MSG_OOB|MSG_TRUNC))
+   return -EOPNOTSUPP;
+
+   mutex_lock(>active.in_mutex);
+   if (len > XEN_FLEX_RING_SIZE(map->active.ring->ring_order))
+   len = XEN_FLEX_RING_SIZE(map->active.ring->ring_order);
+
+   while (!(flags & MSG_DONTWAIT) && !pvcalls_front_read_todo(map)) {
+   if (count < PVCALLS_FRON_MAX_SPIN)
+   count++;
+   else
+   wait_event_interruptible(map->active.inflight_conn_req,
+pvcalls_front_read_todo(map));
+   }
+   ret = __read_ring(map->active.ring, >active.data,
+ >msg_iter, len, flags);
+
+   if (ret > 0)
+   notify_remote_via_irq(map->active.irq);
+   if (ret == 0)
+   ret = -EAGAIN;
+   if (ret == -ENOTCONN)
+   ret = 0;
+
+   mutex_unlock(>active.in_mutex);
+   return ret;
+}
+
 int pvcalls_front_bind(struct socket *sock, struct sockaddr *addr, int 
addr_len)
 {
struct pvcalls_bedata *bedata;
diff --git 

[Xen-devel] [PATCH v1 11/13] xen/pvcalls: implement release command

2017-07-21 Thread Stefano Stabellini
Send PVCALLS_RELEASE to the backend and wait for a reply. Take both
in_mutex and out_mutex to avoid concurrent accesses. Then, free the
socket.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 86 +
 drivers/xen/pvcalls-front.h |  1 +
 2 files changed, 87 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index b6cfb7d..bd3dfac 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -174,6 +174,24 @@ static irqreturn_t pvcalls_front_conn_handler(int irq, 
void *sock_map)
return IRQ_HANDLED;
 }
 
+static void pvcalls_front_free_map(struct pvcalls_bedata *bedata,
+  struct sock_mapping *map)
+{
+   int i;
+
+   spin_lock(>pvcallss_lock);
+   if (!list_empty(>list))
+   list_del_init(>list);
+   spin_unlock(>pvcallss_lock);
+
+   /* what if the thread waiting still need access? */
+   for (i = 0; i < (1 << map->active.ring->ring_order); i++)
+   gnttab_end_foreign_access(map->active.ring->ref[i], 0, 0);
+   gnttab_end_foreign_access(map->active.ref, 0, 0);
+   free_page((unsigned long)map->active.ring);
+   unbind_from_irqhandler(map->active.irq, map);
+}
+
 int pvcalls_front_socket(struct socket *sock)
 {
struct pvcalls_bedata *bedata;
@@ -805,6 +823,74 @@ unsigned int pvcalls_front_poll(struct file *file, struct 
socket *sock,
return pvcalls_front_poll_passive(file, bedata, map, wait);
 }
 
+int pvcalls_front_release(struct socket *sock)
+{
+   struct pvcalls_bedata *bedata;
+   struct sock_mapping *map;
+   int req_id, notify;
+   struct xen_pvcalls_request *req;
+
+   if (!pvcalls_front_dev)
+   return -EIO;
+   bedata = dev_get_drvdata(_front_dev->dev);
+   if (!bedata)
+   return -EIO;
+
+   if (sock->sk == NULL)
+   return 0;
+
+   map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head);
+   if (map == NULL)
+   return 0;
+   WRITE_ONCE(sock->sk->sk_send_head, NULL);
+
+   spin_lock(>pvcallss_lock);
+   req_id = bedata->ring.req_prod_pvt & (RING_SIZE(>ring) - 1);
+   BUG_ON(req_id >= PVCALLS_NR_REQ_PER_RING);
+   if (RING_FULL(>ring) ||
+   READ_ONCE(bedata->rsp[req_id].req_id) != PVCALLS_INVALID_ID) {
+   spin_unlock(>pvcallss_lock);
+   return -EAGAIN;
+   }
+   req = RING_GET_REQUEST(>ring, req_id);
+   req->req_id = req_id;
+   req->cmd = PVCALLS_RELEASE;
+   req->u.release.id = (uint64_t)sock;
+
+   bedata->ring.req_prod_pvt++;
+   RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>ring, notify);
+   spin_unlock(>pvcallss_lock);
+   if (notify)
+   notify_remote_via_irq(bedata->irq);
+
+   wait_event(bedata->inflight_req,
+   READ_ONCE(bedata->rsp[req_id].req_id) == req_id);
+
+   if (map->active_socket) {
+   /* 
+* Set in_error and wake up inflight_conn_req to force
+* recvmsg waiters to exit.
+*/
+   map->active.ring->in_error = -EBADF;
+   wake_up_interruptible(>active.inflight_conn_req);
+
+   mutex_lock(>active.in_mutex);
+   mutex_lock(>active.out_mutex);
+   pvcalls_front_free_map(bedata, map);
+   mutex_unlock(>active.out_mutex);
+   mutex_unlock(>active.in_mutex);
+   kfree(map);
+   } else {
+   spin_lock(>pvcallss_lock);
+   list_del_init(>list);
+   kfree(map);
+   spin_unlock(>pvcallss_lock);
+   }
+   WRITE_ONCE(bedata->rsp[req_id].req_id, PVCALLS_INVALID_ID);
+
+   return 0;
+}
+
 static const struct xenbus_device_id pvcalls_front_ids[] = {
{ "pvcalls" },
{ "" }
diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h
index 25e05b8..3332978 100644
--- a/drivers/xen/pvcalls-front.h
+++ b/drivers/xen/pvcalls-front.h
@@ -23,5 +23,6 @@ int pvcalls_front_recvmsg(struct socket *sock,
 unsigned int pvcalls_front_poll(struct file *file,
struct socket *sock,
poll_table *wait);
+int pvcalls_front_release(struct socket *sock);
 
 #endif
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v1 06/13] xen/pvcalls: implement listen command

2017-07-21 Thread Stefano Stabellini
Send PVCALLS_LISTEN to the backend.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 49 +
 drivers/xen/pvcalls-front.h |  1 +
 2 files changed, 50 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 71619bc..80fd5fb 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -361,6 +361,55 @@ int pvcalls_front_bind(struct socket *sock, struct 
sockaddr *addr, int addr_len)
return 0;
 }
 
+int pvcalls_front_listen(struct socket *sock, int backlog)
+{
+   struct pvcalls_bedata *bedata;
+   struct sock_mapping *map;
+   struct xen_pvcalls_request *req;
+   int notify, req_id, ret;
+
+   if (!pvcalls_front_dev)
+   return -ENOTCONN;
+   bedata = dev_get_drvdata(_front_dev->dev);
+
+   map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head);
+   if (!map)
+   return -ENOTSOCK;
+
+   if (map->passive.status != PVCALLS_STATUS_BIND)
+   return -EOPNOTSUPP;
+
+   spin_lock(>pvcallss_lock);
+   req_id = bedata->ring.req_prod_pvt & (RING_SIZE(>ring) - 1);
+   BUG_ON(req_id >= PVCALLS_NR_REQ_PER_RING);
+   if (RING_FULL(>ring) ||
+   bedata->rsp[req_id].req_id != PVCALLS_INVALID_ID) {
+   spin_unlock(>pvcallss_lock);
+   return -EAGAIN;
+   }
+   req = RING_GET_REQUEST(>ring, req_id);
+   req->req_id = req_id;
+   req->cmd = PVCALLS_LISTEN;
+   req->u.listen.id = (uint64_t) sock;
+   req->u.listen.backlog = backlog;
+
+   bedata->ring.req_prod_pvt++;
+   RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>ring, notify);
+   spin_unlock(>pvcallss_lock);
+   if (notify)
+   notify_remote_via_irq(bedata->irq);
+
+   wait_event(bedata->inflight_req,
+  READ_ONCE(bedata->rsp[req_id].req_id) == req_id);
+
+   map->passive.status = PVCALLS_STATUS_LISTEN;
+   ret = bedata->rsp[req_id].ret;
+   /* read ret, then set this rsp slot to be reused */
+   smp_mb();
+   WRITE_ONCE(bedata->rsp[req_id].req_id, PVCALLS_INVALID_ID);
+   return ret;
+}
+
 static const struct xenbus_device_id pvcalls_front_ids[] = {
{ "pvcalls" },
{ "" }
diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h
index 8b0a274..aa8fe10 100644
--- a/drivers/xen/pvcalls-front.h
+++ b/drivers/xen/pvcalls-front.h
@@ -9,5 +9,6 @@ int pvcalls_front_connect(struct socket *sock, struct sockaddr 
*addr,
 int pvcalls_front_bind(struct socket *sock,
   struct sockaddr *addr,
   int addr_len);
+int pvcalls_front_listen(struct socket *sock, int backlog);
 
 #endif
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v1 08/13] xen/pvcalls: implement sendmsg

2017-07-21 Thread Stefano Stabellini
Send data to an active socket by copying data to the "out" ring. Take
the active socket out_mutex so that only one function can access the
ring at any given time.

If not enough room is available on the ring, rather than returning
immediately or sleep-waiting, spin for up to 5000 cycles. This small
optimization turns out to improve performance significantly.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 109 
 drivers/xen/pvcalls-front.h |   3 ++
 2 files changed, 112 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index f3a04a2..bf29f40 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -27,6 +27,7 @@
 #define PVCALLS_INVALID_ID (UINT_MAX)
 #define RING_ORDER XENBUS_MAX_RING_GRANT_ORDER
 #define PVCALLS_NR_REQ_PER_RING __CONST_RING_SIZE(xen_pvcalls, XEN_PAGE_SIZE)
+#define PVCALLS_FRON_MAX_SPIN 5000
 
 struct pvcalls_bedata {
struct xen_pvcalls_front_ring ring;
@@ -77,6 +78,22 @@ struct sock_mapping {
};
 };
 
+static int pvcalls_front_write_todo(struct sock_mapping *map)
+{
+   struct pvcalls_data_intf *intf = map->active.ring;
+   RING_IDX cons, prod, size = XEN_FLEX_RING_SIZE(intf->ring_order);
+   int32_t error;
+
+   cons = intf->out_cons;
+   prod = intf->out_prod;
+   error = intf->out_error;
+   if (error == -ENOTCONN)
+   return 0;
+   if (error != 0)
+   return error;
+   return size - pvcalls_queued(prod, cons, size);
+}
+
 static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id)
 {
struct xenbus_device *dev = dev_id;
@@ -304,6 +321,98 @@ int pvcalls_front_connect(struct socket *sock, struct 
sockaddr *addr,
return ret;
 }
 
+static int __write_ring(struct pvcalls_data_intf *intf,
+   struct pvcalls_data *data,
+   struct iov_iter *msg_iter,
+   size_t len)
+{
+   RING_IDX cons, prod, size, masked_prod, masked_cons;
+   RING_IDX array_size = XEN_FLEX_RING_SIZE(intf->ring_order);
+   int32_t error;
+
+   cons = intf->out_cons;
+   prod = intf->out_prod;
+   error = intf->out_error;
+   /* read indexes before continuing */
+   virt_mb();
+
+   if (error < 0)
+   return error;
+
+   size = pvcalls_queued(prod, cons, array_size);
+   if (size >= array_size)
+   return 0;
+   if (len > array_size - size)
+   len = array_size - size;
+
+   masked_prod = pvcalls_mask(prod, array_size);
+   masked_cons = pvcalls_mask(cons, array_size);
+
+   if (masked_prod < masked_cons) {
+   copy_from_iter(data->out + masked_prod, len, msg_iter);
+   } else {
+   if (len > array_size - masked_prod) {
+   copy_from_iter(data->out + masked_prod,
+  array_size - masked_prod, msg_iter);
+   copy_from_iter(data->out,
+  len - (array_size - masked_prod),
+  msg_iter);
+   } else {
+   copy_from_iter(data->out + masked_prod, len, msg_iter);
+   }
+   }
+   /* write to ring before updating pointer */
+   virt_wmb();
+   intf->out_prod += len;
+
+   return len;
+}
+
+int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg,
+ size_t len)
+{
+   struct pvcalls_bedata *bedata;
+   struct sock_mapping *map;
+   int sent = 0, tot_sent = 0;
+   int count = 0, flags;
+
+   if (!pvcalls_front_dev)
+   return -ENOTCONN;
+   bedata = dev_get_drvdata(_front_dev->dev);
+
+   map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head);
+   if (!map)
+   return -ENOTSOCK;
+
+   flags = msg->msg_flags;
+   if (flags & (MSG_CONFIRM|MSG_DONTROUTE|MSG_EOR|MSG_OOB))
+   return -EOPNOTSUPP;
+
+   mutex_lock(>active.out_mutex);
+   if ((flags & MSG_DONTWAIT) && !pvcalls_front_write_todo(map)) {
+   mutex_unlock(>active.out_mutex);
+   return -EAGAIN;
+   }
+
+again:
+   count++;
+   sent = __write_ring(map->active.ring,
+   >active.data, >msg_iter,
+   len);
+   if (sent > 0) {
+   len -= sent;
+   tot_sent += sent;
+   notify_remote_via_irq(map->active.irq);
+   }
+   if (sent >= 0 && len > 0 && count < PVCALLS_FRON_MAX_SPIN)
+   goto again;
+   if (sent < 0)
+   tot_sent = sent;
+
+   mutex_unlock(>active.out_mutex);
+   return tot_sent;
+}
+
 int pvcalls_front_bind(struct socket *sock, struct sockaddr *addr, int 
addr_len)
 {
struct pvcalls_bedata *bedata;

[Xen-devel] [PATCH v1 13/13] xen: introduce a Kconfig option to enable the pvcalls frontend

2017-07-21 Thread Stefano Stabellini
Also add pvcalls-front to the Makefile.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/Kconfig  | 9 +
 drivers/xen/Makefile | 1 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 4545561..ea5e99f 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -196,6 +196,15 @@ config XEN_PCIDEV_BACKEND
 
  If in doubt, say m.
 
+config XEN_PVCALLS_FRONTEND
+   bool "XEN PV Calls frontend driver"
+   depends on INET && XEN
+   help
+ Experimental frontend for the Xen PV Calls protocol
+ (https://xenbits.xen.org/docs/unstable/misc/pvcalls.html). It
+ sends a small set of POSIX calls to the backend, which
+ implements them.
+
 config XEN_PVCALLS_BACKEND
bool "XEN PV Calls backend driver"
depends on INET && XEN && XEN_BACKEND
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 480b928..afb9e03 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -39,6 +39,7 @@ obj-$(CONFIG_XEN_EFI) += efi.o
 obj-$(CONFIG_XEN_SCSI_BACKEND) += xen-scsiback.o
 obj-$(CONFIG_XEN_AUTO_XLATE)   += xlate_mmu.o
 obj-$(CONFIG_XEN_PVCALLS_BACKEND)  += pvcalls-back.o
+obj-$(CONFIG_XEN_PVCALLS_FRONTEND) += pvcalls-front.o
 xen-evtchn-y   := evtchn.o
 xen-gntdev-y   := gntdev.o
 xen-gntalloc-y := gntalloc.o
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v1 00/13] introduce the Xen PV Calls frontend

2017-07-21 Thread Stefano Stabellini
Hi all,

this series introduces the frontend for the newly introduced PV Calls
procotol.

PV Calls is a paravirtualized protocol that allows the implementation of
a set of POSIX functions in a different domain. The PV Calls frontend
sends POSIX function calls to the backend, which implements them and
returns a value to the frontend and acts on the function call.

For more information about PV Calls, please read:

https://xenbits.xen.org/docs/unstable/misc/pvcalls.html

This patch series only implements the frontend driver. It doesn't
attempt to redirect POSIX calls to it. The functions exported in
pvcalls-front.h are meant to be used for that. A separate patch series
will be sent to use them and hook them into the system.


Stefano Stabellini (13):
  xen/pvcalls: introduce the pvcalls xenbus frontend
  xen/pvcalls: connect to the backend
  xen/pvcalls: implement socket command and handle events
  xen/pvcalls: implement connect command
  xen/pvcalls: implement bind command
  xen/pvcalls: implement listen command
  xen/pvcalls: implement accept command
  xen/pvcalls: implement sendmsg
  xen/pvcalls: implement recvmsg
  xen/pvcalls: implement poll command
  xen/pvcalls: implement release command
  xen/pvcalls: implement frontend disconnect
  xen: introduce a Kconfig option to enable the pvcalls frontend

 drivers/xen/Kconfig |9 +
 drivers/xen/Makefile|1 +
 drivers/xen/pvcalls-front.c | 1097 +++
 drivers/xen/pvcalls-front.h |   28 ++
 4 files changed, 1135 insertions(+)
 create mode 100644 drivers/xen/pvcalls-front.c
 create mode 100644 drivers/xen/pvcalls-front.h

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v1 05/13] xen/pvcalls: implement bind command

2017-07-21 Thread Stefano Stabellini
Send PVCALLS_BIND to the backend. Introduce a new structure, part of
struct sock_mapping, to store information specific to passive sockets.

Introduce a status field to keep track of the status of the passive
socket.

Introduce a waitqueue for the "accept" command (see the accept command
implementation): it is used to allow only one outstanding accept
command at any given time and to implement polling on the passive
socket. Introduce a flags field to keep track of in-flight accept and
poll commands.

sock->sk->sk_send_head is not used for ip sockets: reuse the field to
store a pointer to the struct sock_mapping corresponding to the socket.

Convert the struct socket pointer into an uint64_t and use it as id for
the socket to pass to the backend.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 74 +
 drivers/xen/pvcalls-front.h |  3 ++
 2 files changed, 77 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 0d305e0..71619bc 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -57,6 +57,23 @@ struct sock_mapping {
 
wait_queue_head_t inflight_conn_req;
} active;
+   struct {
+   /* Socket status */
+#define PVCALLS_STATUS_UNINITALIZED  0
+#define PVCALLS_STATUS_BIND  1
+#define PVCALLS_STATUS_LISTEN2
+   uint8_t status;
+   /*
+* Internal state-machine flags.
+* Only one accept operation can be inflight for a socket.
+* Only one poll operation can be inflight for a given socket.
+*/
+#define PVCALLS_FLAG_ACCEPT_INFLIGHT 0
+#define PVCALLS_FLAG_POLL_INFLIGHT   1
+#define PVCALLS_FLAG_POLL_RET2
+   uint8_t flags;
+   wait_queue_head_t inflight_accept_req;
+   } passive;
};
 };
 
@@ -287,6 +304,63 @@ int pvcalls_front_connect(struct socket *sock, struct 
sockaddr *addr,
return ret;
 }
 
+int pvcalls_front_bind(struct socket *sock, struct sockaddr *addr, int 
addr_len)
+{
+   struct pvcalls_bedata *bedata;
+   struct sock_mapping *map = NULL;
+   struct xen_pvcalls_request *req;
+   int notify, req_id, ret;
+
+   if (!pvcalls_front_dev)
+   return -ENOTCONN;
+   if (addr->sa_family != AF_INET || sock->type != SOCK_STREAM)
+   return -ENOTSUPP;
+   bedata = dev_get_drvdata(_front_dev->dev);
+
+   map = kzalloc(sizeof(*map), GFP_KERNEL);
+   if (map == NULL)
+   return -ENOMEM;
+
+   spin_lock(>pvcallss_lock);
+   req_id = bedata->ring.req_prod_pvt & (RING_SIZE(>ring) - 1);
+   BUG_ON(req_id >= PVCALLS_NR_REQ_PER_RING);
+   if (RING_FULL(>ring) ||
+   READ_ONCE(bedata->rsp[req_id].req_id) != PVCALLS_INVALID_ID) {
+   kfree(map);
+   spin_unlock(>pvcallss_lock);
+   return -EAGAIN;
+   }
+   req = RING_GET_REQUEST(>ring, req_id);
+   req->req_id = req_id;
+   map->sock = sock;
+   req->cmd = PVCALLS_BIND;
+   req->u.bind.id = (uint64_t) sock;
+   memcpy(req->u.bind.addr, addr, sizeof(*addr));
+   req->u.bind.len = addr_len;
+
+   init_waitqueue_head(>passive.inflight_accept_req);
+
+   list_add_tail(>list, >socketpass_mappings);
+   WRITE_ONCE(sock->sk->sk_send_head, (void *)map);
+   map->active_socket = false;
+
+   bedata->ring.req_prod_pvt++;
+   RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>ring, notify);
+   spin_unlock(>pvcallss_lock);
+   if (notify)
+   notify_remote_via_irq(bedata->irq);
+
+   wait_event(bedata->inflight_req,
+  READ_ONCE(bedata->rsp[req_id].req_id) == req_id);
+
+   map->passive.status = PVCALLS_STATUS_BIND;
+   ret = bedata->rsp[req_id].ret;
+   /* read ret, then set this rsp slot to be reused */
+   smp_mb();
+   WRITE_ONCE(bedata->rsp[req_id].req_id, PVCALLS_INVALID_ID);
+   return 0;
+}
+
 static const struct xenbus_device_id pvcalls_front_ids[] = {
{ "pvcalls" },
{ "" }
diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h
index 63b0417..8b0a274 100644
--- a/drivers/xen/pvcalls-front.h
+++ b/drivers/xen/pvcalls-front.h
@@ -6,5 +6,8 @@
 int pvcalls_front_socket(struct socket *sock);
 int pvcalls_front_connect(struct socket *sock, struct sockaddr *addr,
  int addr_len, int flags);
+int pvcalls_front_bind(struct socket *sock,
+  struct sockaddr *addr,
+  int addr_len);
 
 #endif
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v1 01/13] xen/pvcalls: introduce the pvcalls xenbus frontend

2017-07-21 Thread Stefano Stabellini
Introduce a xenbus frontend for the pvcalls protocol, as defined by
https://xenbits.xen.org/docs/unstable/misc/pvcalls.html.

This patch only adds the stubs, the code will be added by the following
patches.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 68 +
 1 file changed, 68 insertions(+)
 create mode 100644 drivers/xen/pvcalls-front.c

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
new file mode 100644
index 000..173e204
--- /dev/null
+++ b/drivers/xen/pvcalls-front.c
@@ -0,0 +1,68 @@
+/*
+ * (c) 2017 Stefano Stabellini 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static const struct xenbus_device_id pvcalls_front_ids[] = {
+   { "pvcalls" },
+   { "" }
+};
+
+static int pvcalls_front_remove(struct xenbus_device *dev)
+{
+   return 0;
+}
+
+static int pvcalls_front_probe(struct xenbus_device *dev,
+ const struct xenbus_device_id *id)
+{
+   return 0;
+}
+
+static int pvcalls_front_resume(struct xenbus_device *dev)
+{
+   dev_warn(>dev, "suspsend/resume unsupported\n");
+   return 0;
+}
+
+static void pvcalls_front_changed(struct xenbus_device *dev,
+   enum xenbus_state backend_state)
+{
+}
+
+static struct xenbus_driver pvcalls_front_driver = {
+   .ids = pvcalls_front_ids,
+   .probe = pvcalls_front_probe,
+   .remove = pvcalls_front_remove,
+   .resume = pvcalls_front_resume,
+   .otherend_changed = pvcalls_front_changed,
+};
+
+static int __init pvcalls_frontend_init(void)
+{
+   if (!xen_domain())
+   return -ENODEV;
+
+   pr_info("Initialising Xen pvcalls frontend driver\n");
+
+   return xenbus_register_frontend(_front_driver);
+}
+
+module_init(pvcalls_frontend_init);
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v1 10/13] xen/pvcalls: implement poll command

2017-07-21 Thread Stefano Stabellini
For active sockets, check the indexes and use the inflight_conn_req
waitqueue to wait.

For passive sockets, send PVCALLS_POLL to the backend. Use the
inflight_accept_req waitqueue if an accept is outstanding. Otherwise use
the inflight_req waitqueue: inflight_req is awaken when a new response
is received; on wakeup we check whether the POLL response is arrived by
looking at the PVCALLS_FLAG_POLL_RET flag. We set the flag from
pvcalls_front_event_handler, if the response was for a POLL command.

In pvcalls_front_event_handler, get the struct socket pointer from the
poll id (we previously converted struct socket* to uint64_t and used it
as id).

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 123 
 drivers/xen/pvcalls-front.h |   3 ++
 2 files changed, 115 insertions(+), 11 deletions(-)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 3d1041a..b6cfb7d 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -128,17 +128,29 @@ static irqreturn_t pvcalls_front_event_handler(int irq, 
void *dev_id)
rsp = RING_GET_RESPONSE(>ring, bedata->ring.rsp_cons);
 
req_id = rsp->req_id;
-   src = (uint8_t *)>rsp[req_id];
-   src += sizeof(rsp->req_id);
-   dst = (uint8_t *)rsp;
-   dst += sizeof(rsp->req_id);
-   memcpy(dst, src, sizeof(*rsp) - sizeof(rsp->req_id));
-   /*
-* First copy the rest of the data, then req_id. It is
-* paired with the barrier when accessing bedata->rsp.
-*/
-   smp_wmb();
-   WRITE_ONCE(bedata->rsp[req_id].req_id, rsp->req_id);
+   if (rsp->cmd == PVCALLS_POLL) {
+   struct socket *sock = (struct socket *) rsp->u.poll.id;
+   struct sock_mapping *map =
+   (struct sock_mapping *)
+   READ_ONCE(sock->sk->sk_send_head);
+
+   set_bit(PVCALLS_FLAG_POLL_RET,
+   (void *)>passive.flags);
+   clear_bit(PVCALLS_FLAG_POLL_INFLIGHT,
+ (void *)>passive.flags);
+   } else {
+   src = (uint8_t *)>rsp[req_id];
+   src += sizeof(rsp->req_id);
+   dst = (uint8_t *)rsp;
+   dst += sizeof(rsp->req_id);
+   memcpy(dst, src, sizeof(*rsp) - sizeof(rsp->req_id));
+   /*
+* First copy the rest of the data, then req_id. It is
+* paired with the barrier when accessing bedata->rsp.
+*/
+   smp_wmb();
+   WRITE_ONCE(bedata->rsp[req_id].req_id, rsp->req_id);
+   }
 
bedata->ring.rsp_cons++;
wake_up(>inflight_req);
@@ -704,6 +716,95 @@ int pvcalls_front_accept(struct socket *sock, struct 
socket *newsock, int flags)
return ret;
 }
 
+static unsigned int pvcalls_front_poll_passive(struct file *file,
+  struct pvcalls_bedata *bedata,
+  struct sock_mapping *map,
+  poll_table *wait)
+{
+   int notify, req_id;
+   struct xen_pvcalls_request *req;
+
+   if (test_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT,
+(void *)>passive.flags)) {
+   poll_wait(file, >passive.inflight_accept_req, wait);
+   return 0;
+   }
+
+   if (test_and_clear_bit(PVCALLS_FLAG_POLL_RET,
+  (void *)>passive.flags))
+   return POLLIN;
+
+   if (test_and_set_bit(PVCALLS_FLAG_POLL_INFLIGHT,
+(void *)>passive.flags)) {
+   poll_wait(file, >inflight_req, wait);
+   return 0;
+   }
+
+   spin_lock(>pvcallss_lock);
+   req_id = bedata->ring.req_prod_pvt & (RING_SIZE(>ring) - 1);
+   BUG_ON(req_id >= PVCALLS_NR_REQ_PER_RING);
+   if (RING_FULL(>ring) ||
+   READ_ONCE(bedata->rsp[req_id].req_id) != PVCALLS_INVALID_ID) {
+   spin_unlock(>pvcallss_lock);
+   return -EAGAIN;
+   }
+   req = RING_GET_REQUEST(>ring, req_id);
+   req->req_id = req_id;
+   req->cmd = PVCALLS_POLL;
+   req->u.poll.id = (uint64_t) map->sock;
+
+   bedata->ring.req_prod_pvt++;
+   RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>ring, notify);
+   spin_unlock(>pvcallss_lock);
+   if (notify)
+   notify_remote_via_irq(bedata->irq);
+
+   poll_wait(file, >inflight_req, wait);
+   return 0;
+}
+
+static unsigned int pvcalls_front_poll_active(struct file *file,
+

[Xen-devel] [PATCH v1 03/13] xen/pvcalls: implement socket command and handle events

2017-07-21 Thread Stefano Stabellini
Send a PVCALLS_SOCKET command to the backend, use the masked
req_prod_pvt as req_id. This way, req_id is guaranteed to be between 0
and PVCALLS_NR_REQ_PER_RING. We already have a slot in the rsp array
ready for the response, and there cannot be two outstanding responses
with the same req_id.

Wait for the response by waiting on the inflight_req waitqueue and
check for the req_id field in rsp[req_id]. Use atomic accesses to
read the field. Once a response is received, clear the corresponding rsp
slot by setting req_id to PVCALLS_INVALID_ID. Note that
PVCALLS_INVALID_ID is invalid only from the frontend point of view. It
is not part of the PVCalls protocol.

pvcalls_front_event_handler is in charge of copying responses from the
ring to the appropriate rsp slot. It is done by copying the body of the
response first, then by copying req_id atomically. After the copies,
wake up anybody waiting on waitqueue.

pvcallss_lock protects accesses to the ring.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 91 +
 drivers/xen/pvcalls-front.h |  8 
 2 files changed, 99 insertions(+)
 create mode 100644 drivers/xen/pvcalls-front.h

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index fb08ebf..7933c73 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -40,9 +40,100 @@ struct pvcalls_bedata {
 
 static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id)
 {
+   struct xenbus_device *dev = dev_id;
+   struct pvcalls_bedata *bedata;
+   struct xen_pvcalls_response *rsp;
+   uint8_t *src, *dst;
+   int req_id = 0, more = 0;
+
+   if (dev == NULL)
+   return IRQ_HANDLED;
+
+   bedata = dev_get_drvdata(>dev);
+   if (bedata == NULL)
+   return IRQ_HANDLED;
+
+again:
+   while (RING_HAS_UNCONSUMED_RESPONSES(>ring)) {
+   rsp = RING_GET_RESPONSE(>ring, bedata->ring.rsp_cons);
+
+   req_id = rsp->req_id;
+   src = (uint8_t *)>rsp[req_id];
+   src += sizeof(rsp->req_id);
+   dst = (uint8_t *)rsp;
+   dst += sizeof(rsp->req_id);
+   memcpy(dst, src, sizeof(*rsp) - sizeof(rsp->req_id));
+   /*
+* First copy the rest of the data, then req_id. It is
+* paired with the barrier when accessing bedata->rsp.
+*/
+   smp_wmb();
+   WRITE_ONCE(bedata->rsp[req_id].req_id, rsp->req_id);
+
+   bedata->ring.rsp_cons++;
+   wake_up(>inflight_req);
+   }
+
+   RING_FINAL_CHECK_FOR_RESPONSES(>ring, more);
+   if (more)
+   goto again;
return IRQ_HANDLED;
 }
 
+int pvcalls_front_socket(struct socket *sock)
+{
+   struct pvcalls_bedata *bedata;
+   struct xen_pvcalls_request *req;
+   int notify, req_id, ret;
+
+   if (!pvcalls_front_dev)
+   return -EACCES;
+   /*
+* PVCalls only supports domain AF_INET,
+* type SOCK_STREAM and protocol 0 sockets for now.
+*
+* Check socket type here, AF_INET and protocol checks are done
+* by the caller.
+*/
+   if (sock->type != SOCK_STREAM)
+   return -ENOTSUPP;
+
+   bedata = dev_get_drvdata(_front_dev->dev);
+
+   spin_lock(>pvcallss_lock);
+   req_id = bedata->ring.req_prod_pvt & (RING_SIZE(>ring) - 1);
+   BUG_ON(req_id >= PVCALLS_NR_REQ_PER_RING);
+   if (RING_FULL(>ring) ||
+   READ_ONCE(bedata->rsp[req_id].req_id) != PVCALLS_INVALID_ID) {
+   spin_unlock(>pvcallss_lock);
+   return -EAGAIN;
+   }
+   req = RING_GET_REQUEST(>ring, req_id);
+   req->req_id = req_id;
+   req->cmd = PVCALLS_SOCKET;
+   req->u.socket.id = (uint64_t) sock;
+   req->u.socket.domain = AF_INET;
+   req->u.socket.type = SOCK_STREAM;
+   req->u.socket.protocol = 0;
+
+   bedata->ring.req_prod_pvt++;
+   RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>ring, notify);
+   spin_unlock(>pvcallss_lock);
+   if (notify)
+   notify_remote_via_irq(bedata->irq);
+
+   if (wait_event_interruptible(bedata->inflight_req,
+   READ_ONCE(bedata->rsp[req_id].req_id) == req_id) != 0)
+   return -EINTR;
+
+   ret = bedata->rsp[req_id].ret;
+   /* read ret, then set this rsp slot to be reused */
+   smp_mb();
+   WRITE_ONCE(bedata->rsp[req_id].req_id, PVCALLS_INVALID_ID);
+
+   return ret;
+}
+
 static const struct xenbus_device_id pvcalls_front_ids[] = {
{ "pvcalls" },
{ "" }
diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h
new file mode 100644
index 000..b7dabed
--- /dev/null
+++ b/drivers/xen/pvcalls-front.h
@@ -0,0 +1,8 @@
+#ifndef __PVCALLS_FRONT_H__
+#define __PVCALLS_FRONT_H__
+
+#include 
+

[Xen-devel] [PATCH v1 02/13] xen/pvcalls: connect to the backend

2017-07-21 Thread Stefano Stabellini
Implement the probe function for the pvcalls frontend. Read the
supported versions, max-page-order and function-calls nodes from
xenstore.

Introduce a data structure named pvcalls_bedata. It contains pointers to
the command ring, the event channel, a list of active sockets and a list
of passive sockets. Lists accesses are protected by a spin_lock.

Introduce a waitqueue to allow waiting for a response on commands sent
to the backend.

Introduce an array of struct xen_pvcalls_response to store commands
responses.

Only one frontend<->backend connection is supported at any given time
for a guest. Store the active frontend device to a static pointer.

Introduce a stub functions for the event handler.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 153 
 1 file changed, 153 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 173e204..fb08ebf 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -20,6 +20,29 @@
 #include 
 #include 
 
+#define PVCALLS_INVALID_ID (UINT_MAX)
+#define RING_ORDER XENBUS_MAX_RING_GRANT_ORDER
+#define PVCALLS_NR_REQ_PER_RING __CONST_RING_SIZE(xen_pvcalls, XEN_PAGE_SIZE)
+
+struct pvcalls_bedata {
+   struct xen_pvcalls_front_ring ring;
+   grant_ref_t ref;
+   int irq;
+
+   struct list_head socket_mappings;
+   struct list_head socketpass_mappings;
+   spinlock_t pvcallss_lock;
+
+   wait_queue_head_t inflight_req;
+   struct xen_pvcalls_response rsp[PVCALLS_NR_REQ_PER_RING];
+};
+struct xenbus_device *pvcalls_front_dev;
+
+static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id)
+{
+   return IRQ_HANDLED;
+}
+
 static const struct xenbus_device_id pvcalls_front_ids[] = {
{ "pvcalls" },
{ "" }
@@ -33,7 +56,114 @@ static int pvcalls_front_remove(struct xenbus_device *dev)
 static int pvcalls_front_probe(struct xenbus_device *dev,
  const struct xenbus_device_id *id)
 {
+   int ret = -EFAULT, evtchn, ref = -1, i;
+   unsigned int max_page_order, function_calls, len;
+   char *versions;
+   grant_ref_t gref_head = 0;
+   struct xenbus_transaction xbt;
+   struct pvcalls_bedata *bedata = NULL;
+   struct xen_pvcalls_sring *sring;
+
+   if (pvcalls_front_dev != NULL) {
+   dev_err(>dev, "only one PV Calls connection supported\n");
+   return -EINVAL;
+   }
+
+   versions = xenbus_read(XBT_NIL, dev->otherend, "versions", );
+   if (!len)
+   return -EINVAL;
+   if (strcmp(versions, "1")) {
+   kfree(versions);
+   return -EINVAL;
+   }
+   kfree(versions);
+   ret = xenbus_scanf(XBT_NIL, dev->otherend,
+  "max-page-order", "%u", _page_order);
+   if (ret <= 0)
+   return -ENODEV;
+   if (max_page_order < RING_ORDER)
+   return -ENODEV;
+   ret = xenbus_scanf(XBT_NIL, dev->otherend,
+  "function-calls", "%u", _calls);
+   if (ret <= 0 || function_calls != 1)
+   return -ENODEV;
+   pr_info("%s max-page-order is %u\n", __func__, max_page_order);
+
+   bedata = kzalloc(sizeof(struct pvcalls_bedata), GFP_KERNEL);
+   if (!bedata)
+   return -ENOMEM;
+
+   init_waitqueue_head(>inflight_req);
+   for (i = 0; i < PVCALLS_NR_REQ_PER_RING; i++)
+   bedata->rsp[i].req_id = PVCALLS_INVALID_ID;
+
+   sring = (struct xen_pvcalls_sring *) __get_free_page(GFP_KERNEL |
+__GFP_ZERO);
+   if (!sring)
+   goto error;
+   SHARED_RING_INIT(sring);
+   FRONT_RING_INIT(>ring, sring, XEN_PAGE_SIZE);
+
+   ret = xenbus_alloc_evtchn(dev, );
+   if (ret)
+   goto error;
+
+   bedata->irq = bind_evtchn_to_irqhandler(evtchn,
+   pvcalls_front_event_handler,
+   0, "pvcalls-frontend", dev);
+   if (bedata->irq < 0) {
+   ret = bedata->irq;
+   goto error;
+   }
+
+   ret = gnttab_alloc_grant_references(1, _head);
+   if (ret < 0)
+   goto error;
+   bedata->ref = ref = gnttab_claim_grant_reference(_head);
+   if (ref < 0)
+   goto error;
+   gnttab_grant_foreign_access_ref(ref, dev->otherend_id,
+   virt_to_gfn((void *)sring), 0);
+
+ again:
+   ret = xenbus_transaction_start();
+   if (ret) {
+   xenbus_dev_fatal(dev, ret, "starting transaction");
+   goto error;
+   }
+   ret = xenbus_printf(xbt, dev->nodename, "version", "%u", 1);
+   if (ret)
+   goto error_xenbus;
+   ret = xenbus_printf(xbt, 

[Xen-devel] [PATCH v1 07/13] xen/pvcalls: implement accept command

2017-07-21 Thread Stefano Stabellini
Send PVCALLS_ACCEPT to the backend. Allocate a new active socket. Make
sure that only one accept command is executed at any given time by
setting PVCALLS_FLAG_ACCEPT_INFLIGHT and waiting on the
inflight_accept_req waitqueue.

sock->sk->sk_send_head is not used for ip sockets: reuse the field to
store a pointer to the struct sock_mapping corresponding to the socket.

Convert the new struct socket pointer into an uint64_t and use it as id
for the new socket to pass to the backend.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 79 +
 drivers/xen/pvcalls-front.h |  3 ++
 2 files changed, 82 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 80fd5fb..f3a04a2 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -410,6 +410,85 @@ int pvcalls_front_listen(struct socket *sock, int backlog)
return ret;
 }
 
+int pvcalls_front_accept(struct socket *sock, struct socket *newsock, int 
flags)
+{
+   struct pvcalls_bedata *bedata;
+   struct sock_mapping *map;
+   struct sock_mapping *map2 = NULL;
+   struct xen_pvcalls_request *req;
+   int notify, req_id, ret, evtchn;
+
+   if (!pvcalls_front_dev)
+   return -ENOTCONN;
+   bedata = dev_get_drvdata(_front_dev->dev);
+
+   map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head);
+   if (!map)
+   return -ENOTSOCK;
+
+   if (map->passive.status != PVCALLS_STATUS_LISTEN)
+   return -EINVAL;
+
+   /*
+* Backend only supports 1 inflight accept request, will return
+* errors for the others
+*/
+   if (test_and_set_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT,
+(void *)>passive.flags)) {
+   if (wait_event_interruptible(map->passive.inflight_accept_req,
+   !test_and_set_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT,
+ (void *)>passive.flags))
+   != 0)
+   return -EINTR;
+   }
+
+
+   newsock->sk = kzalloc(sizeof(*newsock->sk), GFP_KERNEL);
+   if (newsock->sk == NULL)
+   return -ENOMEM;
+
+   spin_lock(>pvcallss_lock);
+   req_id = bedata->ring.req_prod_pvt & (RING_SIZE(>ring) - 1);
+   BUG_ON(req_id >= PVCALLS_NR_REQ_PER_RING);
+   if (RING_FULL(>ring) ||
+   READ_ONCE(bedata->rsp[req_id].req_id) != PVCALLS_INVALID_ID) {
+   spin_unlock(>pvcallss_lock);
+   return -EAGAIN;
+   }
+
+   map2 = create_active();
+
+   req = RING_GET_REQUEST(>ring, req_id);
+   req->req_id = req_id;
+   req->cmd = PVCALLS_ACCEPT;
+   req->u.accept.id = (uint64_t) sock;
+   req->u.accept.ref = map2->active.ref;
+   req->u.accept.id_new = (uint64_t) newsock;
+   req->u.accept.evtchn = evtchn;
+
+   list_add_tail(>list, >socket_mappings);
+   WRITE_ONCE(newsock->sk->sk_send_head, (void *)map2);
+   map2->sock = newsock;
+
+   bedata->ring.req_prod_pvt++;
+   RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>ring, notify);
+   spin_unlock(>pvcallss_lock);
+   if (notify)
+   notify_remote_via_irq(bedata->irq);
+
+   wait_event(bedata->inflight_req,
+  READ_ONCE(bedata->rsp[req_id].req_id) == req_id);
+
+   clear_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT, (void *)>passive.flags);
+   wake_up(>passive.inflight_accept_req);
+
+   ret = bedata->rsp[req_id].ret;
+   /* read ret, then set this rsp slot to be reused */
+   smp_mb();
+   WRITE_ONCE(bedata->rsp[req_id].req_id, PVCALLS_INVALID_ID);
+   return ret;
+}
+
 static const struct xenbus_device_id pvcalls_front_ids[] = {
{ "pvcalls" },
{ "" }
diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h
index aa8fe10..ab4f1da 100644
--- a/drivers/xen/pvcalls-front.h
+++ b/drivers/xen/pvcalls-front.h
@@ -10,5 +10,8 @@ int pvcalls_front_bind(struct socket *sock,
   struct sockaddr *addr,
   int addr_len);
 int pvcalls_front_listen(struct socket *sock, int backlog);
+int pvcalls_front_accept(struct socket *sock,
+struct socket *newsock,
+int flags);
 
 #endif
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v1 12/13] xen/pvcalls: implement frontend disconnect

2017-07-21 Thread Stefano Stabellini
Implement pvcalls frontend removal function. Go through the list of
active and passive sockets and free them all, one at a time.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index bd3dfac..fcc15fb 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -898,6 +898,34 @@ int pvcalls_front_release(struct socket *sock)
 
 static int pvcalls_front_remove(struct xenbus_device *dev)
 {
+   struct pvcalls_bedata *bedata;
+   struct sock_mapping *map = NULL, *n;
+
+   bedata = dev_get_drvdata(_front_dev->dev);
+
+   list_for_each_entry_safe(map, n, >socket_mappings, list) {
+   mutex_lock(>active.in_mutex);
+   mutex_lock(>active.out_mutex);
+   pvcalls_front_free_map(bedata, map);
+   mutex_unlock(>active.out_mutex);
+   mutex_unlock(>active.in_mutex);
+   kfree(map);
+   }
+   list_for_each_entry_safe(map, n, >socketpass_mappings, list) {
+   spin_lock(>pvcallss_lock);
+   list_del_init(>list);
+   spin_unlock(>pvcallss_lock);
+   kfree(map);
+   }
+   if (bedata->irq > 0)
+   unbind_from_irqhandler(bedata->irq, dev);
+   if (bedata->ref >= 0)
+   gnttab_end_foreign_access(bedata->ref, 0, 0);
+   kfree(bedata->ring.sring);
+   kfree(bedata);
+   dev_set_drvdata(>dev, NULL);
+   xenbus_switch_state(dev, XenbusStateClosed);
+   pvcalls_front_dev = NULL;
return 0;
 }
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Question about hvm_monitor_interrupt

2017-07-21 Thread Razvan Cojocaru
On 07/22/2017 12:33 AM, Tamas K Lengyel wrote:
> Hey Razvan,

Hello,

> the vm_event that is being generated by doing
> VM_EVENT_FLAG_GET_NEXT_INTERRUPT sends almost all required information
> about the interrupt to the listener to allow it to get reinjected,
> except the instruction length. If the listener wants to reinject the
> interrupt to the guest via xc_hvm_inject_trap the instruction length
> is something needing to be specified. So shouldn't that information be
> included in the vm_event?

We only care about requesting guest page faults (TRAP_page_fault), so
that we may be able to inspect things like swapped-out pages, so for
that purpose the instruction length is not necessary. Having said that,
there's nothing against adding the instruction length to the vm_event if
you need it.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Question about hvm_monitor_interrupt

2017-07-21 Thread Tamas K Lengyel
Hey Razvan,
the vm_event that is being generated by doing
VM_EVENT_FLAG_GET_NEXT_INTERRUPT sends almost all required information
about the interrupt to the listener to allow it to get reinjected,
except the instruction length. If the listener wants to reinject the
interrupt to the guest via xc_hvm_inject_trap the instruction length
is something needing to be specified. So shouldn't that information be
included in the vm_event?

Thanks,
Tamas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [libvirt test] 112081: tolerable all pass - PUSHED

2017-07-21 Thread osstest service owner
flight 112081 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112081/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 112036
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 112036
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 112036
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-libvirt-qcow2 12 migrate-support-checkfail never pass
 test-arm64-arm64-libvirt-qcow2 13 saverestore-support-checkfail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  e04d1074f801a211e2767545e2816cc98d820dd3
baseline version:
 libvirt  9af764e86aef7dfb0191a9561bf1d1abf941da05

Last test of basis   112036  2017-07-20 04:21:29 Z1 days
Testing same since   112081  2017-07-21 04:21:50 Z0 days1 attempts


People who touched revisions under test:
  Antoine Millet 
  Chen Hanxiao 
  Cole Robinson 
  Erik Skultety 
  Hao Peng 
  John Ferlan 
  Michal Privoznik 
  Pavel Hrdina 
  Peng Hao 
  Peter Krempa 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-arm64-arm64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-arm64-arm64-libvirt pass
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-arm64-arm64-libvirt-qcow2   pass
 test-armhf-armhf-libvirt-raw pass
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of 

Re: [Xen-devel] [GIT PULL] xen: features and fixes for 4.13-rc2

2017-07-21 Thread Linus Torvalds
On Fri, Jul 21, 2017 at 3:17 AM, Juergen Gross  wrote:
>  drivers/xen/pvcalls-back.c | 1236 
> 

This really doesn't look like a fix.

The merge window is over.

So I'm not pulling this without way more explanations of why I should.

 Linus

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen: selfballoon: remove unnecessary static in frontswap_selfshrink()

2017-07-21 Thread Gustavo A. R. Silva

Hi Juergen,

On 07/21/2017 02:36 AM, Juergen Gross wrote:

On 04/07/17 20:34, Gustavo A. R. Silva wrote:

Remove unnecessary static on local variables last_frontswap_pages and
tgt_frontswap_pages. Such variables are initialized before being used,
on every execution path throughout the function. The statics have no
benefit and, removing them reduce the code size.

This issue was detected using Coccinelle and the following semantic patch:

@bad exists@
position p;
identifier x;
type T;
@@

static T x@p;
...
x = <+...x...+>

@@
identifier x;
expression e;
type T;
position p != bad.p;
@@

-static
 T x@p;
 ... when != x
 when strict
?x = e;

You can see a significant difference in the code size after executing
the size command, before and after the code change:

before:
   textdata bss dec hex filename
   56333452 384946924fd drivers/xen/xen-selfballoon.o

after:
   textdata bss dec hex filename
   55763308 256914023b4 drivers/xen/xen-selfballoon.o

Signed-off-by: Gustavo A. R. Silva 


Reviewed-by: Juergen Gross 



Thank you!

--
Gustavo A. R. Silva

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 6/6] xen: sched: optimize exclusive pinning case (Credit1 & 2)

2017-07-21 Thread George Dunlap
On Fri, Jul 21, 2017 at 8:55 PM, Dario Faggioli
 wrote:
> On Fri, 2017-07-21 at 18:19 +0100, George Dunlap wrote:
>> On 06/23/2017 11:55 AM, Dario Faggioli wrote:
>> > diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
>> > index 4f6330e..85e014d 100644
>> > --- a/xen/common/sched_credit.c
>> > +++ b/xen/common/sched_credit.c
>> > @@ -429,6 +429,24 @@ static inline void __runq_tickle(struct
>> > csched_vcpu *new)
>> >  idlers_empty = cpumask_empty(_mask);
>> >
>> >  /*
>> > + * Exclusive pinning is when a vcpu has hard-affinity with
>> > only one
>> > + * cpu, and there is no other vcpu that has hard-affinity with
>> > that
>> > + * same cpu. This is infrequent, but if it happens, is for
>> > achieving
>> > + * the most possible determinism, and least possible overhead
>> > for
>> > + * the vcpus in question.
>> > + *
>> > + * Try to identify the vast majority of these situations, and
>> > deal
>> > + * with them quickly.
>> > + */
>> > +if ( unlikely(cpumask_cycle(cpu, new->vcpu->cpu_hard_affinity)
>> > == cpu &&
>>
>> Won't this check entail a full "loop" of the cpumask?  It's cheap
>> enough
>> if nr_cpu_ids is small; but don't we support (theoretically) 4096
>> logical cpus?
>>
>> It seems like having a vcpu flag that identifies a vcpu as being
>> pinned
>> would be a more efficient way to do this.  That way we could run this
>> check once whenever the hard affinity changed, rather than every time
>> we
>> want to think about where to run this vcpu.
>>
>> What do you think?
>>
> Right. We actually should get some help from the hardware (ffs &
> firends)... but I think you're right. Implementing this with a flag, as
>  you're suggesting, is most likely better, and easy enough.
>
> I'll go for that!

Cool.  BTW I checked the first 5 in.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 17/22] ARM: vGIC: introduce vgic_lock_vcpu_irq()

2017-07-21 Thread Andre Przywara
Since a VCPU can own multiple IRQs, the natural locking order is to take
a VCPU lock first, then the individual per-IRQ locks.
However there are situations where the target VCPU is not known without
looking into the struct pending_irq first, which usually means we need to
take the IRQ lock first.
To solve this problem, we provide a function called vgic_lock_vcpu_irq(),
which takes a locked struct pending_irq() and returns with *both* the
VCPU and the IRQ lock held.
This is done by looking up the target VCPU, then briefly dropping the
IRQ lock, taking the VCPU lock, then grabbing the per-IRQ lock again.
Before returning there is a check whether something has changed in the
brief period where we didn't hold the IRQ lock, retrying in this (very
rare) case.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/vgic.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 1ba0010..0e6dfe5 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -224,6 +224,48 @@ int vcpu_vgic_free(struct vcpu *v)
 return 0;
 }
 
+/**
+ * vgic_lock_vcpu_irq(): lock both the pending_irq and the corresponding VCPU
+ *
+ * @v: the VCPU (for private IRQs)
+ * @p: pointer to the locked struct pending_irq
+ * @flags: pointer to the IRQ flags used when locking the VCPU
+ *
+ * The function takes a locked IRQ and returns with both the IRQ and the
+ * corresponding VCPU locked. This is non-trivial due to the locking order
+ * being actually the other way round (VCPU first, then IRQ).
+ *
+ * Returns: pointer to the VCPU this IRQ is targeting.
+ */
+struct vcpu *vgic_lock_vcpu_irq(struct vcpu *v, struct pending_irq *p,
+unsigned long *flags)
+{
+struct vcpu *target_vcpu;
+
+ASSERT(spin_is_locked(>lock));
+
+target_vcpu = vgic_get_target_vcpu(v, p);
+spin_unlock(>lock);
+
+do
+{
+struct vcpu *current_vcpu;
+
+spin_lock_irqsave(_vcpu->arch.vgic.lock, *flags);
+spin_lock(>lock);
+
+current_vcpu = vgic_get_target_vcpu(v, p);
+
+if ( target_vcpu->vcpu_id == current_vcpu->vcpu_id )
+return target_vcpu;
+
+spin_unlock(>lock);
+spin_unlock_irqrestore(_vcpu->arch.vgic.lock, *flags);
+
+target_vcpu = current_vcpu;
+} while (1);
+}
+
 struct vcpu *vgic_get_target_vcpu(struct vcpu *v, struct pending_irq *p)
 {
 struct vgic_irq_rank *rank = vgic_rank_irq(v, p->irq);
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 18/22] ARM: vGIC: move virtual IRQ target VCPU from rank to pending_irq

2017-07-21 Thread Andre Przywara
The VCPU a shared virtual IRQ is targeting is currently stored in the
irq_rank structure.
For LPIs we already store the target VCPU in struct pending_irq, so
move SPIs over as well.
The ITS code, which was using this field already, was so far using the
VCPU lock to protect the pending_irq, so move this over to the new lock.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/vgic-v2.c | 56 +++
 xen/arch/arm/vgic-v3-its.c |  9 +++---
 xen/arch/arm/vgic-v3.c | 69 ---
 xen/arch/arm/vgic.c| 73 +-
 xen/include/asm-arm/vgic.h | 13 +++--
 5 files changed, 96 insertions(+), 124 deletions(-)

diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
index 0c8a598..c7ed3ce 100644
--- a/xen/arch/arm/vgic-v2.c
+++ b/xen/arch/arm/vgic-v2.c
@@ -66,19 +66,22 @@ void vgic_v2_setup_hw(paddr_t dbase, paddr_t cbase, paddr_t 
csize,
  *
  * Note the byte offset will be aligned to an ITARGETSR boundary.
  */
-static uint32_t vgic_fetch_itargetsr(struct vgic_irq_rank *rank,
- unsigned int offset)
+static uint32_t vgic_fetch_itargetsr(struct vcpu *v, unsigned int offset)
 {
 uint32_t reg = 0;
 unsigned int i;
+unsigned long flags;
 
-ASSERT(spin_is_locked(>lock));
-
-offset &= INTERRUPT_RANK_MASK;
 offset &= ~(NR_TARGETS_PER_ITARGETSR - 1);
 
 for ( i = 0; i < NR_TARGETS_PER_ITARGETSR; i++, offset++ )
-reg |= (1 << read_atomic(>vcpu[offset])) << (i * 
NR_BITS_PER_TARGET);
+{
+struct pending_irq *p = irq_to_pending(v, offset);
+
+vgic_irq_lock(p, flags);
+reg |= (1 << p->vcpu_id) << (i * NR_BITS_PER_TARGET);
+vgic_irq_unlock(p, flags);
+}
 
 return reg;
 }
@@ -89,32 +92,29 @@ static uint32_t vgic_fetch_itargetsr(struct vgic_irq_rank 
*rank,
  *
  * Note the byte offset will be aligned to an ITARGETSR boundary.
  */
-static void vgic_store_itargetsr(struct domain *d, struct vgic_irq_rank *rank,
+static void vgic_store_itargetsr(struct domain *d,
  unsigned int offset, uint32_t itargetsr)
 {
 unsigned int i;
 unsigned int virq;
 
-ASSERT(spin_is_locked(>lock));
-
 /*
  * The ITARGETSR0-7, used for SGIs/PPIs, are implemented RO in the
  * emulation and should never call this function.
  *
- * They all live in the first rank.
+ * They all live in the first four bytes of ITARGETSR.
  */
-BUILD_BUG_ON(NR_INTERRUPT_PER_RANK != 32);
-ASSERT(rank->index >= 1);
+ASSERT(offset >= 4);
 
-offset &= INTERRUPT_RANK_MASK;
+virq = offset;
 offset &= ~(NR_TARGETS_PER_ITARGETSR - 1);
 
-virq = rank->index * NR_INTERRUPT_PER_RANK + offset;
-
 for ( i = 0; i < NR_TARGETS_PER_ITARGETSR; i++, offset++, virq++ )
 {
 unsigned int new_target, old_target;
+unsigned long flags;
 uint8_t new_mask;
+struct pending_irq *p = spi_to_pending(d, virq);
 
 /*
  * Don't need to mask as we rely on new_mask to fit for only one
@@ -151,16 +151,14 @@ static void vgic_store_itargetsr(struct domain *d, struct 
vgic_irq_rank *rank,
 /* The vCPU ID always starts from 0 */
 new_target--;
 
-old_target = read_atomic(>vcpu[offset]);
+vgic_irq_lock(p, flags);
+old_target = p->vcpu_id;
 
 /* Only migrate the vIRQ if the target vCPU has changed */
 if ( new_target != old_target )
-{
-if ( vgic_migrate_irq(d->vcpu[old_target],
- d->vcpu[new_target],
- virq) )
-write_atomic(>vcpu[offset], new_target);
-}
+vgic_migrate_irq(p, , d->vcpu[new_target]);
+else
+vgic_irq_unlock(p, flags);
 }
 }
 
@@ -264,11 +262,7 @@ static int vgic_v2_distr_mmio_read(struct vcpu *v, 
mmio_info_t *info,
 uint32_t itargetsr;
 
 if ( dabt.size != DABT_BYTE && dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR, DABT_WORD);
-if ( rank == NULL) goto read_as_zero;
-vgic_lock_rank(v, rank, flags);
-itargetsr = vgic_fetch_itargetsr(rank, gicd_reg - GICD_ITARGETSR);
-vgic_unlock_rank(v, rank, flags);
+itargetsr = vgic_fetch_itargetsr(v, gicd_reg - GICD_ITARGETSR);
 *r = vreg_reg32_extract(itargetsr, info);
 
 return 1;
@@ -498,14 +492,10 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, 
mmio_info_t *info,
 uint32_t itargetsr;
 
 if ( dabt.size != DABT_BYTE && dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR, DABT_WORD);
-if ( rank == NULL) goto write_ignore;
-vgic_lock_rank(v, rank, flags);
-itargetsr = vgic_fetch_itargetsr(rank, gicd_reg - GICD_ITARGETSR);
+

[Xen-devel] [RFC PATCH v2 21/22] ARM: vITS: injecting LPIs: use pending_irq lock

2017-07-21 Thread Andre Przywara
Instead of using an atomic access and hoping for the best, let's use
the new pending_irq lock now to make sure we read a sane version of
the target VCPU.
That still doesn't solve the problem mentioned in the comment, but
paves the way for future improvements.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/gic-v3-lpi.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index 2306b58..9db26ed 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -140,20 +140,22 @@ void vgic_vcpu_inject_lpi(struct domain *d, unsigned int 
virq)
 {
 /*
  * TODO: this assumes that the struct pending_irq stays valid all of
- * the time. We cannot properly protect this with the current locking
- * scheme, but the future per-IRQ lock will solve this problem.
+ * the time. We cannot properly protect this with the current code,
+ * but a future refcounting will solve this problem.
  */
 struct pending_irq *p = irq_to_pending(d->vcpu[0], virq);
+unsigned long flags;
 unsigned int vcpu_id;
 
 if ( !p )
 return;
 
-vcpu_id = ACCESS_ONCE(p->vcpu_id);
-if ( vcpu_id >= d->max_vcpus )
-  return;
+vgic_irq_lock(p, flags);
+vcpu_id = p->vcpu_id;
+vgic_irq_unlock(p, flags);
 
-vgic_vcpu_inject_irq(d->vcpu[vcpu_id], virq);
+if ( vcpu_id < d->max_vcpus )
+vgic_vcpu_inject_irq(d->vcpu[vcpu_id], virq);
 }
 
 /*
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 09/22] ARM: vITS: protect LPI priority update with pending_irq lock

2017-07-21 Thread Andre Przywara
As the priority value is now officially a member of struct pending_irq,
we need to take its lock when manipulating it via ITS commands.
Make sure we take the IRQ lock after the VCPU lock when we need both.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/vgic-v3-its.c | 26 +++---
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 66095d4..705708a 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -402,6 +402,7 @@ static int update_lpi_property(struct domain *d, struct 
pending_irq *p)
 uint8_t property;
 int ret;
 
+ASSERT(spin_is_locked(>lock));
 /*
  * If no redistributor has its LPIs enabled yet, we can't access the
  * property table. In this case we just can't update the properties,
@@ -419,7 +420,7 @@ static int update_lpi_property(struct domain *d, struct 
pending_irq *p)
 if ( ret )
 return ret;
 
-write_atomic(>priority, property & LPI_PROP_PRIO_MASK);
+p->priority = property & LPI_PROP_PRIO_MASK;
 
 if ( property & LPI_PROP_ENABLED )
 set_bit(GIC_IRQ_GUEST_ENABLED, >status);
@@ -457,7 +458,7 @@ static int its_handle_inv(struct virt_its *its, uint64_t 
*cmdptr)
 uint32_t devid = its_cmd_get_deviceid(cmdptr);
 uint32_t eventid = its_cmd_get_id(cmdptr);
 struct pending_irq *p;
-unsigned long flags;
+unsigned long flags, vcpu_flags;
 struct vcpu *vcpu;
 uint32_t vlpi;
 int ret = -1;
@@ -485,7 +486,8 @@ static int its_handle_inv(struct virt_its *its, uint64_t 
*cmdptr)
 if ( unlikely(!p) )
 goto out_unlock_its;
 
-spin_lock_irqsave(>arch.vgic.lock, flags);
+spin_lock_irqsave(>arch.vgic.lock, vcpu_flags);
+vgic_irq_lock(p, flags);
 
 /* Read the property table and update our cached status. */
 if ( update_lpi_property(d, p) )
@@ -497,7 +499,8 @@ static int its_handle_inv(struct virt_its *its, uint64_t 
*cmdptr)
 ret = 0;
 
 out_unlock:
-spin_unlock_irqrestore(>arch.vgic.lock, flags);
+vgic_irq_unlock(p, flags);
+spin_unlock_irqrestore(>arch.vgic.lock, vcpu_flags);
 
 out_unlock_its:
 spin_unlock(>its_lock);
@@ -517,7 +520,7 @@ static int its_handle_invall(struct virt_its *its, uint64_t 
*cmdptr)
 struct pending_irq *pirqs[16];
 uint64_t vlpi = 0;  /* 64-bit to catch overflows */
 unsigned int nr_lpis, i;
-unsigned long flags;
+unsigned long flags, vcpu_flags;
 int ret = 0;
 
 /*
@@ -542,7 +545,7 @@ static int its_handle_invall(struct virt_its *its, uint64_t 
*cmdptr)
 vcpu = get_vcpu_from_collection(its, collid);
 spin_unlock(>its_lock);
 
-spin_lock_irqsave(>arch.vgic.lock, flags);
+spin_lock_irqsave(>arch.vgic.lock, vcpu_flags);
 read_lock(>d->arch.vgic.pend_lpi_tree_lock);
 
 do
@@ -555,9 +558,13 @@ static int its_handle_invall(struct virt_its *its, 
uint64_t *cmdptr)
 
 for ( i = 0; i < nr_lpis; i++ )
 {
+vgic_irq_lock(pirqs[i], flags);
 /* We only care about LPIs on our VCPU. */
 if ( pirqs[i]->lpi_vcpu_id != vcpu->vcpu_id )
+{
+vgic_irq_unlock(pirqs[i], flags);
 continue;
+}
 
 vlpi = pirqs[i]->irq;
 /* If that fails for a single LPI, carry on to handle the rest. */
@@ -566,6 +573,8 @@ static int its_handle_invall(struct virt_its *its, uint64_t 
*cmdptr)
 update_lpi_vgic_status(vcpu, pirqs[i]);
 else
 ret = err;
+
+vgic_irq_unlock(pirqs[i], flags);
 }
 /*
  * Loop over the next gang of pending_irqs until we reached the end of
@@ -576,7 +585,7 @@ static int its_handle_invall(struct virt_its *its, uint64_t 
*cmdptr)
   (nr_lpis == ARRAY_SIZE(pirqs)) );
 
 read_unlock(>d->arch.vgic.pend_lpi_tree_lock);
-spin_unlock_irqrestore(>arch.vgic.lock, flags);
+spin_unlock_irqrestore(>arch.vgic.lock, vcpu_flags);
 
 return ret;
 }
@@ -712,6 +721,7 @@ static int its_handle_mapti(struct virt_its *its, uint64_t 
*cmdptr)
 uint32_t intid = its_cmd_get_physical_id(cmdptr), _intid;
 uint16_t collid = its_cmd_get_collection(cmdptr);
 struct pending_irq *pirq;
+unsigned long flags;
 struct vcpu *vcpu = NULL;
 int ret = -1;
 
@@ -765,7 +775,9 @@ static int its_handle_mapti(struct virt_its *its, uint64_t 
*cmdptr)
  * We don't need the VGIC VCPU lock here, because the pending_irq isn't
  * in the radix tree yet.
  */
+vgic_irq_lock(pirq, flags);
 ret = update_lpi_property(its->d, pirq);
+vgic_irq_unlock(pirq, flags);
 if ( ret )
 goto out_remove_host_entry;
 
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 14/22] ARM: vGIC: move virtual IRQ configuration from rank to pending_irq

2017-07-21 Thread Andre Przywara
The IRQ configuration (level or edge triggered) for a group of IRQs
are still stored in the irq_rank structure.
Introduce a new bit called GIC_IRQ_GUEST_LEVEL in the "status" field,
which holds that information.
Remove the storage from the irq_rank and use the existing wrappers to
store and retrieve the configuration bit for multiple IRQs.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/vgic-v2.c | 21 +++-
 xen/arch/arm/vgic-v3.c | 25 --
 xen/arch/arm/vgic.c| 81 +-
 xen/include/asm-arm/vgic.h |  5 ++-
 4 files changed, 73 insertions(+), 59 deletions(-)

diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
index a3fd500..0c8a598 100644
--- a/xen/arch/arm/vgic-v2.c
+++ b/xen/arch/arm/vgic-v2.c
@@ -278,20 +278,12 @@ static int vgic_v2_distr_mmio_read(struct vcpu *v, 
mmio_info_t *info,
 goto read_reserved;
 
 case VRANGE32(GICD_ICFGR, GICD_ICFGRN):
-{
-uint32_t icfgr;
-
 if ( dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 2, gicd_reg - GICD_ICFGR, DABT_WORD);
-if ( rank == NULL) goto read_as_zero;
-vgic_lock_rank(v, rank, flags);
-icfgr = rank->icfg[REG_RANK_INDEX(2, gicd_reg - GICD_ICFGR, 
DABT_WORD)];
-vgic_unlock_rank(v, rank, flags);
 
-*r = vreg_reg32_extract(icfgr, info);
+irq = (gicd_reg - GICD_ICFGR) * 4;
+*r = vgic_fetch_irq_config(v, irq);
 
 return 1;
-}
 
 case VRANGE32(0xD00, 0xDFC):
 goto read_impl_defined;
@@ -529,13 +521,8 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, 
mmio_info_t *info,
 
 case VRANGE32(GICD_ICFGR2, GICD_ICFGRN): /* SPIs */
 if ( dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 2, gicd_reg - GICD_ICFGR, DABT_WORD);
-if ( rank == NULL) goto write_ignore;
-vgic_lock_rank(v, rank, flags);
-vreg_reg32_update(>icfg[REG_RANK_INDEX(2, gicd_reg - GICD_ICFGR,
- DABT_WORD)],
-  r, info);
-vgic_unlock_rank(v, rank, flags);
+irq = (gicd_reg - GICD_ICFGR) * 4; /* 2 bit per IRQ */
+vgic_store_irq_config(v, irq, r);
 return 1;
 
 case VRANGE32(0xD00, 0xDFC):
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index d3356ae..e9e36eb 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -722,20 +722,11 @@ static int __vgic_v3_distr_common_mmio_read(const char 
*name, struct vcpu *v,
 return 1;
 
 case VRANGE32(GICD_ICFGR, GICD_ICFGRN):
-{
-uint32_t icfgr;
-
 if ( dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 2, reg - GICD_ICFGR, DABT_WORD);
-if ( rank == NULL ) goto read_as_zero;
-vgic_lock_rank(v, rank, flags);
-icfgr = rank->icfg[REG_RANK_INDEX(2, reg - GICD_ICFGR, DABT_WORD)];
-vgic_unlock_rank(v, rank, flags);
-
-*r = vreg_reg32_extract(icfgr, info);
-
+irq = (reg - GICD_ICFGR) * 4;
+if ( irq >= v->domain->arch.vgic.nr_spis + 32 ) goto read_as_zero;
+*r = vgic_fetch_irq_config(v, irq);
 return 1;
-}
 
 default:
 printk(XENLOG_G_ERR
@@ -834,13 +825,9 @@ static int __vgic_v3_distr_common_mmio_write(const char 
*name, struct vcpu *v,
 /* ICFGR1 for PPI's, which is implementation defined
if ICFGR1 is programmable or not. We chose to program */
 if ( dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 2, reg - GICD_ICFGR, DABT_WORD);
-if ( rank == NULL ) goto write_ignore;
-vgic_lock_rank(v, rank, flags);
-vreg_reg32_update(>icfg[REG_RANK_INDEX(2, reg - GICD_ICFGR,
- DABT_WORD)],
-  r, info);
-vgic_unlock_rank(v, rank, flags);
+irq = (reg - GICD_ICFGR) * 4;
+if ( irq >= v->domain->arch.vgic.nr_spis + 32 ) goto write_ignore;
+vgic_store_irq_config(v, irq, r);
 return 1;
 
 default:
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index ddcd99b..e5a4765 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -268,6 +268,55 @@ void vgic_store_irq_priority(struct vcpu *v, unsigned int 
nrirqs,
 local_irq_restore(flags);
 }
 
+#define IRQS_PER_CFGR   16
+/**
+ * vgic_fetch_irq_config: assemble the configuration bits for a group of 16 
IRQs
+ * @v: the VCPU for private IRQs, any VCPU of a domain for SPIs
+ * @first_irq: the first IRQ to be queried, must be aligned to 16
+ */
+uint32_t vgic_fetch_irq_config(struct vcpu *v, unsigned int first_irq)
+{
+struct pending_irq *pirqs[IRQS_PER_CFGR];
+unsigned long flags;
+uint32_t ret = 0, i;
+
+local_irq_save(flags);
+vgic_lock_irqs(v, IRQS_PER_CFGR, first_irq, pirqs);
+
+for ( i = 0; i < IRQS_PER_CFGR; i++ )
+if 

[Xen-devel] [RFC PATCH v2 03/22] ARM: vGIC: move gic_raise_inflight_irq() into vgic_vcpu_inject_irq()

2017-07-21 Thread Andre Przywara
Currently there is a gic_raise_inflight_irq(), which serves the very
special purpose of handling a newly injected interrupt while an older
one is still handled. This has only one user, in vgic_vcpu_inject_irq().

Now with the introduction of the pending_irq lock this will later on
result in a nasty deadlock, which can only be solved properly by
actually embedding the function into the caller (and dropping the lock
later in-between).

This has the admittedly hideous consequence of needing to export
gic_update_one_lr(), but this will go away in a later stage of a rework.
In this respect this patch is more a temporary kludge.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/gic.c| 30 +-
 xen/arch/arm/vgic.c   | 11 ++-
 xen/include/asm-arm/gic.h |  2 +-
 3 files changed, 12 insertions(+), 31 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 2c99d71..5bd66a2 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -44,8 +44,6 @@ static DEFINE_PER_CPU(uint64_t, lr_mask);
 
 #undef GIC_DEBUG
 
-static void gic_update_one_lr(struct vcpu *v, int i);
-
 static const struct gic_hw_operations *gic_hw_ops;
 
 void register_gic_ops(const struct gic_hw_operations *ops)
@@ -416,32 +414,6 @@ void gic_remove_irq_from_queues(struct vcpu *v, struct 
pending_irq *p)
 gic_remove_from_lr_pending(v, p);
 }
 
-void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq)
-{
-struct pending_irq *n = irq_to_pending(v, virtual_irq);
-
-/* If an LPI has been removed meanwhile, there is nothing left to raise. */
-if ( unlikely(!n) )
-return;
-
-ASSERT(spin_is_locked(>arch.vgic.lock));
-
-/* Don't try to update the LR if the interrupt is disabled */
-if ( !test_bit(GIC_IRQ_GUEST_ENABLED, >status) )
-return;
-
-if ( list_empty(>lr_queue) )
-{
-if ( v == current )
-gic_update_one_lr(v, n->lr);
-}
-#ifdef GIC_DEBUG
-else
-gdprintk(XENLOG_DEBUG, "trying to inject irq=%u into d%dv%d, when it 
is still lr_pending\n",
- virtual_irq, v->domain->domain_id, v->vcpu_id);
-#endif
-}
-
 /*
  * Find an unused LR to insert an IRQ into, starting with the LR given
  * by @lr. If this new interrupt is a PRISTINE LPI, scan the other LRs to
@@ -503,7 +475,7 @@ void gic_raise_guest_irq(struct vcpu *v, unsigned int 
virtual_irq,
 gic_add_to_lr_pending(v, p);
 }
 
-static void gic_update_one_lr(struct vcpu *v, int i)
+void gic_update_one_lr(struct vcpu *v, int i)
 {
 struct pending_irq *p;
 int irq;
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 38dacd3..7b122cd 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -536,7 +536,16 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int 
virq)
 
 if ( !list_empty(>inflight) )
 {
-gic_raise_inflight_irq(v, virq);
+bool update = test_bit(GIC_IRQ_GUEST_ENABLED, >status) &&
+  list_empty(>lr_queue) && (v == current);
+
+if ( update )
+gic_update_one_lr(v, n->lr);
+#ifdef GIC_DEBUG
+else
+gdprintk(XENLOG_DEBUG, "trying to inject irq=%u into d%dv%d, when 
it is still lr_pending\n",
+ n->irq, v->domain->domain_id, v->vcpu_id);
+#endif
 goto out;
 }
 
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index 6203dc5..cf8b8fb 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -237,12 +237,12 @@ int gic_remove_irq_from_guest(struct domain *d, unsigned 
int virq,
 
 extern void gic_inject(void);
 extern void gic_clear_pending_irqs(struct vcpu *v);
+extern void gic_update_one_lr(struct vcpu *v, int lr);
 extern int gic_events_need_delivery(void);
 
 extern void init_maintenance_interrupt(void);
 extern void gic_raise_guest_irq(struct vcpu *v, unsigned int irq,
 unsigned int priority);
-extern void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq);
 extern void gic_remove_from_lr_pending(struct vcpu *v, struct pending_irq *p);
 extern void gic_remove_irq_from_queues(struct vcpu *v, struct pending_irq *p);
 
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 04/22] ARM: vGIC: rename pending_irq->priority to cur_priority

2017-07-21 Thread Andre Przywara
In preparation for storing the virtual interrupt priority in the struct
pending_irq, rename the existing "priority" member to "cur_priority".
This is to signify that this is the current priority of an interrupt
which has been injected to a VCPU. Once this happened, its priority must
stay fixed at this value, subsequenct MMIO accesses to change the priority
can only affect newly triggered interrupts.
Also since the priority is a sorting criteria for the inflight list, it
must not change when it's on a VCPUs list.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/gic-v2.c  |  2 +-
 xen/arch/arm/gic-v3.c  |  2 +-
 xen/arch/arm/gic.c | 10 +-
 xen/arch/arm/vgic.c|  6 +++---
 xen/include/asm-arm/vgic.h |  2 +-
 5 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index cbe71a9..735e23d 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -437,7 +437,7 @@ static void gicv2_update_lr(int lr, const struct 
pending_irq *p,
 BUG_ON(lr < 0);
 
 lr_reg = (((state & GICH_V2_LR_STATE_MASK) << GICH_V2_LR_STATE_SHIFT)  |
-  ((GIC_PRI_TO_GUEST(p->priority) & GICH_V2_LR_PRIORITY_MASK)
+  ((GIC_PRI_TO_GUEST(p->cur_priority) & GICH_V2_LR_PRIORITY_MASK)
  << GICH_V2_LR_PRIORITY_SHIFT) |
   ((p->irq & GICH_V2_LR_VIRTUAL_MASK) << 
GICH_V2_LR_VIRTUAL_SHIFT));
 
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index f990eae..449bd55 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -961,7 +961,7 @@ static void gicv3_update_lr(int lr, const struct 
pending_irq *p,
 if ( current->domain->arch.vgic.version == GIC_V3 )
 val |= GICH_LR_GRP1;
 
-val |= ((uint64_t)p->priority & 0xff) << GICH_LR_PRIORITY_SHIFT;
+val |= ((uint64_t)p->cur_priority & 0xff) << GICH_LR_PRIORITY_SHIFT;
 val |= ((uint64_t)p->irq & GICH_LR_VIRTUAL_MASK) << GICH_LR_VIRTUAL_SHIFT;
 
if ( p->desc != NULL )
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 5bd66a2..8dec736 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -389,7 +389,7 @@ static inline void gic_add_to_lr_pending(struct vcpu *v, 
struct pending_irq *n)
 
 list_for_each_entry ( iter, >arch.vgic.lr_pending, lr_queue )
 {
-if ( iter->priority > n->priority )
+if ( iter->cur_priority > n->cur_priority )
 {
 list_add_tail(>lr_queue, >lr_queue);
 return;
@@ -542,7 +542,7 @@ void gic_update_one_lr(struct vcpu *v, int i)
 if ( test_bit(GIC_IRQ_GUEST_ENABLED, >status) &&
  test_bit(GIC_IRQ_GUEST_QUEUED, >status) &&
  !test_bit(GIC_IRQ_GUEST_MIGRATING, >status) )
-gic_raise_guest_irq(v, irq, p->priority);
+gic_raise_guest_irq(v, irq, p->cur_priority);
 else {
 list_del_init(>inflight);
 /*
@@ -610,7 +610,7 @@ static void gic_restore_pending_irqs(struct vcpu *v)
 /* No more free LRs: find a lower priority irq to evict */
 list_for_each_entry_reverse( p_r, inflight_r, inflight )
 {
-if ( p_r->priority == p->priority )
+if ( p_r->cur_priority == p->cur_priority )
 goto out;
 if ( test_bit(GIC_IRQ_GUEST_VISIBLE, _r->status) &&
  !test_bit(GIC_IRQ_GUEST_ACTIVE, _r->status) )
@@ -676,9 +676,9 @@ int gic_events_need_delivery(void)
  * ordered by priority */
 list_for_each_entry( p, >arch.vgic.inflight_irqs, inflight )
 {
-if ( GIC_PRI_TO_GUEST(p->priority) >= mask_priority )
+if ( GIC_PRI_TO_GUEST(p->cur_priority) >= mask_priority )
 goto out;
-if ( GIC_PRI_TO_GUEST(p->priority) >= active_priority )
+if ( GIC_PRI_TO_GUEST(p->cur_priority) >= active_priority )
 goto out;
 if ( test_bit(GIC_IRQ_GUEST_ENABLED, >status) )
 {
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 7b122cd..21b545e 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -395,7 +395,7 @@ void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n)
 p = irq_to_pending(v_target, irq);
 set_bit(GIC_IRQ_GUEST_ENABLED, >status);
 if ( !list_empty(>inflight) && !test_bit(GIC_IRQ_GUEST_VISIBLE, 
>status) )
-gic_raise_guest_irq(v_target, irq, p->priority);
+gic_raise_guest_irq(v_target, irq, p->cur_priority);
 spin_unlock_irqrestore(_target->arch.vgic.lock, flags);
 if ( p->desc != NULL )
 {
@@ -550,7 +550,7 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
 }
 
 priority = vgic_get_virq_priority(v, virq);
-n->priority = priority;
+n->cur_priority = priority;
 
 /* the irq is enabled */
 if ( test_bit(GIC_IRQ_GUEST_ENABLED, >status) )
@@ -558,7 +558,7 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned 

[Xen-devel] [RFC PATCH v2 12/22] ARM: vGIC: protect gic_update_one_lr() with pending_irq lock

2017-07-21 Thread Andre Przywara
When we return from a domain with the active bit set in an LR,
we update our pending_irq accordingly. This touches multiple status
bits, so requires the pending_irq lock.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/gic.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 9637682..84b282b 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -508,6 +508,7 @@ void gic_update_one_lr(struct vcpu *v, int i)
 
 if ( lr_val.state & GICH_LR_ACTIVE )
 {
+vgic_irq_lock(p, flags);
 set_bit(GIC_IRQ_GUEST_ACTIVE, >status);
 if ( test_bit(GIC_IRQ_GUEST_ENABLED, >status) &&
  test_and_clear_bit(GIC_IRQ_GUEST_QUEUED, >status) )
@@ -521,6 +522,7 @@ void gic_update_one_lr(struct vcpu *v, int i)
 gdprintk(XENLOG_WARNING, "unable to inject hw irq=%d into 
d%dv%d: already active in LR%d\n",
  irq, v->domain->domain_id, v->vcpu_id, i);
 }
+vgic_irq_unlock(p, flags);
 }
 else if ( lr_val.state & GICH_LR_PENDING )
 {
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 11/22] ARM: vGIC: protect gic_events_need_delivery() with pending_irq lock

2017-07-21 Thread Andre Przywara
gic_events_need_delivery() reads the cur_priority field twice, also
relies on the consistency of status bits.
So it should take pending_irq lock.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/gic.c | 24 +---
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index df89530..9637682 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -666,7 +666,7 @@ int gic_events_need_delivery(void)
 {
 struct vcpu *v = current;
 struct pending_irq *p;
-unsigned long flags;
+unsigned long flags, vcpu_flags;
 const unsigned long apr = gic_hw_ops->read_apr(0);
 int mask_priority;
 int active_priority;
@@ -675,7 +675,7 @@ int gic_events_need_delivery(void)
 mask_priority = gic_hw_ops->read_vmcr_priority();
 active_priority = find_next_bit(, 32, 0);
 
-spin_lock_irqsave(>arch.vgic.lock, flags);
+spin_lock_irqsave(>arch.vgic.lock, vcpu_flags);
 
 /* TODO: We order the guest irqs by priority, but we don't change
  * the priority of host irqs. */
@@ -684,19 +684,21 @@ int gic_events_need_delivery(void)
  * ordered by priority */
 list_for_each_entry( p, >arch.vgic.inflight_irqs, inflight )
 {
-if ( GIC_PRI_TO_GUEST(p->cur_priority) >= mask_priority )
-goto out;
-if ( GIC_PRI_TO_GUEST(p->cur_priority) >= active_priority )
-goto out;
-if ( test_bit(GIC_IRQ_GUEST_ENABLED, >status) )
+vgic_irq_lock(p, flags);
+if ( GIC_PRI_TO_GUEST(p->cur_priority) < mask_priority &&
+ GIC_PRI_TO_GUEST(p->cur_priority) < active_priority &&
+ !test_bit(GIC_IRQ_GUEST_ENABLED, >status) )
 {
-rc = 1;
-goto out;
+vgic_irq_unlock(p, flags);
+continue;
 }
+
+rc = test_bit(GIC_IRQ_GUEST_ENABLED, >status);
+vgic_irq_unlock(p, flags);
+break;
 }
 
-out:
-spin_unlock_irqrestore(>arch.vgic.lock, flags);
+spin_unlock_irqrestore(>arch.vgic.lock, vcpu_flags);
 return rc;
 }
 
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 20/22] ARM: vGIC: move virtual IRQ enable bit from rank to pending_irq

2017-07-21 Thread Andre Przywara
The enabled bits for a group of IRQs are still stored in the irq_rank
structure, although we already have the same information in pending_irq,
in the GIC_IRQ_GUEST_ENABLED bit of the "status" field.
Remove the storage from the irq_rank and just utilize the existing
wrappers to cover enabling/disabling of multiple IRQs.
This also marks the removal of the last member of struct vgic_irq_rank.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/vgic-v2.c |  41 +++--
 xen/arch/arm/vgic-v3.c |  41 +++--
 xen/arch/arm/vgic.c| 201 +++--
 xen/include/asm-arm/vgic.h |  10 +--
 4 files changed, 152 insertions(+), 141 deletions(-)

diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
index c7ed3ce..3320642 100644
--- a/xen/arch/arm/vgic-v2.c
+++ b/xen/arch/arm/vgic-v2.c
@@ -166,9 +166,7 @@ static int vgic_v2_distr_mmio_read(struct vcpu *v, 
mmio_info_t *info,
register_t *r, void *priv)
 {
 struct hsr_dabt dabt = info->dabt;
-struct vgic_irq_rank *rank;
 int gicd_reg = (int)(info->gpa - v->domain->arch.vgic.dbase);
-unsigned long flags;
 unsigned int irq;
 
 perfc_incr(vgicd_reads);
@@ -222,20 +220,16 @@ static int vgic_v2_distr_mmio_read(struct vcpu *v, 
mmio_info_t *info,
 
 case VRANGE32(GICD_ISENABLER, GICD_ISENABLERN):
 if ( dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 1, gicd_reg - GICD_ISENABLER, DABT_WORD);
-if ( rank == NULL) goto read_as_zero;
-vgic_lock_rank(v, rank, flags);
-*r = vreg_reg32_extract(rank->ienable, info);
-vgic_unlock_rank(v, rank, flags);
+irq = (gicd_reg - GICD_ISENABLER) * 8;
+if ( irq >= v->domain->arch.vgic.nr_spis + 32 ) goto read_as_zero;
+*r = vgic_fetch_irq_enabled(v, irq);
 return 1;
 
 case VRANGE32(GICD_ICENABLER, GICD_ICENABLERN):
 if ( dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 1, gicd_reg - GICD_ICENABLER, DABT_WORD);
-if ( rank == NULL) goto read_as_zero;
-vgic_lock_rank(v, rank, flags);
-*r = vreg_reg32_extract(rank->ienable, info);
-vgic_unlock_rank(v, rank, flags);
+irq = (gicd_reg - GICD_ICENABLER) * 8;
+if ( irq >= v->domain->arch.vgic.nr_spis + 32 ) goto read_as_zero;
+*r = vgic_fetch_irq_enabled(v, irq);
 return 1;
 
 /* Read the pending status of an IRQ via GICD is not supported */
@@ -386,10 +380,7 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, 
mmio_info_t *info,
 register_t r, void *priv)
 {
 struct hsr_dabt dabt = info->dabt;
-struct vgic_irq_rank *rank;
 int gicd_reg = (int)(info->gpa - v->domain->arch.vgic.dbase);
-uint32_t tr;
-unsigned long flags;
 unsigned int irq;
 
 perfc_incr(vgicd_writes);
@@ -426,24 +417,16 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, 
mmio_info_t *info,
 
 case VRANGE32(GICD_ISENABLER, GICD_ISENABLERN):
 if ( dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 1, gicd_reg - GICD_ISENABLER, DABT_WORD);
-if ( rank == NULL) goto write_ignore;
-vgic_lock_rank(v, rank, flags);
-tr = rank->ienable;
-vreg_reg32_setbits(>ienable, r, info);
-vgic_enable_irqs(v, (rank->ienable) & (~tr), rank->index);
-vgic_unlock_rank(v, rank, flags);
+irq = (gicd_reg - GICD_ISENABLER) * 8;
+if ( irq >= v->domain->arch.vgic.nr_spis + 32 ) goto write_ignore;
+vgic_store_irq_enable(v, irq, r);
 return 1;
 
 case VRANGE32(GICD_ICENABLER, GICD_ICENABLERN):
 if ( dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 1, gicd_reg - GICD_ICENABLER, DABT_WORD);
-if ( rank == NULL) goto write_ignore;
-vgic_lock_rank(v, rank, flags);
-tr = rank->ienable;
-vreg_reg32_clearbits(>ienable, r, info);
-vgic_disable_irqs(v, (~rank->ienable) & tr, rank->index);
-vgic_unlock_rank(v, rank, flags);
+irq = (gicd_reg - GICD_ICENABLER) * 8;
+if ( irq >= v->domain->arch.vgic.nr_spis + 32 ) goto write_ignore;
+vgic_store_irq_disable(v, irq, r);
 return 1;
 
 case VRANGE32(GICD_ISPENDR, GICD_ISPENDRN):
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index e9d46af..00cc1e5 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -676,8 +676,6 @@ static int __vgic_v3_distr_common_mmio_read(const char 
*name, struct vcpu *v,
 register_t *r)
 {
 struct hsr_dabt dabt = info->dabt;
-struct vgic_irq_rank *rank;
-unsigned long flags;
 unsigned int irq;
 
 switch ( reg )
@@ -689,20 +687,16 @@ static int __vgic_v3_distr_common_mmio_read(const char 
*name, struct vcpu *v,
 
 case VRANGE32(GICD_ISENABLER, GICD_ISENABLERN):
 

[Xen-devel] [RFC PATCH v2 19/22] ARM: vGIC: rework vgic_get_target_vcpu to take a domain instead of vcpu

2017-07-21 Thread Andre Przywara
For "historical" reasons we used to pass a vCPU pointer to
vgic_get_target_vcpu(), which was only considered to distinguish private
IRQs. Now since we have the unique pending_irq pointer already, we don't
need the vCPU anymore, but just the domain.
So change this function to avoid a rather hackish "d->vcpu[0]" parameter
when looking up SPIs, also allows our new vgic_lock_vcpu_irq() function
to eventually take a domain parameter (which makes more sense).

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/gic.c |  2 +-
 xen/arch/arm/vgic.c| 22 +++---
 xen/include/asm-arm/vgic.h |  3 ++-
 3 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 38e998a..300ce6c 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -559,7 +559,7 @@ void gic_update_one_lr(struct vcpu *v, int i)
 smp_wmb();
 if ( test_bit(GIC_IRQ_GUEST_MIGRATING, >status) )
 {
-struct vcpu *v_target = vgic_get_target_vcpu(v, p);
+struct vcpu *v_target = vgic_get_target_vcpu(v->domain, p);
 irq_set_affinity(p->desc, cpumask_of(v_target->processor));
 clear_bit(GIC_IRQ_GUEST_MIGRATING, >status);
 }
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index f6532ee..a49fcde 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -217,7 +217,7 @@ int vcpu_vgic_free(struct vcpu *v)
 /**
  * vgic_lock_vcpu_irq(): lock both the pending_irq and the corresponding VCPU
  *
- * @v: the VCPU (for private IRQs)
+ * @d: the domain the IRQ belongs to
  * @p: pointer to the locked struct pending_irq
  * @flags: pointer to the IRQ flags used when locking the VCPU
  *
@@ -227,14 +227,14 @@ int vcpu_vgic_free(struct vcpu *v)
  *
  * Returns: pointer to the VCPU this IRQ is targeting.
  */
-struct vcpu *vgic_lock_vcpu_irq(struct vcpu *v, struct pending_irq *p,
+struct vcpu *vgic_lock_vcpu_irq(struct domain *d, struct pending_irq *p,
 unsigned long *flags)
 {
 struct vcpu *target_vcpu;
 
 ASSERT(spin_is_locked(>lock));
 
-target_vcpu = vgic_get_target_vcpu(v, p);
+target_vcpu = vgic_get_target_vcpu(d, p);
 spin_unlock(>lock);
 
 do
@@ -244,7 +244,7 @@ struct vcpu *vgic_lock_vcpu_irq(struct vcpu *v, struct 
pending_irq *p,
 spin_lock_irqsave(_vcpu->arch.vgic.lock, *flags);
 spin_lock(>lock);
 
-current_vcpu = vgic_get_target_vcpu(v, p);
+current_vcpu = vgic_get_target_vcpu(d, p);
 
 if ( target_vcpu->vcpu_id == current_vcpu->vcpu_id )
 return target_vcpu;
@@ -256,9 +256,9 @@ struct vcpu *vgic_lock_vcpu_irq(struct vcpu *v, struct 
pending_irq *p,
 } while (1);
 }
 
-struct vcpu *vgic_get_target_vcpu(struct vcpu *v, struct pending_irq *p)
+struct vcpu *vgic_get_target_vcpu(struct domain *d, struct pending_irq *p)
 {
-return v->domain->vcpu[p->vcpu_id];
+return d->vcpu[p->vcpu_id];
 }
 
 #define MAX_IRQS_PER_IPRIORITYR 4
@@ -386,7 +386,7 @@ bool vgic_migrate_irq(struct pending_irq *p, unsigned long 
*flags,
 /* If the IRQ is still lr_pending, re-inject it to the new vcpu */
 if ( !list_empty(>lr_queue) )
 {
-old = vgic_lock_vcpu_irq(new, p, _flags);
+old = vgic_lock_vcpu_irq(new->domain, p, _flags);
 gic_remove_irq_from_queues(old, p);
 irq_set_affinity(p->desc, cpumask_of(new->processor));
 
@@ -430,7 +430,7 @@ void arch_move_irqs(struct vcpu *v)
 for ( i = 32; i < vgic_num_irqs(d); i++ )
 {
 p = irq_to_pending(v, i);
-v_target = vgic_get_target_vcpu(v, p);
+v_target = vgic_get_target_vcpu(d, p);
 
 if ( v_target == v && !test_bit(GIC_IRQ_GUEST_MIGRATING, >status) )
 irq_set_affinity(p->desc, cpu_mask);
@@ -453,7 +453,7 @@ void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n)
 while ( (i = find_next_bit(, 32, i)) < 32 ) {
 irq = i + (32 * n);
 p = irq_to_pending(v, irq);
-v_target = vgic_get_target_vcpu(v, p);
+v_target = vgic_get_target_vcpu(v->domain, p);
 
 spin_lock_irqsave(_target->arch.vgic.lock, flags);
 clear_bit(GIC_IRQ_GUEST_ENABLED, >status);
@@ -507,7 +507,7 @@ void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n)
 while ( (i = find_next_bit(, 32, i)) < 32 ) {
 irq = i + (32 * n);
 p = irq_to_pending(v, irq);
-v_target = vgic_get_target_vcpu(v, p);
+v_target = vgic_get_target_vcpu(v->domain, p);
 spin_lock_irqsave(_target->arch.vgic.lock, vcpu_flags);
 vgic_irq_lock(p, flags);
 set_bit(GIC_IRQ_GUEST_ENABLED, >status);
@@ -710,7 +710,7 @@ void vgic_vcpu_inject_spi(struct domain *d, unsigned int 
virq)
 /* the IRQ needs to be an SPI */
 ASSERT(virq >= 32 && virq <= vgic_num_irqs(d));
 
-v = vgic_get_target_vcpu(d->vcpu[0], p);
+v = vgic_get_target_vcpu(d, p);
 

[Xen-devel] [RFC PATCH v2 08/22] ARM: vGIC: move virtual IRQ priority from rank to pending_irq

2017-07-21 Thread Andre Przywara
So far a virtual interrupt's priority is stored in the irq_rank
structure, which covers multiple IRQs and has a single lock for this
group.
Generalize the already existing priority variable in struct pending_irq
to not only cover LPIs, but every IRQ. Access to this value is protected
by the per-IRQ lock.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/vgic-v2.c | 34 ++
 xen/arch/arm/vgic-v3.c | 36 
 xen/arch/arm/vgic.c| 41 +
 xen/include/asm-arm/vgic.h | 10 --
 4 files changed, 31 insertions(+), 90 deletions(-)

diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
index cf4ab89..ed7ff3b 100644
--- a/xen/arch/arm/vgic-v2.c
+++ b/xen/arch/arm/vgic-v2.c
@@ -171,6 +171,7 @@ static int vgic_v2_distr_mmio_read(struct vcpu *v, 
mmio_info_t *info,
 struct vgic_irq_rank *rank;
 int gicd_reg = (int)(info->gpa - v->domain->arch.vgic.dbase);
 unsigned long flags;
+unsigned int irq;
 
 perfc_incr(vgicd_reads);
 
@@ -250,22 +251,10 @@ static int vgic_v2_distr_mmio_read(struct vcpu *v, 
mmio_info_t *info,
 goto read_as_zero;
 
 case VRANGE32(GICD_IPRIORITYR, GICD_IPRIORITYRN):
-{
-uint32_t ipriorityr;
-uint8_t rank_index;
-
 if ( dabt.size != DABT_BYTE && dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 8, gicd_reg - GICD_IPRIORITYR, DABT_WORD);
-if ( rank == NULL ) goto read_as_zero;
-rank_index = REG_RANK_INDEX(8, gicd_reg - GICD_IPRIORITYR, DABT_WORD);
-
-vgic_lock_rank(v, rank, flags);
-ipriorityr = ACCESS_ONCE(rank->ipriorityr[rank_index]);
-vgic_unlock_rank(v, rank, flags);
-*r = vreg_reg32_extract(ipriorityr, info);
-
+irq = gicd_reg - GICD_IPRIORITYR; /* 8 bit per IRQ, so IRQ = offset */
+*r = vgic_fetch_irq_priority(v, irq, (dabt.size == DABT_BYTE) ? 1 : 4);
 return 1;
-}
 
 case VREG32(0x7FC):
 goto read_reserved;
@@ -415,6 +404,7 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, 
mmio_info_t *info,
 int gicd_reg = (int)(info->gpa - v->domain->arch.vgic.dbase);
 uint32_t tr;
 unsigned long flags;
+unsigned int irq;
 
 perfc_incr(vgicd_writes);
 
@@ -498,23 +488,11 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, 
mmio_info_t *info,
 goto write_ignore_32;
 
 case VRANGE32(GICD_IPRIORITYR, GICD_IPRIORITYRN):
-{
-uint32_t *ipriorityr, priority;
-
 if ( dabt.size != DABT_BYTE && dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 8, gicd_reg - GICD_IPRIORITYR, DABT_WORD);
-if ( rank == NULL) goto write_ignore;
-vgic_lock_rank(v, rank, flags);
-ipriorityr = >ipriorityr[REG_RANK_INDEX(8,
-  gicd_reg - 
GICD_IPRIORITYR,
-  DABT_WORD)];
-priority = ACCESS_ONCE(*ipriorityr);
-vreg_reg32_update(, r, info);
-ACCESS_ONCE(*ipriorityr) = priority;
 
-vgic_unlock_rank(v, rank, flags);
+irq = gicd_reg - GICD_IPRIORITYR; /* 8 bit per IRQ, so IRQ = offset */
+vgic_store_irq_priority(v, (dabt.size == DABT_BYTE) ? 1 : 4, irq, r);
 return 1;
-}
 
 case VREG32(0x7FC):
 goto write_reserved;
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index ad9019e..e58e77e 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -677,6 +677,7 @@ static int __vgic_v3_distr_common_mmio_read(const char 
*name, struct vcpu *v,
 struct hsr_dabt dabt = info->dabt;
 struct vgic_irq_rank *rank;
 unsigned long flags;
+unsigned int irq;
 
 switch ( reg )
 {
@@ -714,23 +715,11 @@ static int __vgic_v3_distr_common_mmio_read(const char 
*name, struct vcpu *v,
 goto read_as_zero;
 
 case VRANGE32(GICD_IPRIORITYR, GICD_IPRIORITYRN):
-{
-uint32_t ipriorityr;
-uint8_t rank_index;
-
 if ( dabt.size != DABT_BYTE && dabt.size != DABT_WORD ) goto bad_width;
-rank = vgic_rank_offset(v, 8, reg - GICD_IPRIORITYR, DABT_WORD);
-if ( rank == NULL ) goto read_as_zero;
-rank_index = REG_RANK_INDEX(8, reg - GICD_IPRIORITYR, DABT_WORD);
-
-vgic_lock_rank(v, rank, flags);
-ipriorityr = ACCESS_ONCE(rank->ipriorityr[rank_index]);
-vgic_unlock_rank(v, rank, flags);
-
-*r = vreg_reg32_extract(ipriorityr, info);
-
+irq = reg - GICD_IPRIORITYR; /* 8 bit per IRQ, so IRQ = offset */
+if ( irq >= v->domain->arch.vgic.nr_spis + 32 ) goto read_as_zero;
+*r = vgic_fetch_irq_priority(v, irq, (dabt.size == DABT_BYTE) ? 1 : 4);
 return 1;
-}
 
 case VRANGE32(GICD_ICFGR, GICD_ICFGRN):
 {
@@ -774,6 +763,7 @@ static int __vgic_v3_distr_common_mmio_write(const char 
*name, struct vcpu *v,
 

[Xen-devel] [RFC PATCH v2 10/22] ARM: vGIC: protect gic_set_lr() with pending_irq lock

2017-07-21 Thread Andre Przywara
When putting a (pending) IRQ into an LR, we should better make sure that
no-one changes it behind our back. So make sure we take the pending_irq
lock. This bubbles up to all users of gic_add_to_lr_pending() and
gic_raise_guest_irq().

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/gic.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 8dec736..df89530 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -383,6 +383,7 @@ static inline void gic_add_to_lr_pending(struct vcpu *v, 
struct pending_irq *n)
 struct pending_irq *iter;
 
 ASSERT(spin_is_locked(>arch.vgic.lock));
+ASSERT(spin_is_locked(>lock));
 
 if ( !list_empty(>lr_queue) )
 return;
@@ -480,6 +481,7 @@ void gic_update_one_lr(struct vcpu *v, int i)
 struct pending_irq *p;
 int irq;
 struct gic_lr lr_val;
+unsigned long flags;
 
 ASSERT(spin_is_locked(>arch.vgic.lock));
 ASSERT(!local_irq_is_enabled());
@@ -534,6 +536,7 @@ void gic_update_one_lr(struct vcpu *v, int i)
 gic_hw_ops->clear_lr(i);
 clear_bit(i, _cpu(lr_mask));
 
+vgic_irq_lock(p, flags);
 if ( p->desc != NULL )
 clear_bit(_IRQ_INPROGRESS, >desc->status);
 clear_bit(GIC_IRQ_GUEST_VISIBLE, >status);
@@ -559,6 +562,7 @@ void gic_update_one_lr(struct vcpu *v, int i)
 clear_bit(GIC_IRQ_GUEST_MIGRATING, >status);
 }
 }
+vgic_irq_unlock(p, flags);
 }
 }
 
@@ -592,11 +596,11 @@ static void gic_restore_pending_irqs(struct vcpu *v)
 int lr = 0;
 struct pending_irq *p, *t, *p_r;
 struct list_head *inflight_r;
-unsigned long flags;
+unsigned long flags, vcpu_flags;
 unsigned int nr_lrs = gic_hw_ops->info->nr_lrs;
 int lrs = nr_lrs;
 
-spin_lock_irqsave(>arch.vgic.lock, flags);
+spin_lock_irqsave(>arch.vgic.lock, vcpu_flags);
 
 if ( list_empty(>arch.vgic.lr_pending) )
 goto out;
@@ -621,16 +625,20 @@ static void gic_restore_pending_irqs(struct vcpu *v)
 goto out;
 
 found:
+vgic_irq_lock(p_r, flags);
 lr = p_r->lr;
 p_r->lr = GIC_INVALID_LR;
 set_bit(GIC_IRQ_GUEST_QUEUED, _r->status);
 clear_bit(GIC_IRQ_GUEST_VISIBLE, _r->status);
 gic_add_to_lr_pending(v, p_r);
 inflight_r = _r->inflight;
+vgic_irq_unlock(p_r, flags);
 }
 
+vgic_irq_lock(p, flags);
 gic_set_lr(lr, p, GICH_LR_PENDING);
 list_del_init(>lr_queue);
+vgic_irq_unlock(p, flags);
 set_bit(lr, _cpu(lr_mask));
 
 /* We can only evict nr_lrs entries */
@@ -640,7 +648,7 @@ found:
 }
 
 out:
-spin_unlock_irqrestore(>arch.vgic.lock, flags);
+spin_unlock_irqrestore(>arch.vgic.lock, vcpu_flags);
 }
 
 void gic_clear_pending_irqs(struct vcpu *v)
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 16/22] ARM: vITS: rename lpi_vcpu_id to vcpu_id

2017-07-21 Thread Andre Przywara
Since we will soon store a virtual IRQ's target VCPU in struct pending_irq,
generalise the existing storage for an LPI's target to cover all IRQs.
This just renames "lpi_vcpu_id" to "vcpu_id", but doesn't change anything
else yet.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/gic-v3-lpi.c  | 2 +-
 xen/arch/arm/vgic-v3-its.c | 7 +++
 xen/arch/arm/vgic.c| 6 +++---
 xen/include/asm-arm/vgic.h | 2 +-
 4 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index c3474f5..2306b58 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -149,7 +149,7 @@ void vgic_vcpu_inject_lpi(struct domain *d, unsigned int 
virq)
 if ( !p )
 return;
 
-vcpu_id = ACCESS_ONCE(p->lpi_vcpu_id);
+vcpu_id = ACCESS_ONCE(p->vcpu_id);
 if ( vcpu_id >= d->max_vcpus )
   return;
 
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 705708a..682ce10 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -560,7 +560,7 @@ static int its_handle_invall(struct virt_its *its, uint64_t 
*cmdptr)
 {
 vgic_irq_lock(pirqs[i], flags);
 /* We only care about LPIs on our VCPU. */
-if ( pirqs[i]->lpi_vcpu_id != vcpu->vcpu_id )
+if ( pirqs[i]->vcpu_id != vcpu->vcpu_id )
 {
 vgic_irq_unlock(pirqs[i], flags);
 continue;
@@ -781,7 +781,7 @@ static int its_handle_mapti(struct virt_its *its, uint64_t 
*cmdptr)
 if ( ret )
 goto out_remove_host_entry;
 
-pirq->lpi_vcpu_id = vcpu->vcpu_id;
+pirq->vcpu_id = vcpu->vcpu_id;
 /*
  * Mark this LPI as new, so any older (now unmapped) LPI in any LR
  * can be easily recognised as such.
@@ -852,8 +852,7 @@ static int its_handle_movi(struct virt_its *its, uint64_t 
*cmdptr)
  */
 spin_lock_irqsave(>arch.vgic.lock, flags);
 
-/* Update our cached vcpu_id in the pending_irq. */
-p->lpi_vcpu_id = nvcpu->vcpu_id;
+p->vcpu_id = nvcpu->vcpu_id;
 
 spin_unlock_irqrestore(>arch.vgic.lock, flags);
 
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 6722924..1ba0010 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -63,15 +63,15 @@ struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, 
unsigned int irq)
 
 void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
 {
-/* The lpi_vcpu_id field must be big enough to hold a VCPU ID. */
-BUILD_BUG_ON(BIT(sizeof(p->lpi_vcpu_id) * 8) < MAX_VIRT_CPUS);
+/* The vcpu_id field must be big enough to hold a VCPU ID. */
+BUILD_BUG_ON(BIT(sizeof(p->vcpu_id) * 8) < MAX_VIRT_CPUS);
 
 memset(p, 0, sizeof(*p));
 INIT_LIST_HEAD(>inflight);
 INIT_LIST_HEAD(>lr_queue);
 spin_lock_init(>lock);
 p->irq = virq;
-p->lpi_vcpu_id = INVALID_VCPU_ID;
+p->vcpu_id = INVALID_VCPU_ID;
 }
 
 static void vgic_rank_init(struct vgic_irq_rank *rank, uint8_t index,
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index 7c6067d..ffd9a95 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -81,7 +81,7 @@ struct pending_irq
 uint8_t lr;
 uint8_t cur_priority;   /* Holds the priority of an injected IRQ. */
 uint8_t priority;   /* Holds the priority for any new IRQ. */
-uint8_t lpi_vcpu_id;/* The VCPU for an LPI. */
+uint8_t vcpu_id;/* The VCPU target for any new IRQ. */
 /* inflight is used to append instances of pending_irq to
  * vgic.inflight_irqs */
 struct list_head inflight;
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 05/22] ARM: vITS: rename pending_irq->lpi_priority to priority

2017-07-21 Thread Andre Przywara
Since we will soon store a virtual IRQ's priority in struct pending_irq,
generalise the existing storage for an LPI's priority to cover all IRQs.
This just renames "lpi_priority" to "priority", but doesn't change
anything else yet.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/vgic-v3-its.c | 4 ++--
 xen/arch/arm/vgic-v3.c | 2 +-
 xen/include/asm-arm/vgic.h | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 9ef792f..66095d4 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -419,7 +419,7 @@ static int update_lpi_property(struct domain *d, struct 
pending_irq *p)
 if ( ret )
 return ret;
 
-write_atomic(>lpi_priority, property & LPI_PROP_PRIO_MASK);
+write_atomic(>priority, property & LPI_PROP_PRIO_MASK);
 
 if ( property & LPI_PROP_ENABLED )
 set_bit(GIC_IRQ_GUEST_ENABLED, >status);
@@ -445,7 +445,7 @@ static void update_lpi_vgic_status(struct vcpu *v, struct 
pending_irq *p)
 {
 if ( !list_empty(>inflight) &&
  !test_bit(GIC_IRQ_GUEST_VISIBLE, >status) )
-gic_raise_guest_irq(v, p->irq, p->lpi_priority);
+gic_raise_guest_irq(v, p->irq, p->priority);
 }
 else
 gic_remove_from_lr_pending(v, p);
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index 48c7682..ad9019e 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -1784,7 +1784,7 @@ static int vgic_v3_lpi_get_priority(struct domain *d, 
uint32_t vlpi)
 
 ASSERT(p);
 
-return p->lpi_priority;
+return p->priority;
 }
 
 static const struct vgic_ops v3_ops = {
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index 0df4ac7..27b5e37 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -79,7 +79,7 @@ struct pending_irq
 #define GIC_INVALID_LR (uint8_t)~0
 uint8_t lr;
 uint8_t cur_priority;   /* Holds the priority of an injected IRQ. */
-uint8_t lpi_priority;   /* Caches the priority if this is an LPI. */
+uint8_t priority;   /* Holds the priority for any new IRQ. */
 uint8_t lpi_vcpu_id;/* The VCPU for an LPI. */
 /* inflight is used to append instances of pending_irq to
  * vgic.inflight_irqs */
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 01/22] ARM: vGIC: introduce and initialize pending_irq lock

2017-07-21 Thread Andre Przywara
Currently we protect the pending_irq structure with the corresponding
VGIC VCPU lock. There are problems in certain corner cases (for
instance if an IRQ is migrating), so let's introduce a per-IRQ lock,
which will protect the consistency of this structure independent from
any VCPU.
For now this just introduces and initializes the lock, also adds
wrapper macros to simplify its usage (and help debugging).

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/vgic.c|  1 +
 xen/include/asm-arm/vgic.h | 11 +++
 2 files changed, 12 insertions(+)

diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 1e5107b..38dacd3 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -69,6 +69,7 @@ void vgic_init_pending_irq(struct pending_irq *p, unsigned 
int virq)
 memset(p, 0, sizeof(*p));
 INIT_LIST_HEAD(>inflight);
 INIT_LIST_HEAD(>lr_queue);
+spin_lock_init(>lock);
 p->irq = virq;
 p->lpi_vcpu_id = INVALID_VCPU_ID;
 }
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index d4ed23d..1c38b9a 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -90,6 +90,14 @@ struct pending_irq
  * TODO: when implementing irq migration, taking only the current
  * vgic lock is not going to be enough. */
 struct list_head lr_queue;
+/* The lock protects the consistency of this structure. A single status bit
+ * can be read and/or set without holding the lock using the atomic
+ * set_bit/clear_bit/test_bit functions, however accessing multiple bits or
+ * relating to other members in this struct requires the lock.
+ * The list_head members are protected by their corresponding VCPU lock,
+ * it is not sufficient to hold this pending_irq lock here to query or
+ * change list order or affiliation. */
+spinlock_t lock;
 };
 
 #define NR_INTERRUPT_PER_RANK   32
@@ -156,6 +164,9 @@ struct vgic_ops {
 #define vgic_lock(v)   spin_lock_irq(&(v)->domain->arch.vgic.lock)
 #define vgic_unlock(v) spin_unlock_irq(&(v)->domain->arch.vgic.lock)
 
+#define vgic_irq_lock(p, flags) spin_lock_irqsave(&(p)->lock, flags)
+#define vgic_irq_unlock(p, flags) spin_unlock_irqrestore(&(p)->lock, flags)
+
 #define vgic_lock_rank(v, r, flags)   spin_lock_irqsave(&(r)->lock, flags)
 #define vgic_unlock_rank(v, r, flags) spin_unlock_irqrestore(&(r)->lock, flags)
 
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 22/22] ARM: vGIC: remove remaining irq_rank code

2017-07-21 Thread Andre Przywara
Now that we no longer need the struct vgic_irq_rank, we can remove the
definition and all the helper functions.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/vgic.c  | 54 
 xen/include/asm-arm/domain.h |  6 +
 xen/include/asm-arm/vgic.h   | 48 ---
 3 files changed, 1 insertion(+), 107 deletions(-)

diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index dd969e2..8ce3ce5 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -32,35 +32,6 @@
 #include 
 #include 
 
-static inline struct vgic_irq_rank *vgic_get_rank(struct vcpu *v, int rank)
-{
-if ( rank == 0 )
-return v->arch.vgic.private_irqs;
-else if ( rank <= DOMAIN_NR_RANKS(v->domain) )
-return >domain->arch.vgic.shared_irqs[rank - 1];
-else
-return NULL;
-}
-
-/*
- * Returns rank corresponding to a GICD_ register for
- * GICD_ with -bits-per-interrupt.
- */
-struct vgic_irq_rank *vgic_rank_offset(struct vcpu *v, int b, int n,
-  int s)
-{
-int rank = REG_RANK_NR(b, (n >> s));
-
-return vgic_get_rank(v, rank);
-}
-
-struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq)
-{
-int rank = irq/32;
-
-return vgic_get_rank(v, rank);
-}
-
 void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq,
unsigned int vcpu_id)
 {
@@ -75,14 +46,6 @@ void vgic_init_pending_irq(struct pending_irq *p, unsigned 
int virq,
 p->vcpu_id = vcpu_id;
 }
 
-static void vgic_rank_init(struct vgic_irq_rank *rank, uint8_t index,
-   unsigned int vcpu)
-{
-spin_lock_init(>lock);
-
-rank->index = index;
-}
-
 int domain_vgic_register(struct domain *d, int *mmio_count)
 {
 switch ( d->arch.vgic.version )
@@ -121,11 +84,6 @@ int domain_vgic_init(struct domain *d, unsigned int nr_spis)
 
 spin_lock_init(>arch.vgic.lock);
 
-d->arch.vgic.shared_irqs =
-xzalloc_array(struct vgic_irq_rank, DOMAIN_NR_RANKS(d));
-if ( d->arch.vgic.shared_irqs == NULL )
-return -ENOMEM;
-
 d->arch.vgic.pending_irqs =
 xzalloc_array(struct pending_irq, d->arch.vgic.nr_spis);
 if ( d->arch.vgic.pending_irqs == NULL )
@@ -134,9 +92,6 @@ int domain_vgic_init(struct domain *d, unsigned int nr_spis)
 /* SPIs are routed to VCPU0 by default */
 for (i=0; iarch.vgic.nr_spis; i++)
 vgic_init_pending_irq(>arch.vgic.pending_irqs[i], i + 32, 0);
-/* SPIs are routed to VCPU0 by default */
-for ( i = 0; i < DOMAIN_NR_RANKS(d); i++ )
-vgic_rank_init(>arch.vgic.shared_irqs[i], i + 1, 0);
 
 ret = d->arch.vgic.handler->domain_init(d);
 if ( ret )
@@ -178,7 +133,6 @@ void domain_vgic_free(struct domain *d)
 }
 
 d->arch.vgic.handler->domain_free(d);
-xfree(d->arch.vgic.shared_irqs);
 xfree(d->arch.vgic.pending_irqs);
 xfree(d->arch.vgic.allocated_irqs);
 }
@@ -187,13 +141,6 @@ int vcpu_vgic_init(struct vcpu *v)
 {
 int i;
 
-v->arch.vgic.private_irqs = xzalloc(struct vgic_irq_rank);
-if ( v->arch.vgic.private_irqs == NULL )
-  return -ENOMEM;
-
-/* SGIs/PPIs are always routed to this VCPU */
-vgic_rank_init(v->arch.vgic.private_irqs, 0, v->vcpu_id);
-
 v->domain->arch.vgic.handler->vcpu_init(v);
 
 memset(>arch.vgic.pending_irqs, 0, sizeof(v->arch.vgic.pending_irqs));
@@ -210,7 +157,6 @@ int vcpu_vgic_init(struct vcpu *v)
 
 int vcpu_vgic_free(struct vcpu *v)
 {
-xfree(v->arch.vgic.private_irqs);
 return 0;
 }
 
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 8dfc1d1..418400f 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -83,15 +83,12 @@ struct arch_domain
  * shared_irqs where each member contains its own locking.
  *
  * If both class of lock is required then this lock must be
- * taken first. If multiple rank locks are required (including
- * the per-vcpu private_irqs rank) then they must be taken in
- * rank order.
+ * taken first.
  */
 spinlock_t lock;
 uint32_t ctlr;
 int nr_spis; /* Number of SPIs */
 unsigned long *allocated_irqs; /* bitmap of IRQs allocated */
-struct vgic_irq_rank *shared_irqs;
 /*
  * SPIs are domain global, SGIs and PPIs are per-VCPU and stored in
  * struct arch_vcpu.
@@ -248,7 +245,6 @@ struct arch_vcpu
  * struct arch_domain.
  */
 struct pending_irq pending_irqs[32];
-struct vgic_irq_rank *private_irqs;
 
 /* This list is ordered by IRQ priority and it is used to keep
  * track of the IRQs that the VGIC injected into the guest.
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index 233ff1f..9c79c5e 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -101,16 

[Xen-devel] [RFC PATCH v2 15/22] ARM: vGIC: rework vgic_get_target_vcpu to take a pending_irq

2017-07-21 Thread Andre Przywara
For now vgic_get_target_vcpu takes a VCPU and an IRQ number, because
this is what we need for finding the proper rank and the VCPU in there.
In the future the VCPU will be looked up in the struct pending_irq.
To avoid locking issues, let's pass the pointer to the pending_irq
instead. We can read the IRQ number from there, and all but one caller
know that pointer already anyway.
This simplifies future code changes.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/gic.c |  2 +-
 xen/arch/arm/vgic.c| 22 --
 xen/include/asm-arm/vgic.h |  2 +-
 3 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 84b282b..38e998a 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -559,7 +559,7 @@ void gic_update_one_lr(struct vcpu *v, int i)
 smp_wmb();
 if ( test_bit(GIC_IRQ_GUEST_MIGRATING, >status) )
 {
-struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
+struct vcpu *v_target = vgic_get_target_vcpu(v, p);
 irq_set_affinity(p->desc, cpumask_of(v_target->processor));
 clear_bit(GIC_IRQ_GUEST_MIGRATING, >status);
 }
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index e5a4765..6722924 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -224,10 +224,11 @@ int vcpu_vgic_free(struct vcpu *v)
 return 0;
 }
 
-struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
+struct vcpu *vgic_get_target_vcpu(struct vcpu *v, struct pending_irq *p)
 {
-struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
-int target = read_atomic(>vcpu[virq & INTERRUPT_RANK_MASK]);
+struct vgic_irq_rank *rank = vgic_rank_irq(v, p->irq);
+int target = read_atomic(>vcpu[p->irq & INTERRUPT_RANK_MASK]);
+
 return v->domain->vcpu[target];
 }
 
@@ -391,8 +392,8 @@ void arch_move_irqs(struct vcpu *v)
 
 for ( i = 32; i < vgic_num_irqs(d); i++ )
 {
-v_target = vgic_get_target_vcpu(v, i);
-p = irq_to_pending(v_target, i);
+p = irq_to_pending(v, i);
+v_target = vgic_get_target_vcpu(v, p);
 
 if ( v_target == v && !test_bit(GIC_IRQ_GUEST_MIGRATING, >status) )
 irq_set_affinity(p->desc, cpu_mask);
@@ -414,10 +415,10 @@ void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n)
 
 while ( (i = find_next_bit(, 32, i)) < 32 ) {
 irq = i + (32 * n);
-v_target = vgic_get_target_vcpu(v, irq);
+p = irq_to_pending(v, irq);
+v_target = vgic_get_target_vcpu(v, p);
 
 spin_lock_irqsave(_target->arch.vgic.lock, flags);
-p = irq_to_pending(v_target, irq);
 clear_bit(GIC_IRQ_GUEST_ENABLED, >status);
 gic_remove_from_lr_pending(v_target, p);
 desc = p->desc;
@@ -468,9 +469,9 @@ void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n)
 
 while ( (i = find_next_bit(, 32, i)) < 32 ) {
 irq = i + (32 * n);
-v_target = vgic_get_target_vcpu(v, irq);
+p = irq_to_pending(v, irq);
+v_target = vgic_get_target_vcpu(v, p);
 spin_lock_irqsave(_target->arch.vgic.lock, vcpu_flags);
-p = irq_to_pending(v_target, irq);
 vgic_irq_lock(p, flags);
 set_bit(GIC_IRQ_GUEST_ENABLED, >status);
 int_type = test_bit(GIC_IRQ_GUEST_LEVEL, >status) ?
@@ -666,12 +667,13 @@ out:
 
 void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq)
 {
+struct pending_irq *p = irq_to_pending(d->vcpu[0], virq);
 struct vcpu *v;
 
 /* the IRQ needs to be an SPI */
 ASSERT(virq >= 32 && virq <= vgic_num_irqs(d));
 
-v = vgic_get_target_vcpu(d->vcpu[0], virq);
+v = vgic_get_target_vcpu(d->vcpu[0], p);
 vgic_vcpu_inject_irq(v, virq);
 }
 
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index 14c22b2..7c6067d 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -213,7 +213,7 @@ enum gic_sgi_mode;
 extern int domain_vgic_init(struct domain *d, unsigned int nr_spis);
 extern void domain_vgic_free(struct domain *d);
 extern int vcpu_vgic_init(struct vcpu *v);
-extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq);
+extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, struct pending_irq 
*p);
 extern void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq);
 extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
 extern void vgic_clear_pending_irqs(struct vcpu *v);
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 00/22] ARM: vGIC rework (attempt)

2017-07-21 Thread Andre Przywara
Hi,

this is the first part of the attempt to rewrite the VGIC to solve the
issues we discovered when adding the ITS emulation.
The problems we identified resulted in the following list of things that
need fixing:
1) introduce a per-IRQ lock
2) remove the IRQ rank scheme (of storing IRQ properties)
3) simplify the VCPU IRQ lists (getting rid of lr_queue)
4) introduce reference counting for struct pending_irq's
5) properly handle level triggered IRQs

This series addresses the first two points. I tried to move point 3) up
and fix that first, but that turned out to somehow depend on both
points 1) and 2), so we have this order now. Still having the two lists
makes things somewhat more complicated, though, but I think this is as
best as it can get. After addressing point 3) (in a later post) the end
result will look much better. I have some code for 3) and 5), mostly, but
we need to agree on the first steps first.

This is a bit of an open-heart surgery, as we try to change a locking
scheme while staying bisectable (both in terms of compilability *and*
runnability) and still having reviewable chunks.
To help reviewing I tried to split the patches up as much as possible.
Changes which are independent or introduce new functions are separate,
the motivation for some of them becomes apparent only later.
The rough idea of this series is to introduce the VGIC IRQ lock itself
first, then move each of the rank members into struct pending_irq, adjusting
the locking for that at the same time. To make the changes a bit smaller, I
fixed some read locks in separate patches after the "move" patch.
Also patch 09 adjusts the locking for setting the priority in the ITS,
which is technially needed in patch 08 already, but moved out for the sake
of reviewability. It might be squashed into patch 08 upon merging.

As hinted above still having to cope with two lists leads to some atrocities,
namely patch 03. This hideousness will vanish when the whole requirement of
queueing an IRQ in that early state will go away.

This is still somewhat work-in-progress, but I wanted to share the code
anyway, since I spent way too much time on it (rewriting it several times
on the way) and I am interested in some fresh pair of eyes to have a look.
Currently the target VCPU move (patch 18) leads to a deadlock and I just ran
out of time (before going on holidays) to debug this.
So if someone could have a look to see if this approach in general looks
good, I'd be grateful. I know that there is optimization potential (some
functions can surely be refactored), but I'd rather do one step after the
other.

Cheers,
Andre.

Andre Przywara (22):
  ARM: vGIC: introduce and initialize pending_irq lock
  ARM: vGIC: route/remove_irq: replace rank lock with IRQ lock
  ARM: vGIC: move gic_raise_inflight_irq() into vgic_vcpu_inject_irq()
  ARM: vGIC: rename pending_irq->priority to cur_priority
  ARM: vITS: rename pending_irq->lpi_priority to priority
  ARM: vGIC: introduce locking routines for multiple IRQs
  ARM: vGIC: introduce priority setter/getter
  ARM: vGIC: move virtual IRQ priority from rank to pending_irq
  ARM: vITS: protect LPI priority update with pending_irq lock
  ARM: vGIC: protect gic_set_lr() with pending_irq lock
  ARM: vGIC: protect gic_events_need_delivery() with pending_irq lock
  ARM: vGIC: protect gic_update_one_lr() with pending_irq lock
  ARM: vITS: remove no longer needed lpi_priority wrapper
  ARM: vGIC: move virtual IRQ configuration from rank to pending_irq
  ARM: vGIC: rework vgic_get_target_vcpu to take a pending_irq
  ARM: vITS: rename lpi_vcpu_id to vcpu_id
  ARM: vGIC: introduce vgic_lock_vcpu_irq()
  ARM: vGIC: move virtual IRQ target VCPU from rank to pending_irq
  ARM: vGIC: rework vgic_get_target_vcpu to take a domain instead of
vcpu
  ARM: vGIC: move virtual IRQ enable bit from rank to pending_irq
  ARM: vITS: injecting LPIs: use pending_irq lock
  ARM: vGIC: remove remaining irq_rank code

 xen/arch/arm/gic-v2.c|   2 +-
 xen/arch/arm/gic-v3-lpi.c|  14 +-
 xen/arch/arm/gic-v3.c|   2 +-
 xen/arch/arm/gic.c   |  96 
 xen/arch/arm/vgic-v2.c   | 161 -
 xen/arch/arm/vgic-v3-its.c   |  42 ++--
 xen/arch/arm/vgic-v3.c   | 182 +--
 xen/arch/arm/vgic.c  | 521 +++
 xen/include/asm-arm/domain.h |   6 +-
 xen/include/asm-arm/gic.h|   2 +-
 xen/include/asm-arm/vgic.h   | 114 +++---
 11 files changed, 540 insertions(+), 602 deletions(-)

-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 13/22] ARM: vITS: remove no longer needed lpi_priority wrapper

2017-07-21 Thread Andre Przywara
For LPIs we stored the priority value in struct pending_irq, but all
other type of IRQs were using the irq_rank structure for that.
Now that every IRQ using pending_irq, we can remove the special handling
we had in place for LPIs and just use the now unified access wrappers.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/vgic-v2.c |  7 ---
 xen/arch/arm/vgic-v3.c | 11 ---
 xen/include/asm-arm/vgic.h |  1 -
 3 files changed, 19 deletions(-)

diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
index ed7ff3b..a3fd500 100644
--- a/xen/arch/arm/vgic-v2.c
+++ b/xen/arch/arm/vgic-v2.c
@@ -690,18 +690,11 @@ static struct pending_irq *vgic_v2_lpi_to_pending(struct 
domain *d,
 BUG();
 }
 
-static int vgic_v2_lpi_get_priority(struct domain *d, unsigned int vlpi)
-{
-/* Dummy function, no LPIs on a VGICv2. */
-BUG();
-}
-
 static const struct vgic_ops vgic_v2_ops = {
 .vcpu_init   = vgic_v2_vcpu_init,
 .domain_init = vgic_v2_domain_init,
 .domain_free = vgic_v2_domain_free,
 .lpi_to_pending = vgic_v2_lpi_to_pending,
-.lpi_get_priority = vgic_v2_lpi_get_priority,
 .max_vcpus = 8,
 };
 
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index e58e77e..d3356ae 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -1757,23 +1757,12 @@ static struct pending_irq 
*vgic_v3_lpi_to_pending(struct domain *d,
 return pirq;
 }
 
-/* Retrieve the priority of an LPI from its struct pending_irq. */
-static int vgic_v3_lpi_get_priority(struct domain *d, uint32_t vlpi)
-{
-struct pending_irq *p = vgic_v3_lpi_to_pending(d, vlpi);
-
-ASSERT(p);
-
-return p->priority;
-}
-
 static const struct vgic_ops v3_ops = {
 .vcpu_init   = vgic_v3_vcpu_init,
 .domain_init = vgic_v3_domain_init,
 .domain_free = vgic_v3_domain_free,
 .emulate_reg  = vgic_v3_emulate_reg,
 .lpi_to_pending = vgic_v3_lpi_to_pending,
-.lpi_get_priority = vgic_v3_lpi_get_priority,
 /*
  * We use both AFF1 and AFF0 in (v)MPIDR. Thus, the max number of CPU
  * that can be supported is up to 4096(==256*16) in theory.
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index 59d52c6..6343c95 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -143,7 +143,6 @@ struct vgic_ops {
 bool (*emulate_reg)(struct cpu_user_regs *regs, union hsr hsr);
 /* lookup the struct pending_irq for a given LPI interrupt */
 struct pending_irq *(*lpi_to_pending)(struct domain *d, unsigned int vlpi);
-int (*lpi_get_priority)(struct domain *d, uint32_t vlpi);
 /* Maximum number of vCPU supported */
 const unsigned int max_vcpus;
 };
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 02/22] ARM: vGIC: route/remove_irq: replace rank lock with IRQ lock

2017-07-21 Thread Andre Przywara
So far the rank lock is protecting the physical IRQ routing for a
particular virtual IRQ (though this doesn't seem to be documented
anywhere). So although these functions don't really touch the rank
structure, the lock prevents them from running concurrently.
This seems a bit like a kludge, so as we now have our newly introduced
per-IRQ lock, we can use that instead to get a more natural protection
(and remove the first rank user).

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/gic.c | 18 +++---
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 6c803bf..2c99d71 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -139,9 +139,7 @@ int gic_route_irq_to_guest(struct domain *d, unsigned int 
virq,
 unsigned long flags;
 /* Use vcpu0 to retrieve the pending_irq struct. Given that we only
  * route SPIs to guests, it doesn't make any difference. */
-struct vcpu *v_target = vgic_get_target_vcpu(d->vcpu[0], virq);
-struct vgic_irq_rank *rank = vgic_rank_irq(v_target, virq);
-struct pending_irq *p = irq_to_pending(v_target, virq);
+struct pending_irq *p = irq_to_pending(d->vcpu[0], virq);
 int res = -EBUSY;
 
 ASSERT(spin_is_locked(>lock));
@@ -150,7 +148,7 @@ int gic_route_irq_to_guest(struct domain *d, unsigned int 
virq,
 ASSERT(virq < vgic_num_irqs(d));
 ASSERT(!is_lpi(virq));
 
-vgic_lock_rank(v_target, rank, flags);
+vgic_irq_lock(p, flags);
 
 if ( p->desc ||
  /* The VIRQ should not be already enabled by the guest */
@@ -168,7 +166,7 @@ int gic_route_irq_to_guest(struct domain *d, unsigned int 
virq,
 res = 0;
 
 out:
-vgic_unlock_rank(v_target, rank, flags);
+vgic_irq_unlock(p, flags);
 
 return res;
 }
@@ -177,9 +175,7 @@ out:
 int gic_remove_irq_from_guest(struct domain *d, unsigned int virq,
   struct irq_desc *desc)
 {
-struct vcpu *v_target = vgic_get_target_vcpu(d->vcpu[0], virq);
-struct vgic_irq_rank *rank = vgic_rank_irq(v_target, virq);
-struct pending_irq *p = irq_to_pending(v_target, virq);
+struct pending_irq *p = irq_to_pending(d->vcpu[0], virq);
 unsigned long flags;
 
 ASSERT(spin_is_locked(>lock));
@@ -187,7 +183,7 @@ int gic_remove_irq_from_guest(struct domain *d, unsigned 
int virq,
 ASSERT(p->desc == desc);
 ASSERT(!is_lpi(virq));
 
-vgic_lock_rank(v_target, rank, flags);
+vgic_irq_lock(p, flags);
 
 if ( d->is_dying )
 {
@@ -207,7 +203,7 @@ int gic_remove_irq_from_guest(struct domain *d, unsigned 
int virq,
 if ( test_bit(_IRQ_INPROGRESS, >status) ||
  !test_bit(_IRQ_DISABLED, >status) )
 {
-vgic_unlock_rank(v_target, rank, flags);
+vgic_irq_unlock(p, flags);
 return -EBUSY;
 }
 }
@@ -217,7 +213,7 @@ int gic_remove_irq_from_guest(struct domain *d, unsigned 
int virq,
 
 p->desc = NULL;
 
-vgic_unlock_rank(v_target, rank, flags);
+vgic_irq_unlock(p, flags);
 
 return 0;
 }
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 07/22] ARM: vGIC: introduce priority setter/getter

2017-07-21 Thread Andre Przywara
Since the GICs MMIO access always covers a number of IRQs at once,
introduce wrapper functions which loop over those IRQs, take their
locks and read or update the priority values.
This will be used in a later patch.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/vgic.c| 37 +
 xen/include/asm-arm/vgic.h |  5 +
 2 files changed, 42 insertions(+)

diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 434b7e2..b2c9632 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -243,6 +243,43 @@ static int vgic_get_virq_priority(struct vcpu *v, unsigned 
int virq)
 return ACCESS_ONCE(rank->priority[virq & INTERRUPT_RANK_MASK]);
 }
 
+#define MAX_IRQS_PER_IPRIORITYR 4
+uint32_t vgic_fetch_irq_priority(struct vcpu *v, unsigned int nrirqs,
+ unsigned int first_irq)
+{
+struct pending_irq *pirqs[MAX_IRQS_PER_IPRIORITYR];
+unsigned long flags;
+uint32_t ret = 0, i;
+
+local_irq_save(flags);
+vgic_lock_irqs(v, nrirqs, first_irq, pirqs);
+
+for ( i = 0; i < nrirqs; i++ )
+ret |= pirqs[i]->priority << (i * 8);
+
+vgic_unlock_irqs(pirqs, nrirqs);
+local_irq_restore(flags);
+
+return ret;
+}
+
+void vgic_store_irq_priority(struct vcpu *v, unsigned int nrirqs,
+ unsigned int first_irq, uint32_t value)
+{
+struct pending_irq *pirqs[MAX_IRQS_PER_IPRIORITYR];
+unsigned long flags;
+unsigned int i;
+
+local_irq_save(flags);
+vgic_lock_irqs(v, nrirqs, first_irq, pirqs);
+
+for ( i = 0; i < nrirqs; i++, value >>= 8 )
+pirqs[i]->priority = value & 0xff;
+
+vgic_unlock_irqs(pirqs, nrirqs);
+local_irq_restore(flags);
+}
+
 bool vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq)
 {
 unsigned long flags;
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index ecf4969..f3791c8 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -198,6 +198,11 @@ void vgic_lock_irqs(struct vcpu *v, unsigned int nrirqs, 
unsigned int first_irq,
 struct pending_irq **pirqs);
 void vgic_unlock_irqs(struct pending_irq **pirqs, unsigned int nrirqs);
 
+uint32_t vgic_fetch_irq_priority(struct vcpu *v, unsigned int nrirqs,
+ unsigned int first_irq);
+void vgic_store_irq_priority(struct vcpu *v, unsigned int nrirqs,
+ unsigned int first_irq, uint32_t reg);
+
 enum gic_sgi_mode;
 
 /*
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 06/22] ARM: vGIC: introduce locking routines for multiple IRQs

2017-07-21 Thread Andre Przywara
When replacing the rank lock with individual per-IRQs lock soon, we will
still need the ability to lock multiple IRQs.
Provide two helper routines which lock and unlock a number of consecutive
IRQs in the right order.
Forward-looking the locking function fills an array of pending_irq
pointers, so the lookup has only to be done once.
These routines expect that local_irq_save() has been called before the
lock routine and the respective local_irq_restore() after the unlock
function.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/vgic.c| 20 
 xen/include/asm-arm/vgic.h |  4 
 2 files changed, 24 insertions(+)

diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 21b545e..434b7e2 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -375,6 +375,26 @@ static inline unsigned int vgic_get_virq_type(struct vcpu 
*v, int n, int index)
 return IRQ_TYPE_LEVEL_HIGH;
 }
 
+void vgic_lock_irqs(struct vcpu *v, unsigned int nrirqs,
+unsigned int first_irq, struct pending_irq **pirqs)
+{
+unsigned int i;
+
+for ( i = 0; i < nrirqs; i++ )
+{
+pirqs[i] = irq_to_pending(v, first_irq + i);
+spin_lock([i]->lock);
+}
+}
+
+void vgic_unlock_irqs(struct pending_irq **pirqs, unsigned int nrirqs)
+{
+int i;
+
+for ( i = nrirqs - 1; i >= 0; i-- )
+spin_unlock([i]->lock);
+}
+
 void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n)
 {
 const unsigned long mask = r;
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index 27b5e37..ecf4969 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -194,6 +194,10 @@ static inline int REG_RANK_NR(int b, uint32_t n)
 }
 }
 
+void vgic_lock_irqs(struct vcpu *v, unsigned int nrirqs, unsigned int 
first_irq,
+struct pending_irq **pirqs);
+void vgic_unlock_irqs(struct pending_irq **pirqs, unsigned int nrirqs);
+
 enum gic_sgi_mode;
 
 /*
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 6/6] xen: sched: optimize exclusive pinning case (Credit1 & 2)

2017-07-21 Thread Dario Faggioli
On Fri, 2017-07-21 at 18:19 +0100, George Dunlap wrote:
> On 06/23/2017 11:55 AM, Dario Faggioli wrote:
> > diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> > index 4f6330e..85e014d 100644
> > --- a/xen/common/sched_credit.c
> > +++ b/xen/common/sched_credit.c
> > @@ -429,6 +429,24 @@ static inline void __runq_tickle(struct
> > csched_vcpu *new)
> >  idlers_empty = cpumask_empty(_mask);
> >  
> >  /*
> > + * Exclusive pinning is when a vcpu has hard-affinity with
> > only one
> > + * cpu, and there is no other vcpu that has hard-affinity with
> > that
> > + * same cpu. This is infrequent, but if it happens, is for
> > achieving
> > + * the most possible determinism, and least possible overhead
> > for
> > + * the vcpus in question.
> > + *
> > + * Try to identify the vast majority of these situations, and
> > deal
> > + * with them quickly.
> > + */
> > +if ( unlikely(cpumask_cycle(cpu, new->vcpu->cpu_hard_affinity) 
> > == cpu &&
> 
> Won't this check entail a full "loop" of the cpumask?  It's cheap
> enough
> if nr_cpu_ids is small; but don't we support (theoretically) 4096
> logical cpus?
> 
> It seems like having a vcpu flag that identifies a vcpu as being
> pinned
> would be a more efficient way to do this.  That way we could run this
> check once whenever the hard affinity changed, rather than every time
> we
> want to think about where to run this vcpu.
> 
> What do you think?
> 
Right. We actually should get some help from the hardware (ffs &
firends)... but I think you're right. Implementing this with a flag, as
 you're suggesting, is most likely better, and easy enough.

I'll go for that!

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)

signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/6] xen: credit2: rearrange members of control structures

2017-07-21 Thread Dario Faggioli
On Fri, 2017-07-21 at 18:05 +0100, George Dunlap wrote:
> On 06/23/2017 11:55 AM, Dario Faggioli wrote:
> > 
> > While there, improve the wording, style and alignment
> > of comments too.
> > 
> > Signed-off-by: Dario Faggioli 
> 
> I haven't taken a careful look at these; the idea sounds good and
> I'll
> trust that you've taken a careful look at them:
> 
Hehe... thanks! :-)

I've even done the whole thing twice. In fact, I was about to submit
the series, when I discovered that I did optimize the cache layout of a
debug build, and hence had to redo everything from the beginning! :-P

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)

signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/6] xen: RTDS: rearrange members of control structures

2017-07-21 Thread Dario Faggioli
On Fri, 2017-07-21 at 13:51 -0400, Meng Xu wrote:
> On Fri, Jun 23, 2017 at 6:55 AM, Dario Faggioli
>  wrote:
> > 
> > Nothing changed in `pahole` output, in terms of holes
> > and padding, but some fields have been moved, to put
> > related members in same cache line.
> > 
> > Signed-off-by: Dario Faggioli 
> > ---
> > Cc: Meng Xu 
> > Cc: George Dunlap 
> > ---
> >  xen/common/sched_rt.c |   13 -
> >  1 file changed, 8 insertions(+), 5 deletions(-)
> > 
> > diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
> > index 1b30014..39f6bee 100644
> > --- a/xen/common/sched_rt.c
> > +++ b/xen/common/sched_rt.c
> > @@ -171,11 +171,14 @@ static void repl_timer_handler(void *data);
> >  struct rt_private {
> >  spinlock_t lock;/* the global coarse-grained lock
> > */
> >  struct list_head sdom;  /* list of availalbe domains, used
> > for dump */
> > +
> >  struct list_head runq;  /* ordered list of runnable vcpus
> > */
> >  struct list_head depletedq; /* unordered list of depleted
> > vcpus */
> > +
> > +struct timer *repl_timer;   /* replenishment timer */
> >  struct list_head replq; /* ordered list of vcpus that need
> > replenishment */
> > +
> >  cpumask_t tickled;  /* cpus been tickled */
> > -struct timer *repl_timer;   /* replenishment timer */
> >  };
> > 
> >  /*
> > @@ -185,10 +188,6 @@ struct rt_vcpu {
> >  struct list_head q_elem; /* on the runq/depletedq list */
> >  struct list_head replq_elem; /* on the replenishment events
> > list */
> > 
> > -/* Up-pointers */
> > -struct rt_dom *sdom;
> > -struct vcpu *vcpu;
> > -
> >  /* VCPU parameters, in nanoseconds */
> >  s_time_t period;
> >  s_time_t budget;
> > @@ -198,6 +197,10 @@ struct rt_vcpu {
> >  s_time_t last_start; /* last start time */
> >  s_time_t cur_deadline;   /* current deadline for EDF */
> > 
> > +/* Up-pointers */
> > +struct rt_dom *sdom;
> > +struct vcpu *vcpu;
> > +
> >  unsigned flags;  /* mark __RTDS_scheduled, etc..
> > */
> >  };
> > 
> 
> Reviewed-by: Meng Xu 
> 
> BTW, Dario, I'm wondering if you used any tool to give hints about
> how
> to arrange the fields in a structure or you just did it manually?
> 
I used pahole for figuring out the cache layout. But just that. So,
basically, I --manually-- tried to move the fields around, and check
the result with pahole (and then did it again, and again. :-D).

TBH, the improvement for RTDS is probably not even noticeable, as we
access almost all the fields anyway. But it still makes sense, IMO.

Thanks for the review,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)

signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 112104: tolerable trouble: broken/pass - PUSHED

2017-07-21 Thread osstest service owner
flight 112104 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112104/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  647de517b08e77b9b5f76d6853dddc759b8df0b4
baseline version:
 xen  73771b89fd9d89a23d5c7b760056fdaf94946be9

Last test of basis   112062  2017-07-20 18:14:31 Z1 days
Testing same since   112104  2017-07-21 18:18:21 Z0 days1 attempts


People who touched revisions under test:
  Dario Faggioli 
  George Dunlap 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  broken  
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=647de517b08e77b9b5f76d6853dddc759b8df0b4
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
647de517b08e77b9b5f76d6853dddc759b8df0b4
+ branch=xen-unstable-smoke
+ revision=647de517b08e77b9b5f76d6853dddc759b8df0b4
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.9-testing
+ '[' x647de517b08e77b9b5f76d6853dddc759b8df0b4 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : 

Re: [Xen-devel] [PATCH 22/25 v6] xen/arm: vpl011: Add support for vuart console in xenconsole

2017-07-21 Thread Stefano Stabellini
On Fri, 21 Jul 2017, Julien Grall wrote:
> Hi,
> 
> On 18/07/17 21:07, Stefano Stabellini wrote:
> > On Mon, 17 Jul 2017, Bhupinder Thakur wrote:
> > > This patch finally adds the support for vuart console. It adds
> > > two new fields in the console initialization:
> > > 
> > > - optional
> > > - prefer_gnttab
> > > 
> > > optional flag tells whether the console is optional.
> > > 
> > > prefer_gnttab tells whether the ring buffer should be allocated using
> > > grant table.
> > > 
> > > Signed-off-by: Bhupinder Thakur 
> > > ---
> > > CC: Ian Jackson 
> > > CC: Wei Liu 
> > > CC: Stefano Stabellini 
> > > CC: Julien Grall 
> > > 
> > > Changes since v4:
> > > - Renamed VUART_CFLAGS- to CFLAGS_vuart- in the Makefile as per the
> > > convention.
> > > 
> > >  config/arm32.mk   |  1 +
> > >  config/arm64.mk   |  1 +
> > >  tools/console/Makefile|  3 ++-
> > >  tools/console/daemon/io.c | 29 -
> > >  4 files changed, 32 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/config/arm32.mk b/config/arm32.mk
> > > index f95228e..b9f23fe 100644
> > > --- a/config/arm32.mk
> > > +++ b/config/arm32.mk
> > > @@ -1,5 +1,6 @@
> > >  CONFIG_ARM := y
> > >  CONFIG_ARM_32 := y
> > > +CONFIG_VUART_CONSOLE := y
> > >  CONFIG_ARM_$(XEN_OS) := y
> > > 
> > >  CONFIG_XEN_INSTALL_SUFFIX :=
> > 
> > What about leaving this off for ARM32 by default?
> 
> Why? This will only disable xenconsole changes and not the hypervisor. The
> changes are quite tiny, so I would even be in favor of enabling for all
> architectures.
> 
> Or are you suggesting to disable the VPL011 emulation in the hypervisor? But I
> don't see the emulation AArch64 specific, and a user could disable it if he
> doesn't want it...

I was thinking that the virtual pl011 is mostly useful for SBSA
compliance, which doesn't really apply to ARM32 (there are no ARM32 SBSA
compliant platforms as far as I am aware).

Given that we don't need vpl011 on ARM32, I thought we might as well
disable it. Less code the better. I wouldn't go as far as introducing
more #ifdefs to disable it, but I would make use of the existing config
options to turn it off by default on ARM32. Does it make sense?

That said, you are right that there is no point in disabling only
CONFIG_VUART_CONSOLE, which affects the tools only. We should really
disable SBSA_VUART_CONSOLE by default on ARM32. In fact, ideally
CONFIG_VUART_CONSOLE would be set dependning on the value of
SBSA_VUART_CONSOLE. What do you think?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v3]Proposal to allow setting up shared memory areas between VMs from xl config file

2017-07-21 Thread Stefano Stabellini
On Fri, 21 Jul 2017, Julien Grall wrote:
> > >   @x86_cacheattrcan be 'uc', 'wc', 'wt', 'wp', 'wb' or 'suc'.
> > > Default
> > > is 'wb'.
> > 
> > Also here, I would write:
> > 
> > @x86_cacheattr  Only 'wb' (write-back) is supported today.
> > 
> > Like you wrote later, begin and end addresses need to be multiple of 4K.
> 
> This is not true. The addresses should be a multiple of the hypervisor page
> granularity.
> 
> I will not be possible to map a 4K chunk in stage-2 when the hypervisor is
> using 16K or 64K page granularity.

Yes, but there are no 16K or 64K hypervisor pages now. So far, we have
not really attemped to say "granularity" for hypevisor pages rather than
4K, given that 4K has always been a solid assumption. But this doc could
be the right time to start doing that :-)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/pvcalls: use WARN_ON(1) instead of __WARN()

2017-07-21 Thread Stefano Stabellini
On Fri, 21 Jul 2017, Arnd Bergmann wrote:
> __WARN() is an internal helper that is only available on
> some architectures, but causes a build error e.g. on ARM64
> in some configurations:
> 
> drivers/xen/pvcalls-back.c: In function 'set_backend_state':
> drivers/xen/pvcalls-back.c:1097:5: error: implicit declaration of function 
> '__WARN' [-Werror=implicit-function-declaration]
> 
> Unfortunately, there is no equivalent of BUG() that takes no
> arguments, but WARN_ON(1) is commonly used in other drivers
> and works on all configurations.
> 
> Fixes: 7160378206b2 ("xen/pvcalls: xenbus state handling")
> Signed-off-by: Arnd Bergmann 

Reviewed-by: Stefano Stabellini 


> ---
>  drivers/xen/pvcalls-back.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c
> index d6c4c4aecb41..00c1a2344330 100644
> --- a/drivers/xen/pvcalls-back.c
> +++ b/drivers/xen/pvcalls-back.c
> @@ -1094,7 +1094,7 @@ static void set_backend_state(struct xenbus_device *dev,
>   xenbus_switch_state(dev, XenbusStateClosing);
>   break;
>   default:
> - __WARN();
> + WARN_ON(1);
>   }
>   break;
>   case XenbusStateInitWait:
> @@ -1109,7 +1109,7 @@ static void set_backend_state(struct xenbus_device *dev,
>   xenbus_switch_state(dev, XenbusStateClosing);
>   break;
>   default:
> - __WARN();
> + WARN_ON(1);
>   }
>   break;
>   case XenbusStateConnected:
> @@ -1123,7 +1123,7 @@ static void set_backend_state(struct xenbus_device *dev,
>   xenbus_switch_state(dev, XenbusStateClosing);
>   break;
>   default:
> - __WARN();
> + WARN_ON(1);
>   }
>   break;
>   case XenbusStateClosing:
> @@ -1134,11 +1134,11 @@ static void set_backend_state(struct xenbus_device 
> *dev,
>   xenbus_switch_state(dev, XenbusStateClosed);
>   break;
>   default:
> - __WARN();
> + WARN_ON(1);
>   }
>   break;
>   default:
> - __WARN();
> + WARN_ON(1);
>   }
>   }
>  }
> -- 
> 2.9.0
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PULL for-2.10 6/7] xen/mapcache: introduce xen_replace_cache_entry()

2017-07-21 Thread Igor Druzhinin

On 21/07/17 14:50, Anthony PERARD wrote:

On Tue, Jul 18, 2017 at 03:22:41PM -0700, Stefano Stabellini wrote:

From: Igor Druzhinin 


...


+static uint8_t *xen_replace_cache_entry_unlocked(hwaddr old_phys_addr,
+ hwaddr new_phys_addr,
+ hwaddr size)
+{
+MapCacheEntry *entry;
+hwaddr address_index, address_offset;
+hwaddr test_bit_size, cache_size = size;
+
+address_index  = old_phys_addr >> MCACHE_BUCKET_SHIFT;
+address_offset = old_phys_addr & (MCACHE_BUCKET_SIZE - 1);
+
+assert(size);
+/* test_bit_size is always a multiple of XC_PAGE_SIZE */
+test_bit_size = size + (old_phys_addr & (XC_PAGE_SIZE - 1));
+if (test_bit_size % XC_PAGE_SIZE) {
+test_bit_size += XC_PAGE_SIZE - (test_bit_size % XC_PAGE_SIZE);
+}
+cache_size = size + address_offset;
+if (cache_size % MCACHE_BUCKET_SIZE) {
+cache_size += MCACHE_BUCKET_SIZE - (cache_size % MCACHE_BUCKET_SIZE);
+}
+
+entry = >entry[address_index % mapcache->nr_buckets];
+while (entry && !(entry->paddr_index == address_index &&
+  entry->size == cache_size)) {
+entry = entry->next;
+}
+if (!entry) {
+DPRINTF("Trying to update an entry for %lx " \
+"that is not in the mapcache!\n", old_phys_addr);
+return NULL;
+}
+
+address_index  = new_phys_addr >> MCACHE_BUCKET_SHIFT;
+address_offset = new_phys_addr & (MCACHE_BUCKET_SIZE - 1);
+
+fprintf(stderr, "Replacing a dummy mapcache entry for %lx with %lx\n",
+old_phys_addr, new_phys_addr);


Looks likes this does not build on 32bits.
in: 
http://logs.test-lab.xenproject.org/osstest/logs/112041/build-i386/6.ts-xen-build.log

/home/osstest/build.112041.build-i386/xen/tools/qemu-xen-dir/hw/i386/xen/xen-mapcache.c:
 In function 'xen_replace_cache_entry_unlocked':
/home/osstest/build.112041.build-i386/xen/tools/qemu-xen-dir/hw/i386/xen/xen-mapcache.c:539:13:
 error: format '%lx' expects argument of type 'long unsigned int', but argument 
3 has type 'hwaddr' [-Werror=format=]
  old_phys_addr, new_phys_addr);
  ^
/home/osstest/build.112041.build-i386/xen/tools/qemu-xen-dir/hw/i386/xen/xen-mapcache.c:539:13:
 error: format '%lx' expects argument of type 'long unsigned int', but argument 
4 has type 'hwaddr' [-Werror=format=]
cc1: all warnings being treated as errors
   CC  i386-softmmu/target/i386/gdbstub.o
/home/osstest/build.112041.build-i386/xen/tools/qemu-xen-dir/rules.mak:66: 
recipe for target 'hw/i386/xen/xen-mapcache.o' failed


+
+xen_remap_bucket(entry, entry->vaddr_base,
+ cache_size, address_index, false);
+if (!test_bits(address_offset >> XC_PAGE_SHIFT,
+test_bit_size >> XC_PAGE_SHIFT,
+entry->valid_mapping)) {
+DPRINTF("Unable to update a mapcache entry for %lx!\n", old_phys_addr);
+return NULL;
+}
+
+return entry->vaddr_base + address_offset;
+}
+




Please, accept the attached patch to fix the issue.

Igor
>From 69a3afa453e283e92ddfd76109b203a20a02524c Mon Sep 17 00:00:00 2001
From: Igor Druzhinin 
Date: Fri, 21 Jul 2017 19:27:41 +0100
Subject: [PATCH] xen: fix compilation on 32-bit hosts

Signed-off-by: Igor Druzhinin 
---
 hw/i386/xen/xen-mapcache.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/hw/i386/xen/xen-mapcache.c b/hw/i386/xen/xen-mapcache.c
index 84cc4a2..540406a 100644
--- a/hw/i386/xen/xen-mapcache.c
+++ b/hw/i386/xen/xen-mapcache.c
@@ -529,7 +529,7 @@ static uint8_t *xen_replace_cache_entry_unlocked(hwaddr old_phys_addr,
 entry = entry->next;
 }
 if (!entry) {
-DPRINTF("Trying to update an entry for %lx " \
+DPRINTF("Trying to update an entry for "TARGET_FMT_plx \
 "that is not in the mapcache!\n", old_phys_addr);
 return NULL;
 }
@@ -537,15 +537,16 @@ static uint8_t *xen_replace_cache_entry_unlocked(hwaddr old_phys_addr,
 address_index  = new_phys_addr >> MCACHE_BUCKET_SHIFT;
 address_offset = new_phys_addr & (MCACHE_BUCKET_SIZE - 1);
 
-fprintf(stderr, "Replacing a dummy mapcache entry for %lx with %lx\n",
-old_phys_addr, new_phys_addr);
+fprintf(stderr, "Replacing a dummy mapcache entry for "TARGET_FMT_plx \
+" with "TARGET_FMT_plx"\n", old_phys_addr, new_phys_addr);
 
 xen_remap_bucket(entry, entry->vaddr_base,
  cache_size, address_index, false);
 if(!test_bits(address_offset >> XC_PAGE_SHIFT,
 test_bit_size >> XC_PAGE_SHIFT,
 entry->valid_mapping)) {
-DPRINTF("Unable to update a mapcache entry for %lx!\n", old_phys_addr);
+DPRINTF("Unable to update a mapcache entry for "TARGET_FMT_plx"!\n",
+old_phys_addr);
 

Re: [Xen-devel] [PATCH] xen/pvcalls: use WARN_ON(1) instead of __WARN()

2017-07-21 Thread Boris Ostrovsky
On 07/21/2017 12:17 PM, Arnd Bergmann wrote:
> __WARN() is an internal helper that is only available on
> some architectures, but causes a build error e.g. on ARM64
> in some configurations:
>
> drivers/xen/pvcalls-back.c: In function 'set_backend_state':
> drivers/xen/pvcalls-back.c:1097:5: error: implicit declaration of function 
> '__WARN' [-Werror=implicit-function-declaration]
>
> Unfortunately, there is no equivalent of BUG() that takes no
> arguments, but WARN_ON(1) is commonly used in other drivers
> and works on all configurations.
>
> Fixes: 7160378206b2 ("xen/pvcalls: xenbus state handling")
> Signed-off-by: Arnd Bergmann 

Reviewed-by: Boris Ostrovsky 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/6] xen: RTDS: rearrange members of control structures

2017-07-21 Thread Meng Xu
On Fri, Jun 23, 2017 at 6:55 AM, Dario Faggioli
 wrote:
>
> Nothing changed in `pahole` output, in terms of holes
> and padding, but some fields have been moved, to put
> related members in same cache line.
>
> Signed-off-by: Dario Faggioli 
> ---
> Cc: Meng Xu 
> Cc: George Dunlap 
> ---
>  xen/common/sched_rt.c |   13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
> index 1b30014..39f6bee 100644
> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -171,11 +171,14 @@ static void repl_timer_handler(void *data);
>  struct rt_private {
>  spinlock_t lock;/* the global coarse-grained lock */
>  struct list_head sdom;  /* list of availalbe domains, used for dump 
> */
> +
>  struct list_head runq;  /* ordered list of runnable vcpus */
>  struct list_head depletedq; /* unordered list of depleted vcpus */
> +
> +struct timer *repl_timer;   /* replenishment timer */
>  struct list_head replq; /* ordered list of vcpus that need 
> replenishment */
> +
>  cpumask_t tickled;  /* cpus been tickled */
> -struct timer *repl_timer;   /* replenishment timer */
>  };
>
>  /*
> @@ -185,10 +188,6 @@ struct rt_vcpu {
>  struct list_head q_elem; /* on the runq/depletedq list */
>  struct list_head replq_elem; /* on the replenishment events list */
>
> -/* Up-pointers */
> -struct rt_dom *sdom;
> -struct vcpu *vcpu;
> -
>  /* VCPU parameters, in nanoseconds */
>  s_time_t period;
>  s_time_t budget;
> @@ -198,6 +197,10 @@ struct rt_vcpu {
>  s_time_t last_start; /* last start time */
>  s_time_t cur_deadline;   /* current deadline for EDF */
>
> +/* Up-pointers */
> +struct rt_dom *sdom;
> +struct vcpu *vcpu;
> +
>  unsigned flags;  /* mark __RTDS_scheduled, etc.. */
>  };
>

Reviewed-by: Meng Xu 

BTW, Dario, I'm wondering if you used any tool to give hints about how
to arrange the fields in a structure or you just did it manually?

Thanks,

Meng

---
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [linux-3.18 test] 112085: regressions - trouble: blocked/broken/fail/pass

2017-07-21 Thread osstest service owner
flight 112085 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112085/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-arndale   4 host-install(4)broken REGR. vs. 111920
 test-armhf-armhf-libvirt-raw  7 xen-boot fail REGR. vs. 111920
 test-amd64-i386-xl-qemuu-debianhvm-amd64 16 guest-localmigrate/x10 fail REGR. 
vs. 111920

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-arm64-arm64-examine  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail like 111893
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail like 111893
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 111893
 test-amd64-i386-freebsd10-amd64 19 guest-start/freebsd.repeat fail like 111920
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 111920
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 111920
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 111920
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 111920
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore   fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore   fail never pass
 build-arm64-pvops 6 kernel-build fail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass

version targeted for testing:
 linuxdd8b674caeef9381345a6369fba29d425ff433f3
baseline version:
 linux4d29e8c0e9319ce9d391c57d3133306c05b6cef5

Last test of basis   111920  2017-07-17 06:21:48 Z4 days
Testing same since   112085  2017-07-21 06:22:28 Z0 days1 attempts


People who touched revisions under test:
  Adam Borowski 
  Amit Pundir 
  Andrew Morton 
  Andrey Konovalov 
  Arend van Spriel 
  Ben Hutchings 
  Cong Wang 
  Cyril Bur 
  Dan Carpenter 
  David Ahern 

Re: [Xen-devel] [PATCH 6/6] xen: sched: optimize exclusive pinning case (Credit1 & 2)

2017-07-21 Thread George Dunlap
On 06/23/2017 11:55 AM, Dario Faggioli wrote:
> Exclusive pinning of vCPUs is used, sometimes, for
> achieving the highest level of determinism, and the
> least possible overhead, for the vCPUs in question.
> 
> Although static 1:1 pinning is not recommended, for
> general use cases, optimizing the tickling code (of
> Credit1 and Credit2) is easy and cheap enough, so go
> for it.
> 
> Signed-off-by: Dario Faggioli 
> ---
> Cc: George Dunlap 
> Cc: Anshul Makkar 
> ---
>  xen/common/sched_credit.c|   19 +++
>  xen/common/sched_credit2.c   |   21 -
>  xen/include/xen/perfc_defn.h |1 +
>  3 files changed, 40 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> index 4f6330e..85e014d 100644
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -429,6 +429,24 @@ static inline void __runq_tickle(struct csched_vcpu *new)
>  idlers_empty = cpumask_empty(_mask);
>  
>  /*
> + * Exclusive pinning is when a vcpu has hard-affinity with only one
> + * cpu, and there is no other vcpu that has hard-affinity with that
> + * same cpu. This is infrequent, but if it happens, is for achieving
> + * the most possible determinism, and least possible overhead for
> + * the vcpus in question.
> + *
> + * Try to identify the vast majority of these situations, and deal
> + * with them quickly.
> + */
> +if ( unlikely(cpumask_cycle(cpu, new->vcpu->cpu_hard_affinity) == cpu &&

Won't this check entail a full "loop" of the cpumask?  It's cheap enough
if nr_cpu_ids is small; but don't we support (theoretically) 4096
logical cpus?

It seems like having a vcpu flag that identifies a vcpu as being pinned
would be a more efficient way to do this.  That way we could run this
check once whenever the hard affinity changed, rather than every time we
want to think about where to run this vcpu.

What do you think?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] xen-blkfront: Fix handling of non-supported operations

2017-07-21 Thread Bart Van Assche
This patch fixes the following sparse warnings:

drivers/block/xen-blkfront.c:916:45: warning: incorrect type in argument 2 
(different base types)
drivers/block/xen-blkfront.c:916:45:expected restricted blk_status_t 
[usertype] error
drivers/block/xen-blkfront.c:916:45:got int [signed] error
drivers/block/xen-blkfront.c:1599:47: warning: incorrect type in assignment 
(different base types)
drivers/block/xen-blkfront.c:1599:47:expected int [signed] error
drivers/block/xen-blkfront.c:1599:47:got restricted blk_status_t [usertype] 

drivers/block/xen-blkfront.c:1607:55: warning: incorrect type in assignment 
(different base types)
drivers/block/xen-blkfront.c:1607:55:expected int [signed] error
drivers/block/xen-blkfront.c:1607:55:got restricted blk_status_t [usertype] 

drivers/block/xen-blkfront.c:1625:55: warning: incorrect type in assignment 
(different base types)
drivers/block/xen-blkfront.c:1625:55:expected int [signed] error
drivers/block/xen-blkfront.c:1625:55:got restricted blk_status_t [usertype] 

drivers/block/xen-blkfront.c:1628:62: warning: restricted blk_status_t degrades 
to integer

Compile-tested only.

Fixes: commit 2a842acab109 ("block: introduce new block status code type")
Signed-off-by: Bart Van Assche 
Cc: Christoph Hellwig 
Cc: Konrad Rzeszutek Wilk 
Cc: Roger Pau Monné 
Cc: 
Cc: 
---
 drivers/block/xen-blkfront.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index c852ed3c01d5..1799bba74390 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -111,7 +111,7 @@ struct blk_shadow {
 };
 
 struct blkif_req {
-   int error;
+   blk_status_terror;
 };
 
 static inline struct blkif_req *blkif_req(struct request *rq)
@@ -1616,7 +1616,7 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
if (unlikely(bret->status == BLKIF_RSP_EOPNOTSUPP)) {
printk(KERN_WARNING "blkfront: %s: %s op 
failed\n",
   info->gd->disk_name, 
op_name(bret->operation));
-   blkif_req(req)->error = -EOPNOTSUPP;
+   blkif_req(req)->error = BLK_STS_NOTSUPP;
}
if (unlikely(bret->status == BLKIF_RSP_ERROR &&
 rinfo->shadow[id].req.u.rw.nr_segments == 
0)) {
-- 
2.13.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/6] xen: credit2: rearrange members of control structures

2017-07-21 Thread George Dunlap
On 06/23/2017 11:55 AM, Dario Faggioli wrote:
> With the aim of improving memory size and layout, and
> at the same time trying to put related fields reside
> in the same cacheline.
> 
> Here's a summary of the output of `pahole`, with and
> without this patch, for the affected data structures.
> 
> csched2_runqueue_data:
>  * Before:
> size: 216, cachelines: 4, members: 14
> sum members: 208, holes: 2, sum holes: 8
> last cacheline: 24 bytes
>  * After:
> size: 208, cachelines: 4, members: 14
> last cacheline: 16 bytes
> 
> csched2_private:
>  * Before:
> size: 120, cachelines: 2, members: 8
> sum members: 112, holes: 1, sum holes: 4
> padding: 4
> last cacheline: 56 bytes
>  * After:
> size: 112, cachelines: 2, members: 8
> last cacheline: 48 bytes
> 
> csched2_vcpu:
>  * Before:
> size: 112, cachelines: 2, members: 14
> sum members: 108, holes: 1, sum holes: 4
> last cacheline: 48 bytes
>  * After:
> size: 112, cachelines: 2, members: 14
> padding: 4
> last cacheline: 48 bytes
> 
> While there, improve the wording, style and alignment
> of comments too.
> 
> Signed-off-by: Dario Faggioli 

I haven't taken a careful look at these; the idea sounds good and I'll
trust that you've taken a careful look at them:

Acked-by: George Dunlap 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/6] xen: RTDS: rearrange members of control structures

2017-07-21 Thread George Dunlap
On 06/23/2017 11:55 AM, Dario Faggioli wrote:
> Nothing changed in `pahole` output, in terms of holes
> and padding, but some fields have been moved, to put
> related members in same cache line.
> 
> Signed-off-by: Dario Faggioli 

Acked-by: George Dunlap 

> ---
> Cc: Meng Xu 
> Cc: George Dunlap 
> ---
>  xen/common/sched_rt.c |   13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
> index 1b30014..39f6bee 100644
> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -171,11 +171,14 @@ static void repl_timer_handler(void *data);
>  struct rt_private {
>  spinlock_t lock;/* the global coarse-grained lock */
>  struct list_head sdom;  /* list of availalbe domains, used for dump 
> */
> +
>  struct list_head runq;  /* ordered list of runnable vcpus */
>  struct list_head depletedq; /* unordered list of depleted vcpus */
> +
> +struct timer *repl_timer;   /* replenishment timer */
>  struct list_head replq; /* ordered list of vcpus that need 
> replenishment */
> +
>  cpumask_t tickled;  /* cpus been tickled */
> -struct timer *repl_timer;   /* replenishment timer */
>  };
>  
>  /*
> @@ -185,10 +188,6 @@ struct rt_vcpu {
>  struct list_head q_elem; /* on the runq/depletedq list */
>  struct list_head replq_elem; /* on the replenishment events list */
>  
> -/* Up-pointers */
> -struct rt_dom *sdom;
> -struct vcpu *vcpu;
> -
>  /* VCPU parameters, in nanoseconds */
>  s_time_t period;
>  s_time_t budget;
> @@ -198,6 +197,10 @@ struct rt_vcpu {
>  s_time_t last_start; /* last start time */
>  s_time_t cur_deadline;   /* current deadline for EDF */
>  
> +/* Up-pointers */
> +struct rt_dom *sdom;
> +struct vcpu *vcpu;
> +
>  unsigned flags;  /* mark __RTDS_scheduled, etc.. */
>  };
>  
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] docs: fix superpage default value

2017-07-21 Thread Konrad Rzeszutek Wilk
On Fri, Jul 21, 2017 at 05:51:02PM +0100, Wei Liu wrote:
> On Fri, Jul 21, 2017 at 12:44:18PM -0400, Konrad Rzeszutek Wilk wrote:
> > On Thu, Jul 20, 2017 at 01:57:17PM +0100, Wei Liu wrote:
> > > On Thu, Jul 20, 2017 at 12:49:37PM +0100, Andrew Cooper wrote:
> > > > On 20/07/17 12:47, Wei Liu wrote:
> > > > > On Thu, Jul 20, 2017 at 12:45:38PM +0100, Roger Pau Monné wrote:
> > > > > > On Thu, Jul 20, 2017 at 12:35:56PM +0100, Wei Liu wrote:
> > > > > > > The code says it defaults to false.
> > > > > > > 
> > > > > > > Signed-off-by: Wei Liu 
> > > > > > > ---
> > > > > > > Cc: Andrew Cooper 
> > > > > > > Cc: George Dunlap 
> > > > > > > Cc: Ian Jackson 
> > > > > > > Cc: Jan Beulich 
> > > > > > > Cc: Konrad Rzeszutek Wilk 
> > > > > > > Cc: Stefano Stabellini 
> > > > > > > Cc: Tim Deegan 
> > > > > > > Cc: Wei Liu 
> > > > > > > ---
> > > > > > >   docs/misc/xen-command-line.markdown | 2 +-
> > > > > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > 
> > > > > > > diff --git a/docs/misc/xen-command-line.markdown 
> > > > > > > b/docs/misc/xen-command-line.markdown
> > > > > > > index 3f90c3b7a8..f524294aa6 100644
> > > > > > > --- a/docs/misc/xen-command-line.markdown
> > > > > > > +++ b/docs/misc/xen-command-line.markdown
> > > > > > > @@ -136,7 +136,7 @@ mode during S3 resume.
> > > > > > >   ### allowsuperpage
> > > > > > >   > `= `
> > > > > > > -> Default: `true`
> > > > > > > +> Default: `false`
> > > > > > >   Permit Xen to use superpages when performing memory management.
> > > > > > I'm not an expert on Xen MM code, but isn't this intended for PV
> > > > > > guests? The description above makes it look like this is for Xen
> > > > > > itself, but AFAICT from skimming over the code this seems to be a PV
> > > > > > feature, in which case the text above should be fixed to prevent
> > > > > > confusion.
> > > > > I believe it is PV only, but I'm not 100% sure.
> > > > > 
> > > > > I would love to fix the text as well if possible.
> > > > 
> > > > I'm fairly sure this option applies exclusively to PV superpages. Double
> > > > check the logic through the code, but I think (since dropping 32bit
> > > > support), we have no configuration where Xen might not be able to use
> > > > superpages.
> > > > 
> > > 
> > > So we can just delete this option and make Xen always use superpage?
> > > That would be fine by me, too.
> > 
> > Can we just nuke the code altogther?
> > 
> > Oracle is not using it anymore.
> 
> Sure! I was about to ask you about that.
> 
> I'm happy to submit patches to nuke it from both the hypervisor and toolstack.

Feel free to add Acked-by: Konrad Rzeszutek Wilk 
on them :-)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-devel][xen/Arm]xen fail to boot on omap5 board

2017-07-21 Thread Andrii Anisov

Hello Julien,


On 21.07.17 15:52, Julien Grall wrote:
This is very early boot in head.S so having the full log will not 
really help here...


What is more interesting is where the different modules have been 
loaded in memory:

- Device Tree
- Kernel
- Xen
- Initramfs (if any)
Well, actually I supposed HYP mode is not enabled. It was tricky some 
time ago, not sure if it was upstreamed to u-boot.

But yep, mentioned print is after HYP mode check.
IMHO the log starting from the board power on moment will provide more 
precise info about the situation.


--

*Andrii Anisov*



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/6] xen: credit: rearrange members of control structures

2017-07-21 Thread George Dunlap
On 06/23/2017 11:55 AM, Dario Faggioli wrote:
> With the aim of improving memory size and layout, and
> at the same time trying to put related fields reside
> in the same cacheline.
> 
> Here's a summary of the output of `pahole`, with and
> without this patch, for the affected data structures.
> 
> csched_pcpu:
>  * Before:
> size: 88, cachelines: 2, members: 6
> sum members: 80, holes: 1, sum holes: 4
> padding: 4
> paddings: 1, sum paddings: 5
> last cacheline: 24 bytes
>  * After:
> size: 80, cachelines: 2, members: 6
> paddings: 1, sum paddings: 5
> last cacheline: 16 bytes
> 
> csched_vcpu:
>  * Before:
> size: 72, cachelines: 2, members: 9
> padding: 2
> last cacheline: 8 bytes
>  * After:
> same numbers, but move some fields to put
> related fields in same cache line.
> 
> csched_private:
>  * Before:
> size: 152, cachelines: 3, members: 17
> sum members: 140, holes: 2, sum holes: 8
> padding: 4
> paddings: 1, sum paddings: 5
> last cacheline: 24 bytes
>  * After:
> same numbers, but move some fields to put
> related fields in same cache line.
> 
> Signed-off-by: Dario Faggioli 

Acked-by: George Dunlap 

> ---
> Cc: George Dunlap 
> Cc: Anshul Makkar 
> ---
>  xen/common/sched_credit.c |   41 ++---
>  1 file changed, 26 insertions(+), 15 deletions(-)
> 
> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> index efdf6bf..4f6330e 100644
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -169,10 +169,12 @@ integer_param("sched_credit_tslice_ms", 
> sched_credit_tslice_ms);
>  struct csched_pcpu {
>  struct list_head runq;
>  uint32_t runq_sort_last;
> -struct timer ticker;
> -unsigned int tick;
> +
>  unsigned int idle_bias;
>  unsigned int nr_runnable;
> +
> +unsigned int tick;
> +struct timer ticker;
>  };
>  
>  /*
> @@ -181,13 +183,18 @@ struct csched_pcpu {
>  struct csched_vcpu {
>  struct list_head runq_elem;
>  struct list_head active_vcpu_elem;
> +
> +/* Up-pointers */
>  struct csched_dom *sdom;
>  struct vcpu *vcpu;
> -atomic_t credit;
> -unsigned int residual;
> +
>  s_time_t start_time;   /* When we were scheduled (used for credit) */
>  unsigned flags;
> -int16_t pri;
> +int pri;
> +
> +atomic_t credit;
> +unsigned int residual;
> +
>  #ifdef CSCHED_STATS
>  struct {
>  int credit_last;
> @@ -219,21 +226,25 @@ struct csched_dom {
>  struct csched_private {
>  /* lock for the whole pluggable scheduler, nests inside cpupool_lock */
>  spinlock_t lock;
> -struct list_head active_sdom;
> -uint32_t ncpus;
> -struct timer  master_ticker;
> -unsigned int master;
> +
>  cpumask_var_t idlers;
>  cpumask_var_t cpus;
> +uint32_t *balance_bias;
> +uint32_t runq_sort;
> +unsigned int ratelimit_us;
> +
> +/* Period of master and tick in milliseconds */
> +unsigned int tslice_ms, tick_period_us, ticks_per_tslice;
> +uint32_t ncpus;
> +
> +struct list_head active_sdom;
>  uint32_t weight;
>  uint32_t credit;
>  int credit_balance;
> -uint32_t runq_sort;
> -uint32_t *balance_bias;
> -unsigned ratelimit_us;
> -/* Period of master and tick in milliseconds */
> -unsigned tslice_ms, tick_period_us, ticks_per_tslice;
> -unsigned credits_per_tslice;
> +unsigned int credits_per_tslice;
> +
> +unsigned int master;
> +struct timer master_ticker;
>  };
>  
>  static void csched_tick(void *_cpu);
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/6] xen: credit2: make the cpu to runqueue map per-cpu

2017-07-21 Thread George Dunlap
On 06/23/2017 11:54 AM, Dario Faggioli wrote:
> Instead of keeping an NR_CPUS big array of int-s,
> directly inside csched2_private, use a per-cpu
> variable.
> 
> That's especially beneficial (in terms of saved
> memory) when there are more instance of Credit2 (in
> different cpupools), and also helps fitting
> csched2_private itself into CPU caches.
> 
> Signed-off-by: Dario Faggioli 

Sounds good:

Acked-by: George Dunlap 

> ---
> Cc: George Dunlap 
> Cc: Anshul Makkar 
> ---
>  xen/common/sched_credit2.c |   33 -
>  1 file changed, 20 insertions(+), 13 deletions(-)
> 
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 10d9488..15862f2 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -383,7 +383,6 @@ struct csched2_private {
>  
>  struct list_head sdom; /* Used mostly for dump keyhandler. */
>  
> -int runq_map[NR_CPUS];
>  cpumask_t active_queues; /* Queues which may have active cpus */
>  struct csched2_runqueue_data *rqd;
>  
> @@ -393,6 +392,14 @@ struct csched2_private {
>  };
>  
>  /*
> + * Physical CPU
> + *
> + * The only per-pCPU information we need to maintain is of which runqueue
> + * each CPU is part of.
> + */
> +static DEFINE_PER_CPU(int, runq_map);
> +
> +/*
>   * Virtual CPU
>   */
>  struct csched2_vcpu {
> @@ -448,16 +455,16 @@ static inline struct csched2_dom *csched2_dom(const 
> struct domain *d)
>  }
>  
>  /* CPU to runq_id macro */
> -static inline int c2r(const struct scheduler *ops, unsigned int cpu)
> +static inline int c2r(unsigned int cpu)
>  {
> -return csched2_priv(ops)->runq_map[(cpu)];
> +return per_cpu(runq_map, cpu);
>  }
>  
>  /* CPU to runqueue struct macro */
>  static inline struct csched2_runqueue_data *c2rqd(const struct scheduler 
> *ops,
>unsigned int cpu)
>  {
> -return _priv(ops)->rqd[c2r(ops, cpu)];
> +return _priv(ops)->rqd[c2r(cpu)];
>  }
>  
>  /*
> @@ -1082,7 +1089,7 @@ runq_insert(const struct scheduler *ops, struct 
> csched2_vcpu *svc)
>  ASSERT(spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock));
>  
>  ASSERT(!vcpu_on_runq(svc));
> -ASSERT(c2r(ops, cpu) == c2r(ops, svc->vcpu->processor));
> +ASSERT(c2r(cpu) == c2r(svc->vcpu->processor));
>  
>  ASSERT(>rqd->runq == runq);
>  ASSERT(!is_idle_vcpu(svc->vcpu));
> @@ -1733,7 +1740,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct 
> vcpu *vc)
>  if ( min_rqi == -1 )
>  {
>  new_cpu = get_fallback_cpu(svc);
> -min_rqi = c2r(ops, new_cpu);
> +min_rqi = c2r(new_cpu);
>  min_avgload = prv->rqd[min_rqi].b_avgload;
>  goto out_up;
>  }
> @@ -2622,7 +2629,7 @@ csched2_schedule(
>  unsigned tasklet:8, idle:8, smt_idle:8, tickled:8;
>  } d;
>  d.cpu = cpu;
> -d.rq_id = c2r(ops, cpu);
> +d.rq_id = c2r(cpu);
>  d.tasklet = tasklet_work_scheduled;
>  d.idle = is_idle_vcpu(current);
>  d.smt_idle = cpumask_test_cpu(cpu, >smt_idle);
> @@ -2783,7 +2790,7 @@ dump_pcpu(const struct scheduler *ops, int cpu)
>  #define cpustr keyhandler_scratch
>  
>  cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_sibling_mask, 
> cpu));
> -printk("CPU[%02d] runq=%d, sibling=%s, ", cpu, c2r(ops, cpu), cpustr);
> +printk("CPU[%02d] runq=%d, sibling=%s, ", cpu, c2r(cpu), cpustr);
>  cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_core_mask, cpu));
>  printk("core=%s\n", cpustr);
>  
> @@ -2930,7 +2937,7 @@ init_pdata(struct csched2_private *prv, unsigned int 
> cpu)
>  }
>  
>  /* Set the runqueue map */
> -prv->runq_map[cpu] = rqi;
> +per_cpu(runq_map, cpu) = rqi;
>  
>  __cpumask_set_cpu(cpu, >idle);
>  __cpumask_set_cpu(cpu, >active);
> @@ -3034,7 +3041,7 @@ csched2_deinit_pdata(const struct scheduler *ops, void 
> *pcpu, int cpu)
>  ASSERT(!pcpu && cpumask_test_cpu(cpu, >initialized));
>  
>  /* Find the old runqueue and remove this cpu from it */
> -rqi = prv->runq_map[cpu];
> +rqi = per_cpu(runq_map, cpu);
>  
>  rqd = prv->rqd + rqi;
>  
> @@ -3055,6 +3062,8 @@ csched2_deinit_pdata(const struct scheduler *ops, void 
> *pcpu, int cpu)
>  else if ( rqd->pick_bias == cpu )
>  rqd->pick_bias = cpumask_first(>active);
>  
> +per_cpu(runq_map, cpu) = -1;
> +
>  spin_unlock(>lock);
>  
>  __cpumask_clear_cpu(cpu, >initialized);
> @@ -3121,10 +3130,8 @@ csched2_init(struct scheduler *ops)
>  return -ENOMEM;
>  }
>  for ( i = 0; i < nr_cpu_ids; i++ )
> -{
> -prv->runq_map[i] = -1;
>  prv->rqd[i].id = -1;
> -}
> +
>  /* initialize ratelimit */
>  prv->ratelimit_us = sched_ratelimit_us;
>  
> 


___

Re: [Xen-devel] [PATCH] docs: fix superpage default value

2017-07-21 Thread Wei Liu
On Fri, Jul 21, 2017 at 12:44:18PM -0400, Konrad Rzeszutek Wilk wrote:
> On Thu, Jul 20, 2017 at 01:57:17PM +0100, Wei Liu wrote:
> > On Thu, Jul 20, 2017 at 12:49:37PM +0100, Andrew Cooper wrote:
> > > On 20/07/17 12:47, Wei Liu wrote:
> > > > On Thu, Jul 20, 2017 at 12:45:38PM +0100, Roger Pau Monné wrote:
> > > > > On Thu, Jul 20, 2017 at 12:35:56PM +0100, Wei Liu wrote:
> > > > > > The code says it defaults to false.
> > > > > > 
> > > > > > Signed-off-by: Wei Liu 
> > > > > > ---
> > > > > > Cc: Andrew Cooper 
> > > > > > Cc: George Dunlap 
> > > > > > Cc: Ian Jackson 
> > > > > > Cc: Jan Beulich 
> > > > > > Cc: Konrad Rzeszutek Wilk 
> > > > > > Cc: Stefano Stabellini 
> > > > > > Cc: Tim Deegan 
> > > > > > Cc: Wei Liu 
> > > > > > ---
> > > > > >   docs/misc/xen-command-line.markdown | 2 +-
> > > > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/docs/misc/xen-command-line.markdown 
> > > > > > b/docs/misc/xen-command-line.markdown
> > > > > > index 3f90c3b7a8..f524294aa6 100644
> > > > > > --- a/docs/misc/xen-command-line.markdown
> > > > > > +++ b/docs/misc/xen-command-line.markdown
> > > > > > @@ -136,7 +136,7 @@ mode during S3 resume.
> > > > > >   ### allowsuperpage
> > > > > >   > `= `
> > > > > > -> Default: `true`
> > > > > > +> Default: `false`
> > > > > >   Permit Xen to use superpages when performing memory management.
> > > > > I'm not an expert on Xen MM code, but isn't this intended for PV
> > > > > guests? The description above makes it look like this is for Xen
> > > > > itself, but AFAICT from skimming over the code this seems to be a PV
> > > > > feature, in which case the text above should be fixed to prevent
> > > > > confusion.
> > > > I believe it is PV only, but I'm not 100% sure.
> > > > 
> > > > I would love to fix the text as well if possible.
> > > 
> > > I'm fairly sure this option applies exclusively to PV superpages. Double
> > > check the logic through the code, but I think (since dropping 32bit
> > > support), we have no configuration where Xen might not be able to use
> > > superpages.
> > > 
> > 
> > So we can just delete this option and make Xen always use superpage?
> > That would be fine by me, too.
> 
> Can we just nuke the code altogther?
> 
> Oracle is not using it anymore.

Sure! I was about to ask you about that.

I'm happy to submit patches to nuke it from both the hypervisor and toolstack.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/6] xen: credit2: allocate runqueue data structure dynamically

2017-07-21 Thread George Dunlap
On 06/23/2017 11:54 AM, Dario Faggioli wrote:
> Instead of keeping an NR_CPUS big array of csched2_runqueue_data
> elements, directly inside the csched2_private structure, allocate
> it dynamically.
> 
> This has two positive effects:
> - reduces the size of csched2_private sensibly, which is
>   especially good in case there are more instance of Credit2
>   (in different cpupools), and is also good from the point
>   of view of fitting the struct into CPU caches;
> - we can use nr_cpu_ids as array size, which may be sensibly
>   smaller than NR_CPUS
> 
> Signed-off-by: Dario Faggioli 

Looks good, thanks:

Acked-by: George Dunlap 



> ---
> Cc: George Dunlap 
> Cc: Anshul Makkar 
> ---
>  xen/common/sched_credit2.c |   16 
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 126417c..10d9488 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -385,7 +385,7 @@ struct csched2_private {
>  
>  int runq_map[NR_CPUS];
>  cpumask_t active_queues; /* Queues which may have active cpus */
> -struct csched2_runqueue_data rqd[NR_CPUS];
> +struct csched2_runqueue_data *rqd;
>  
>  unsigned int load_precision_shift;
>  unsigned int load_window_shift;
> @@ -3099,9 +3099,11 @@ csched2_init(struct scheduler *ops)
>  printk(XENLOG_INFO "load tracking window length %llu ns\n",
> 1ULL << opt_load_window_shift);
>  
> -/* Basically no CPU information is available at this point; just
> +/*
> + * Basically no CPU information is available at this point; just
>   * set up basic structures, and a callback when the CPU info is
> - * available. */
> + * available.
> + */
>  
>  prv = xzalloc(struct csched2_private);
>  if ( prv == NULL )
> @@ -3111,7 +3113,13 @@ csched2_init(struct scheduler *ops)
>  rwlock_init(>lock);
>  INIT_LIST_HEAD(>sdom);
>  
> -/* But un-initialize all runqueues */
> +/* Allocate all runqueues and mark them as un-initialized */
> +prv->rqd = xzalloc_array(struct csched2_runqueue_data, nr_cpu_ids);
> +if ( !prv->rqd )
> +{
> +xfree(prv);
> +return -ENOMEM;
> +}
>  for ( i = 0; i < nr_cpu_ids; i++ )
>  {
>  prv->runq_map[i] = -1;
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] docs: fix superpage default value

2017-07-21 Thread Konrad Rzeszutek Wilk
On Thu, Jul 20, 2017 at 01:57:17PM +0100, Wei Liu wrote:
> On Thu, Jul 20, 2017 at 12:49:37PM +0100, Andrew Cooper wrote:
> > On 20/07/17 12:47, Wei Liu wrote:
> > > On Thu, Jul 20, 2017 at 12:45:38PM +0100, Roger Pau Monné wrote:
> > > > On Thu, Jul 20, 2017 at 12:35:56PM +0100, Wei Liu wrote:
> > > > > The code says it defaults to false.
> > > > > 
> > > > > Signed-off-by: Wei Liu 
> > > > > ---
> > > > > Cc: Andrew Cooper 
> > > > > Cc: George Dunlap 
> > > > > Cc: Ian Jackson 
> > > > > Cc: Jan Beulich 
> > > > > Cc: Konrad Rzeszutek Wilk 
> > > > > Cc: Stefano Stabellini 
> > > > > Cc: Tim Deegan 
> > > > > Cc: Wei Liu 
> > > > > ---
> > > > >   docs/misc/xen-command-line.markdown | 2 +-
> > > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/docs/misc/xen-command-line.markdown 
> > > > > b/docs/misc/xen-command-line.markdown
> > > > > index 3f90c3b7a8..f524294aa6 100644
> > > > > --- a/docs/misc/xen-command-line.markdown
> > > > > +++ b/docs/misc/xen-command-line.markdown
> > > > > @@ -136,7 +136,7 @@ mode during S3 resume.
> > > > >   ### allowsuperpage
> > > > >   > `= `
> > > > > -> Default: `true`
> > > > > +> Default: `false`
> > > > >   Permit Xen to use superpages when performing memory management.
> > > > I'm not an expert on Xen MM code, but isn't this intended for PV
> > > > guests? The description above makes it look like this is for Xen
> > > > itself, but AFAICT from skimming over the code this seems to be a PV
> > > > feature, in which case the text above should be fixed to prevent
> > > > confusion.
> > > I believe it is PV only, but I'm not 100% sure.
> > > 
> > > I would love to fix the text as well if possible.
> > 
> > I'm fairly sure this option applies exclusively to PV superpages. Double
> > check the logic through the code, but I think (since dropping 32bit
> > support), we have no configuration where Xen might not be able to use
> > superpages.
> > 
> 
> So we can just delete this option and make Xen always use superpage?
> That would be fine by me, too.

Can we just nuke the code altogther?

Oracle is not using it anymore.
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] xen/link: Move .data.rel.ro sections into .rodata for final link

2017-07-21 Thread Andrew Cooper

On 21/07/17 11:43, Julien Grall wrote:



On 20/07/17 17:54, Wei Liu wrote:

On Thu, Jul 20, 2017 at 05:46:50PM +0100, Wei Liu wrote:

CC relevant maintainers

On Thu, Jul 20, 2017 at 05:20:43PM +0200, David Woodhouse wrote:

From: David Woodhouse 

This includes stuff lke the hypercall tables which we really want


lke -> like


to be read-only. And they were going into .data.read-mostly.

Signed-off-by: David Woodhouse 


Reviewed-by: Wei Liu 


Acked-by: Julien Grall 


Acked-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regarding hdmi sharing in xen

2017-07-21 Thread Andrii Anisov

Dear George,

First I would state terms as following:
* Sharing HW - using the same hardware by different domains using PV 
drivers, so actually one domain accessing the HW directly and serves 
other domains.
* Assigning HW - providing access to some particular HW for some 
particular domain. E.g. peripherals by default are assigned to Dom0, but 
using Passthrough some could be assigned to DomU.


On 19.07.17 07:41, George John wrote:
Our plan is to run Linux as Dom0 and Android as DomU. The Linux 
portion will be having 1 HDMI display and the Android porion will be 
having 1 HDMI.

Can we share the DU and use the HDMI port as it is in the guests?
IIRC last year it was shown a setup with Salvator-X board, where one 
HDMI display was *assigned* to Linux (Dom0) and one HDMI display was 
*assigned* to Android. So such setup is technically feasible.
If a domain has assigned a display to use solely, it can share that 
display to other domains using displif protocol [1].


[1] 
https://lists.xenproject.org/archives/html/xen-devel/2017-04/msg00470.html


--

*Andrii Anisov*



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH XTF v3] Implement pv_read_some

2017-07-21 Thread Andrew Cooper

On 21/07/17 08:01, Felix Schmoll wrote:



Much better.  Just one final question.  Do you intend this
function to block until data becomes available?  (because that
appears to be how it behaves.)


Yes. I could split it up into two functions if that bothers you. Or do 
you just want me to include that in the comment?


Just include it in the comment.

~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH XTF] Functional: Add a UMIP test

2017-07-21 Thread Andrew Cooper

On 21/07/17 02:42, Boqun Feng wrote:

On Thu, Jul 20, 2017 at 10:38:59AM +0100, Andrew Cooper wrote:

On 20/07/17 06:29, Boqun Feng (Intel) wrote:

Add a "umip" test for the User-Model Instruction Prevention. The test
simply tries to run sgdt/sidt/sldt/str/smsw in guest user-mode with
CR4_UMIP = 1.

Signed-off-by: Boqun Feng (Intel) 

Thankyou very much for providing a test.

As a general remark, how have you found XTF to use?


Great tool! Especially when you need to run Xen in a simulated
environment like simics and want to test something, bringing up even a
simple Linux domainU would be a lot of pain. ;-) While XTF just works
like a charm and it's easy to write a test case, though according to
your comments I'm now very good at it now ;-)


I'm glad to hear this.




+void test_main(void)
+{
+unsigned long exp;
+unsigned long cr4 = read_cr4();

This is all good.  However, it is insufficient to properly test the UMIP
behaviour.  Please look at the cpuid-faulting to see how I structured
things.

In particular, you should:

1) Test the regular behaviour of the instructions.
2) Search for UMIP, skipping if it isn't available.
3) Enable UMIP.

Maybe I also need to provide a write_cr4_safe() similar as wrmsr_safe(),
in case that cpuid indicates UMIP supported while UMIP CR4 bit is not
allowed to set, which means a bug?


Yes.  You are entirely correct.  Feel free to put write_cr4_safe() in 
lib.h along with the other *_safe() variants.





4) Test the instructions again, this time checking for #GP in userspace.
5) Disable UMIP.
6) Check again for regular behaviour.

This way, you also check that turning it off works as well as turning it on.

In addition, each test needs to check more than just the block of tests
below.

1) The tests should run the instructions natively, and forced through the
instruction emulator.  See the FPU Exception Emulation test which is along
the same lines.  One thing to be aware of though is that in older versions
of Xen, the s??? instructions weren't implemented in the instruction
emulator, so the test should tolerate and skip if it gets #UD back.


Rogar that.


:)

Roger.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 112091: all pass - PUSHED

2017-07-21 Thread osstest service owner
flight 112091 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112091/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 1683ecec41a7c944783c51efa75375f1e0a71d08
baseline version:
 ovmf 79aac4dd756bb2809cdcb74f7d2ae8a630457c99

Last test of basis   112039  2017-07-20 06:18:11 Z1 days
Testing same since   112091  2017-07-21 10:17:54 Z0 days1 attempts


People who touched revisions under test:
  Star Zeng 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=ovmf
+ revision=1683ecec41a7c944783c51efa75375f1e0a71d08
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf 
1683ecec41a7c944783c51efa75375f1e0a71d08
+ branch=ovmf
+ revision=1683ecec41a7c944783c51efa75375f1e0a71d08
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=ovmf
+ xenbranch=xen-unstable
+ '[' xovmf = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable
+ prevxenbranch=xen-4.9-testing
+ '[' x1683ecec41a7c944783c51efa75375f1e0a71d08 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/linux-firmware.git
++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git
++ : git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
++ : 

[Xen-devel] [qemu-mainline test] 112072: regressions - FAIL

2017-07-21 Thread osstest service owner
flight 112072 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112072/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-xsm6 xen-buildfail REGR. vs. 111765
 build-i3866 xen-buildfail REGR. vs. 111765
 build-armhf-xsm   6 xen-buildfail REGR. vs. 111765
 test-amd64-amd64-xl-qemuu-win7-amd64 10 windows-install  fail REGR. vs. 111765
 build-armhf   6 xen-buildfail REGR. vs. 111765

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-raw1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win10-i386  1 build-check(1)  blocked n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ws16-amd64  1 build-check(1)  blocked n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 111765
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass

version targeted for testing:
 qemuu25d0233c1ac6cd14a15fcc834f1de3b179037b1d
baseline version:
 qemuu31fe1c414501047cbb91b695bdccc0068496dcf6

Last test of basis   111765  2017-07-13 10:20:16 Z8 days
Failing since111790  2017-07-14 04:20:46 Z7 days   10 attempts
Testing same since   112072  2017-07-21 00:49:48 Z0 days1 attempts


People who touched revisions under test:
  Alex Bennée 
  Alex Williamson 
  Alexander Graf 
  Alexey Kardashevskiy 
  Alistair Francis 

Re: [Xen-devel] [PATCH] docs: fix superpage default value

2017-07-21 Thread Wei Liu
On Fri, Jul 21, 2017 at 05:21:26PM +0100, Andrew Cooper wrote:
> On 20/07/17 13:57, Wei Liu wrote:
> > On Thu, Jul 20, 2017 at 12:49:37PM +0100, Andrew Cooper wrote:
> > > On 20/07/17 12:47, Wei Liu wrote:
> > > > On Thu, Jul 20, 2017 at 12:45:38PM +0100, Roger Pau Monné wrote:
> > > > > On Thu, Jul 20, 2017 at 12:35:56PM +0100, Wei Liu wrote:
> > > > > > The code says it defaults to false.
> > > > > > 
> > > > > > Signed-off-by: Wei Liu 
> > > > > > ---
> > > > > > Cc: Andrew Cooper 
> > > > > > Cc: George Dunlap 
> > > > > > Cc: Ian Jackson 
> > > > > > Cc: Jan Beulich 
> > > > > > Cc: Konrad Rzeszutek Wilk 
> > > > > > Cc: Stefano Stabellini 
> > > > > > Cc: Tim Deegan 
> > > > > > Cc: Wei Liu 
> > > > > > ---
> > > > > >docs/misc/xen-command-line.markdown | 2 +-
> > > > > >1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/docs/misc/xen-command-line.markdown 
> > > > > > b/docs/misc/xen-command-line.markdown
> > > > > > index 3f90c3b7a8..f524294aa6 100644
> > > > > > --- a/docs/misc/xen-command-line.markdown
> > > > > > +++ b/docs/misc/xen-command-line.markdown
> > > > > > @@ -136,7 +136,7 @@ mode during S3 resume.
> > > > > >### allowsuperpage
> > > > > >> `= `
> > > > > > -> Default: `true`
> > > > > > +> Default: `false`
> > > > > >Permit Xen to use superpages when performing memory management.
> > > > > I'm not an expert on Xen MM code, but isn't this intended for PV
> > > > > guests? The description above makes it look like this is for Xen
> > > > > itself, but AFAICT from skimming over the code this seems to be a PV
> > > > > feature, in which case the text above should be fixed to prevent
> > > > > confusion.
> > > > I believe it is PV only, but I'm not 100% sure.
> > > > 
> > > > I would love to fix the text as well if possible.
> > > I'm fairly sure this option applies exclusively to PV superpages. Double
> > > check the logic through the code, but I think (since dropping 32bit
> > > support), we have no configuration where Xen might not be able to use
> > > superpages.
> > > 
> > So we can just delete this option and make Xen always use superpage?
> > That would be fine by me, too.
> 
> No - my point was that this option now exclusively controls PV superpages,
> IIRC.
> 

OK. I misunderstood.

In that case. We can change the text to:

  Permit PV guests to use suerpages.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 4/4] Xentrace: add support for HVM's PI blocking list operation

2017-07-21 Thread George Dunlap
On Fri, Jul 7, 2017 at 7:49 AM, Chao Gao  wrote:
> In order to analyze PI blocking list operation frequence and obtain
> the list length, add some relevant events to xentrace and some
> associated code in xenalyze. Event ASYNC_PI_LIST_DEL may happen in interrupt
> context, which incurs current assumptions checked in toplevel_assert_check()
> are not suitable any more. Thus, this patch extends the 
> toplevel_assert_check()
> to remove such assumptions for events of type ASYNC_PI_LIST_DEL.
>
> Signed-off-by: Chao Gao 

Hey Chao Gao,

Thanks for doing the work to add this tracing support to xentrace --
and in particular taking the effort to adapt the assert mechanism to
be able to handle asynchronous events.

I think in this case though, having a separate HVM sub-class for
asynchronous events isn't really the right approach.  The main purpose
of sub-classes is to help filter the events you want; and I can't
think of any time you'd want to trace PI_LIST_DEL and not PI_LIST_ADD
(or vice versa).  Secondly, the "asynchronous event" problem will be
an issue for other contexts as well, and the solution will be the
same.

I think a better solution would be to do something similar to
TRC_64_FLAG and TRC_HVM_IOMEM_[read,write], and claim another bit to
create a TRC_ASYNC_FLAG (0x400 probably).  Then we can filter the
"not_idle_domain" and "vcpu_data_mode" asserts on that.

What do you think?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] docs: fix superpage default value

2017-07-21 Thread Andrew Cooper

On 20/07/17 13:57, Wei Liu wrote:

On Thu, Jul 20, 2017 at 12:49:37PM +0100, Andrew Cooper wrote:

On 20/07/17 12:47, Wei Liu wrote:

On Thu, Jul 20, 2017 at 12:45:38PM +0100, Roger Pau Monné wrote:

On Thu, Jul 20, 2017 at 12:35:56PM +0100, Wei Liu wrote:

The code says it defaults to false.

Signed-off-by: Wei Liu 
---
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Wei Liu 
---
   docs/misc/xen-command-line.markdown | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index 3f90c3b7a8..f524294aa6 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -136,7 +136,7 @@ mode during S3 resume.
   ### allowsuperpage
   > `= `
-> Default: `true`
+> Default: `false`
   Permit Xen to use superpages when performing memory management.

I'm not an expert on Xen MM code, but isn't this intended for PV
guests? The description above makes it look like this is for Xen
itself, but AFAICT from skimming over the code this seems to be a PV
feature, in which case the text above should be fixed to prevent
confusion.

I believe it is PV only, but I'm not 100% sure.

I would love to fix the text as well if possible.

I'm fairly sure this option applies exclusively to PV superpages. Double
check the logic through the code, but I think (since dropping 32bit
support), we have no configuration where Xen might not be able to use
superpages.


So we can just delete this option and make Xen always use superpage?
That would be fine by me, too.


No - my point was that this option now exclusively controls PV 
superpages, IIRC.


~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] xen/pvcalls: use WARN_ON(1) instead of __WARN()

2017-07-21 Thread Arnd Bergmann
__WARN() is an internal helper that is only available on
some architectures, but causes a build error e.g. on ARM64
in some configurations:

drivers/xen/pvcalls-back.c: In function 'set_backend_state':
drivers/xen/pvcalls-back.c:1097:5: error: implicit declaration of function 
'__WARN' [-Werror=implicit-function-declaration]

Unfortunately, there is no equivalent of BUG() that takes no
arguments, but WARN_ON(1) is commonly used in other drivers
and works on all configurations.

Fixes: 7160378206b2 ("xen/pvcalls: xenbus state handling")
Signed-off-by: Arnd Bergmann 
---
 drivers/xen/pvcalls-back.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c
index d6c4c4aecb41..00c1a2344330 100644
--- a/drivers/xen/pvcalls-back.c
+++ b/drivers/xen/pvcalls-back.c
@@ -1094,7 +1094,7 @@ static void set_backend_state(struct xenbus_device *dev,
xenbus_switch_state(dev, XenbusStateClosing);
break;
default:
-   __WARN();
+   WARN_ON(1);
}
break;
case XenbusStateInitWait:
@@ -1109,7 +1109,7 @@ static void set_backend_state(struct xenbus_device *dev,
xenbus_switch_state(dev, XenbusStateClosing);
break;
default:
-   __WARN();
+   WARN_ON(1);
}
break;
case XenbusStateConnected:
@@ -1123,7 +1123,7 @@ static void set_backend_state(struct xenbus_device *dev,
xenbus_switch_state(dev, XenbusStateClosing);
break;
default:
-   __WARN();
+   WARN_ON(1);
}
break;
case XenbusStateClosing:
@@ -1134,11 +1134,11 @@ static void set_backend_state(struct xenbus_device *dev,
xenbus_switch_state(dev, XenbusStateClosed);
break;
default:
-   __WARN();
+   WARN_ON(1);
}
break;
default:
-   __WARN();
+   WARN_ON(1);
}
}
 }
-- 
2.9.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 3/4] VT-d PI: restrict the vcpu number on a given pcpu

2017-07-21 Thread George Dunlap
On Fri, Jul 7, 2017 at 7:48 AM, Chao Gao  wrote:
> Currently, a blocked vCPU is put in its pCPU's pi blocking list. If
> too many vCPUs are blocked on a given pCPU, it will incur that the list
> grows too long. After a simple analysis, there are 32k domains and
> 128 vcpu per domain, thus about 4M vCPUs may be blocked in one pCPU's
> PI blocking list. When a wakeup interrupt arrives, the list is
> traversed to find some specific vCPUs to wake them up. This traversal in
> that case would consume much time.
>
> To mitigate this issue, this patch limits the number of vCPUs tracked on a
> given pCPU's blocking list, taking factors such as perfomance of common case,
> current hvm vCPU count and current pCPU count into consideration. With this
> method, for the common case, it works fast and for some extreme cases, the
> list length is under control.
>
> With this patch, when a vcpu is to be blocked, we check whether the pi
> blocking list's length of the pcpu where the vcpu is running exceeds
> the limit which is the average vcpus per pcpu ratio plus a constant.
> If no, the vcpu is added to this pcpu's pi blocking list. Otherwise,
> another online pcpu is chosen to accept the vcpu.
>
> Signed-off-by: Chao Gao 
> ---
> v4:
>  - use a new lock to avoid adding a blocked vcpu to a offline pcpu's blocking
>  list.
>
> ---
>  xen/arch/x86/hvm/vmx/vmx.c | 136 
> +
>  1 file changed, 114 insertions(+), 22 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index ecd6485..04e9aa6 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -95,22 +95,91 @@ static DEFINE_PER_CPU(struct vmx_pi_blocking_vcpu, 
> vmx_pi_blocking);
>  uint8_t __read_mostly posted_intr_vector;
>  static uint8_t __read_mostly pi_wakeup_vector;
>
> +/*
> + * Protect critical sections to avoid adding a blocked vcpu to a destroyed
> + * blocking list.
> + */
> +static DEFINE_SPINLOCK(remote_pbl_operation);
> +
> +#define remote_pbl_operation_begin(flags)   \
> +({  \
> +spin_lock_irqsave(_pbl_operation, flags);\
> +})
> +
> +#define remote_pbl_operation_done(flags)\
> +({  \
> +spin_unlock_irqrestore(_pbl_operation, flags);   \
> +})
> +
>  void vmx_pi_per_cpu_init(unsigned int cpu)
>  {
>  INIT_LIST_HEAD(_cpu(vmx_pi_blocking, cpu).list);
>  spin_lock_init(_cpu(vmx_pi_blocking, cpu).lock);
>  }
>
> +/*
> + * By default, the local pcpu (means the one the vcpu is currently running 
> on)
> + * is chosen as the destination of wakeup interrupt. But if the vcpu number 
> of
> + * the pcpu exceeds a limit, another pcpu is chosen until we find a suitable
> + * one.
> + *
> + * Currently, choose (v_tot/p_tot) + K as the limit of vcpu count, where
> + * v_tot is the total number of hvm vcpus on the system, p_tot is the total
> + * number of pcpus in the system, and K is a fixed number. An experment on a
> + * skylake server which has 112 cpus and 64G memory shows the maximum time to
> + * wakeup a vcpu from a 128-entry blocking list takes about 22us, which is
> + * tolerable. So choose 128 as the fixed number K.
> + *
> + * This policy makes sure:
> + * 1) for common cases, the limit won't be reached and the local pcpu is used
> + * which is beneficial to performance (at least, avoid an IPI when unblocking
> + * vcpu).
> + * 2) for the worst case, the blocking list length scales with the vcpu count
> + * divided by the pcpu count.
> + */
> +#define PI_LIST_FIXED_NUM 128
> +#define PI_LIST_LIMIT (atomic_read(_hvm_vcpus) / num_online_cpus() + 
> \
> +   PI_LIST_FIXED_NUM)
> +static inline bool pi_over_limit(int cpu)
> +{
> +return per_cpu(vmx_pi_blocking, cpu).counter > PI_LIST_LIMIT;

Is there any reason to hide this calculation behind a #define, when
it's only used once anyway?

Also -- the vast majority of the time, .counter will be <
PI_LIST_FIXED_NUM; there's no reason to do an atomic read and an
integer division in that case.  I would do this:

if ( likely(per_cpu(vm_pi_blocking, cpu).counter <= PI_LIST_FIXED_LIMIT) )
  return 0;

return per_cpu(vm_pi_blocking, cpu).counter < PI_LIST_FIXED_LIMIT +
(atomic_read(_hvm_vcpus) / num_online_cpus));

Also, I personally think it would make the code more readable to say,
"pi_under_limit()" instead; that way...

> +}
> +
>  static void vmx_vcpu_block(struct vcpu *v)
>  {
> -unsigned long flags;
> -unsigned int dest;
> +unsigned long flags[2];
> +unsigned int dest, pi_cpu;
>  spinlock_t *old_lock;
> -spinlock_t *pi_blocking_list_lock =
> -   _cpu(vmx_pi_blocking, v->processor).lock;
>  struct pi_desc *pi_desc = >arch.hvm_vmx.pi_desc;
> +spinlock_t *pi_blocking_list_lock;
> +bool in_remote_operation = false;
> +
> +pi_cpu 

[Xen-devel] Notes from Design Session: Solving Community Problems: Patch Volume vs Review Bandwidth, Community Meetings ... and other problems

2017-07-21 Thread Lars Kurth
Hi all,
please find attached my notes. 
Lars

Session URL: http://sched.co/AjB3

ACTIONS on Lars, Andy and Juergen
ACTIONS on Stefano and Julien

Community Call
==
This was a discussion about whether we should do more community calls, 
in critical areas. The background was whether we should have an x86 call 
to mirror the ARM call.

Jan and Andy asked whether the ARM calls are useful

Julien: 
They are very useful. On average about 10 people attend.
On ARM we don't yet have a real plan of what's needed for the future.
We are hoping to use the call to establish a firmer plan.

Lars:
Was asking whether we always have an agenda at the beginning.

Julien:
Sometimes, but often the agenda is established/refined in the first
5 minutes at the beginning of the call. Typically Julien or Stefano
handle this at the beginning

Lars asks whether we need one for tools
Ian: there is currently not much a need for technical coordination

Lars: it feels that a call on x86 would be helpful
But we can only cover non-NDA information as with the other calls

Jan and Andy agree that they are happy to try this, but are concerned
that it may fizzle out. Also neither want to own agenda and note-taking
(notes and call info are posted on xen-devel@)

ACTION: Lars to work with Intel on setting this up
(note, I was asked by Susie Li to include John Ji and Chao Peng on this
thread and discuss with them at a separate call)

Timing wise, a call at from 9-10 UK time once a month should work. 

Example of ARM call minutes:
* http://markmail.org/message/myjllcngy3lqveji
* http://markmail.org/message/d4kuqxxhj6dfnf23
* There also ought to be a reminder of call details (someone to
  highlight an example)

Contributions vs. Review Bandwidth 
==

A potential bottleneck issue was raised in the area of ARM and x86
 
ARM
---
Lars asks what issue have been observed

Julien: 
Lots of new features and lots of design discussion

Stefano: 
Design discussions are creating trouble: sometimes we have complex 
proposals without a clear answer on the right way forward.

Complicated design 
=> 2/3 options 
=> not clear which way is the best forward 
=> ARM maintainers can provide advice, can say what is going to work

Right now ARM maintainers expect the contributor has to lead and 
drive it (e.g. an example where we were stuck was BIG.Little support)

A pattern we have seen is:
- Complex problem
- Not an obviously clear answer
- Gets stuck
- Design discussion fizzles out without an artefact in the codebase
  (in other words, there is an unfinished mail thread)

Lars:
Asks whether maybe the issue is one of sufficient confidence by the
contributor to move the discussion further or whether expectations 
were not communicated clearly (e.g. tell contributors to pick a
solution and move forward).

Stefano and Julien: 
Agree that this may indeed be the case

It is unusual to be in a technical leadership position when it comes
to driving designs and new solutions, but not from a process perspective.
Contributors need to be reminded of that.

It is also possible that embedded vendors may want to contribute,
but have only a small time window to do this.

Agreements:
* Create a couple of boilerplate mails or checklists to set 
  expectations better

ACTION: on ARM maintainers to trial

* Agreed to allow draft design into the git tree, as long as 
  interface status (Draft and unresolved issues) are clearly 
  documented. In that case, contributors can show progress
  and others - even if a design is not finished - can build on
  it. Feature docs already allow for that and so do Design
  Docs (although there is no example).

ACTION: on ARM maintainers to trial and pick a suitable
location in tree.

x86
---

Lars prompts Jan, Andy on some of the challenges

Jan, Andy:
A Typically series are large and fully formed 
  (e.g. 30 size series)
B Often we don't have enough context to understand design behind code
  This has improved through Hackathons, meetings under NDA, ...
C In the past, series have existed for 2 years earlier in private
  (e.g. SGX was developed against 4.6) and is posted against a newer version.
  At that point, some assumptions may have changed: e.g. on 5-level-paging
  we agreed at the summit that PV support is not needed (only HVM and PVH)
D There is not normally lack of driving and managing the submission of an issue

Roger: 
feels that when he is reviewing x86 stuff it does not actually take work off 
Jan or Andrew, as sometimes one of them will pick up and re-reviews. That 
sometimes puts him off.

Jan: that is a risk to take and shouldn't put you off. 

Wei: says that when his responsibility on a patch is not clear, he says 
"subject to the agreement of XXX". That sets expectations with other 
maintainers and contributors.

Then we went a little bit onto reasons behind bandwidth issues

Jan: large series are often hard to understand and consume. Also, sometimes
there is a lack of understanding that there 

Re: [Xen-devel] [PATCH v4 1/4] VT-d PI: track the vcpu number on pi blocking list

2017-07-21 Thread George Dunlap
On Fri, Jul 7, 2017 at 7:48 AM, Chao Gao  wrote:
> This patch adds a field, counter, in struct vmx_pi_blocking_vcpu to track
> how many entries are on the pi blocking list.
>
> Signed-off-by: Chao Gao 

Minor nit:  The grammar in the title isn't quite right; "vcpu number"
would be "the number identifying a particular vcpu", not "the number
of vcpus".  It should be, "VT-d PI: Track the number of vcpus on pi
blocking list".

With that:

Reviewed-by: George Dunlap 

> ---
> v4:
>  - non-trace part of Patch 1 in v3
>
> ---
>  xen/arch/x86/hvm/vmx/vmx.c | 14 +++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 69ce3aa..ecd6485 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -83,6 +83,7 @@ static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
>  struct vmx_pi_blocking_vcpu {
>  struct list_head list;
>  spinlock_t   lock;
> +unsigned int counter;
>  };
>
>  /*
> @@ -120,6 +121,7 @@ static void vmx_vcpu_block(struct vcpu *v)
>   */
>  ASSERT(old_lock == NULL);
>
> +per_cpu(vmx_pi_blocking, v->processor).counter++;
>  list_add_tail(>arch.hvm_vmx.pi_blocking.list,
>_cpu(vmx_pi_blocking, v->processor).list);
>  spin_unlock_irqrestore(pi_blocking_list_lock, flags);
> @@ -187,6 +189,8 @@ static void vmx_pi_unblock_vcpu(struct vcpu *v)
>  {
>  ASSERT(v->arch.hvm_vmx.pi_blocking.lock == pi_blocking_list_lock);
>  list_del(>arch.hvm_vmx.pi_blocking.list);
> +container_of(pi_blocking_list_lock,
> + struct vmx_pi_blocking_vcpu, lock)->counter--;
>  v->arch.hvm_vmx.pi_blocking.lock = NULL;
>  }
>
> @@ -235,6 +239,7 @@ void vmx_pi_desc_fixup(unsigned int cpu)
>  if ( pi_test_on(>pi_desc) )
>  {
>  list_del(>pi_blocking.list);
> +per_cpu(vmx_pi_blocking, cpu).counter--;
>  vmx->pi_blocking.lock = NULL;
>  vcpu_unblock(container_of(vmx, struct vcpu, arch.hvm_vmx));
>  }
> @@ -259,6 +264,8 @@ void vmx_pi_desc_fixup(unsigned int cpu)
>
>  list_move(>pi_blocking.list,
>_cpu(vmx_pi_blocking, new_cpu).list);
> +per_cpu(vmx_pi_blocking, cpu).counter--;
> +per_cpu(vmx_pi_blocking, new_cpu).counter++;
>  vmx->pi_blocking.lock = new_lock;
>
>  spin_unlock(new_lock);
> @@ -2358,9 +2365,9 @@ static struct hvm_function_table __initdata 
> vmx_function_table = {
>  static void pi_wakeup_interrupt(struct cpu_user_regs *regs)
>  {
>  struct arch_vmx_struct *vmx, *tmp;
> -spinlock_t *lock = _cpu(vmx_pi_blocking, smp_processor_id()).lock;
> -struct list_head *blocked_vcpus =
> -   _cpu(vmx_pi_blocking, smp_processor_id()).list;
> +unsigned int cpu = smp_processor_id();
> +spinlock_t *lock = _cpu(vmx_pi_blocking, cpu).lock;
> +struct list_head *blocked_vcpus = _cpu(vmx_pi_blocking, cpu).list;
>
>  ack_APIC_irq();
>  this_cpu(irq_count)++;
> @@ -2377,6 +2384,7 @@ static void pi_wakeup_interrupt(struct cpu_user_regs 
> *regs)
>  if ( pi_test_on(>pi_desc) )
>  {
>  list_del(>pi_blocking.list);
> +per_cpu(vmx_pi_blocking, cpu).counter--;
>  ASSERT(vmx->pi_blocking.lock == lock);
>  vmx->pi_blocking.lock = NULL;
>  vcpu_unblock(container_of(vmx, struct vcpu, arch.hvm_vmx));
> --
> 1.8.3.1
>
>
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 112065: regressions - FAIL

2017-07-21 Thread osstest service owner
flight 112065 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112065/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
112004

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds16 guest-start/debian.repeat fail REGR. vs. 112004

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop   fail blocked in 112004
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 112004
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 112004
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 112004
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 112004
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 112004
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 112004
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass

version targeted for testing:
 xen  64c3fce24585740a43eb0d589de6e329ca454502
baseline version:
 xen  d535d8922f571502252deaf607e82e7475cd1728

Last test of basis   112004  2017-07-19 06:51:03 Z2 days
Failing since112033  2017-07-20 02:24:27 Z1 days2 attempts
Testing same since   112065  2017-07-20 19:20:15 Z0 days1 attempts


People who touched revisions under test:
  

Re: [Xen-devel] [xen-unstable test] 112033: regressions - trouble: broken/fail/pass

2017-07-21 Thread Julien Grall
Hi,

On 20/07/17 20:01, osstest service owner wrote:
> flight 112033 xen-unstable real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/112033/
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-amd64-i386-xl-qemuu-ovmf-amd64  4 host-install(4) broken REGR. vs. 
> 112004
>  test-armhf-armhf-xl-credit2 16 guest-start/debian.repeat fail REGR. vs. 
> 112004

I have looked at the failure for this test. It is happening on one of the 
cubietruck
and seems to reproduce fairly reliably ([1]).

It is failing when creating the 6th domain. Looking at the guest console logs, 
I only
see 5 domains logs. Nothing for the 6th.

The guest seem to received a prefetch abort (see trace below) probably after a 
data
abort. I am not sure to understand why and the stack trace seem awfully blank.

I've look at other available logs with similar failure. All end up with a 
prefetch
abort, although not necessarily after a data abort.

Ian, I am wondering if I could borrow one of the cubietruck on Monday to try and
reproduce the bug?

Cheers,

Jul 20 06:44:03.543038 (XEN) [ Xen-4.10-unstable  arm32  debug=y   Not 
tainted ]
Jul 20 06:44:03.548785 (XEN) CPU:0
Jul 20 06:44:03.550283 (XEN) PC: 000c
Jul 20 06:44:03.552407 (XEN) CPSR:   61d7 MODE:32-bit Guest ABT
Jul 20 06:44:03.556288 (XEN)  R0: dcffe000 R1: 5c00065f R2:  R3: 
c031c4a8
Jul 20 06:44:03.561910 (XEN)  R4: dc00 R5:  R6: c0f4d264 R7: 
dc001000
Jul 20 06:44:03.567413 (XEN)  R8: dcffe000 R9: 0005c000 R10:dc20 
R11:c0f4d000 R12:
Jul 20 06:44:03.574161 (XEN) USR: SP:  LR: 
Jul 20 06:44:03.577411 (XEN) SVC: SP: c1201e60 LR: c1007d68 SPSR:41d3
Jul 20 06:44:03.581903 (XEN) ABT: SP: c1318acc LR: 0010 SPSR:61d7
Jul 20 06:44:03.586403 (XEN) UND: SP: c1318ad8 LR: c1318ad8 SPSR:
Jul 20 06:44:03.590909 (XEN) IRQ: SP: c1318ac0 LR: c1318ac0 SPSR:
Jul 20 06:44:03.595404 (XEN) FIQ: SP: c1318ae4 LR: c1318ae4 SPSR:
Jul 20 06:44:03.599909 (XEN) FIQ: R8:  R9:  R10: 
R11: R12:
Jul 20 06:44:03.606657 (XEN) 
Jul 20 06:44:03.607279 (XEN)  SCTLR: 10c5387d
Jul 20 06:44:03.609775 (XEN)TCR: 
Jul 20 06:44:03.612153 (XEN)  TTBR0: 4020406a
Jul 20 06:44:03.615282 (XEN)  TTBR1: 4020406a
Jul 20 06:44:03.618404 (XEN)   IFAR: 000c, IFSR: 0007
Jul 20 06:44:03.622166 (XEN)   DFAR: dcffe000, DFSR: 0805
Jul 20 06:44:03.626073 (XEN) 
Jul 20 06:44:03.626683 (XEN)   VTCR_EL2: 80003558
Jul 20 06:44:03.629208 (XEN)  VTTBR_EL2: 0002bff24000
Jul 20 06:44:03.632310 (XEN) 
Jul 20 06:44:03.632931 (XEN)  SCTLR_EL2: 30cd187f
Jul 20 06:44:03.635432 (XEN)HCR_EL2: 0038663f
Jul 20 06:44:03.638549 (XEN)  TTBR0_EL2: bff12000
Jul 20 06:44:03.641663 (XEN) 
Jul 20 06:44:03.642421 (XEN)ESR_EL2: 07e0
Jul 20 06:44:03.644790 (XEN)  HPFAR_EL2: 0001c810
Jul 20 06:44:03.647919 (XEN)  HDFAR: e0800f00
Jul 20 06:44:03.650295 (XEN)  HIFAR: 5cf18882
Jul 20 06:44:03.652665 (XEN) 
Jul 20 06:44:03.653526 (XEN) Guest stack trace from sp=c1318acc:
Jul 20 06:44:03.657187 (XEN)     
  
Jul 20 06:44:03.664298 (XEN)     
  
Jul 20 06:44:03.671297 (XEN)     
  
Jul 20 06:44:03.678425 (XEN)     
  
Jul 20 06:44:03.685539 (XEN)     
  
Jul 20 06:44:03.692645 (XEN)     
  
Jul 20 06:44:03.699805 (XEN)     
  
Jul 20 06:44:03.706937 (XEN)     
  
Jul 20 06:44:03.714064 (XEN)     
  
Jul 20 06:44:03.721049 (XEN)     
  
Jul 20 06:44:03.728168 (XEN)     
  
Jul 20 06:44:03.735291 (XEN)     
  
Jul 20 06:44:03.742417 (XEN)     
  
Jul 20 06:44:03.749529 (XEN)     
  
Jul 20 06:44:03.756662 (XEN)     
  
Jul 20 06:44:03.763787 (XEN)     
  
Jul 20 06:44:03.770791 (XEN)     
  
Jul 20 

Re: [Xen-devel] [Bug] Intel RMRR support with upstream Qemu

2017-07-21 Thread Alexey G
> On Fri, 21 Jul 2017 10:57:55 +
> "Zhang, Xiong Y"  wrote:
> 
> > On an intel skylake machine with upstream qemu, if I add
> > "rdm=strategy=host, policy=strict" to hvm.cfg, win 8.1 DomU couldn't
> > boot up and continues reboot.
> > 
> > Steps to reproduce this issue:
> > 
> > 1)   Boot xen with iommu=1 to enable iommu
> > 2)   hvm.cfg contain:
> > 
> > builder="hvm"
> > 
> > memory=
> > 
> > disk=['win8.1 img']
> > 
> > device_model_override='qemu-system-i386'
> > 
> > device_model_version='qemu-xen'
> > 
> > rdm="strategy=host,policy=strict"
> > 
> > 3)   xl cr hvm.cfg
> > 
> > Conditions to reproduce this issue:
> > 
> > 1)   DomU memory size > the top address of RMRR. Otherwise, this
> > issue will disappear.
> > 2)   rdm=" strategy=host,policy=strict" should exist
> > 3)   Windows DomU.  Linux DomU doesn't have such issue.
> > 4)   Upstream qemu.  Traditional qemu doesn't have such issue.
> > 
> > In this situation, hvmloader will relocate some guest ram below RMRR to
> > high memory, and it seems window guest access an invalid address. Could
> > someone give me some suggestions on how to debug this ?  
> 
> You're likely have RMRR range(s) below 2GB boundary.
> 
> You may try the following:
> 
> 1. Specify some large 'mmio_hole' value in your domain configuration file,
> ex. mmio_hole=2560
> 2. If it won't help, 'xl dmesg' output might come useful
> 
> Right now upstream QEMU still doesn't support relocation of parts
> of guest RAM to >4GB boundary if they were overlapped by MMIO ranges.
> AFAIR forcing allow_memory_relocate to 1 for hvmloader didn't bring
> anything good for HVM guest.
> 
> Setting the mmio_hole size manually allows to create a "predefined"
> memory/MMIO hole layout for both QEMU (via 'max-ram-below-4g') and
> hvmloader (via a XenStore param), effectively avoiding MMIO/RMRR overlaps
> or RAM relocation in hvmloader, so this might help.

Wrote too soon, "policy=strict" means that you won't be able to create a
DomU if RMRR was below 2G... so it's actually should be above 2GB. Anyway,
try setting mmio_hole size.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PULL for-2.10 6/7] xen/mapcache: introduce xen_replace_cache_entry()

2017-07-21 Thread Anthony PERARD
On Tue, Jul 18, 2017 at 03:22:41PM -0700, Stefano Stabellini wrote:
> From: Igor Druzhinin 

...

> +static uint8_t *xen_replace_cache_entry_unlocked(hwaddr old_phys_addr,
> + hwaddr new_phys_addr,
> + hwaddr size)
> +{
> +MapCacheEntry *entry;
> +hwaddr address_index, address_offset;
> +hwaddr test_bit_size, cache_size = size;
> +
> +address_index  = old_phys_addr >> MCACHE_BUCKET_SHIFT;
> +address_offset = old_phys_addr & (MCACHE_BUCKET_SIZE - 1);
> +
> +assert(size);
> +/* test_bit_size is always a multiple of XC_PAGE_SIZE */
> +test_bit_size = size + (old_phys_addr & (XC_PAGE_SIZE - 1));
> +if (test_bit_size % XC_PAGE_SIZE) {
> +test_bit_size += XC_PAGE_SIZE - (test_bit_size % XC_PAGE_SIZE);
> +}
> +cache_size = size + address_offset;
> +if (cache_size % MCACHE_BUCKET_SIZE) {
> +cache_size += MCACHE_BUCKET_SIZE - (cache_size % MCACHE_BUCKET_SIZE);
> +}
> +
> +entry = >entry[address_index % mapcache->nr_buckets];
> +while (entry && !(entry->paddr_index == address_index &&
> +  entry->size == cache_size)) {
> +entry = entry->next;
> +}
> +if (!entry) {
> +DPRINTF("Trying to update an entry for %lx " \
> +"that is not in the mapcache!\n", old_phys_addr);
> +return NULL;
> +}
> +
> +address_index  = new_phys_addr >> MCACHE_BUCKET_SHIFT;
> +address_offset = new_phys_addr & (MCACHE_BUCKET_SIZE - 1);
> +
> +fprintf(stderr, "Replacing a dummy mapcache entry for %lx with %lx\n",
> +old_phys_addr, new_phys_addr);

Looks likes this does not build on 32bits.
in: 
http://logs.test-lab.xenproject.org/osstest/logs/112041/build-i386/6.ts-xen-build.log

/home/osstest/build.112041.build-i386/xen/tools/qemu-xen-dir/hw/i386/xen/xen-mapcache.c:
 In function 'xen_replace_cache_entry_unlocked':
/home/osstest/build.112041.build-i386/xen/tools/qemu-xen-dir/hw/i386/xen/xen-mapcache.c:539:13:
 error: format '%lx' expects argument of type 'long unsigned int', but argument 
3 has type 'hwaddr' [-Werror=format=]
 old_phys_addr, new_phys_addr);
 ^
/home/osstest/build.112041.build-i386/xen/tools/qemu-xen-dir/hw/i386/xen/xen-mapcache.c:539:13:
 error: format '%lx' expects argument of type 'long unsigned int', but argument 
4 has type 'hwaddr' [-Werror=format=]
cc1: all warnings being treated as errors
  CC  i386-softmmu/target/i386/gdbstub.o
/home/osstest/build.112041.build-i386/xen/tools/qemu-xen-dir/rules.mak:66: 
recipe for target 'hw/i386/xen/xen-mapcache.o' failed

> +
> +xen_remap_bucket(entry, entry->vaddr_base,
> + cache_size, address_index, false);
> +if (!test_bits(address_offset >> XC_PAGE_SHIFT,
> +test_bit_size >> XC_PAGE_SHIFT,
> +entry->valid_mapping)) {
> +DPRINTF("Unable to update a mapcache entry for %lx!\n", 
> old_phys_addr);
> +return NULL;
> +}
> +
> +return entry->vaddr_base + address_offset;
> +}
> +

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [Bug] Intel RMRR support with upstream Qemu

2017-07-21 Thread Alexey G
Hi,

On Fri, 21 Jul 2017 10:57:55 +
"Zhang, Xiong Y"  wrote:

> On an intel skylake machine with upstream qemu, if I add
> "rdm=strategy=host, policy=strict" to hvm.cfg, win 8.1 DomU couldn't boot
> up and continues reboot.
> 
> Steps to reproduce this issue:
> 
> 1)   Boot xen with iommu=1 to enable iommu
> 2)   hvm.cfg contain:
> 
> builder="hvm"
> 
> memory=
> 
> disk=['win8.1 img']
> 
> device_model_override='qemu-system-i386'
> 
> device_model_version='qemu-xen'
> 
> rdm="strategy=host,policy=strict"
> 
> 3)   xl cr hvm.cfg
> 
> Conditions to reproduce this issue:
> 
> 1)   DomU memory size > the top address of RMRR. Otherwise, this
> issue will disappear.
> 2)   rdm=" strategy=host,policy=strict" should exist
> 3)   Windows DomU.  Linux DomU doesn't have such issue.
> 4)   Upstream qemu.  Traditional qemu doesn't have such issue.
> 
> In this situation, hvmloader will relocate some guest ram below RMRR to
> high memory, and it seems window guest access an invalid address. Could
> someone give me some suggestions on how to debug this ?

You're likely have RMRR range(s) below 2GB boundary.

You may try the following:

1. Specify some large 'mmio_hole' value in your domain configuration file,
ex. mmio_hole=2560
2. If it won't help, 'xl dmesg' output might come useful

Right now upstream QEMU still doesn't support relocation of parts
of guest RAM to >4GB boundary if they were overlapped by MMIO ranges.
AFAIR forcing allow_memory_relocate to 1 for hvmloader didn't bring anything
good for HVM guest.

Setting the mmio_hole size manually allows to create a "predefined"
memory/MMIO hole layout for both QEMU (via 'max-ram-below-4g') and
hvmloader (via a XenStore param), effectively avoiding MMIO/RMRR overlaps
or RAM relocation in hvmloader, so this might help.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Notes from Design Summit Hypervisor Fuzzing Session

2017-07-21 Thread Lars Kurth
Hi all,
please find attached my notes. A lot of it went over my head, so I may have 
gotten things wrong and some are missing
Feel free to modify, chip in, clarify, as needed
Lars

Session URL: http://sched.co/AjHN

OPTION 1: Userspace Approach


 Dom0  Domu
[AFL] [VM nested with Xen and XTF]
[Xen ]

Would need 
1. nested HVM support
2. VM forking

Not an option as too hard

OPTION 2:
=

 Dom0DomU
[AFL   ][VM XTF   ] 
[  ] <> [  [e]] e = executor
   /\  ||
   ||  \/
[Xen  ]

This approach would need

1. Tracing (instrument binary and write to shared memory for AFL)

Almost done, but not completely deterministic yet

2. Implemented a special hypercall that returns return code that can be 
converted into expected AFL output for branching info

Submitted

3. Communication channel between AFL and XTF

Almost done

4. Using XTF because it should be the fastest option and allows us to restrict 
the scope of what to fuzz

Key challenge: not making unnecessary indeterministic hyper calls in the 
background
Use of XTF constrains the degrees of freedom and focusses the fuzzing

5. Need some way to feed info back into AFL

I believe there was some discussion around this, which I did not get

Discussion
==

Dismissed Option 1. All agreed that Option 2 is best.

I missed quite a bit of this, because the discussion was quite fast at times

George: 
recommends to test one thing at the time to reduce the problem space
Such as iteration, feedback, ...  
Based on outcome iterate

There was a little bit of discussion around determinism:

Andy: blacklist shadop_??? with ??? = shutdown, suspend, watchdog, ... 
Possibly there are some more functions that need to be blacklisted
This should help with determinism

Andy: Going to have problems such as dealing with partial hypercall operations
Wei: Already included this - only 1 thread in XTF => deterministic
Andy: What happoens if HV gets interrupted
Juergen: put XTF into null scheduler pool to minimise risk of interrupts and 
increase determinism
Wei: That would exclude IRQs in such a scenario

There was a little bit of around feedback loop and protocol between AFL and XTF

Andy: easiest way to get a feedback loop starting. XTF to boot, wait on event 
channel (shadop call with - 0 timeout)
AFL does the hypercall with edge tracing, ...

Jurgen: starting measurement can be done be initiated AFL (Dom0), and disabled 
from XTF (DomU)
Wei: follow the same pattern as xl already does (I don't know the sample code 
though)

There was a bit of discussion on the impact pf QEMU

Wei: can't use QEMU to emulate a machine with vhdx (following on from a 
question by Ian)

Ian: this will be fast, not quite so reliable. But a good first step

And some other topics

Andy: there is also syzkaller, with fuzzing entity being some userspace calls
Wei: used as a reference material as Oracle did something similar
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen:Kconfig: Make SCIF built by default for ARM

2017-07-21 Thread Julien Grall

Hi Andrii,

Please CC the relevant maintainers when sending a patch (or questions 
regarding a specific subsystems) on the ML.


On 18/07/17 17:45, Andrii Anisov wrote:

From: Andrii Anisov 

Both Renesas R-Car Gen2(ARM32) and Gen3(ARM64) are utilizing SCIF IP,
so make its serial driver built by default for ARM.

Signed-off-by: Andrii Anisov 


Acked-by: Julien Grall 


---
 xen/drivers/char/Kconfig | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/char/Kconfig b/xen/drivers/char/Kconfig
index 51343d0..fb53dd8 100644
--- a/xen/drivers/char/Kconfig
+++ b/xen/drivers/char/Kconfig
@@ -39,10 +39,10 @@ config HAS_OMAP
 config HAS_SCIF
bool
default y
-   depends on ARM_32
+   depends on ARM
help
  This selects the SuperH SCI(F) UART. If you have a SuperH based board,
- say Y.
+ or Renesas R-Car Gen 2/3 based board say Y.

 config HAS_EHCI
bool



Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-devel][xen/Arm]xen fail to boot on omap5 board

2017-07-21 Thread Julien Grall



On 18/07/17 10:50, Andrii Anisov wrote:

Dear Shishir,


On 18.07.17 12:05, shishir tiwari wrote:

Hi

I want test and understand xen hypervisor implementation with dom0 and
domU on omap5 board.

I followed
https://wiki.xenproject.org/wiki/Xen_ARM_with_Virtualization_Extensions/OMAP5432_uEVM

with latest kernel(4.11.7) and xen(4.9.0) and device tree and but
unable to boot dom0.

xen stop on "Turning on pages".


I guess you mean "- Turning on paging -"


Please drop the whole log.


This is very early boot in head.S so having the full log will not really 
help here...


What is more interesting is where the different modules have been loaded 
in memory:

- Device Tree
- Kernel
- Xen
- Initramfs (if any)




please tell what version on Xen and kernel is tested on omap5 board.

IIRC it was XEN 4.5 and LK 3.18. An old and outdated stuff. The same as
OMAP5, which is discontinued maybe three years ago.


Even though OMAP5 is not sold anymore, we should still be able to boot 
Xen 4.9 on it. If it is not the case, then there is a bug in the code.




BTW, I'm really surprised you have an OMAP5 based board. Which actually
do you have?



Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 13/24] ARM: NUMA: DT: Parse memory NUMA information

2017-07-21 Thread Julien Grall



On 21/07/17 12:10, Vijay Kilari wrote:

Hi Julien,

On Thu, Jul 20, 2017 at 4:56 PM, Julien Grall  wrote:



On 19/07/17 19:39, Julien Grall wrote:


 cell = (const __be32 *)prop->data;
 banks = fdt32_to_cpu(prop->len) / (reg_cells * sizeof (u32));

-for ( i = 0; i < banks && bootinfo.mem.nr_banks < NR_MEM_BANKS;
i++ )
+for ( i = 0; i < banks; i++ )
 {
 device_tree_get_reg(, address_cells, size_cells, ,
);
 if ( !size )
 continue;
-bootinfo.mem.bank[bootinfo.mem.nr_banks].start = start;
-bootinfo.mem.bank[bootinfo.mem.nr_banks].size = size;
-bootinfo.mem.nr_banks++;
+if ( !efi_enabled(EFI_BOOT) && bootinfo.mem.nr_banks <
NR_MEM_BANKS )
+{
+bootinfo.mem.bank[bootinfo.mem.nr_banks].start = start;
+bootinfo.mem.bank[bootinfo.mem.nr_banks].size = size;
+bootinfo.mem.nr_banks++;
+}



This change should be split.



I thought a bit more about this code during the week. I think it would be
nicer to write:

#ifdef CONFIG_NUMA
dt_numa_process_memory_node(nid, start, size);
#endif

if ( !efi_enabled(EFI_BOOT) )
  continue;


Should be if ( efi_enabled(EFI_BOOT) ) ?


if ( bootinfo.mem.nr_banks < NR_MEM_BANKS )


Should be if ( bootinfo.mem.nr_banks >= NR_MEM_BANKS ) ?


Yes for both. I wrote too quickly this e-mail.

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 13/24] ARM: NUMA: DT: Parse memory NUMA information

2017-07-21 Thread Vijay Kilari
Hi Julien,

On Thu, Jul 20, 2017 at 4:56 PM, Julien Grall  wrote:
>
>
> On 19/07/17 19:39, Julien Grall wrote:
>>>
>>>  cell = (const __be32 *)prop->data;
>>>  banks = fdt32_to_cpu(prop->len) / (reg_cells * sizeof (u32));
>>>
>>> -for ( i = 0; i < banks && bootinfo.mem.nr_banks < NR_MEM_BANKS;
>>> i++ )
>>> +for ( i = 0; i < banks; i++ )
>>>  {
>>>  device_tree_get_reg(, address_cells, size_cells, ,
>>> );
>>>  if ( !size )
>>>  continue;
>>> -bootinfo.mem.bank[bootinfo.mem.nr_banks].start = start;
>>> -bootinfo.mem.bank[bootinfo.mem.nr_banks].size = size;
>>> -bootinfo.mem.nr_banks++;
>>> +if ( !efi_enabled(EFI_BOOT) && bootinfo.mem.nr_banks <
>>> NR_MEM_BANKS )
>>> +{
>>> +bootinfo.mem.bank[bootinfo.mem.nr_banks].start = start;
>>> +bootinfo.mem.bank[bootinfo.mem.nr_banks].size = size;
>>> +bootinfo.mem.nr_banks++;
>>> +}
>>
>>
>> This change should be split.
>
>
> I thought a bit more about this code during the week. I think it would be
> nicer to write:
>
> #ifdef CONFIG_NUMA
> dt_numa_process_memory_node(nid, start, size);
> #endif
>
> if ( !efi_enabled(EFI_BOOT) )
>   continue;

Should be if ( efi_enabled(EFI_BOOT) ) ?
>
> if ( bootinfo.mem.nr_banks < NR_MEM_BANKS )

Should be if ( bootinfo.mem.nr_banks >= NR_MEM_BANKS ) ?

>   break;
>
> bootinfo.mem.bank[];
> 
>
> Also, you may want to add a stub for dt_numa_process_memory_node rather than
> #ifdef in the code.
>
> Cheers,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [Bug] Intel RMRR support with upstream Qemu

2017-07-21 Thread Zhang, Xiong Y
On an intel skylake machine with upstream qemu, if I add "rdm=strategy=host, 
policy=strict" to hvm.cfg, win 8.1 DomU couldn't boot up and continues reboot.

Steps to reproduce this issue:

1)   Boot xen with iommu=1 to enable iommu

2)   hvm.cfg contain:

builder="hvm"

memory=

disk=['win8.1 img']

device_model_override='qemu-system-i386'

device_model_version='qemu-xen'

rdm="strategy=host,policy=strict"

3)   xl cr hvm.cfg

Conditions to reproduce this issue:

1)   DomU memory size > the top address of RMRR. Otherwise, this issue will 
disappear.

2)   rdm=" strategy=host,policy=strict" should exist

3)   Windows DomU.  Linux DomU doesn't have such issue.

4)   Upstream qemu.  Traditional qemu doesn't have such issue.

In this situation, hvmloader will relocate some guest ram below RMRR to high 
memory, and it seems window guest access an invalid address.
Could someone give me some suggestions on how to debug this ?

thanks
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] xen/link: Move .data.rel.ro sections into .rodata for final link

2017-07-21 Thread Julien Grall



On 20/07/17 17:54, Wei Liu wrote:

On Thu, Jul 20, 2017 at 05:46:50PM +0100, Wei Liu wrote:

CC relevant maintainers

On Thu, Jul 20, 2017 at 05:20:43PM +0200, David Woodhouse wrote:

From: David Woodhouse 

This includes stuff lke the hypercall tables which we really want


lke -> like


to be read-only. And they were going into .data.read-mostly.

Signed-off-by: David Woodhouse 


Reviewed-by: Wei Liu 


Acked-by: Julien Grall 

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v3]Proposal to allow setting up shared memory areas between VMs from xl config file

2017-07-21 Thread Julien Grall

Hi,

On 18/07/17 19:30, Zhongze Liu wrote:


1. Motivation and Description

Virtual machines use grant table hypercalls to setup a share page for
inter-VMs communications. These hypercalls are used by all PV
protocols today. However, very simple guests, such as baremetal
applications, might not have the infrastructure to handle the grant table.
This project is about setting up several shared memory areas for inter-VMs
communications directly from the VM config file.
So that the guest kernel doesn't have to have grant table support (in the
embedded space, this is not unusual) to be able to communicate with
other guests.


2. Implementation Plan:


==
2.1 Introduce a new VM config option in xl:
==

2.1.1 Design Goals
~~~

The shared areas should be shareable among several (>=2) VMs, so every shared
physical memory area is assigned to a set of VMs. Therefore, a “token” or
“identifier” should be used here to uniquely identify a backing memory area.
A string no longer than 128 bytes is used here to serve the purpose.

The backing area would be taken from one domain, which we will regard
as the "master domain", and this domain should be created prior to any
other "slave domain"s. Again, we have to use some kind of tag to tell who
is the "master domain".

And the ability to specify the permissions and cacheability (and shareability
for arm HVM's) of the pages to be shared should be also given to the user.


s/arm/ARM/. Furthermore it is called ARM guest and not HVM.



2.2.2 Syntax and Behavior
~
The following example illustrates the syntax of the proposed config entry:

In xl config file of vm1:

   static_shm = [ 'id=ID1, begin=0x10, end=0x20, role=master,
   arm_shareattr=inner, arm_inner_cacheattr=wb,
   arm_outer_cacheattr=wb, x86_cacheattr=wb, prot=ro',

   'id=ID2, begin=0x30, end=0x40, role=master,
   arm_shareattr=inner, arm_inner_cacheattr=wb,
   arm_outer_cacheattr=wb, x86_cacheattr=wb, prot=rw' ]

In xl config file of vm2:

static_shm = [ 'id=ID1, begin=0x50, end=0x60, role=slave, prot=ro' ]

In xl config file of vm3:

static_shm = [ 'id=ID2, begin=0x70, end=0x80, role=slave, prot=ro' ]

where:
  @id   can be any string that matches the regexp "[^ \t\n,]+"
and no logner than 128 characters


s/logner/longer/


  @begin/endcan be decimals or hexidemicals of the form "0x2".


s/hexidemicals/hexadecimals/


  @role can only be 'master' or 'slave'
  @prot can be 'n', 'r', 'ro', 'w', 'wo', 'x', 'xo', 'rw', 'rx',
'wx' or 'rwx'. Default is 'rw'.
  @arm_shareattrcan be 'inner' our 'outter', this will be ignored and


s/outter/outer/. If you really want to support shareability, you want to 
provide non-shareable too.


But I think, as suggested on the answer to Stefano, I would be easier if 
we provide a set of policies that will configure the guest correctly. 
This would avoid to do sanity check on the options used by the user.




a warning will be printed out to the screen if it
is specified in an x86 HVM config file.
Default is 'inner'
  @arm_outer_cacheattr  can be 'uc', 'wt', 'wb', 'bufferable' or 'wa', this will
be ignored and a warning will be printed out to the
screen if it is specified in an x86 HVM config file.
Default is 'inner'


I guess you took the name from asm-arm/page.h? Those attributes are for 
stage-1 page-table and not stage-2 (i.e used for translated an 
intermediate physical address to a physical address). Actually nowhere 
you explain that this will be used to configure the mapping in stage-2.


The possibility to configure the mappings are very different (see D4.5 
in ARM DDI0487B.a). You can configure cacheability but not cache 
allocation hints. For instance wa (write-allocate) is a hint.


You will also want to warn the user that this may not prevent memory 
attribute mismatch depending the the cacheability policy.



  @arm_inner_cacheattr  can be 'uc', 'wt', 'wb', 'bufferable' or 'wa'. Default
is 'wb'.
  @x86_cacheattrcan be 'uc', 'wc', 'wt', 'wp', 'wb' or 'suc'. Default
is 'wb'.


Besides, the sizes of the areas specified by @begin and @end in the slave
domain's config file should be smaller than the corresponding sizes specified
in its master's domain. And overlapping backing memory areas are allowed.

In the example 

  1   2   >