Re: [Xen-devel] [PATCH] xen-mapcache: Fix the bug when overlapping emulated DMA operations may cause inconsistency in guest memory mappings

2017-07-19 Thread Alexey G
On Wed, 19 Jul 2017 11:00:26 -0700 (PDT)
Stefano Stabellini  wrote:

> My expectation is that unlocked mappings are much more frequent than
> locked mappings. Also, I expect that only very rarely we'll be able to
> reuse locked mappings. Over the course of a VM lifetime, it seems to me
> that walking the list every time would cost more than it would benefit.
> 
> These are only "expectations", I would love to see numbers. Numbers make
> for better decisions :-)  Would you be up for gathering some of these
> numbers? Such as how many times you get to reuse locked mappings and how
> many times we walk items on the list fruitlessly?
> 
> Otherwise, would you be up for just testing the modified version of the
> patch I sent to verify that solves the bug?

Numbers will show that there is a one single entry in the bucket's list
most of the time. :) Even two entries are rare encounters, typically to be
seen only when guest performs some intensive I/O. OK, I'll collect some real
stats for different scenarios, these are interesting numbers, might come
useful for later optimizations.

The approach your proposed is good, but it allows reusing of suitable
locked entries only when they come first in list (an existing behavior).
But we can actually reuse a locked entry which may come next (if any) in
the list as well. When we have the situation when lock=0 entry comes first
in the list and lock=1 entry is the second -- there is a chance the first
entry was a 2MB-type (must be some reason why 2nd entry was added to the
list), so picking it for a lock0-request might result in
xen_remap_bucket... which should be avoided. Anyway, there is no big deal
which approach is better as these situations are uncommon. After all,
mostly it's just a single entry in the bucket's list. 

> > One possible minor optimization for xen-mapcache would be to reuse
> > larger mappings for mappings of lesser cache_size. Right now existing
> > code does checks in the "entry->size == cache_size" manner, while we
> > can use "entry->size >= cache_size" here. However, we may end up with
> > resident MapCacheEntries being mapped to a bigger mapping sizes than
> > necessary and thus might need to add remapping back to the normal size
> > in xen_invalidate_map_cache_entry_unlocked() when there are no other
> > mappings.  
> 
> Yes, I thought about it, that would be a good improvement to have.

Well, it appears there is a lot of space for improvements in xen-mapcache
usage. Probably getting rid of the lock0/lock1-request separation will
allow to drastically reduce the number of xen_remap_bucket calls.


There also might be a possible bug for lock0-mappings being remapped by
concurrent xen-mapcache requests. 

The whole xen_map_cache(addr, 0, lock=0) thing looks very strange. As it
seems, the idea was to have a way to receive a temporary mapping to read
some tiny item from guest's RAM and after that leaving this mapping on its
own without bothering to unmap it. So it will be either reused later by
some other lock0/1-request or even remapped.

It appears that lock=0 mappings are very fragile. Their typical usage
scenario is like this:

rcu_read_lock();
...
ptr = qemu_map_ram_ptr(mr->ram_block, addr1);
memcpy(buf, ptr, len);
...
rcu_read_unlock();

Here qemu_map_ram_ptr calls xen_map_cache(lock=0) which returns the actual
ptr. This scenario assumes there will be no intervention between
qemu_map_ram_ptr and rcu_read_unlock, providing ptr validity.

This might be ok for QEMU alone, but with underlying xen-mapcache usage
it seems to be assumed for RCU read lock to provide protection against
concurrent remappings of ptr's MapCacheEntry... which it doesn't obviously.
The problem is that rcu_read_lock() seems to be used to protect QEMU stuff
only, leaving us only mapcache_(un)lock to sync execution. But, upon return
from qemu_map_ram_ptr we don't hold the xen-mapcache lock anymore, so the
question is how rcu read lock supposed to save us from concurrent
qemu_map_ram_ptr/qemu_ram_ptr_length's? If there will be some DMA mapping
(lock=1) for that address_index or even some another lock0-read (of
different size) -- they will see an unlocked entry which can be remapped
without hesitation, breaking the ptr mapping which might be still in use.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v3]Proposal to allow setting up shared memory areas between VMs from xl config file

2017-07-19 Thread Zhongze Liu
Hi Stefano,

I missed some of your comments in the last reply.  adding responses to them.

2017-07-20 2:47 GMT+08:00 Stefano Stabellini :
> On Wed, 19 Jul 2017, Zhongze Liu wrote:
>> 
>> 1. Motivation and Description
>> 
>> Virtual machines use grant table hypercalls to setup a share page for
>> inter-VMs communications. These hypercalls are used by all PV
>> protocols today. However, very simple guests, such as baremetal
>> applications, might not have the infrastructure to handle the grant table.
>> This project is about setting up several shared memory areas for inter-VMs
>> communications directly from the VM config file.
>> So that the guest kernel doesn't have to have grant table support (in the
>> embedded space, this is not unusual) to be able to communicate with
>> other guests.
>>
>> 
>> 2. Implementation Plan:
>> 
>>
>> ==
>> 2.1 Introduce a new VM config option in xl:
>> ==
>>
>> 2.1.1 Design Goals
>> ~~~
>>
>> The shared areas should be shareable among several (>=2) VMs, so every shared
>> physical memory area is assigned to a set of VMs. Therefore, a “token” or
>> “identifier” should be used here to uniquely identify a backing memory area.
>> A string no longer than 128 bytes is used here to serve the purpose.
>>
>> The backing area would be taken from one domain, which we will regard
>> as the "master domain", and this domain should be created prior to any
>> other "slave domain"s. Again, we have to use some kind of tag to tell who
>> is the "master domain".
>>
>> And the ability to specify the permissions and cacheability (and shareability
>> for arm HVM's) of the pages to be shared should be also given to the user.
>>
>> 2.2.2 Syntax and Behavior
>> ~
>> The following example illustrates the syntax of the proposed config entry:
>>
>> In xl config file of vm1:
>>
>>static_shm = [ 'id=ID1, begin=0x10, end=0x20, role=master,
>>arm_shareattr=inner, arm_inner_cacheattr=wb,
>>arm_outer_cacheattr=wb, x86_cacheattr=wb, prot=ro',
>>
>>'id=ID2, begin=0x30, end=0x40, role=master,
>>arm_shareattr=inner, arm_inner_cacheattr=wb,
>>arm_outer_cacheattr=wb, x86_cacheattr=wb, prot=rw' ]
>
> Probably not a good idea to mix x86 and arm attributes in the example :-)
> Just make a couple of examples instead.

OK. I'll separate this into two examples.

>
>
>> In xl config file of vm2:
>>
>> static_shm = [ 'id=ID1, begin=0x50, end=0x60, role=slave, 
>> prot=ro' ]
>>
>> In xl config file of vm3:
>>
>> static_shm = [ 'id=ID2, begin=0x70, end=0x80, role=slave, 
>> prot=ro' ]
>>
>> where:
>>   @id   can be any string that matches the regexp "[^ 
>> \t\n,]+"
>> and no logner than 128 characters
>>   @begin/endcan be decimals or hexidemicals of the form 
>> "0x2".
>>   @role can only be 'master' or 'slave'
>>   @prot can be 'n', 'r', 'ro', 'w', 'wo', 'x', 'xo', 'rw', 
>> 'rx',
>> 'wx' or 'rwx'. Default is 'rw'.
>>   @arm_shareattrcan be 'inner' our 'outter', this will be ignored and
>> a warning will be printed out to the screen if it
>> is specified in an x86 HVM config file.
>> Default is 'inner'
>>   @arm_outer_cacheattr  can be 'uc', 'wt', 'wb', 'bufferable' or 'wa', this 
>> will
>> be ignored and a warning will be printed out to the
>> screen if it is specified in an x86 HVM config file.
>> Default is 'inner'
>>   @arm_inner_cacheattr  can be 'uc', 'wt', 'wb', 'bufferable' or 'wa'. 
>> Default
>> is 'wb'.
>
> I don't think we need both @arm_outer_cacheattr and
> @arm_inner_cacheattr: a single @arm_cacheattr should suffice.
>
> Also, we need to explain what each of these values mean. Instead, I
> would only say that today we only support write-back:
>
> @arm_cacheattr  Only 'wb' (write-back) is supported today.
>
> In the code I would check that arm_cacheattr is either missing, or set
> to 'wb'. Throw an error in all other cases.

I'm not sure whether I should first list out all the flags that are *supposed
to be* accepted and then mark some of the flags unavailable or just simply
list only the flags that are currently available.

>
>
>>   @x86_cacheattrcan be 'uc', 'wc', 'wt', 'wp', 'wb' or 'suc'. Default
>> is 'wb'.
>
> Also here, I would write:
>
> @x86_cacheattr  Only 'wb' (write-back) is supported today.

[Xen-devel] [PATCH XTF] Functional: Add a UMIP test

2017-07-19 Thread Boqun Feng (Intel)
Add a "umip" test for the User-Model Instruction Prevention. The test
simply tries to run sgdt/sidt/sldt/str/smsw in guest user-mode with
CR4_UMIP = 1.

Signed-off-by: Boqun Feng (Intel) 
---
 docs/all-tests.dox  |   2 +
 tests/umip/Makefile |   9 
 tests/umip/main.c   | 120 
 3 files changed, 131 insertions(+)
 create mode 100644 tests/umip/Makefile
 create mode 100644 tests/umip/main.c

diff --git a/docs/all-tests.dox b/docs/all-tests.dox
index 01a7a572f472..ec5328b50189 100644
--- a/docs/all-tests.dox
+++ b/docs/all-tests.dox
@@ -109,4 +109,6 @@ guest breakout.
 @section index-in-development In Development
 
 @subpage test-vvmx - Nested VT-x tests.
+
+@subpage test-umip - User-Mode Instruction Prevention
 */
diff --git a/tests/umip/Makefile b/tests/umip/Makefile
new file mode 100644
index ..0248c8b247a0
--- /dev/null
+++ b/tests/umip/Makefile
@@ -0,0 +1,9 @@
+include $(ROOT)/build/common.mk
+
+NAME  := umip
+CATEGORY  := functional
+TEST-ENVS := hvm32 hvm64
+
+obj-perenv += main.o
+
+include $(ROOT)/build/gen.mk
diff --git a/tests/umip/main.c b/tests/umip/main.c
new file mode 100644
index ..27b7d44f4b98
--- /dev/null
+++ b/tests/umip/main.c
@@ -0,0 +1,120 @@
+/**
+ * @file tests/umip/main.c
+ * @ref test-umip
+ *
+ * @page test-umip umip
+ *
+ * @todo Docs for test-umip
+ *
+ * @see tests/umip/main.c
+ */
+#include 
+#include 
+
+const char test_title[] = "User-Mode Instruction Prevention Test";
+bool test_wants_user_mapping = true;
+
+unsigned long umip_sgdt(void)
+{
+unsigned long fault = 0;
+unsigned long tmp;
+
+asm volatile("1: sgdt %[tmp]; 2:"
+ _ASM_EXTABLE_HANDLER(1b,2b, ex_record_fault_edi)
+: "+D" (fault), [tmp] "=m" (tmp)
+:);
+
+return fault;
+}
+
+unsigned long umip_sldt(void)
+{
+unsigned long fault = 0;
+unsigned long tmp;
+
+asm volatile("1: sldt %[tmp]; 2:"
+ _ASM_EXTABLE_HANDLER(1b,2b, ex_record_fault_edi)
+: "+D" (fault), [tmp] "=m" (tmp)
+:);
+
+return fault;
+}
+
+unsigned long umip_sidt(void)
+{
+unsigned long fault = 0;
+unsigned long tmp;
+
+asm volatile("1: sidt %[tmp]; 2:"
+ _ASM_EXTABLE_HANDLER(1b,2b, ex_record_fault_edi)
+: "+D" (fault), [tmp] "=m" (tmp)
+:);
+
+return fault;
+}
+
+unsigned long umip_str(void)
+{
+unsigned long fault = 0;
+unsigned long tmp;
+
+asm volatile("1: str %[tmp]; 2:"
+ _ASM_EXTABLE_HANDLER(1b,2b, ex_record_fault_edi)
+: "+D" (fault), [tmp] "=m" (tmp)
+:);
+
+return fault;
+}
+
+unsigned long umip_smsw(void)
+{
+unsigned long fault = 0;
+unsigned long tmp;
+
+asm volatile("1: smsw %[tmp]; 2:"
+ _ASM_EXTABLE_HANDLER(1b,2b, ex_record_fault_edi)
+: "+D" (fault), [tmp] "=m" (tmp)
+:);
+
+return fault;
+}
+
+void test_main(void)
+{
+unsigned long exp;
+unsigned long cr4 = read_cr4();
+
+if ( !cpu_has_umip )
+xtf_failure("Fail: UMIP is not supported\n");
+
+write_cr4(cr4 | X86_CR4_UMIP);
+
+exp = EXINFO_SYM(GP, 0);
+
+if ( exec_user(umip_sgdt) != exp )
+xtf_failure("Fail: sgdt didn't trigger #GP\n");
+
+if ( exec_user(umip_sldt) != exp )
+xtf_failure("Fail: sldt didn't trigger #GP\n");
+
+if ( exec_user(umip_sidt) != exp )
+xtf_failure("Fail: sidt didn't trigger #GP\n");
+
+if ( exec_user(umip_str) != exp )
+xtf_failure("Fail: str didn't trigger #GP\n");
+
+if ( exec_user(umip_smsw) != exp )
+xtf_failure("Fail: smsw didn't trigger #GP\n");
+
+xtf_success(NULL);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.13.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/cpufeatures: Expose UMIP to HVM guest

2017-07-19 Thread Boqun Feng (Intel)
User-Mode Instruction Prevention (UMIP) is a security feature present in
new Intel Processors. With this feature, when the UMIP bit in CR4 set,
the following instructions cannot be executed if CPL > 0: SGDT, SIDT,
SLDT, SMSW, and STR. An attempt at such execution causes a general-
protection exception (#GP).

This patch simply adds necessary definitions to expose this feature to
hvm guests.

Signed-off-by: Boqun Feng (Intel) 
Cc: Jan Beulich 
---
This patch is basically based on Jan Beulich's patch:


https://lists.xenproject.org/archives/html/xen-devel/2016-12/msg00552.html

I simply picked up exposing bits in that patch and ran some tests on our
simics environment. If any SoB adjustion is needed, please let me know.

Another patch for XTF is sent out along with this patch, as that patch add a
new test for UMIP.

 xen/arch/x86/hvm/hvm.c  | 1 +
 xen/include/public/arch-x86/cpufeatureset.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 814538574725..1284460cda8e 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -960,6 +960,7 @@ unsigned long hvm_cr4_guest_valid_bits(const struct vcpu 
*v, bool restore)
 (p->basic.xsave   ? X86_CR4_OSXSAVE   : 0) |
 (p->feat.smep ? X86_CR4_SMEP  : 0) |
 (p->feat.smap ? X86_CR4_SMAP  : 0) |
+(p->feat.umip ? X86_CR4_UMIP  : 0) |
 (p->feat.pku  ? X86_CR4_PKE   : 0));
 }
 
diff --git a/xen/include/public/arch-x86/cpufeatureset.h 
b/xen/include/public/arch-x86/cpufeatureset.h
index 97dd3534c573..0ee3ea350fc9 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -225,6 +225,7 @@ XEN_CPUFEATURE(AVX512VL,  5*32+31) /*A  AVX-512 Vector 
Length Extensions */
 /* Intel-defined CPU features, CPUID level 0x0007:0.ecx, word 6 */
 XEN_CPUFEATURE(PREFETCHWT1,   6*32+ 0) /*A  PREFETCHWT1 instruction */
 XEN_CPUFEATURE(AVX512VBMI,6*32+ 1) /*A  AVX-512 Vector Byte Manipulation 
Instrs */
+XEN_CPUFEATURE(UMIP,  6*32+ 2) /*S  User Mode Instruction Prevention */
 XEN_CPUFEATURE(PKU,   6*32+ 3) /*H  Protection Keys for Userspace */
 XEN_CPUFEATURE(OSPKE, 6*32+ 4) /*!  OS Protection Keys Enable */
 XEN_CPUFEATURE(AVX512_VPOPCNTDQ, 6*32+14) /*A  POPCNT for vectors of DW/QW */
-- 
2.13.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline bisection] complete test-amd64-i386-xl-qemuu-ws16-amd64

2017-07-19 Thread osstest service owner
branch xen-unstable
xenbranch xen-unstable
job test-amd64-i386-xl-qemuu-ws16-amd64
testid windows-install

Tree: linux git://xenbits.xen.org/linux-pvops.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://git.qemu.org/qemu.git
Tree: xen git://xenbits.xen.org/xen.git

*** Found and reproduced problem changeset ***

  Bug is in tree:  qemuu git://git.qemu.org/qemu.git
  Bug introduced:  04bf2526ce87f21b32c9acba1c5518708c243ad0
  Bug not present: 1a29cc8f5ebd657e159dbe4be340102595846d42
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/112031/


  commit 04bf2526ce87f21b32c9acba1c5518708c243ad0
  Author: Prasad J Pandit 
  Date:   Wed Jul 12 18:08:40 2017 +0530
  
  exec: use qemu_ram_ptr_length to access guest ram
  
  When accessing guest's ram block during DMA operation, use
  'qemu_ram_ptr_length' to get ram block pointer. It ensures
  that DMA operation of given length is possible; And avoids
  any OOB memory access situations.
  
  Reported-by: Alex 
  Signed-off-by: Prasad J Pandit 
  Message-Id: <20170712123840.29328-1-ppan...@redhat.com>
  Signed-off-by: Paolo Bonzini 


For bisection revision-tuple graph see:
   
http://logs.test-lab.xenproject.org/osstest/results/bisect/qemu-mainline/test-amd64-i386-xl-qemuu-ws16-amd64.windows-install.html
Revision IDs in each graph node refer, respectively, to the Trees above.


Running cs-bisection-step 
--graph-out=/home/logs/results/bisect/qemu-mainline/test-amd64-i386-xl-qemuu-ws16-amd64.windows-install
 --summary-out=tmp/112031.bisection-summary --basis-template=111765 
--blessings=real,real-bisect qemu-mainline test-amd64-i386-xl-qemuu-ws16-amd64 
windows-install
Searching for failure / basis pass:
 111986 fail [host=fiano0] / 111790 ok.
Failure / basis pass flights: 111986 / 111790
(tree with no url: minios)
(tree with no url: ovmf)
(tree with no url: seabios)
Tree: linux git://xenbits.xen.org/linux-pvops.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://git.qemu.org/qemu.git
Tree: xen git://xenbits.xen.org/xen.git
Latest b65f2f457c49b2cfd7967c34b7a0b04c25587f13 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
8051789e982499050680a26febeada7467e18a8d 
ff3351d4495c07501aa75e46aec3f494f51d29e1 
2b8a8a03f56e21381c7dd560b081002d357639e2
Basis pass b65f2f457c49b2cfd7967c34b7a0b04c25587f13 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
8051789e982499050680a26febeada7467e18a8d 
49bcce4b9c11759678fd223aefb48691c4959d4f 
614a14736e33fb84872eb00f08799ebbc73a96c6
Generating revisions with ./adhoc-revtuple-generator  
git://xenbits.xen.org/linux-pvops.git#b65f2f457c49b2cfd7967c34b7a0b04c25587f13-b65f2f457c49b2cfd7967c34b7a0b04c25587f13
 
git://xenbits.xen.org/osstest/linux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-c530a75c1e6a472b0eb9558310b518f0dfcd8860
 
git://xenbits.xen.org/qemu-xen-traditional.git#8051789e982499050680a26febeada7467e18a8d-8051789e982499050680a26febeada7467e18a8d
 
git://git.qemu.org/qemu.git#49bcce4b9c11759678fd223aefb48691c4959d4f-ff3351d4495c07501aa75e46aec3f494f51d29e1
 
git://xenbits.xen.org/xen.git#614a14736e33fb84872eb00f08799ebbc73a96c6-2b8a8a03f56e21381c7dd560b081002d357639e2
Loaded 8988 nodes in revision graph
Searching for test results:
 111790 pass b65f2f457c49b2cfd7967c34b7a0b04c25587f13 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
8051789e982499050680a26febeada7467e18a8d 
49bcce4b9c11759678fd223aefb48691c4959d4f 
614a14736e33fb84872eb00f08799ebbc73a96c6
 111817 fail irrelevant
 111848 fail irrelevant
 111889 fail irrelevant
 111926 fail irrelevant
 111944 fail irrelevant
 111940 pass b65f2f457c49b2cfd7967c34b7a0b04c25587f13 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
8051789e982499050680a26febeada7467e18a8d 
49bcce4b9c11759678fd223aefb48691c4959d4f 
614a14736e33fb84872eb00f08799ebbc73a96c6
 111951 fail irrelevant
 111983 fail b65f2f457c49b2cfd7967c34b7a0b04c25587f13 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
8051789e982499050680a26febeada7467e18a8d 
04bf2526ce87f21b32c9acba1c5518708c243ad0 
614a14736e33fb84872eb00f08799ebbc73a96c6
 111964 fail irrelevant
 111977 pass b65f2f457c49b2cfd7967c34b7a0b04c25587f13 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
8051789e982499050680a26febeada7467e18a8d 
92ddfade9f619977d47399bd360c03626629b1e2 
614a14736e33fb84872eb00f08799ebbc73a96c6
 111971 fail b65f2f457c49b2cfd7967c34b7a0b04c25587f13 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
8051789e982499050680a26febeada7467e18a8d 
98fab4c163adb980568afa40824208edbcd6d70c 
614a14736e33fb84872eb00f08799ebbc73a96c6
 111986 fail b65f2f457c49b2cfd7967c34b7a0b04c25587f13 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
8051789e982499050680a26febeada7467e18a8d 

Re: [Xen-devel] [RFC 06/22] kvm: Adapt assembly for PIE support

2017-07-19 Thread H. Peter Anvin
,Chris Metcalf ,"Paul E . 
McKenney" ,Andrew Morton 
,Christopher Li ,Dou Liyang 
,Masahiro Yamada 
,Daniel Borkmann ,Markus 
Trippelsdorf ,Peter Foley ,Steven 
Rostedt ,Tim Chen ,Catalin 
Marinas ,Matthew Wilcox 
,Michal Hocko ,Rob Landley 
,Jiri Kosina ,"H . J . Lu" 
,Paul Bolle ,Baoquan He 
,Daniel Micay ,the arch/x86 maintainers 
,"linux-cry...@vger.kernel.org" 
,Linux Kernel Mailing List 
,xen-de...@lists.xenproject.org,kvm list
,linux-pm ,linux-arch 
,Linux-Sparse ,Kernel 
Hardening 
From: h...@zytor.com
Message-ID: <83ba7600-bc8d-4c91-812c-dd2a0bf44...@zytor.com>

On July 19, 2017 3:58:07 PM PDT, Ard Biesheuvel  
wrote:
>On 19 July 2017 at 23:27, H. Peter Anvin  wrote:
>> On 07/19/17 08:40, Thomas Garnier wrote:

 This doesn't look right.  It's accessing a per-cpu variable.  The
 per-cpu section is an absolute, zero-based section and not subject
>to
 relocation.
>>>
>>> PIE does not respect the zero-based section, it tries to have
>>> everything relative. Patch 16/22 also adapt per-cpu to work with PIE
>>> (while keeping the zero absolute design by default).
>>>
>>
>> This is silly.  The right thing is for PIE is to be explicitly
>absolute,
>> without (%rip).  The use of (%rip) memory references for percpu is
>just
>> an optimization.
>>
>
>Sadly, there is an issue in binutils that may prevent us from doing
>this as cleanly as we would want.
>
>For historical reasons, bfd.ld emits special symbols like
>__GLOBAL_OFFSET_TABLE__ as absolute symbols with a section index of
>SHN_ABS, even though it is quite obvious that they are relative like
>any other symbol that points into the image. Unfortunately, this means
>that binutils needs to emit R_X86_64_RELATIVE relocations even for
>SHN_ABS symbols, which means we lose the ability to use both absolute
>and relocatable symbols in the same PIE image (unless the reloc tool
>can filter them out)
>
>More info here:
>https://sourceware.org/bugzilla/show_bug.cgi?id=19818

The reloc tool already has the ability to filter symbols.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations

2017-07-19 Thread H. Peter Anvin
,"Paul E . McKenney" ,Andrew 
Morton ,Christopher Li ,Dou 
Liyang ,Masahiro Yamada 
,Daniel Borkmann ,Markus 
Trippelsdorf ,Peter Foley ,Steven 
Rostedt ,Tim Chen ,Ard 
Biesheuvel ,Catalin Marinas 
,Matthew Wilcox ,Michal Hocko 
,Rob Landley ,Jiri Kosina 
,"H . J . Lu" ,Paul Bolle 
,Baoquan He ,Daniel Micay 
,the arch/x86 maintainers 
,linux-cry...@vger.kernel.org,LKML 
,xen-de...@lists.xenproject.org,kvm list 
,Linux PM list
,linux-arch 
,linux-spa...@vger.kernel.org,Kernel Hardening 

From: h...@zytor.com
Message-ID: <0ef6faaa-a99c-4f0d-9e4a-ad25e9395...@zytor.com>

On July 19, 2017 4:25:56 PM PDT, Thomas Garnier  wrote:
>On Wed, Jul 19, 2017 at 4:08 PM, H. Peter Anvin  wrote:
>> On 07/19/17 15:47, Thomas Garnier wrote:
>>> On Wed, Jul 19, 2017 at 3:33 PM, H. Peter Anvin 
>wrote:
 On 07/18/17 15:33, Thomas Garnier wrote:
> The x86 relocation tool generates a list of 32-bit signed
>integers. There
> was no need to use 64-bit integers because all addresses where
>above the 2G
> top of the memory.
>
> This change add a large-reloc option to generate 64-bit unsigned
>integers.
> It can be used when the kernel plan to go below the top 2G and
>32-bit
> integers are not enough.

 Why on Earth?  This would only be necessary if the *kernel itself*
>was
 more than 2G, which isn't going to happen for the forseeable
>future.
>>>
>>> Because the relocation integer is an absolute address, not an offset
>>> in the binary. Next iteration, I can try using a 32-bit offset for
>>> everyone.
>>
>> It is an absolute address *as the kernel was originally linked*, for
>> obvious reasons.
>
>Sure when the kernel was just above 0x8000, it doesn't
>work when it goes down to 0x. That's why using an
>offset might make more sense in general.
>
>>
>> -hpa
>>

What is the motivation for changing the pre linked address at all?
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Notes from PCI Passthrough design discussion at Xen Summit

2017-07-19 Thread Manish Jaggi

Hi Punit,

On 7/19/2017 8:11 PM, Punit Agrawal wrote:

I took some notes for the PCI Passthrough design discussion at Xen
Summit. Due to the wide range of topics covered, the notes got sparser
towards the end of the session. I've tried to attribute names against
comments but have very likely got things mixed up. Apologies in advance.
Was curious if any discussions happened on the RC Emu (config space 
emulation) as per slide 18

https://schd.ws/hosted_files/xendeveloperanddesignsummit2017/76/slides.pdf

Although the session was well attended, some of the more active
discussions involved - Julien Grall, Stefano Stabillini, Roger Pau
Monné, Jan Beulich, Vikram Sethi. I'm sure I am missing some folks here.

Please do point out any mistakes I've made for the audience's benefit.

* Discovery of PCI hostbridges
   - Dom0 will be responsible for scanning the ECAM for devices and
 register them with Xen. This approach is chosen due to variety of
 non-standard PCI controllers on ARM platforms and the desire to
 not duplicate driver code between Linux and Xen.
   - Jan, Roger: Bus scan needs to happer before device discovery
 otherwise a small window where Xen doesn't know which host bridge
 the device is registered on (as it'll likely only refer to the
 segment number).
   - Roger: Registering config space with Xen before device discovery
 will allow the hypervisor to set access traps for certain
 functionality as appropriate.
   - Jan: Xen and Dom0 have to agree on the PCI segment number mapping
 to host bridges. This is so that for future calls, Dom0 and
 hypervisor can communicate using sBDF without ambiguity.
   - Julien: Dom0 will register config space address and segment
 number. mcfg_add will be used to pass the segment to Xen.
   - PCI segment - it's purely a software construct so identify
 different host bridges.
   - Some discussion on whether boot devices need to be on
 Segment 0. Technically, MCFG is only required to describe Segment
 0 - other host bridges can be described in AML.

* Configuration accesses for non-ecam compliant host bridge
   - Julien proposed these to be forwarded to Dom0 for handling.
   - Audience: What kind of non-compliance are we talking about? If
 they are simple, can they be implemented in Xen in a few lines of
 code?
   - A few different types
 - restrictions on access size, e.g., only certain sizes supported
 - register multiplexing via a window; similar to legacy x86 PCI
   access mechanism
 - ECAM compliant but with special casing for different devices

* Support on 32bit platforms
   - Is there enough address space to map ECAM into Dom0. Maximum ECAM
 size is 256MB.

* PCI ACS support
   - Vikram: Xen needs to be aware of the PCI device topology to
 correctly setup device groups for passthrough
   - Jan: Roger: IIRC, Xen is already aware of the device topology
 thought it doesn't use ACS to work out which devices need to be
 passed to guest as a group.
   - Stefano: There was support in xend (previous Xen toolstack) but the
 functionality has not yet been ported to libxl.

* Implementation milestones
   - Julien provided a summary of breakdown
 - M0 - design document, currently under discussion on xen-devel
 - M1 - PCI support in Xen
   - Xen aware of PCI devices (via Dom0 registration)
 - M2 - Guest PCIe passthrough
   - Julien: Some complexity in dealing with Legacy interrupts as they can 
be shared.
   - Roger: MSIs mandatory for PCIe. So legacy interrupts can be
 tackled at a later stage.
 - M3 - testing
   - fuzzing. Jan: If implemented it'll be better than what x86
 currently have.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support

2017-07-19 Thread H. Peter Anvin
On 07/19/17 19:21, H. Peter Anvin wrote:
> On 07/19/17 16:33, H. Peter Anvin wrote:
>>>
>>> I agree that it is odd but that's how the compiler generates code. I
>>> will re-explore PIC options with mcmodel=small or medium, as mentioned
>>> on other threads.
>>
>> Why should the way compiler generates code affect the way we do things
>> in assembly?
>>
>> That being said, the compiler now has support for generating this kind
>> of code explicitly via the __seg_gs pointer modifier.  That should let
>> us drop the __percpu_prefix and just use variables directly.  I suspect
>> we want to declare percpu variables as "volatile __seg_gs" to account
>> for the possibility of CPU switches.
>>
>> Older compilers won't be able to work with this, of course, but I think
>> that it is acceptable for those older compilers to not be able to
>> support PIE.
>>
> 
> Grump.  It turns out that the compiler doesn't do the right thing for
> symbols marked with the __seg_[fg]s markers.  __thread does the right
> thing, but __thread a) has %fs: hard-coded, still, and b) I believe can
> still cache %seg:0 arbitrarily long.

I filed this bug report for gcc:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81490

It might still be possible to work around this by playing really ugly
games with __thread, but I haven't yet figured out how best to do that.

-hpa

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support

2017-07-19 Thread H. Peter Anvin
On 07/19/17 16:33, H. Peter Anvin wrote:
>>
>> I agree that it is odd but that's how the compiler generates code. I
>> will re-explore PIC options with mcmodel=small or medium, as mentioned
>> on other threads.
> 
> Why should the way compiler generates code affect the way we do things
> in assembly?
> 
> That being said, the compiler now has support for generating this kind
> of code explicitly via the __seg_gs pointer modifier.  That should let
> us drop the __percpu_prefix and just use variables directly.  I suspect
> we want to declare percpu variables as "volatile __seg_gs" to account
> for the possibility of CPU switches.
> 
> Older compilers won't be able to work with this, of course, but I think
> that it is acceptable for those older compilers to not be able to
> support PIE.
> 

Grump.  It turns out that the compiler doesn't do the right thing for
symbols marked with the __seg_[fg]s markers.  __thread does the right
thing, but __thread a) has %fs: hard-coded, still, and b) I believe can
still cache %seg:0 arbitrarily long.

-hpa


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 112004: tolerable FAIL - PUSHED

2017-07-19 Thread osstest service owner
flight 112004 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112004/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 18 guest-start/win.repeat fail like 111836
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail like 111912
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 111957
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 111957
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 111957
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 111957
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 111957
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 111957
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass

version targeted for testing:
 xen  d535d8922f571502252deaf607e82e7475cd1728
baseline version:
 xen  2b8a8a03f56e21381c7dd560b081002d357639e2

Last test of basis   111957  2017-07-18 02:04:13 Z1 days
Failing since111981  2017-07-18 13:27:36 Z1 days2 attempts
Testing same since   112004  2017-07-19 06:51:03 Z0 days1 attempts


People who touched revisions under test:
  Haozhong Zhang 
  Julien Grall 
  Sergej Proskurin 
  Wei Liu 

jobs:
 build-amd64-xsm 

Re: [Xen-devel] [RFC PATCH] tools/libxl : add struct and parsing utils for the 'static_shm' xl config entry

2017-07-19 Thread Zhongze Liu
Hi Stefano,

2017-07-20 3:24 GMT+08:00 Stefano Stabellini :
> On Wed, 19 Jul 2017, Zhongze Liu wrote:
>> Add a new struct libxl_static_shm in the libxl IDL for the proposed new xl
>> config entry 'static_shm' (see [1]), which allow the user to set up shared
>> memory areas among several VMs for communication.
>>
>> Add related parsing code to the libxl/libxlu_* family and xl/xl_parse.c
>>
>> [1]: [RFC v3]Proposal to allow setting up shared memory areas between VMs 
>> from xl config file,
>>  
>> https://lists.xenproject.org/archives/html/xen-devel/2017-07/msg01741.html
>>
>> Signed-off-by: Zhongze Liu 
>> ---
>> Cc: Wei Liu 
>> Cc: Ian Jackson 
>> Cc: Stefano Stabellini 
>> Cc: Julien Grall 
>> Cc: xen-devel@lists.xen.org
>> ---
>>  tools/libxl/Makefile|   2 +-
>>  tools/libxl/libxl.h |  10 ++
>>  tools/libxl/libxl_types.idl |  52 +
>>  tools/libxl/libxlu_sshm.c   | 274 
>> 
>>  tools/libxl/libxlutil.h |   6 +
>>  tools/xl/xl_parse.c |  24 +++-
>>  6 files changed, 366 insertions(+), 2 deletions(-)
>>  create mode 100644 tools/libxl/libxlu_sshm.c
>>
>> diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
>> index 2ffb78f5c4..b7effb188b 100644
>> --- a/tools/libxl/Makefile
>> +++ b/tools/libxl/Makefile
>> @@ -175,7 +175,7 @@ AUTOINCS= libxlu_cfg_y.h libxlu_cfg_l.h _libxl_list.h 
>> _paths.h \
>>  AUTOSRCS= libxlu_cfg_y.c libxlu_cfg_l.c
>>  AUTOSRCS += _libxl_save_msgs_callout.c _libxl_save_msgs_helper.c
>>  LIBXLU_OBJS = libxlu_cfg_y.o libxlu_cfg_l.o libxlu_cfg.o \
>> - libxlu_disk_l.o libxlu_disk.o libxlu_vif.o libxlu_pci.o
>> + libxlu_disk_l.o libxlu_disk.o libxlu_vif.o libxlu_pci.o libxlu_sshm.o
>>  $(LIBXLU_OBJS): CFLAGS += $(CFLAGS_libxenctrl) # For xentoollog.h
>>
>>  $(TEST_PROG_OBJS) _libxl.api-for-check: CFLAGS += $(CFLAGS_libxentoollog)
>> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
>> index 7cf0f31f68..cf3cbe1ba1 100644
>> --- a/tools/libxl/libxl.h
>> +++ b/tools/libxl/libxl.h
>> @@ -2228,6 +2228,16 @@ int libxl_fd_set_nonblock(libxl_ctx *ctx, int fd, int 
>> nonblock);
>>  int libxl_qemu_monitor_command(libxl_ctx *ctx, uint32_t domid,
>> const char *command_line, char **output);
>>
>> +
>> +/* Functions to stattically set up shared memory regions between two  
>> domains
>  ^ statically  
> ^double space
>

Sorry for the typos.

>
>> + * for shm-based communication. */
>> +
>> +#define LIBXL_SSHM_RANGE_UNKNOWN UINT64_MAX
>> +
>> +/* TODO: int libxl_sshm_add(libxl_ctx *ctx, uint32_t domid,
>> + *  libxl_static_shm *sshm);
>> + */
>> +
>>  #include 
>>
>>  #endif /* LIBXL_H */
>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
>> index 8a9849c643..8c68b45add 100644
>> --- a/tools/libxl/libxl_types.idl
>> +++ b/tools/libxl/libxl_types.idl
>> @@ -779,6 +779,57 @@ libxl_device_channel = Struct("device_channel", [
>> ])),
>>  ])
>>
>> +# static shared memory cacheability attributes
>> +libxl_sshm_cacheattr = Enumeration("sshm_cacheattr", [
>> +(-1, "UNKNOWN"),
>> +(0, "UC"),
>> +(1, "WC"),  #x86 only
>> +(4, "WT"),
>> +(5, "WP"),  #x86 only
>> +(6, "WB"),
>> +(7, "SUC"), #x86 only
>> +(8, "BUFFERABLE"),  #ARM only
>> +(9, "WA"),  #ARM only
>> +], init_val = "LIBXL_SSHM_CACHEATTR_UNKNOWN")
>
> I would only specify UNKNOWN and WB for now.

For here and below, I actually want to left the checks for 'not
implemented' errors
to later stages of handling. The typical call flow of xl is like below:

xl --> libxlu_* --> xl --> libxl_* --> hypercalls

I was planning to check for options that are not implemented currently
in the libxl_*.

>
>
>> +# static shared memory shareability attributes
>> +libxl_sshm_shareattr = Enumeration("sshm_shareattr", [
>> +(-1, "UNKNOWN"),
>> +(0, "NON"),
>> +(2, "OUTER"),
>> +(3, "INNER"),
>> +], init_val = "LIBXL_SSHM_SHAREATTR_UNKNOWN")
>> +
>> +libxl_sshm_prot = Enumeration("sshm_prot", [
>> +(-1, "UNKNOWN"),
>> +(0, "N"),
>> +(1, "R"),
>> +(2, "W"),
>> +(4, "X"),
>> +(3, "RW"),
>> +(5, "RX"),
>> +(6, "WX"),
>> +(7, "RWX"),
>> +], init_val = "LIBXL_SSHM_PROT_UNKNOWN")
>> +
>> +libxl_sshm_role = Enumeration("sshm_role", [
>> +(-1, "UNKNOWN"),
>> +(0, "MASTER"),
>> +(1, "SLAVE"),
>> +], init_val = "LIBXL_SSHM_ROLE_UNKNOWN")
>> +
>> +libxl_static_shm = Struct("static_shm", [
>> +("id", string),
>> +("begin", uint64, {'init_val': 'LIBXL_SSHM_RANGE_UNKNOWN'}),
>> +("end", uint64, {'init_val': 'LIBXL_SSHM_RANGE_UNKNOWN'}),
>> +("prot", libxl_sshm_prot),
>> +("arm_shareattr", libxl_sshm_shareattr),
>> +("arm_inner_cacheattr", 

Re: [Xen-devel] [RFC v3]Proposal to allow setting up shared memory areas between VMs from xl config file

2017-07-19 Thread Zhongze Liu
2017-07-20 2:47 GMT+08:00 Stefano Stabellini :
> On Wed, 19 Jul 2017, Zhongze Liu wrote:
>> 
>> 1. Motivation and Description
>> 
>> Virtual machines use grant table hypercalls to setup a share page for
>> inter-VMs communications. These hypercalls are used by all PV
>> protocols today. However, very simple guests, such as baremetal
>> applications, might not have the infrastructure to handle the grant table.
>> This project is about setting up several shared memory areas for inter-VMs
>> communications directly from the VM config file.
>> So that the guest kernel doesn't have to have grant table support (in the
>> embedded space, this is not unusual) to be able to communicate with
>> other guests.
>>
>> 
>> 2. Implementation Plan:
>> 
>>
>> ==
>> 2.1 Introduce a new VM config option in xl:
>> ==
>>
>> 2.1.1 Design Goals
>> ~~~
>>
>> The shared areas should be shareable among several (>=2) VMs, so every shared
>> physical memory area is assigned to a set of VMs. Therefore, a “token” or
>> “identifier” should be used here to uniquely identify a backing memory area.
>> A string no longer than 128 bytes is used here to serve the purpose.
>>
>> The backing area would be taken from one domain, which we will regard
>> as the "master domain", and this domain should be created prior to any
>> other "slave domain"s. Again, we have to use some kind of tag to tell who
>> is the "master domain".
>>
>> And the ability to specify the permissions and cacheability (and shareability
>> for arm HVM's) of the pages to be shared should be also given to the user.
>>
>> 2.2.2 Syntax and Behavior
>> ~
>> The following example illustrates the syntax of the proposed config entry:
>>
>> In xl config file of vm1:
>>
>>static_shm = [ 'id=ID1, begin=0x10, end=0x20, role=master,
>>arm_shareattr=inner, arm_inner_cacheattr=wb,
>>arm_outer_cacheattr=wb, x86_cacheattr=wb, prot=ro',
>>
>>'id=ID2, begin=0x30, end=0x40, role=master,
>>arm_shareattr=inner, arm_inner_cacheattr=wb,
>>arm_outer_cacheattr=wb, x86_cacheattr=wb, prot=rw' ]
>
> Probably not a good idea to mix x86 and arm attributes in the example :-)
> Just make a couple of examples instead.
>
>
>> In xl config file of vm2:
>>
>> static_shm = [ 'id=ID1, begin=0x50, end=0x60, role=slave, 
>> prot=ro' ]
>>
>> In xl config file of vm3:
>>
>> static_shm = [ 'id=ID2, begin=0x70, end=0x80, role=slave, 
>> prot=ro' ]
>>
>> where:
>>   @id   can be any string that matches the regexp "[^ 
>> \t\n,]+"
>> and no logner than 128 characters
>>   @begin/endcan be decimals or hexidemicals of the form 
>> "0x2".
>>   @role can only be 'master' or 'slave'
>>   @prot can be 'n', 'r', 'ro', 'w', 'wo', 'x', 'xo', 'rw', 
>> 'rx',
>> 'wx' or 'rwx'. Default is 'rw'.
>>   @arm_shareattrcan be 'inner' our 'outter', this will be ignored and
>> a warning will be printed out to the screen if it
>> is specified in an x86 HVM config file.
>> Default is 'inner'
>>   @arm_outer_cacheattr  can be 'uc', 'wt', 'wb', 'bufferable' or 'wa', this 
>> will
>> be ignored and a warning will be printed out to the
>> screen if it is specified in an x86 HVM config file.
>> Default is 'inner'
>>   @arm_inner_cacheattr  can be 'uc', 'wt', 'wb', 'bufferable' or 'wa'. 
>> Default
>> is 'wb'.
>
> I don't think we need both @arm_outer_cacheattr and
> @arm_inner_cacheattr: a single @arm_cacheattr should suffice.
>
> Also, we need to explain what each of these values mean. Instead, I
> would only say that today we only support write-back:
>
> @arm_cacheattr  Only 'wb' (write-back) is supported today.
>
> In the code I would check that arm_cacheattr is either missing, or set
> to 'wb'. Throw an error in all other cases.
>
>
>>   @x86_cacheattrcan be 'uc', 'wc', 'wt', 'wp', 'wb' or 'suc'. Default
>> is 'wb'.
>
> Also here, I would write:
>
> @x86_cacheattr  Only 'wb' (write-back) is supported today.
>
> Like you wrote later, begin and end addresses need to be multiple of 4K.
>
>
>> Besides, the sizes of the areas specified by @begin and @end in the slave
>> domain's config file should be smaller than the corresponding sizes specified
>> in its master's domain. And overlapping backing memory areas are allowed.
>>
>> In the example 

Re: [Xen-devel] [PATCH] xen-blkfront: fix mq start/stop race

2017-07-19 Thread Junxiao Bi
On 07/19/2017 10:08 PM, Konrad Rzeszutek Wilk wrote:
> On Wed, Jul 19, 2017 at 03:51:48PM +0800, Junxiao Bi wrote:
>> Hi Konrad,
>>
>> On 07/19/2017 03:37 PM, Roger Pau Monné wrote:
>>> On Wed, Jul 19, 2017 at 09:19:49AM +0800, Junxiao Bi wrote:
 Hi Roger,

 On 06/23/2017 08:57 PM, Roger Pau Monné wrote:
> On Thu, Jun 22, 2017 at 09:36:52AM +0800, Junxiao Bi wrote:
>> When ring buf full, hw queue will be stopped. While blkif interrupt 
>> consume
>> request and make free space in ring buf, hw queue will be started again.
>> But since start queue is protected by spin lock while stop not, that will
>> cause a race.
>>
>> interrupt:  process:
>> blkif_interrupt()   blkif_queue_rq()
>>  kick_pending_request_queues_locked()
>>   blk_mq_start_stopped_hw_queues()
>>clear_bit(BLK_MQ_S_STOPPED, >state)
>>  
>> blk_mq_stop_hw_queue(hctx)
>>blk_mq_run_hw_queue(hctx, async)
>>
>> If ring buf is made empty in this case, interrupt will never come, then 
>> the
>> hw queue will be stopped forever, all processes waiting for the pending 
>> io
>> in the queue will hung.
>>
>> Signed-off-by: Junxiao Bi 
>> Reviewed-by: Ankur Arora 
>
> Acked-by: Roger Pau Monné 
 Looks patch not in mainline. Can you please help merge it?
>>>
>>> I'm afraid this needs to be done by Konrad or one of the Linux
>>> maintainers, I don't have an account on kernel.org in order to send
>>> pull requests to Jens.
>> Can you pls help merge it?
> 
> Could you kindly repost it with the updated tags _and_ against Linus's latest
> branch?
Sure, v2 sent. Please check.

Thanks,
Junxiao.
> 
> I get:
> [konrad@char linux]$ git am -s < /tmp/a
> Applying: xen-blkfront: fix mq start/stop race
> error: patch failed: drivers/block/xen-blkfront.c:912
> error: drivers/block/xen-blkfront.c: patch does not apply
> Patch failed at 0001 xen-blkfront: fix mq start/stop race
> The copy of the patch that failed is found in: .git/rebase-apply/patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".
> 
> 
>>
>> Thanks,
>> Junxiao.
>>>
>>> Roger.
>>>
>>


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2] xen-blkfront: fix mq start/stop race

2017-07-19 Thread Junxiao Bi
When ring buf full, hw queue will be stopped. While blkif interrupt consume
request and make free space in ring buf, hw queue will be started again.
But since start queue is protected by spin lock while stop not, that will
cause a race.

interrupt:  process:
blkif_interrupt()   blkif_queue_rq()
 kick_pending_request_queues_locked()
   blk_mq_start_stopped_hw_queues()
  clear_bit(BLK_MQ_S_STOPPED, >state)
 blk_mq_stop_hw_queue(hctx)
  blk_mq_run_hw_queue(hctx, async)

If ring buf is made empty in this case, interrupt will never come, then the
hw queue will be stopped forever, all processes waiting for the pending io
in the queue will hung.

Signed-off-by: Junxiao Bi 
Reviewed-by: Ankur Arora 
Acked-by: Roger Pau Monné 
---
 drivers/block/xen-blkfront.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index c852ed3c01d5..5468be4f8075 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -906,8 +906,8 @@ static blk_status_t blkif_queue_rq(struct blk_mq_hw_ctx 
*hctx,
return BLK_STS_IOERR;
 
 out_busy:
-   spin_unlock_irqrestore(>ring_lock, flags);
blk_mq_stop_hw_queue(hctx);
+   spin_unlock_irqrestore(>ring_lock, flags);
return BLK_STS_RESOURCE;
 }
 
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support

2017-07-19 Thread H. Peter Anvin
On 07/19/17 11:26, Thomas Garnier wrote:
> On Tue, Jul 18, 2017 at 8:08 PM, Brian Gerst  wrote:
>> On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier  wrote:
>>> Perpcu uses a clever design where the .percu ELF section has a virtual
>>> address of zero and the relocation code avoid relocating specific
>>> symbols. It makes the code simple and easily adaptable with or without
>>> SMP support.
>>>
>>> This design is incompatible with PIE because generated code always try to
>>> access the zero virtual address relative to the default mapping address.
>>> It becomes impossible when KASLR is configured to go below -2G. This
>>> patch solves this problem by removing the zero mapping and adapting the GS
>>> base to be relative to the expected address. These changes are done only
>>> when PIE is enabled. The original implementation is kept as-is
>>> by default.
>>
>> The reason the per-cpu section is zero-based on x86-64 is to
>> workaround GCC hardcoding the stack protector canary at %gs:40.  So
>> this patch is incompatible with CONFIG_STACK_PROTECTOR.
> 
> Ok, that make sense. I don't want this feature to not work with
> CONFIG_CC_STACKPROTECTOR*. One way to fix that would be adding a GDT
> entry for gs so gs:40 points to the correct memory address and
> gs:[rip+XX] works correctly through the MSR.

What are you talking about?  A GDT entry and the MSR do the same thing,
except that a GDT entry is limited to an offset of 0-0x (which
doesn't work for us, obviously.)

> Given the separate
> discussion on mcmodel, I am going first to check if we can move from
> PIE to PIC with a mcmodel=small or medium that would remove the percpu
> change requirement. I tried before without success but I understand
> better percpu and other components so maybe I can make it work.

>> This is silly.  The right thing is for PIE is to be explicitly absolute,
>> without (%rip).  The use of (%rip) memory references for percpu is just
>> an optimization.
> 
> I agree that it is odd but that's how the compiler generates code. I
> will re-explore PIC options with mcmodel=small or medium, as mentioned
> on other threads.

Why should the way compiler generates code affect the way we do things
in assembly?

That being said, the compiler now has support for generating this kind
of code explicitly via the __seg_gs pointer modifier.  That should let
us drop the __percpu_prefix and just use variables directly.  I suspect
we want to declare percpu variables as "volatile __seg_gs" to account
for the possibility of CPU switches.

Older compilers won't be able to work with this, of course, but I think
that it is acceptable for those older compilers to not be able to
support PIE.

-hpa


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations

2017-07-19 Thread H. Peter Anvin
On 07/19/17 15:47, Thomas Garnier wrote:
> On Wed, Jul 19, 2017 at 3:33 PM, H. Peter Anvin  wrote:
>> On 07/18/17 15:33, Thomas Garnier wrote:
>>> The x86 relocation tool generates a list of 32-bit signed integers. There
>>> was no need to use 64-bit integers because all addresses where above the 2G
>>> top of the memory.
>>>
>>> This change add a large-reloc option to generate 64-bit unsigned integers.
>>> It can be used when the kernel plan to go below the top 2G and 32-bit
>>> integers are not enough.
>>
>> Why on Earth?  This would only be necessary if the *kernel itself* was
>> more than 2G, which isn't going to happen for the forseeable future.
> 
> Because the relocation integer is an absolute address, not an offset
> in the binary. Next iteration, I can try using a 32-bit offset for
> everyone.

It is an absolute address *as the kernel was originally linked*, for
obvious reasons.

-hpa


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations

2017-07-19 Thread Thomas Garnier
On Wed, Jul 19, 2017 at 4:08 PM, H. Peter Anvin  wrote:
> On 07/19/17 15:47, Thomas Garnier wrote:
>> On Wed, Jul 19, 2017 at 3:33 PM, H. Peter Anvin  wrote:
>>> On 07/18/17 15:33, Thomas Garnier wrote:
 The x86 relocation tool generates a list of 32-bit signed integers. There
 was no need to use 64-bit integers because all addresses where above the 2G
 top of the memory.

 This change add a large-reloc option to generate 64-bit unsigned integers.
 It can be used when the kernel plan to go below the top 2G and 32-bit
 integers are not enough.
>>>
>>> Why on Earth?  This would only be necessary if the *kernel itself* was
>>> more than 2G, which isn't going to happen for the forseeable future.
>>
>> Because the relocation integer is an absolute address, not an offset
>> in the binary. Next iteration, I can try using a 32-bit offset for
>> everyone.
>
> It is an absolute address *as the kernel was originally linked*, for
> obvious reasons.

Sure when the kernel was just above 0x8000, it doesn't
work when it goes down to 0x. That's why using an
offset might make more sense in general.

>
> -hpa
>



-- 
Thomas.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 07/22] x86: relocate_kernel - Adapt assembly for PIE support

2017-07-19 Thread Thomas Garnier
On Wed, Jul 19, 2017 at 3:58 PM, H. Peter Anvin  wrote:
> On 07/18/17 15:33, Thomas Garnier wrote:
>> Change the assembly code to use only relative references of symbols for the
>> kernel to be PIE compatible.
>>
>> Position Independent Executable (PIE) support will allow to extended the
>> KASLR randomization range below the -2G memory limit.
>>
>> Signed-off-by: Thomas Garnier 
>> ---
>>  arch/x86/kernel/relocate_kernel_64.S | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/relocate_kernel_64.S 
>> b/arch/x86/kernel/relocate_kernel_64.S
>> index 98111b38ebfd..da817d1628ac 100644
>> --- a/arch/x86/kernel/relocate_kernel_64.S
>> +++ b/arch/x86/kernel/relocate_kernel_64.S
>> @@ -186,7 +186,7 @@ identity_mapped:
>>   movq%rax, %cr3
>>   lea PAGE_SIZE(%r8), %rsp
>>   callswap_pages
>> - movq$virtual_mapped, %rax
>> + leaqvirtual_mapped(%rip), %rax
>>   pushq   %rax
>>   ret
>>
>
> This is completely wrong.  The whole point is that %rip here is on an
> identity-mapped page, which means that its offset to the actual symbol
> is ill-defined.
>
> The use of pushq/ret to do an indirect jump is bizarre, though, instead of:
>
> pushq %r8
> ret
>
> one ought to simply do
>
> jmpq *%r8
>
> I think the author of this code was confused by the fact that we have to
> use this construct to do a *far* jump.
>
> There are some other very bizarre constructs in this file, that I can
> only assume comes from clumsy porting from 32 bits, for example:
>
> call 1f
> 1:
> popq %r8
> subq $(1b - relocate_kernel), %r8
>
> ... instead of the much simpler ...
>
> leaq relocate_kernel(%rip), %r8
>
> With this value in %r8 anyway, you can simply do:
>
> leaq (virtual_mapped - relocate_kernel)(%r8), %rax
> jmpq *%rax
>

Thanks I will look into that.

> This patchset scares me.  There seems to be a lot of places where you
> have not been very aware of what is actually happening in the code, nor
> have done research about how the ABIs actually work and affect things.

There is a lot of assembly that needed to be change. It was easier to
understand parts that are directly exercised like boot or percpu.
That's why I value people's feedback and will improve the patchset.

Thanks!

>
> Sorry.
>
> -hpa



-- 
Thomas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 07/22] x86: relocate_kernel - Adapt assembly for PIE support

2017-07-19 Thread H. Peter Anvin
On 07/18/17 15:33, Thomas Garnier wrote:
> Change the assembly code to use only relative references of symbols for the
> kernel to be PIE compatible.
> 
> Position Independent Executable (PIE) support will allow to extended the
> KASLR randomization range below the -2G memory limit.
> 
> Signed-off-by: Thomas Garnier 
> ---
>  arch/x86/kernel/relocate_kernel_64.S | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/relocate_kernel_64.S 
> b/arch/x86/kernel/relocate_kernel_64.S
> index 98111b38ebfd..da817d1628ac 100644
> --- a/arch/x86/kernel/relocate_kernel_64.S
> +++ b/arch/x86/kernel/relocate_kernel_64.S
> @@ -186,7 +186,7 @@ identity_mapped:
>   movq%rax, %cr3
>   lea PAGE_SIZE(%r8), %rsp
>   callswap_pages
> - movq$virtual_mapped, %rax
> + leaqvirtual_mapped(%rip), %rax
>   pushq   %rax
>   ret
>  

This is completely wrong.  The whole point is that %rip here is on an
identity-mapped page, which means that its offset to the actual symbol
is ill-defined.

The use of pushq/ret to do an indirect jump is bizarre, though, instead of:

pushq %r8
ret

one ought to simply do

jmpq *%r8

I think the author of this code was confused by the fact that we have to
use this construct to do a *far* jump.

There are some other very bizarre constructs in this file, that I can
only assume comes from clumsy porting from 32 bits, for example:

call 1f
1:
popq %r8
subq $(1b - relocate_kernel), %r8

... instead of the much simpler ...

leaq relocate_kernel(%rip), %r8

With this value in %r8 anyway, you can simply do:

leaq (virtual_mapped - relocate_kernel)(%r8), %rax
jmpq *%rax

This patchset scares me.  There seems to be a lot of places where you
have not been very aware of what is actually happening in the code, nor
have done research about how the ABIs actually work and affect things.

Sorry.

-hpa

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 06/22] kvm: Adapt assembly for PIE support

2017-07-19 Thread Ard Biesheuvel
On 19 July 2017 at 23:27, H. Peter Anvin  wrote:
> On 07/19/17 08:40, Thomas Garnier wrote:
>>>
>>> This doesn't look right.  It's accessing a per-cpu variable.  The
>>> per-cpu section is an absolute, zero-based section and not subject to
>>> relocation.
>>
>> PIE does not respect the zero-based section, it tries to have
>> everything relative. Patch 16/22 also adapt per-cpu to work with PIE
>> (while keeping the zero absolute design by default).
>>
>
> This is silly.  The right thing is for PIE is to be explicitly absolute,
> without (%rip).  The use of (%rip) memory references for percpu is just
> an optimization.
>

Sadly, there is an issue in binutils that may prevent us from doing
this as cleanly as we would want.

For historical reasons, bfd.ld emits special symbols like
__GLOBAL_OFFSET_TABLE__ as absolute symbols with a section index of
SHN_ABS, even though it is quite obvious that they are relative like
any other symbol that points into the image. Unfortunately, this means
that binutils needs to emit R_X86_64_RELATIVE relocations even for
SHN_ABS symbols, which means we lose the ability to use both absolute
and relocatable symbols in the same PIE image (unless the reloc tool
can filter them out)

More info here:
https://sourceware.org/bugzilla/show_bug.cgi?id=19818

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations

2017-07-19 Thread H. Peter Anvin
On 07/18/17 15:33, Thomas Garnier wrote:
> The x86 relocation tool generates a list of 32-bit signed integers. There
> was no need to use 64-bit integers because all addresses where above the 2G
> top of the memory.
> 
> This change add a large-reloc option to generate 64-bit unsigned integers.
> It can be used when the kernel plan to go below the top 2G and 32-bit
> integers are not enough.

Why on Earth?  This would only be necessary if the *kernel itself* was
more than 2G, which isn't going to happen for the forseeable future.

-hpa


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 06/22] kvm: Adapt assembly for PIE support

2017-07-19 Thread H. Peter Anvin
On 07/19/17 08:40, Thomas Garnier wrote:
>>
>> This doesn't look right.  It's accessing a per-cpu variable.  The
>> per-cpu section is an absolute, zero-based section and not subject to
>> relocation.
> 
> PIE does not respect the zero-based section, it tries to have
> everything relative. Patch 16/22 also adapt per-cpu to work with PIE
> (while keeping the zero absolute design by default).
> 

This is silly.  The right thing is for PIE is to be explicitly absolute,
without (%rip).  The use of (%rip) memory references for percpu is just
an optimization.

-hpa


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations

2017-07-19 Thread Thomas Garnier
On Wed, Jul 19, 2017 at 3:33 PM, H. Peter Anvin  wrote:
> On 07/18/17 15:33, Thomas Garnier wrote:
>> The x86 relocation tool generates a list of 32-bit signed integers. There
>> was no need to use 64-bit integers because all addresses where above the 2G
>> top of the memory.
>>
>> This change add a large-reloc option to generate 64-bit unsigned integers.
>> It can be used when the kernel plan to go below the top 2G and 32-bit
>> integers are not enough.
>
> Why on Earth?  This would only be necessary if the *kernel itself* was
> more than 2G, which isn't going to happen for the forseeable future.

Because the relocation integer is an absolute address, not an offset
in the binary. Next iteration, I can try using a 32-bit offset for
everyone.
>
> -hpa
>



-- 
Thomas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 06/22] kvm: Adapt assembly for PIE support

2017-07-19 Thread Thomas Garnier
On Wed, Jul 19, 2017 at 3:27 PM, H. Peter Anvin  wrote:
> On 07/19/17 08:40, Thomas Garnier wrote:
>>>
>>> This doesn't look right.  It's accessing a per-cpu variable.  The
>>> per-cpu section is an absolute, zero-based section and not subject to
>>> relocation.
>>
>> PIE does not respect the zero-based section, it tries to have
>> everything relative. Patch 16/22 also adapt per-cpu to work with PIE
>> (while keeping the zero absolute design by default).
>>
>
> This is silly.  The right thing is for PIE is to be explicitly absolute,
> without (%rip).  The use of (%rip) memory references for percpu is just
> an optimization.

I agree that it is odd but that's how the compiler generates code. I
will re-explore PIC options with mcmodel=small or medium, as mentioned
on other threads.

>
> -hpa
>



-- 
Thomas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [libvirt test] 112002: tolerable all pass - PUSHED

2017-07-19 Thread osstest service owner
flight 112002 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112002/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 111966
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 111966
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 111966
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-libvirt-qcow2 12 migrate-support-checkfail never pass
 test-arm64-arm64-libvirt-qcow2 13 saverestore-support-checkfail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  b494e09d058f09b48d0fd8855edd557101294671
baseline version:
 libvirt  dae23ec3456011f86086db76d45d8d0d266f7b9f

Last test of basis   111966  2017-07-18 04:24:43 Z1 days
Testing same since   112002  2017-07-19 04:21:09 Z0 days1 attempts


People who touched revisions under test:
  Andrea Bolognani 
  Boris Fiuczynski 
  Jim Fehlig 
  John Ferlan 
  Michal Privoznik 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-arm64-arm64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-arm64-arm64-libvirt pass
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-arm64-arm64-libvirt-qcow2   pass
 test-armhf-armhf-libvirt-raw pass
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master

[Xen-devel] [linux-next test] 112009: regressions - FAIL

2017-07-19 Thread osstest service owner
flight 112009 linux-next real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112009/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-libvirt   7 xen-boot fail REGR. vs. 111972
 test-amd64-i386-xl-qemuu-win10-i386  7 xen-boot  fail REGR. vs. 111972
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-boot  fail REGR. vs. 111972
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-boot fail REGR. vs. 111972
 test-amd64-i386-xl-xsm7 xen-boot fail REGR. vs. 111972
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-boot fail REGR. vs. 111972
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-boot fail REGR. vs. 111972
 test-amd64-i386-pair 10 xen-boot/src_hostfail REGR. vs. 111972
 test-amd64-i386-pair 11 xen-boot/dst_hostfail REGR. vs. 111972
 test-amd64-i386-xl-qemut-win10-i386  7 xen-boot  fail REGR. vs. 111972
 test-amd64-i386-xl7 xen-boot fail REGR. vs. 111972
 test-amd64-i386-freebsd10-i386  7 xen-boot   fail REGR. vs. 111972
 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail REGR. vs. 111972
 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail REGR. vs. 111972
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-boot  fail REGR. vs. 111972
 test-amd64-i386-examine   7 reboot   fail REGR. vs. 111972
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-boot fail REGR. vs. 111972
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-boot  fail REGR. vs. 111972
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 
111972
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 111972
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 111972
 test-amd64-i386-freebsd10-amd64  7 xen-boot  fail REGR. vs. 111972
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-boot   fail REGR. vs. 111972
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-boot  fail REGR. vs. 111972
 test-amd64-amd64-rumprun-amd64  7 xen-boot   fail REGR. vs. 111972
 test-amd64-amd64-pair10 xen-boot/src_hostfail REGR. vs. 111972
 test-amd64-amd64-pair11 xen-boot/dst_hostfail REGR. vs. 111972
 test-amd64-amd64-examine  7 reboot   fail REGR. vs. 111972
 test-amd64-i386-rumprun-i386  7 xen-boot fail REGR. vs. 111972
 test-amd64-i386-xl-raw7 xen-boot fail REGR. vs. 111972
 test-amd64-i386-libvirt-xsm   7 xen-boot fail REGR. vs. 111972
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-boot   fail REGR. vs. 111972
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 
111972
 build-arm64-pvops 6 kernel-build fail REGR. vs. 111972
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
111972
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-boot  fail REGR. vs. 111972

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-examine  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-win7-amd64 18 guest-start/win.repeat fail blocked in 
111972
 test-amd64-amd64-libvirt-pair 21 guest-start/debian   fail like 111972
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 111972
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 111972
 test-amd64-amd64-amd64-pvgrub  7 xen-boot fail like 111972
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 111972
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 111972
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 111972
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 

[Xen-devel] [PATCH] hvmloader, libxl: use the correct ACPI settings depending on device model

2017-07-19 Thread Igor Druzhinin
We need to choose ACPI tables and ACPI IO port location
properly depending on the device model version we are running.
Previously, this decision was made by BIOS type specific
code in hvmloader, e.g. always load QEMU traditional specific
tables if it's ROMBIOS and always load QEMU Xen specific
tables if it's SeaBIOS.

This change saves this behavior but adds an additional way
(xenstore key) to specify the correct device model if we
happen to run a non-default one. Toolstack bit makes use of it.

Signed-off-by: Igor Druzhinin 
---
 tools/firmware/hvmloader/hvmloader.c |  2 --
 tools/firmware/hvmloader/ovmf.c  |  2 ++
 tools/firmware/hvmloader/rombios.c   |  2 ++
 tools/firmware/hvmloader/seabios.c   |  3 +++
 tools/firmware/hvmloader/util.c  | 24 
 tools/libxl/libxl_create.c   |  2 ++
 6 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/tools/firmware/hvmloader/hvmloader.c 
b/tools/firmware/hvmloader/hvmloader.c
index f603f68..db11ab1 100644
--- a/tools/firmware/hvmloader/hvmloader.c
+++ b/tools/firmware/hvmloader/hvmloader.c
@@ -405,8 +405,6 @@ int main(void)
 }
 
 acpi_enable_sci();
-
-hvm_param_set(HVM_PARAM_ACPI_IOPORTS_LOCATION, 1);
 }
 
 init_vm86_tss();
diff --git a/tools/firmware/hvmloader/ovmf.c b/tools/firmware/hvmloader/ovmf.c
index 4ff7f1d..ebadc64 100644
--- a/tools/firmware/hvmloader/ovmf.c
+++ b/tools/firmware/hvmloader/ovmf.c
@@ -127,6 +127,8 @@ static void ovmf_acpi_build_tables(void)
 .dsdt_15cpu_len = 0
 };
 
+hvm_param_set(HVM_PARAM_ACPI_IOPORTS_LOCATION, 1);
+
 hvmloader_acpi_build_tables(, ACPI_PHYSICAL_ADDRESS);
 }
 
diff --git a/tools/firmware/hvmloader/rombios.c 
b/tools/firmware/hvmloader/rombios.c
index 56b39b7..31a7c65 100644
--- a/tools/firmware/hvmloader/rombios.c
+++ b/tools/firmware/hvmloader/rombios.c
@@ -181,6 +181,8 @@ static void rombios_acpi_build_tables(void)
 .dsdt_15cpu_len = dsdt_15cpu_len,
 };
 
+hvm_param_set(HVM_PARAM_ACPI_IOPORTS_LOCATION, 0);
+
 hvmloader_acpi_build_tables(, ACPI_PHYSICAL_ADDRESS);
 }
 
diff --git a/tools/firmware/hvmloader/seabios.c 
b/tools/firmware/hvmloader/seabios.c
index 870576a..5878eff 100644
--- a/tools/firmware/hvmloader/seabios.c
+++ b/tools/firmware/hvmloader/seabios.c
@@ -28,6 +28,7 @@
 
 #include 
 #include 
+#include 
 
 extern unsigned char dsdt_anycpu_qemu_xen[];
 extern int dsdt_anycpu_qemu_xen_len;
@@ -99,6 +100,8 @@ static void seabios_acpi_build_tables(void)
 .dsdt_15cpu_len = 0,
 };
 
+hvm_param_set(HVM_PARAM_ACPI_IOPORTS_LOCATION, 1);
+
 hvmloader_acpi_build_tables(, rsdp);
 add_table(rsdp);
 }
diff --git a/tools/firmware/hvmloader/util.c b/tools/firmware/hvmloader/util.c
index db5f240..45b777c 100644
--- a/tools/firmware/hvmloader/util.c
+++ b/tools/firmware/hvmloader/util.c
@@ -31,6 +31,9 @@
 #include 
 #include 
 
+extern unsigned char dsdt_anycpu_qemu_xen[], dsdt_anycpu[], dsdt_15cpu[];
+extern int dsdt_anycpu_qemu_xen_len, dsdt_anycpu_len, dsdt_15cpu_len;
+
 /*
  * Check whether there exists overlap in the specified memory range.
  * Returns true if exists, else returns false.
@@ -897,6 +900,27 @@ void hvmloader_acpi_build_tables(struct acpi_config 
*config,
 /* Allocate and initialise the acpi info area. */
 mem_hole_populate_ram(ACPI_INFO_PHYSICAL_ADDRESS >> PAGE_SHIFT, 1);
 
+/* If the device model is specified switch to the corresponding tables */
+s = xenstore_read("platform/device-model", "");
+if ( !strncmp(s, "qemu_xen_traditional", 21) )
+{
+config->dsdt_anycpu = dsdt_anycpu;
+config->dsdt_anycpu_len = dsdt_anycpu_len;
+config->dsdt_15cpu = dsdt_15cpu;
+config->dsdt_15cpu_len = dsdt_15cpu_len;
+
+hvm_param_set(HVM_PARAM_ACPI_IOPORTS_LOCATION, 0);
+}
+else if ( !strncmp(s, "qemu_xen", 10) )
+{
+config->dsdt_anycpu = dsdt_anycpu_qemu_xen;
+config->dsdt_anycpu_len = dsdt_anycpu_qemu_xen_len;
+config->dsdt_15cpu = NULL;
+config->dsdt_15cpu_len = 0;
+
+hvm_param_set(HVM_PARAM_ACPI_IOPORTS_LOCATION, 1);
+}
+
 config->lapic_base_address = LAPIC_BASE_ADDRESS;
 config->lapic_id = acpi_lapic_id;
 config->ioapic_base_address = ioapic_base_address;
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 1158303..8dc8186 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -472,6 +472,8 @@ int libxl__domain_build(libxl__gc *gc,
info->u.hvm.mmio_hole_memkb << 10);
 }
 }
+localents[i++] = "platform/device-model";
+localents[i++] = (char *) 
libxl_device_model_version_to_string(info->device_model_version);
 
 break;
 case LIBXL_DOMAIN_TYPE_PV:
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] tools/libxl : add struct and parsing utils for the 'static_shm' xl config entry

2017-07-19 Thread Stefano Stabellini
On Wed, 19 Jul 2017, Zhongze Liu wrote:
> Add a new struct libxl_static_shm in the libxl IDL for the proposed new xl
> config entry 'static_shm' (see [1]), which allow the user to set up shared
> memory areas among several VMs for communication.
> 
> Add related parsing code to the libxl/libxlu_* family and xl/xl_parse.c
> 
> [1]: [RFC v3]Proposal to allow setting up shared memory areas between VMs 
> from xl config file,
>  
> https://lists.xenproject.org/archives/html/xen-devel/2017-07/msg01741.html
> 
> Signed-off-by: Zhongze Liu 
> ---
> Cc: Wei Liu 
> Cc: Ian Jackson 
> Cc: Stefano Stabellini 
> Cc: Julien Grall 
> Cc: xen-devel@lists.xen.org
> ---
>  tools/libxl/Makefile|   2 +-
>  tools/libxl/libxl.h |  10 ++
>  tools/libxl/libxl_types.idl |  52 +
>  tools/libxl/libxlu_sshm.c   | 274 
> 
>  tools/libxl/libxlutil.h |   6 +
>  tools/xl/xl_parse.c |  24 +++-
>  6 files changed, 366 insertions(+), 2 deletions(-)
>  create mode 100644 tools/libxl/libxlu_sshm.c
> 
> diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
> index 2ffb78f5c4..b7effb188b 100644
> --- a/tools/libxl/Makefile
> +++ b/tools/libxl/Makefile
> @@ -175,7 +175,7 @@ AUTOINCS= libxlu_cfg_y.h libxlu_cfg_l.h _libxl_list.h 
> _paths.h \
>  AUTOSRCS= libxlu_cfg_y.c libxlu_cfg_l.c
>  AUTOSRCS += _libxl_save_msgs_callout.c _libxl_save_msgs_helper.c
>  LIBXLU_OBJS = libxlu_cfg_y.o libxlu_cfg_l.o libxlu_cfg.o \
> - libxlu_disk_l.o libxlu_disk.o libxlu_vif.o libxlu_pci.o
> + libxlu_disk_l.o libxlu_disk.o libxlu_vif.o libxlu_pci.o libxlu_sshm.o
>  $(LIBXLU_OBJS): CFLAGS += $(CFLAGS_libxenctrl) # For xentoollog.h
>  
>  $(TEST_PROG_OBJS) _libxl.api-for-check: CFLAGS += $(CFLAGS_libxentoollog)
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 7cf0f31f68..cf3cbe1ba1 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -2228,6 +2228,16 @@ int libxl_fd_set_nonblock(libxl_ctx *ctx, int fd, int 
> nonblock);
>  int libxl_qemu_monitor_command(libxl_ctx *ctx, uint32_t domid,
> const char *command_line, char **output);
>  
> +
> +/* Functions to stattically set up shared memory regions between two  domains
 ^ statically  ^double 
space


> + * for shm-based communication. */
> +
> +#define LIBXL_SSHM_RANGE_UNKNOWN UINT64_MAX
> +
> +/* TODO: int libxl_sshm_add(libxl_ctx *ctx, uint32_t domid,
> + *  libxl_static_shm *sshm);
> + */
> +
>  #include 
>  
>  #endif /* LIBXL_H */
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 8a9849c643..8c68b45add 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -779,6 +779,57 @@ libxl_device_channel = Struct("device_channel", [
> ])),
>  ])
>  
> +# static shared memory cacheability attributes
> +libxl_sshm_cacheattr = Enumeration("sshm_cacheattr", [
> +(-1, "UNKNOWN"),
> +(0, "UC"),
> +(1, "WC"),  #x86 only
> +(4, "WT"),
> +(5, "WP"),  #x86 only
> +(6, "WB"),
> +(7, "SUC"), #x86 only
> +(8, "BUFFERABLE"),  #ARM only
> +(9, "WA"),  #ARM only
> +], init_val = "LIBXL_SSHM_CACHEATTR_UNKNOWN")

I would only specify UNKNOWN and WB for now.


> +# static shared memory shareability attributes
> +libxl_sshm_shareattr = Enumeration("sshm_shareattr", [
> +(-1, "UNKNOWN"),
> +(0, "NON"),
> +(2, "OUTER"),
> +(3, "INNER"),
> +], init_val = "LIBXL_SSHM_SHAREATTR_UNKNOWN")
> +
> +libxl_sshm_prot = Enumeration("sshm_prot", [
> +(-1, "UNKNOWN"),
> +(0, "N"),
> +(1, "R"),
> +(2, "W"),
> +(4, "X"),
> +(3, "RW"),
> +(5, "RX"),
> +(6, "WX"),
> +(7, "RWX"),
> +], init_val = "LIBXL_SSHM_PROT_UNKNOWN")
> +
> +libxl_sshm_role = Enumeration("sshm_role", [
> +(-1, "UNKNOWN"),
> +(0, "MASTER"),
> +(1, "SLAVE"),
> +], init_val = "LIBXL_SSHM_ROLE_UNKNOWN")
> +
> +libxl_static_shm = Struct("static_shm", [
> +("id", string),
> +("begin", uint64, {'init_val': 'LIBXL_SSHM_RANGE_UNKNOWN'}),
> +("end", uint64, {'init_val': 'LIBXL_SSHM_RANGE_UNKNOWN'}),
> +("prot", libxl_sshm_prot),
> +("arm_shareattr", libxl_sshm_shareattr),
> +("arm_inner_cacheattr", libxl_sshm_cacheattr),
> +("arm_outer_cacheattr", libxl_sshm_cacheattr),

I would have a single arm_cacheattr


> +("x86_cacheattr", libxl_sshm_cacheattr),
> +("role", libxl_sshm_role),
> +])
> +
>  libxl_domain_config = Struct("domain_config", [
>  ("c_info", libxl_domain_create_info),
>  ("b_info", libxl_domain_build_info),
> @@ -797,6 +848,7 @@ libxl_domain_config = Struct("domain_config", [
>  ("channels", Array(libxl_device_channel, "num_channels")),
>  ("usbctrls", 

Re: [Xen-devel] [kernel-hardening] Re: x86: PIE support and option to extend KASLR randomization

2017-07-19 Thread Kees Cook
On Wed, Jul 19, 2017 at 7:08 AM, Christopher Lameter  wrote:
> On Tue, 18 Jul 2017, Thomas Garnier wrote:
>
>> Performance/Size impact:
>> Hackbench (50% and 1600% loads):
>>  - PIE enabled: 7% to 8% on half load, 10% on heavy load.
>> slab_test (average of 10 runs):
>>  - PIE enabled: 3% to 4%
>> Kernbench (average of 10 Half and Optimal runs):
>>  - PIE enabled: 5% to 6%
>>
>> Size of vmlinux (Ubuntu configuration):
>>  File size:
>>  - PIE disabled: 472928672 bytes (-0.000169% from baseline)
>>  - PIE enabled: 216878461 bytes (-54.14% from baseline)
>
> Maybe we need something like CONFIG_PARANOIA so that we can determine at
> build time how much performance we want to sacrifice for performance?
>
> Its going to be difficult to understand what all these hardening config
> options do.

This kind of thing got discussed recently, and like
CONFIG_EXPERIMENTAL, a global config doesn't really work. The best
thing to do is to document each config as well as possible and system
builders can decide.

-Kees

-- 
Kees Cook
Pixel Security

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 112023: tolerable trouble: broken/pass - PUSHED

2017-07-19 Thread osstest service owner
flight 112023 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112023/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  7868654ff7fe5e4a2eeae2b277644fa884a5031e
baseline version:
 xen  5efaeaa8235d9f16fa2711efe22b8f2bd54a182b

Last test of basis   112017  2017-07-19 15:02:34 Z0 days
Testing same since   112023  2017-07-19 18:01:09 Z0 days1 attempts


People who touched revisions under test:
  Owen Smith 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  broken  
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=7868654ff7fe5e4a2eeae2b277644fa884a5031e
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
7868654ff7fe5e4a2eeae2b277644fa884a5031e
+ branch=xen-unstable-smoke
+ revision=7868654ff7fe5e4a2eeae2b277644fa884a5031e
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.9-testing
+ '[' x7868654ff7fe5e4a2eeae2b277644fa884a5031e = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/ovmf.git
++ : 

Re: [Xen-devel] [RFC v3]Proposal to allow setting up shared memory areas between VMs from xl config file

2017-07-19 Thread Stefano Stabellini
On Wed, 19 Jul 2017, Zhongze Liu wrote:
> 
> 1. Motivation and Description
> 
> Virtual machines use grant table hypercalls to setup a share page for
> inter-VMs communications. These hypercalls are used by all PV
> protocols today. However, very simple guests, such as baremetal
> applications, might not have the infrastructure to handle the grant table.
> This project is about setting up several shared memory areas for inter-VMs
> communications directly from the VM config file.
> So that the guest kernel doesn't have to have grant table support (in the
> embedded space, this is not unusual) to be able to communicate with
> other guests.
> 
> 
> 2. Implementation Plan:
> 
> 
> ==
> 2.1 Introduce a new VM config option in xl:
> ==
> 
> 2.1.1 Design Goals
> ~~~
> 
> The shared areas should be shareable among several (>=2) VMs, so every shared
> physical memory area is assigned to a set of VMs. Therefore, a “token” or
> “identifier” should be used here to uniquely identify a backing memory area.
> A string no longer than 128 bytes is used here to serve the purpose.
> 
> The backing area would be taken from one domain, which we will regard
> as the "master domain", and this domain should be created prior to any
> other "slave domain"s. Again, we have to use some kind of tag to tell who
> is the "master domain".
> 
> And the ability to specify the permissions and cacheability (and shareability
> for arm HVM's) of the pages to be shared should be also given to the user.
> 
> 2.2.2 Syntax and Behavior
> ~
> The following example illustrates the syntax of the proposed config entry:
> 
> In xl config file of vm1:
> 
>static_shm = [ 'id=ID1, begin=0x10, end=0x20, role=master,
>arm_shareattr=inner, arm_inner_cacheattr=wb,
>arm_outer_cacheattr=wb, x86_cacheattr=wb, prot=ro',
> 
>'id=ID2, begin=0x30, end=0x40, role=master,
>arm_shareattr=inner, arm_inner_cacheattr=wb,
>arm_outer_cacheattr=wb, x86_cacheattr=wb, prot=rw' ]

Probably not a good idea to mix x86 and arm attributes in the example :-)
Just make a couple of examples instead.


> In xl config file of vm2:
> 
> static_shm = [ 'id=ID1, begin=0x50, end=0x60, role=slave, 
> prot=ro' ]
> 
> In xl config file of vm3:
> 
> static_shm = [ 'id=ID2, begin=0x70, end=0x80, role=slave, 
> prot=ro' ]
> 
> where:
>   @id   can be any string that matches the regexp "[^ \t\n,]+"
> and no logner than 128 characters
>   @begin/endcan be decimals or hexidemicals of the form "0x2".
>   @role can only be 'master' or 'slave'
>   @prot can be 'n', 'r', 'ro', 'w', 'wo', 'x', 'xo', 'rw', 
> 'rx',
> 'wx' or 'rwx'. Default is 'rw'.
>   @arm_shareattrcan be 'inner' our 'outter', this will be ignored and
> a warning will be printed out to the screen if it
> is specified in an x86 HVM config file.
> Default is 'inner'
>   @arm_outer_cacheattr  can be 'uc', 'wt', 'wb', 'bufferable' or 'wa', this 
> will
> be ignored and a warning will be printed out to the
> screen if it is specified in an x86 HVM config file.
> Default is 'inner'
>   @arm_inner_cacheattr  can be 'uc', 'wt', 'wb', 'bufferable' or 'wa'. Default
> is 'wb'.

I don't think we need both @arm_outer_cacheattr and
@arm_inner_cacheattr: a single @arm_cacheattr should suffice.

Also, we need to explain what each of these values mean. Instead, I
would only say that today we only support write-back:

@arm_cacheattr  Only 'wb' (write-back) is supported today.

In the code I would check that arm_cacheattr is either missing, or set
to 'wb'. Throw an error in all other cases.


>   @x86_cacheattrcan be 'uc', 'wc', 'wt', 'wp', 'wb' or 'suc'. Default
> is 'wb'.

Also here, I would write:

@x86_cacheattr  Only 'wb' (write-back) is supported today.

Like you wrote later, begin and end addresses need to be multiple of 4K.


> Besides, the sizes of the areas specified by @begin and @end in the slave
> domain's config file should be smaller than the corresponding sizes specified
> in its master's domain. And overlapping backing memory areas are allowed.
> 
> In the example above. A memory area ID1 will be shared between vm1 and vm2.
> This area will be taken from vm1 and mapped into vm2's stage-2 page table.
> The parameter "prot=ro" means that this memory area are 

Re: [Xen-devel] [RFC 13/22] x86/power/64: Adapt assembly for PIE support

2017-07-19 Thread Pavel Machek
On Tue 2017-07-18 15:33:24, Thomas Garnier wrote:
> Change the assembly code to use only relative references of symbols for the
> kernel to be PIE compatible.
> 
> Position Independent Executable (PIE) support will allow to extended the
> KASLR randomization range below the -2G memory limit.
> 
> Signed-off-by: Thomas Garnier 

Acked-by: Pavel Machek 

(But not tested; testing it would be nice).
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 13/24] ARM: NUMA: DT: Parse memory NUMA information

2017-07-19 Thread Julien Grall

Hi Vijay,

On 18/07/17 12:41, vijay.kil...@gmail.com wrote:

From: Vijaya Kumar K 

Parse memory node and fetch numa-node-id information.
For each memory range, store in node_memblk_range[]
along with node id.

When booting in UEFI mode, UEFI passes memory information
to Dom0 using EFI memory descriptor table and deletes the
memory nodes from the host DT. However to fetch the memory
numa node id, memory DT node should not be deleted by EFI stub.
With this patch, do not delete memory node from FDT.

NUMA info of memory is extracted from process_memory_node()
instead of parsing the DT again during numa_init().


This patch does too much and needs to be split. The splitting would be 
at least:


- EFI mode change
- Numa change



Signed-off-by: Vijaya Kumar K 
---
v3: - Set numa_off in numa_failed() and drop dt_numa variable
---
 xen/arch/arm/bootfdt.c  | 25 +
 xen/arch/arm/efi/efi-boot.h | 25 -
 xen/arch/arm/numa/dt_numa.c | 32 
 xen/arch/arm/numa/numa.c|  5 +
 xen/include/asm-arm/numa.h  |  2 ++
 5 files changed, 60 insertions(+), 29 deletions(-)

diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index 6e8251b..b3a132c 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -13,6 +13,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 


Please add the headers in alphabetical order.


 #include 
 #include 

@@ -146,6 +148,9 @@ static void __init process_memory_node(const void *fdt, int 
node,
 const __be32 *cell;
 paddr_t start, size;
 u32 reg_cells = address_cells + size_cells;
+#ifdef CONFIG_NUMA
+uint32_t nid;
+#endif

 if ( address_cells < 1 || size_cells < 1 )
 {
@@ -154,24 +159,36 @@ static void __init process_memory_node(const void *fdt, 
int node,
 return;
 }

+#ifdef CONFIG_NUMA
+nid = device_tree_get_u32(fdt, node, "numa-node-id", NR_NODE_MEMBLKS);


Should not you use MAX_NUM_NODES rather than NR_NODE_MEMBLKS?

Also, where is the sanity check?


+#endif
 prop = fdt_get_property(fdt, node, "reg", NULL);
 if ( !prop )
 {
 printk("fdt: node `%s': missing `reg' property\n", name);
+#ifdef CONFIG_NUMA
+   numa_failed();


This file is using soft-tab not hard one.


+#endif
 return;
 }

 cell = (const __be32 *)prop->data;
 banks = fdt32_to_cpu(prop->len) / (reg_cells * sizeof (u32));

-for ( i = 0; i < banks && bootinfo.mem.nr_banks < NR_MEM_BANKS; i++ )
+for ( i = 0; i < banks; i++ )
 {
 device_tree_get_reg(, address_cells, size_cells, , );
 if ( !size )
 continue;
-bootinfo.mem.bank[bootinfo.mem.nr_banks].start = start;
-bootinfo.mem.bank[bootinfo.mem.nr_banks].size = size;
-bootinfo.mem.nr_banks++;
+if ( !efi_enabled(EFI_BOOT) && bootinfo.mem.nr_banks < NR_MEM_BANKS )
+{
+bootinfo.mem.bank[bootinfo.mem.nr_banks].start = start;
+bootinfo.mem.bank[bootinfo.mem.nr_banks].size = size;
+bootinfo.mem.nr_banks++;
+}


This change should be split.


+#ifdef CONFIG_NUMA
+dt_numa_process_memory_node(nid, start, size);
+#endif
 }
 }

diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
index 56de26e..a8bde68 100644
--- a/xen/arch/arm/efi/efi-boot.h
+++ b/xen/arch/arm/efi/efi-boot.h
@@ -194,33 +194,8 @@ EFI_STATUS __init fdt_add_uefi_nodes(EFI_SYSTEM_TABLE 
*sys_table,
 int status;
 u32 fdt_val32;
 u64 fdt_val64;
-int prev;
 int num_rsv;

-/*
- * Delete any memory nodes present.  The EFI memory map is the only
- * memory description provided to Xen.
- */
-prev = 0;
-for (;;)
-{
-const char *type;
-int len;
-
-node = fdt_next_node(fdt, prev, NULL);
-if ( node < 0 )
-break;
-
-type = fdt_getprop(fdt, node, "device_type", );
-if ( type && strncmp(type, "memory", len) == 0 )
-{
-fdt_del_node(fdt, node);
-continue;
-}
-
-prev = node;
-}
-


That chunk should move to the same patch as the EFI check.


/*
 * Delete all memory reserve map entries. When booting via UEFI,
 * kernel will use the UEFI memory map to find reserved regions.
diff --git a/xen/arch/arm/numa/dt_numa.c b/xen/arch/arm/numa/dt_numa.c
index 963bb40..84030e7 100644
--- a/xen/arch/arm/numa/dt_numa.c
+++ b/xen/arch/arm/numa/dt_numa.c
@@ -58,6 +58,38 @@ static int __init dt_numa_process_cpu_node(const void *fdt)
 return 0;
 }

+void __init dt_numa_process_memory_node(uint32_t nid, paddr_t start,
+   paddr_t size)
+{
+struct node *nd;
+int i;
+
+i = conflicting_memblks(start, start + size);
+if ( i < 0 )
+{
+ if ( numa_add_memblk(nid, start, size) )
+ {
+ printk(XENLOG_WARNING "DT: NUMA: node-id 

Re: [Xen-devel] [RFC PATCH v3 12/24] ARM: NUMA: DT: Parse CPU NUMA information

2017-07-19 Thread Julien Grall

Hi,

On 18/07/17 12:41, vijay.kil...@gmail.com wrote:

From: Vijaya Kumar K 

Parse CPU node and fetch numa-node-id information.
For each node-id found, update nodemask_t mask.
Refer to Documentation/devicetree/bindings/numa.txt
in linux kernel.

Signed-off-by: Vijaya Kumar K 
---
v3: - Parse cpu nodes under path /cpus
- Move changes to bootfdt.c as separate patch
- Set numa_off on dt_numa_init() failure
---
 xen/arch/arm/Makefile   |  1 +
 xen/arch/arm/numa/Makefile  |  2 ++
 xen/arch/arm/numa/dt_numa.c | 77 +
 xen/arch/arm/numa/numa.c| 48 
 xen/arch/arm/setup.c|  4 +++
 xen/include/asm-arm/numa.h  | 10 +-
 6 files changed, 141 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 49e1fb2..a89be66 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -3,6 +3,7 @@ subdir-$(CONFIG_ARM_64) += arm64
 subdir-y += platforms
 subdir-$(CONFIG_ARM_64) += efi
 subdir-$(CONFIG_ACPI) += acpi
+subdir-$(CONFIG_NUMA) += numa

 obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
 obj-y += bootfdt.init.o
diff --git a/xen/arch/arm/numa/Makefile b/xen/arch/arm/numa/Makefile
new file mode 100644
index 000..3af3aff
--- /dev/null
+++ b/xen/arch/arm/numa/Makefile
@@ -0,0 +1,2 @@
+obj-y += dt_numa.o
+obj-y += numa.o
diff --git a/xen/arch/arm/numa/dt_numa.c b/xen/arch/arm/numa/dt_numa.c
new file mode 100644
index 000..963bb40
--- /dev/null
+++ b/xen/arch/arm/numa/dt_numa.c
@@ -0,0 +1,77 @@
+/*
+ * OF NUMA Parsing support.
+ *
+ * Copyright (C) 2015 - 2016 Cavium Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 


Again, this include should not be there as the device tree is not yet 
parsed.



+#include 
+#include 


Again, please order in alphabetically the includes...


+
+/*
+ * Even though we connect cpus to numa domains later in SMP
+ * init, we need to know the node ids now for all cpus.
+ */
+static int __init dt_numa_process_cpu_node(const void *fdt)
+{
+int node, offset;
+uint32_t nid;
+
+offset = fdt_path_offset(fdt, "/cpus");
+if ( offset < 0 )
+return -EINVAL;
+
+node = fdt_first_subnode(fdt, offset);
+if ( node == -FDT_ERR_NOTFOUND )
+return -EINVAL;
+
+do {
+if ( device_tree_type_matches(fdt, node, "cpu") )
+{
+nid = device_tree_get_u32(fdt, node, "numa-node-id", MAX_NUMNODES);
+if ( nid >= MAX_NUMNODES )
+printk(XENLOG_WARNING
+   "NUMA: Node id %u exceeds maximum value\n", nid);
+else
+node_set(nid, processor_nodes_parsed);
+}
+
+offset = node;
+node = fdt_next_subnode(fdt, offset);
+} while (node != -FDT_ERR_NOTFOUND);
+
+return 0;
+}
+
+int __init dt_numa_init(void)
+{
+int ret;
+
+ret = dt_numa_process_cpu_node((void *)device_tree_flattened);
+
+return ret;


return dt_numa_process_cpu_node();

But I am still not sure to understand why you can't parse the numa node 
in directly in bootfdt.c as you do for the memory.



+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
new file mode 100644
index 000..45cc418
--- /dev/null
+++ b/xen/arch/arm/numa/numa.c
@@ -0,0 +1,48 @@
+/*
+ * ARM NUMA Implementation
+ *
+ * Copyright (C) 2016 - Cavium Inc.
+ * Vijaya Kumar K 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms and conditions of the GNU General Public
+ * License, version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+void __init numa_init(void)
+{
+int ret = 0;
+
+nodes_clear(processor_nodes_parsed);


Why do you need to clear processor_nodes_parsed? It should already be 
all zeroed.



+if ( numa_off )
+goto no_numa;
+
+ret = dt_numa_init();
+if ( ret )
+{
+

Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support

2017-07-19 Thread Thomas Garnier
On Tue, Jul 18, 2017 at 8:08 PM, Brian Gerst  wrote:
> On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier  wrote:
>> Perpcu uses a clever design where the .percu ELF section has a virtual
>> address of zero and the relocation code avoid relocating specific
>> symbols. It makes the code simple and easily adaptable with or without
>> SMP support.
>>
>> This design is incompatible with PIE because generated code always try to
>> access the zero virtual address relative to the default mapping address.
>> It becomes impossible when KASLR is configured to go below -2G. This
>> patch solves this problem by removing the zero mapping and adapting the GS
>> base to be relative to the expected address. These changes are done only
>> when PIE is enabled. The original implementation is kept as-is
>> by default.
>
> The reason the per-cpu section is zero-based on x86-64 is to
> workaround GCC hardcoding the stack protector canary at %gs:40.  So
> this patch is incompatible with CONFIG_STACK_PROTECTOR.

Ok, that make sense. I don't want this feature to not work with
CONFIG_CC_STACKPROTECTOR*. One way to fix that would be adding a GDT
entry for gs so gs:40 points to the correct memory address and
gs:[rip+XX] works correctly through the MSR. Given the separate
discussion on mcmodel, I am going first to check if we can move from
PIE to PIC with a mcmodel=small or medium that would remove the percpu
change requirement. I tried before without success but I understand
better percpu and other components so maybe I can make it work.

Thanks a lot for the feedback.

>
> --
> Brian Gerst



-- 
Thomas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/2] x86/mm: Change default value for suppress #VE in set_mem_access()

2017-07-19 Thread Tamas K Lengyel
On Wed, Jul 19, 2017 at 5:47 AM, Adrian Pop  wrote:
> Hello,
>
> On Tue, Jul 18, 2017 at 11:26:45AM -0600, Tamas K Lengyel wrote:
>> On Tue, Jul 18, 2017 at 9:25 AM, Adrian Pop  wrote:
>> > From: Vlad Ioan Topan 
>> >
>> > The default value for the "suppress #VE" bit set by set_mem_access()
>> > currently depends on whether the call is made from the same domain (the
>> > bit is set when called from another domain and cleared if called from
>> > the same domain). This patch changes that behavior to inherit the old
>> > suppress #VE bit value if it is already set and to set it to 1
>> > otherwise, which is safer and more reliable.
>>
>> With the way things are currently if the in-guest tool calls
>> set_mem_access for an altp2m view, it implies it wants to receive #VE
>> for it. Wouldn't this change in this patch effectively make it
>> impossible for an in-guest tool to decide which pages it wants to
>> receive #VE for? The new HVMOP you are introducing is only accessible
>> from a privileged domain..
>
> Yes, this change, along with the restrictions from the new HVMOP would
> virtually prevent a guest from changing the suppress #VE bit for its
> pages. The current set_mem_access functionality, if I'm not mistaken,
> is a bit odd since the guest can only clear the sve, but to set it,
> another domain would have to call set_mem_access for it.

Stating that change explicitly in the patch message would have been
something I would want to see.

Calling set_mem_access from the guest itself by design clears the SVE
bit, which makes sense. The in-guest tool doesn't know whether there
is an external mem_access listener, so the only thing it should be
allowed to do is to signal to the hypervisor that when it changes EPT
permissions, violations on those pages need to be injected into the
guest with #VE. If you don't want to allow a domain to make changes
like that, you need to restrict altp2m ops to be issued from the
domain completely.

>
> I think the issue would be whether to allow a domain to set/clear the
> suppress #VE bit for its pages by calling the new HVMOP on itself.

This problem is not limited to setting the SVE bit. It also applies to
swapping altp2m views. Pretty much all altp2m HVMOPs can be issued
from a user-space program without any way to check whether that
process is allowed to do that or not. If you don't think it is safe
for a domain to set SVE, the none of the altp2m ops are safe for the
domain to issue on itself. If we could say ensure only the kernel can
issue the hvmops, that would be OK. But that's not possible at the
moment AFAICT.

Tamas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen-mapcache: Fix the bug when overlapping emulated DMA operations may cause inconsistency in guest memory mappings

2017-07-19 Thread Stefano Stabellini
On Wed, 19 Jul 2017, Alexey G wrote:
> Stefano,
> 
> On Tue, 18 Jul 2017 15:17:25 -0700 (PDT)
> Stefano Stabellini  wrote:
> 
> > > The patch modifies the behavior in which MapCacheEntry's are added to
> > > the list, avoiding duplicates.  
> > 
> > I take that the idea is to always go through the whole list to check for
> > duplicate locked entries, right?
>  
> That's a short list.
> 
> In fact, it's not easy to generate multiple linked entries in this list --
> normally, entries will be added, used and then removed immediately by
> xen_invalidate_map_cache(). Specific conditions are required to make the
> list grow -- like simultaneous DMA operations (of different cache_size)
> originating the same address_index or presence of the Option ROM mapping in
> the area. 
> 
> So normally we deal with just 1-2 entries in the list. Even three entries
> are likely to be generated only intentionally and with a bit of luck as it
> depends on host's performance/workload a lot. Also, a good cache_size
> diversity is required to produce entries in the list but we actually
> limited to only few multiplies of MCACHE_BUCKET_SIZE due to the maximum DMA
> size limitations of emulated devices.
> 
> > Yes, I think this would work, but we should make sure to scan the whole
> > list only when lock ==  true. Something like the following:
> > 
> > -while (entry && entry->lock && entry->vaddr_base &&
> > +while (entry && (lock || entry->lock) && entry->vaddr_base &&
> >  (entry->paddr_index != address_index || entry->size !=
> > cache_size || !test_bits(address_offset >> XC_PAGE_SHIFT,
> >   test_bit_size >> XC_PAGE_SHIFT,
> >   entry->valid_mapping))) {
> > +if (!free_entry && !entry->lock) {
> > +free_entry = entry;
> > +free_pentry = pentry;
> > +}
> >  pentry = entry;
> >  entry = entry->next;
> >  }
> > 
> > Would this work?
> 
> This would, but the question is if there will be a benefit. In this way we
> avoiding to traverse the rest of the list (few entries, if any) if we asked
> for some lock=0 mapping and found such entry before the reuseable lock=n
> entry. We win few iterations of quick checks, but on other hand risking to
> have to execute xen_remap_bucket() for this entry (with lot of fairly slow
> stuff). If there was a reusable entry later in the list -- using it instead
> of (possibly) remapping an entry will be faster... so it's pros and cons
> here.

My expectation is that unlocked mappings are much more frequent than
locked mappings. Also, I expect that only very rarely we'll be able to
reuse locked mappings. Over the course of a VM lifetime, it seems to me
that walking the list every time would cost more than it would benefit.

These are only "expectations", I would love to see numbers. Numbers make
for better decisions :-)  Would you be up for gathering some of these
numbers? Such as how many times you get to reuse locked mappings and how
many times we walk items on the list fruitlessly?

Otherwise, would you be up for just testing the modified version of the
patch I sent to verify that solves the bug?



> We can use locked entry for "non-locked" request as it is protected by the
> same (kinda suspicious) rcu_read_lock/rcu_read_unlock mechanism above. The
> big question here is whether rcu_read_(un)lock is enough at all
> for underneath xen-mapcache usage -- seems like the xen-mapcache-related
> code in QEMU expects RCU read lock to work like a plain critical section...
> although this needs to be checked.
> 
> One possible minor optimization for xen-mapcache would be to reuse larger
> mappings for mappings of lesser cache_size. Right now existing code does
> checks in the "entry->size == cache_size" manner, while we can use
> "entry->size >= cache_size" here. However, we may end up with resident
> MapCacheEntries being mapped to a bigger mapping sizes than necessary and
> thus might need to add remapping back to the normal size in
> xen_invalidate_map_cache_entry_unlocked() when there are no other mappings.

Yes, I thought about it, that would be a good improvement to have.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 10/24] NUMA: Allow numa initialization with DT

2017-07-19 Thread Julien Grall

Hi Vijay,

On 18/07/17 12:41, vijay.kil...@gmail.com wrote:

From: Vijaya Kumar K 

The common code allows numa initialization only when
ACPI_NUMA config is enabled. Allow initialization when
NUMA config is enabled for DT.

In this patch, along with acpi_numa, check for acpi_disabled
is added.

Signed-off-by: Vijaya Kumar K 
---
 xen/common/numa.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/xen/common/numa.c b/xen/common/numa.c
index 74c4697..5e985d2 100644
--- a/xen/common/numa.c
+++ b/xen/common/numa.c
@@ -324,7 +324,7 @@ static int __init numa_scan_nodes(paddr_t start, paddr_t 
end)
 for ( i = 0; i < MAX_NUMNODES; i++ )
 cutoff_node(i, start, end);

-if ( acpi_numa <= 0 )
+if ( !acpi_disabled && acpi_numa <= 0 )


I am struggling to understand this change. Likely you want to similar 
variable for DT to say NUMA is available or this has failed.


This also change quite a bit the semantic for x86 because, you will now 
continue if acpi_disabled and acpi_numa = 0. The code seems to allow it, 
but I don't know if we support it.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 112017: tolerable trouble: broken/pass - PUSHED

2017-07-19 Thread osstest service owner
flight 112017 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112017/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  5efaeaa8235d9f16fa2711efe22b8f2bd54a182b
baseline version:
 xen  ab48596654ca20bd45eee4bdc1252188e9beb5a5

Last test of basis   112012  2017-07-19 10:03:30 Z0 days
Testing same since   112017  2017-07-19 15:02:34 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  broken  
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=5efaeaa8235d9f16fa2711efe22b8f2bd54a182b
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
5efaeaa8235d9f16fa2711efe22b8f2bd54a182b
+ branch=xen-unstable-smoke
+ revision=5efaeaa8235d9f16fa2711efe22b8f2bd54a182b
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.9-testing
+ '[' x5efaeaa8235d9f16fa2711efe22b8f2bd54a182b = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/ovmf.git
++ : 

Re: [Xen-devel] [RFC PATCH v3 08/24] NUMA: x86: Move numa code and make it generic

2017-07-19 Thread Julien Grall

Hi Vijay,

On 18/07/17 12:41, vijay.kil...@gmail.com wrote:

From: Vijaya Kumar K 

Move code from xen/arch/x86/numa.c to xen/common/numa.c
so that it can be used by other archs.

The following changes are done:
- Few generic static functions in x86/numa.c is made
  non-static common/numa.c
- The generic contents of header file asm-x86/numa.h
  are moved to xen/numa.h.
- The header file includes are reordered and externs are
  dropped.
- Moved acpi_numa from asm-x86/acpi.h to xen/acpi.h
- Coding style of code moved to commom/numa.c is changed
  to Xen style.
- numa_add_cpu() and numa_set_node() and moved to header
  file and added inline function in case of CONFIG_NUMA
  is not enabled because these functions are called from
  generic code with out any config check.

Also the node_online_map is defined in x86/numa.c for x86
and arm/smpboot.c for ARM. For x86 it is moved to x86/smpboot.c
If moved to common code the compilation fails because
common/numa.c is compiled only when NUMA is enabled.


I would much prefer if this patch does one thing: Moving code. The rest 
should be split out to help review and allowing us to easily verify you 
only moved code...



+#define NODE_DATA(nid)  (&(node_data[nid]))
+
+#define node_start_pfn(nid) NODE_DATA(nid)->node_start_pfn
+#define node_spanned_pages(nid) NODE_DATA(nid)->node_spanned_pages
+#define node_end_pfn(nid)   NODE_DATA(nid)->node_start_pfn + \
+ NODE_DATA(nid)->node_spanned_pages
+
+void numa_add_cpu(int cpu);
+void numa_set_node(int cpu, nodeid_t node);
+#else
+static inline void numa_add_cpu(int cpu) { }
+static inline void numa_set_node(int cpu, nodeid_t node) { }


I am not sure why you need to define stub at least for numa_set_node... 
I can't see use in non-NUMA code. I will comment about the numa_add_cpu 
later.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 21/22] x86/module: Add support for mcmodel large and PLTs

2017-07-19 Thread Brian Gerst
On Wed, Jul 19, 2017 at 11:58 AM, Thomas Garnier  wrote:
> On Tue, Jul 18, 2017 at 8:59 PM, Brian Gerst  wrote:
>> On Tue, Jul 18, 2017 at 9:35 PM, H. Peter Anvin  wrote:
>>> On 07/18/17 15:33, Thomas Garnier wrote:
 With PIE support and KASLR extended range, the modules may be further
 away from the kernel than before breaking mcmodel=kernel expectations.

 Add an option to build modules with mcmodel=large. The modules generated
 code will make no assumptions on placement in memory.

 Despite this option, modules still expect kernel functions to be within
 2G and generate relative calls. To solve this issue, the PLT arm64 code
 was adapted for x86_64. When a relative relocation go outside its range,
 a dynamic PLT entry is used to correctly jump to the destination.
>>>
>>> Why large as opposed to medium or medium-PIC?
>>
>> Or for that matter, why not small-PIC?  We aren't changing the size of
>> the kernel to be larger than 2G text or data.  Small-PIC would still
>> allow it to be placed anywhere in the address space, and would
>> generate far better code.
>
> My understanding was that small=PIC and medium=PIC assume that the
> module code is in the lower 2G of memory. I will do additional testing
> on the modules to confirm that.

That is only for small/medium absolute (non-PIC) code.  Think about
userspace shared libraries.  They are not limited to being mapped in
the lower 2G of the address space.

--
Brian Gerst

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 09/15] xen: vmx: handle SGX related MSRs

2017-07-19 Thread Andrew Cooper
On 09/07/17 09:09, Kai Huang wrote:
> This patch handles IA32_FEATURE_CONTROL and IA32_SGXLEPUBKEYHASHn MSRs.
>
> For IA32_FEATURE_CONTROL, if SGX is exposed to domain, then SGX_ENABLE bit
> is always set. If SGX launch control is also exposed to domain, and physical
> IA32_SGXLEPUBKEYHASHn are writable, then SGX_LAUNCH_CONTROL_ENABLE bit is
> also always set. Write to IA32_FEATURE_CONTROL is ignored.
>
> For IA32_SGXLEPUBKEYHASHn, a new 'struct sgx_vcpu' is added for per-vcpu SGX
> staff, and currently it has vcpu's virtual ia32_sgxlepubkeyhash[0-3]. Two
> boolean 'readable' and 'writable' are also added to indicate whether virtual
> IA32_SGXLEPUBKEYHASHn are readable and writable.
>
> During vcpu is initialized, virtual ia32_sgxlepubkeyhash are also initialized.
> If physical IA32_SGXLEPUBKEYHASHn are writable, then ia32_sgxlepubkeyhash are
> set to Intel's default value, as for physical machine, those MSRs will have
> Intel's default value. If physical MSRs are not writable (it is *locked* by
> BIOS before handling to Xen), then we try to read those MSRs and use physical
> values as defult value for virtual MSRs. One thing is rdmsr_safe is used, as
> although SDM says if SGX is present, IA32_SGXLEPUBKEYHASHn are available for
> read, but in reality, skylake client (at least some, depending on BIOS) 
> doesn't
> have those MSRs available, so we use rdmsr_safe and set readable to false if 
> it
> returns error code.
>
> For IA32_SGXLEPUBKEYHASHn MSR read from guest, if physical MSRs are not
> readable, guest is not allowed to read either, otherwise vcpu's virtual MSR
> value is returned.
>
> For IA32_SGXLEPUBKEYHASHn MSR write from guest, we allow guest to write if 
> both
> physical MSRs are writable and SGX launch control is exposed to domain,
> otherwise error is injected.
>
> To make EINIT run successfully in guest, vcpu's virtual IA32_SGXLEPUBKEYHASHn
> will be update to physical MSRs when vcpu is scheduled in.
>
> Signed-off-by: Kai Huang 
> ---
>  xen/arch/x86/hvm/vmx/sgx.c | 194 
> +
>  xen/arch/x86/hvm/vmx/vmx.c |  24 +
>  xen/include/asm-x86/cpufeature.h   |   3 +
>  xen/include/asm-x86/hvm/vmx/sgx.h  |  22 +
>  xen/include/asm-x86/hvm/vmx/vmcs.h |   2 +
>  xen/include/asm-x86/msr-index.h|   6 ++
>  6 files changed, 251 insertions(+)
>
> diff --git a/xen/arch/x86/hvm/vmx/sgx.c b/xen/arch/x86/hvm/vmx/sgx.c
> index 14379151e8..4944e57aef 100644
> --- a/xen/arch/x86/hvm/vmx/sgx.c
> +++ b/xen/arch/x86/hvm/vmx/sgx.c
> @@ -405,6 +405,200 @@ void hvm_destroy_epc(struct domain *d)
>  hvm_reset_epc(d, true);
>  }
>  
> +/* Whether IA32_SGXLEPUBKEYHASHn are physically *unlocked* by BIOS */
> +bool_t sgx_ia32_sgxlepubkeyhash_writable(void)
> +{
> +uint64_t sgx_lc_enabled = IA32_FEATURE_CONTROL_SGX_ENABLE |
> +  IA32_FEATURE_CONTROL_SGX_LAUNCH_CONTROL_ENABLE 
> |
> +  IA32_FEATURE_CONTROL_LOCK;
> +uint64_t val;
> +
> +rdmsrl(MSR_IA32_FEATURE_CONTROL, val);
> +
> +return (val & sgx_lc_enabled) == sgx_lc_enabled;
> +}
> +
> +bool_t domain_has_sgx(struct domain *d)
> +{
> +/* hvm_epc_populated(d) implies CPUID has SGX */
> +return hvm_epc_populated(d);
> +}
> +
> +bool_t domain_has_sgx_launch_control(struct domain *d)
> +{
> +struct cpuid_policy *p = d->arch.cpuid;
> +
> +if ( !domain_has_sgx(d) )
> +return false;
> +
> +/* Unnecessary but check anyway */
> +if ( !cpu_has_sgx_launch_control )
> +return false;
> +
> +return !!p->feat.sgx_launch_control;
> +}

Both of these should be d->arch.cpuid->feat.{sgx,sgx_lc} only, and not
from having individual helpers.

The CPUID setup during host boot and domain construction should take
care of setting everything up properly, or hiding the features from the
guest.  The point of the work I've been doing is to prevent situations
where the guest can see SGX but something doesn't work because of Xen
using nested checks like this.

> +
> +/* Digest of Intel signing key. MSR's default value after reset. */
> +#define SGX_INTEL_DEFAULT_LEPUBKEYHASH0 0xa6053e051270b7ac
> +#define SGX_INTEL_DEFAULT_LEPUBKEYHASH1 0x6cfbe8ba8b3b413d
> +#define SGX_INTEL_DEFAULT_LEPUBKEYHASH2 0xc4916d99f2b3735d
> +#define SGX_INTEL_DEFAULT_LEPUBKEYHASH3 0xd4f8c05909f9bb3b
> +
> +void sgx_vcpu_init(struct vcpu *v)
> +{
> +struct sgx_vcpu *sgxv = to_sgx_vcpu(v);
> +
> +memset(sgxv, 0, sizeof (*sgxv));
> +
> +if ( sgx_ia32_sgxlepubkeyhash_writable() )
> +{
> +/*
> + * If physical MSRs are writable, set vcpu's default value to Intel's
> + * default value. For real machine, after reset, MSRs contain Intel's
> + * default value.
> + */
> +sgxv->ia32_sgxlepubkeyhash[0] = SGX_INTEL_DEFAULT_LEPUBKEYHASH0;
> +sgxv->ia32_sgxlepubkeyhash[1] = SGX_INTEL_DEFAULT_LEPUBKEYHASH1;
> +sgxv->ia32_sgxlepubkeyhash[2] = 

Re: [Xen-devel] [RFC PATCH v3 06/24] x86: NUMA: Rename some generic functions

2017-07-19 Thread Julien Grall

Hi Vijay,

On 18/07/17 12:41, vijay.kil...@gmail.com wrote:

From: Vijaya Kumar K 

Rename some function in ACPI code as follow
 - Rename setup_node to acpi_setup_node
 - Rename bad_srat to numa_failed
 - Rename nodes_cover_memory to arch_sanitize_nodes_memory
   and changed return type to bool
 - Rename acpi_scan_nodes to numa_scan_nodes

Also introduce reset_pxm2node() to reset pxm2node variable.
This avoids exporting pxm2node.


This does not belong to this patch.

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xen: xen-pciback: remove DRIVER_ATTR() usage

2017-07-19 Thread Juergen Gross
On 19/07/17 17:17, Greg KH wrote:
> On Wed, Jul 19, 2017 at 04:51:02PM +0200, Juergen Gross wrote:
>> On 19/07/17 16:43, Greg KH wrote:
>>> From: Greg Kroah-Hartman 
>>>
>>> It's better to be explicit and use the DRIVER_ATTR_RW() and
>>> DRIVER_ATTR_RO() macros when defining a driver's sysfs file.
>>>
>>> Bonus is this fixes up a checkpatch.pl warning.
>>>
>>> This is part of a series to drop DRIVER_ATTR() from the tree entirely.
>>>
>>> Cc: Boris Ostrovsky 
>>> Cc: Juergen Gross 
>>> Signed-off-by: Greg Kroah-Hartman 
>>
>> Reviewed-by: Juergen Gross 
>>
>> I'll take this through the Xen tree, unless you want to use your tree.
> 
> If I can take it through mine, then I could drop DRIVER_ATTR() from the
> whole tree for the next kernel release, which would be ideal.
> 
> But if you want to take it, that's fine, I can wait another release, no
> rush.

In this case just use your tree. I don't think there are any pending
conflicting patches right now.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 05/24] x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs

2017-07-19 Thread Julien Grall



On 19/07/17 07:40, Vijay Kilari wrote:

On Tue, Jul 18, 2017 at 8:59 PM, Wei Liu  wrote:

On Tue, Jul 18, 2017 at 05:11:27PM +0530, vijay.kil...@gmail.com wrote:

From: Vijaya Kumar K 

Add accessors for nodes[] and other static variables and
use those accessors. These variables are later accessed
outside the file when the code made generic in later
patches. However the coding style is not changed.

Signed-off-by: Vijaya Kumar K 
---
v3: - Changed accessors parameter from int to unsigned int
- Updated commit message
- Fixed wrong indentation
---
 xen/arch/x86/srat.c | 106 +++-
 1 file changed, 81 insertions(+), 25 deletions(-)

diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 535c9d7..42cca5a 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -41,6 +41,44 @@ static struct node node_memblk_range[NR_NODE_MEMBLKS];
 static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
 static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);

+static struct node *get_numa_node(unsigned int id)
+{
+ return [id];
+}
+
+static nodeid_t get_memblk_nodeid(unsigned int id)
+{
+ return memblk_nodeid[id];
+}
+
+static nodeid_t *get_memblk_nodeid_map(void)
+{
+ return _nodeid[0];
+}
+
+static struct node *get_node_memblk_range(unsigned int memblk)
+{
+ return _memblk_range[memblk];
+}
+
+static int get_num_node_memblks(void)
+{
+ return num_node_memblks;
+}


They should all be inline functions. And maybe at once lift to a header
and add proper prefix since you mention they are going to be used later.


Currently these are static variables in x86/srat.c file.
In patch #9 I move them to common/numa.c file and make these functions
non-static.

If I lift them to header file and make inline, then I have to make these as
global variables.


As I said on v2, I am not sure to understand the usefulness of those 
accessors over global variables...


You don't have any kind of sanity check, so they would do exactly the 
same job. The global variables would avoid so much churn.


More that you tend to sometimes use global and other time static helpers...

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 04/24] x86: NUMA: Rename and sanitize memnode shift code

2017-07-19 Thread Julien Grall

Hi Vijay,

On 18/07/17 12:41, vijay.kil...@gmail.com wrote:

From: Vijaya Kumar K 

memnode_shift variable is changed from int to unsigned int.
With this change, compute_memnode_shift() returns error value
instead of returning shift value. The memnode_shift is updated inside
compute_memnode_shift().

Also, following changes are made
  - Rename compute_hash_shift to compute_memnode_shift
  - Update int to unsigned int for params in extract_lsb_from_nodes()
  - Return values of populate_memnodemap() is changed


I am not sure to understand the rationale behind changing the return 
value of populate_memnodemap. Likely this mean a bit more description in 
the commit message.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PULL for-2.0 0/7] please pull xen-20170718-tag

2017-07-19 Thread Peter Maydell
On 18 July 2017 at 23:20, Stefano Stabellini  wrote:
> The following changes since commit f9dada2baabb639feb988b3a564df7a06d214e18:
>
>   Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into 
> staging (2017-07-18 20:29:36 +0100)
>
> are available in the git repository at:
>
>
>   git://xenbits.xen.org/people/sstabellini/qemu-dm.git tags/xen-20170718-tag
>
> for you to fetch changes up to 331b5189d756d431b1d18ae7097527ba3d3ea809:
>
>   xen: don't use xenstore to save/restore physmap anymore (2017-07-18 
> 14:16:52 -0700)
>
> 
> Xen 2017/07/18
>
> 
> Igor Druzhinin (4):
>   xen: move physmap saving into a separate function
>   xen/mapcache: add an ability to create dummy mappings
>   xen/mapcache: introduce xen_replace_cache_entry()
>   xen: don't use xenstore to save/restore physmap anymore
>
> Peter Maydell (1):
>   xen_pt_msi.c: Check for xen_host_pci_get_* failures in 
> xen_pt_msix_init()
>
> Stefano Stabellini (1):
>   xen-platform: separate unplugging of NVMe disks
>
> Xiong Zhang (1):
>   hw/xen: Set emu_mask for igd_opregion register

Applied, thanks.

-- PMM

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 02/24] x86: NUMA: Clean up: Fix coding styles and drop unused code

2017-07-19 Thread Julien Grall

Hi,

On 19/07/17 17:27, Wei Liu wrote:

On Wed, Jul 19, 2017 at 05:23:43PM +0100, Julien Grall wrote:


This is also more than cosmetics and I think the reviewed-by from Wei should
have been carried.


should *not* have been carried.


That's what I meant but failed to write the not.



And I agree.



Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 02/24] x86: NUMA: Clean up: Fix coding styles and drop unused code

2017-07-19 Thread Wei Liu
On Wed, Jul 19, 2017 at 05:23:43PM +0100, Julien Grall wrote:
> 
> This is also more than cosmetics and I think the reviewed-by from Wei should
> have been carried.

should *not* have been carried.

And I agree.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 02/24] x86: NUMA: Clean up: Fix coding styles and drop unused code

2017-07-19 Thread Julien Grall

Hi Vijay,

On 18/07/17 12:41, vijay.kil...@gmail.com wrote:

From: Vijaya Kumar K 

Fix coding style, trailing spaces, tabs in NUMA code.
Also drop unused macros and functions.
There is no functional change.

Signed-off-by: Vijaya Kumar K 
Reviewed-by: Wei Liu 
---
v3: - Change commit message
- Changed VIRTUAL_BUG_ON to ASSERT


Looking at the commit message you don't mention any renaming...


- Dropped useless inner paranthesis for some macros


[...]


diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index 3cf26c2..c0de57b 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -1,8 +1,11 @@
-#ifndef _ASM_X8664_NUMA_H
+#ifndef _ASM_X8664_NUMA_H
 #define _ASM_X8664_NUMA_H 1

 #include 

+#define MAX_NUMNODESNR_NODES
+#define NR_NODE_MEMBLKS (MAX_NUMNODES * 2)


I don't understand why this suddenly appears in the code when you moved 
away in patch #1 in xen/numa.h.


[...]


@@ -57,21 +55,23 @@ struct node_data {

 extern struct node_data node_data[];

-static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
-{
-   nodeid_t nid;
-   VIRTUAL_BUG_ON((paddr_to_pdx(addr) >> memnode_shift) >= memnodemapsize);
-   nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift];
-   VIRTUAL_BUG_ON(nid >= MAX_NUMNODES || !node_data[nid]);
-   return nid;
-}
-
-#define NODE_DATA(nid) (&(node_data[nid]))
-
-#define node_start_pfn(nid)(NODE_DATA(nid)->node_start_pfn)
-#define node_spanned_pages(nid)(NODE_DATA(nid)->node_spanned_pages)
-#define node_end_pfn(nid)   (NODE_DATA(nid)->node_start_pfn + \
-NODE_DATA(nid)->node_spanned_pages)
+static inline __attribute_pure__ nodeid_t phys_to_nid(paddr_t addr)
+{
+   nodeid_t nid;
+
+   ASSERT((paddr_to_pdx(addr) >> memnode_shift) < memnodemapsize);
+   nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift];
+   ASSERT(nid <= MAX_NUMNODES || !node_data[nid].node_start_pfn);
+
+   return nid;
+}
+
+#define NODE_DATA(nid)  (&(node_data[nid]))


I understand Jan asked to remove the inner parentheses here. And you 
didn't do it. However ...



+
+#define node_start_pfn(nid) NODE_DATA(nid)->node_start_pfn
+#define node_spanned_pages(nid) NODE_DATA(nid)->node_spanned_pages
+#define node_end_pfn(nid)   NODE_DATA(nid)->node_start_pfn + \
+ NODE_DATA(nid)->node_spanned_pages


... here it is totally wrong to remove the parenthesis. Imagine you do:

node_end_pfn(nid) * 2

This will now turned into

NODE_DATA(nid)->node_start_pfn + NODE_DATA(nid)->node_spanned_pages * 2

The parenthesis is not correct anymore and will result to wrong 
computation. You should keep the outer parenthesis *everywhere* for 
safety and remove only the inner one in NODE_DATA.


This is also more than cosmetics and I think the reviewed-by from Wei 
should have been carried.




 extern int valid_numa_range(u64 start, u64 end, nodeid_t node);

diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 6bba29e..3bb4afc 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -6,9 +6,6 @@
 #define NUMA_NO_NODE 0xFF
 #define NUMA_NO_DISTANCE 0xFF

-#define MAX_NUMNODESNR_NODES
-#define NR_NODE_MEMBLKS (MAX_NUMNODES * 2)
-


See my comment above.


 #define vcpu_to_node(v) (cpu_to_node((v)->processor))

 #define domain_to_node(d) \



Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 21/22] x86/module: Add support for mcmodel large and PLTs

2017-07-19 Thread Thomas Garnier
On Tue, Jul 18, 2017 at 8:59 PM, Brian Gerst  wrote:
> On Tue, Jul 18, 2017 at 9:35 PM, H. Peter Anvin  wrote:
>> On 07/18/17 15:33, Thomas Garnier wrote:
>>> With PIE support and KASLR extended range, the modules may be further
>>> away from the kernel than before breaking mcmodel=kernel expectations.
>>>
>>> Add an option to build modules with mcmodel=large. The modules generated
>>> code will make no assumptions on placement in memory.
>>>
>>> Despite this option, modules still expect kernel functions to be within
>>> 2G and generate relative calls. To solve this issue, the PLT arm64 code
>>> was adapted for x86_64. When a relative relocation go outside its range,
>>> a dynamic PLT entry is used to correctly jump to the destination.
>>
>> Why large as opposed to medium or medium-PIC?
>
> Or for that matter, why not small-PIC?  We aren't changing the size of
> the kernel to be larger than 2G text or data.  Small-PIC would still
> allow it to be placed anywhere in the address space, and would
> generate far better code.

My understanding was that small=PIC and medium=PIC assume that the
module code is in the lower 2G of memory. I will do additional testing
on the modules to confirm that.

>
> --
> Brian Gerst



-- 
Thomas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 01/24] NUMA: Make number of NUMA nodes configurable

2017-07-19 Thread Julien Grall

Hi Vijay,

On 19/07/2017 08:00, Vijay Kilari wrote:

On Tue, Jul 18, 2017 at 11:25 PM, Julien Grall  wrote:

Hi,


On 18/07/17 12:41, vijay.kil...@gmail.com wrote:


From: Vijaya Kumar K 

Introduce NR_NODES config option to specify number
of NUMA nodes supported. By default value is set at
64 for x86 and 8 for arm. Dropped NODES_SHIFT macro.

Also move NR_NODE_MEMBLKS from asm-x86/acpi.h to xen/numa.h

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/Kconfig   | 7 +++
 xen/include/asm-x86/acpi.h | 1 -
 xen/include/asm-x86/numa.h | 2 --
 xen/include/xen/config.h   | 1 +
 xen/include/xen/numa.h | 7 ++-
 5 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
index cf0acb7..9c2a4e2 100644
--- a/xen/arch/Kconfig
+++ b/xen/arch/Kconfig
@@ -6,3 +6,10 @@ config NR_CPUS
default "128" if ARM
---help---
  Specifies the maximum number of physical CPUs which Xen will
support.
+
+config NR_NODES
+   int "Maximum number of NUMA nodes"
+   default "64" if X86
+   default "8" if ARM



3rd time I am asking it... Why the difference between x86 and ARM?


AFAIK, there is no arm platform for now with numa more than 8 nodes.
Thunderx is only 2 nodes.
So kept it low value for ARM to avoid unnecessary memory allocation.

Do you want me to keep same as x86?.


Well, you say it is for saving memory allocation but you don't give any 
number on how much you can save by reducing the default from 64 to 8...


Looking at it, MAX_NUMNODES is used for some static allocation and also 
for the bitmap nodemask_t.


Because our bitmap is based on unsigned long, you would use the same 
quantity of memory for AArch64, for AArch32 the quantity will be divided 
by two. Still nodemask_t does not seem to be widely used.


In the case of the static allocation, I spot ~40 bytes per NUMA node. So 
8 node will use ~320 bytes and 64 bytes ~2560.


NUMA is likely going to be used in server, don't tell me you are 2k 
short in memory? If it is an issue it is better to think how to limit 
the number of static variable rather than putting a low limit here.


For Embedded use case, they will likely want to put the default to 1 but 
I would not worry about them as they are likely going to tweak the Kconfig.






Also, you likely want to set to 1 if NUMA is not enabled.


I don't see any dependency of NR_NODES with NUMA config.
So it is always set to default value. Isn't?


Well, what is the point to allow more than 1 node when NUMA is not 
supported?


Not mentioning that this is quite confusing for a user to allow setting 
up the maximum number of nodes if the archicture is not supporting numa...


For instance, this is the case today on ARM because, without this 
series, we don't support NUMA.








+   ---help---
+ Specifies the maximum number of NUMA nodes which Xen will
support.
diff --git a/xen/include/asm-x86/acpi.h b/xen/include/asm-x86/acpi.h
index 27ecc65..15be784 100644
--- a/xen/include/asm-x86/acpi.h
+++ b/xen/include/asm-x86/acpi.h
@@ -105,7 +105,6 @@ extern void acpi_reserve_bootmem(void);

 extern s8 acpi_numa;
 extern int acpi_scan_nodes(u64 start, u64 end);
-#define NR_NODE_MEMBLKS (MAX_NUMNODES*2)

 #ifdef CONFIG_ACPI_SLEEP

diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index bada2c0..3cf26c2 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -3,8 +3,6 @@

 #include 

-#define NODES_SHIFT 6
-
 typedef u8 nodeid_t;

 extern int srat_rev;
diff --git a/xen/include/xen/config.h b/xen/include/xen/config.h
index a1d0f97..0f1a029 100644
--- a/xen/include/xen/config.h
+++ b/xen/include/xen/config.h
@@ -81,6 +81,7 @@

 /* allow existing code to work with Kconfig variable */
 #define NR_CPUS CONFIG_NR_CPUS
+#define NR_NODES CONFIG_NR_NODES

 #ifndef CONFIG_DEBUG
 #define NDEBUG
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 7aef1a8..6bba29e 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -3,14 +3,11 @@

 #include 

-#ifndef NODES_SHIFT
-#define NODES_SHIFT 0
-#endif
-
 #define NUMA_NO_NODE 0xFF
 #define NUMA_NO_DISTANCE 0xFF

-#define MAX_NUMNODES(1 << NODES_SHIFT)
+#define MAX_NUMNODESNR_NODES
+#define NR_NODE_MEMBLKS (MAX_NUMNODES * 2)


Also, I don't understand why you move this define from asm-x86/numa.h to 
xen/numa.h. At least, this does not seem related to this patch...


Cheers,


--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 15/17] osstest: add support for FreeBSD buildjobs to sg-run-job

2017-07-19 Thread Ian Jackson
Roger Pau Monne writes ("[PATCH v5 15/17] osstest: add support for FreeBSD 
buildjobs to sg-run-job"):
> Add support and introduce a FreeBSD build job to sg-run-job.
...
> +switch -exact $ostype {
> +FREEBSD { run-ts broken = ts-freebsd-set-hostflags --share }
 ^ +
Actually, I just acked this, but I think you want to
add a plus there to make the testid not contain --share.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 15/17] osstest: add support for FreeBSD buildjobs to sg-run-job

2017-07-19 Thread Ian Jackson
Roger Pau Monne writes ("[PATCH v5 15/17] osstest: add support for FreeBSD 
buildjobs to sg-run-job"):
> Add support and introduce a FreeBSD build job to sg-run-job.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 14/17] osstest: change the meaning of need_build_host

2017-07-19 Thread Ian Jackson
Roger Pau Monne writes ("[PATCH v5 14/17] osstest: change the meaning of 
need_build_host"):
> Make need_build_host store a string instead of a boolean. This is
> later going to be expanded to handle the FreeBSD build jobs.

This is all fine, but I have two style comments:

> +if {[string match BUILD_* $nh]} {
>  set need_xen_hosts {}
> -set need_build_host 1
> +set need_build_host [string range $nh [expr [string first _ $nh] + 
> 1] end]

This string range stuff is rather clunky.  How about

   if {[regsub {^BUILD_(.*)} $nh need_build_host]} {

?

> -if {$need_build_host} { catching-otherwise broken prepare-build-host }
> +if {[llength $need_build_host]} {
> +catching-otherwise broken {
> +prepare-build-host-[string tolower $need_build_host]

I might be tempted to not bother with the `string tolower' and simply
let the functions have SHOUTING in their names.  Up to you.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 01/24] NUMA: Make number of NUMA nodes configurable

2017-07-19 Thread Julien Grall



On 19/07/17 09:17, Wei Liu wrote:

On Tue, Jul 18, 2017 at 06:52:11PM +0100, Julien Grall wrote:

Hi,

On 18/07/17 16:29, Wei Liu wrote:

On Tue, Jul 18, 2017 at 05:11:23PM +0530, vijay.kil...@gmail.com wrote:

From: Vijaya Kumar K 

Introduce NR_NODES config option to specify number
of NUMA nodes supported. By default value is set at
64 for x86 and 8 for arm. Dropped NODES_SHIFT macro.

Also move NR_NODE_MEMBLKS from asm-x86/acpi.h to xen/numa.h

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/Kconfig   | 7 +++
 xen/include/asm-x86/acpi.h | 1 -
 xen/include/asm-x86/numa.h | 2 --
 xen/include/xen/config.h   | 1 +
 xen/include/xen/numa.h | 7 ++-
 5 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
index cf0acb7..9c2a4e2 100644
--- a/xen/arch/Kconfig
+++ b/xen/arch/Kconfig
@@ -6,3 +6,10 @@ config NR_CPUS
default "128" if ARM
---help---
  Specifies the maximum number of physical CPUs which Xen will support.
+
+config NR_NODES
+   int "Maximum number of NUMA nodes"
+   default "64" if X86
+   default "8" if ARM
+   ---help---
+ Specifies the maximum number of NUMA nodes which Xen will support.


Since this can now be specified by user but the definition of
NUMA_NO_NODE is  not changed, I think you need to sanitise the value
provided somewhere.

Maybe introduce a build time check? There are some examples in tree. See
cpuid.c:build_assertions.


You can do bound-checking in Kconfig:

range 1 254



Oh, good to know. Yes this is the way to go.


(Not directed to you Wei :))

Actually looking again at Xen, we are moving away from a power of 2. So 
what is the rationale behind that? Have you looked at why describing the 
number of nodes in term of power of 2 was chosen?


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 111995: regressions - FAIL

2017-07-19 Thread osstest service owner
flight 111995 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/111995/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. 
vs. 110515
 test-amd64-amd64-i386-pvgrub  7 xen-boot fail REGR. vs. 110515
 test-amd64-amd64-xl-pvh-intel  7 xen-bootfail REGR. vs. 110515
 test-amd64-amd64-qemuu-nested-intel  7 xen-boot  fail REGR. vs. 110515
 test-amd64-amd64-xl-qcow2 7 xen-boot fail REGR. vs. 110515
 test-amd64-amd64-amd64-pvgrub  7 xen-bootfail REGR. vs. 110515
 test-amd64-amd64-xl  16 guest-localmigrate   fail REGR. vs. 110515
 test-amd64-i386-libvirt-xsm  16 guest-saverestore.2  fail REGR. vs. 110515
 test-amd64-amd64-xl-xsm  16 guest-localmigrate   fail REGR. vs. 110515
 test-amd64-amd64-libvirt 16 guest-saverestore.2  fail REGR. vs. 110515
 test-amd64-amd64-xl-credit2  15 guest-saverestorefail REGR. vs. 110515
 test-amd64-i386-xl   16 guest-localmigrate   fail REGR. vs. 110515
 test-amd64-amd64-libvirt-pair 21 guest-start/debian  fail REGR. vs. 110515
 test-amd64-amd64-xl-multivcpu 15 guest-saverestore   fail REGR. vs. 110515
 test-amd64-amd64-pair21 guest-start/debian   fail REGR. vs. 110515
 test-amd64-amd64-libvirt-xsm 16 guest-saverestore.2  fail REGR. vs. 110515
 test-amd64-i386-libvirt  16 guest-saverestore.2  fail REGR. vs. 110515
 test-amd64-i386-xl-xsm   16 guest-localmigrate   fail REGR. vs. 110515
 test-amd64-amd64-xl-pvh-amd  16 guest-localmigrate   fail REGR. vs. 110515
 test-amd64-i386-libvirt-pair 21 guest-start/debian   fail REGR. vs. 110515
 test-amd64-i386-pair 21 guest-start/debian   fail REGR. vs. 110515
 test-amd64-amd64-pygrub   7 xen-boot fail REGR. vs. 110515
 test-amd64-amd64-xl-qemut-debianhvm-amd64  7 xen-bootfail REGR. vs. 110515
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
110515
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
110515
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 16 guest-localmigrate/x10 
fail REGR. vs. 110515
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
110515

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 110515
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 110515
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 110515
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 110515
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 110515
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 110515
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore   fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-xl-rtds

Re: [Xen-devel] [PATCH v5 13/17] osstest: introduce a script to set the runtime hostflags runvar for FreeBSD jobs

2017-07-19 Thread Ian Jackson
Roger Pau Monne writes ("[PATCH v5 13/17] osstest: introduce a script to set 
the runtime hostflags runvar for FreeBSD jobs"):
> Due to the nature of the FreeBSD install media, which is
> self-generated from the ts-freebsd-build script, the hostflags runvar
> set to FreeBSD jobs are related to the current version under test.
> 
> The following hostflags might need to be fetched from the runvars of a
> previous build-$arch-freebsd job:
...
> +our $share;
> +if (@ARGV && $ARGV[0] eq "--share") {
> +$share = 1;
> +shift @ARGV;
> +}

I think the remaining arguments should be host idents.

Also you should check that the first ident doesn't start with -.
(simply calling die if it does is fine).

> +my $version = get_freebsd_version();
> +set_runtime_hostflag("host", "freebsd-$version");

Specifically, you should iterate that, and this ...

> +if ($share) {
> +my $hash = get_freebsd_image_hash();
> +
> +set_runtime_hostflag("host", "share-build-freebsd-$hash");
> +}

for each entry in @ARGV.

That way this script can be used for pair tests etc.

ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 12/17] osstest: add support for runtime_IDENT_hostflags

2017-07-19 Thread Ian Jackson
Roger Pau Monne writes ("[PATCH v5 12/17] osstest: add support for 
runtime_IDENT_hostflags"):
> This is required for FreeBSD, that will need to set some of the
> hostflags at runtime. The current IDENT_hostflags will be keep as-is,
> and they should only be set at job creation time.
> 
> Also introduce a helper to set the runtime hostflags.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 06/22] kvm: Adapt assembly for PIE support

2017-07-19 Thread Thomas Garnier
On Tue, Jul 18, 2017 at 7:49 PM, Brian Gerst  wrote:
> On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier  wrote:
>> Change the assembly code to use only relative references of symbols for the
>> kernel to be PIE compatible. The new __ASM_GET_PTR_PRE macro is used to
>> get the address of a symbol on both 32 and 64-bit with PIE support.
>>
>> Position Independent Executable (PIE) support will allow to extended the
>> KASLR randomization range below the -2G memory limit.
>>
>> Signed-off-by: Thomas Garnier 
>> ---
>>  arch/x86/include/asm/kvm_host.h | 6 --
>>  arch/x86/kernel/kvm.c   | 6 --
>>  arch/x86/kvm/svm.c  | 4 ++--
>>  3 files changed, 10 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h 
>> b/arch/x86/include/asm/kvm_host.h
>> index 87ac4fba6d8e..3041201a3aeb 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -1352,9 +1352,11 @@ asmlinkage void kvm_spurious_fault(void);
>> ".pushsection .fixup, \"ax\" \n" \
>> "667: \n\t" \
>> cleanup_insn "\n\t"   \
>> -   "cmpb $0, kvm_rebooting \n\t" \
>> +   "cmpb $0, kvm_rebooting" __ASM_SEL(,(%%rip)) " \n\t" \
>> "jne 668b \n\t"   \
>> -   __ASM_SIZE(push) " $666b \n\t"\
>> +   __ASM_SIZE(push) "%%" _ASM_AX " \n\t"   \
>> +   __ASM_GET_PTR_PRE(666b) "%%" _ASM_AX "\n\t" \
>> +   "xchg %%" _ASM_AX ", (%%" _ASM_SP ") \n\t"  \
>> "call kvm_spurious_fault \n\t"\
>> ".popsection \n\t" \
>> _ASM_EXTABLE(666b, 667b)
>> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
>> index 71c17a5be983..53b8ad162589 100644
>> --- a/arch/x86/kernel/kvm.c
>> +++ b/arch/x86/kernel/kvm.c
>> @@ -618,8 +618,10 @@ asm(
>>  ".global __raw_callee_save___kvm_vcpu_is_preempted;"
>>  ".type __raw_callee_save___kvm_vcpu_is_preempted, @function;"
>>  "__raw_callee_save___kvm_vcpu_is_preempted:"
>> -"movq  __per_cpu_offset(,%rdi,8), %rax;"
>> -"cmpb  $0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);"
>> +"leaq  __per_cpu_offset(%rip), %rax;"
>> +"movq  (%rax,%rdi,8), %rax;"
>> +"addq  " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rip), %rax;"
>
> This doesn't look right.  It's accessing a per-cpu variable.  The
> per-cpu section is an absolute, zero-based section and not subject to
> relocation.
>

PIE does not respect the zero-based section, it tries to have
everything relative. Patch 16/22 also adapt per-cpu to work with PIE
(while keeping the zero absolute design by default).

>> +"cmpb  $0, (%rax);
>>  "setne %al;"
>>  "ret;"
>>  ".popsection");
>
> --
> Brian Gerst



-- 
Thomas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 11/17] osstest: introduce a FreeBSD build script

2017-07-19 Thread Ian Jackson
Roger Pau Monne writes ("[PATCH v5 11/17] osstest: introduce a FreeBSD build 
script"):
> In order to generate the FreeBSD installer image and the install
> media.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 10/17] osstest: add support for the FreeBSD package manager

2017-07-19 Thread Ian Jackson
Roger Pau Monne writes ("[PATCH v5 10/17] osstest: add support for the FreeBSD 
package manager"):
> FreeBSD support is added to target_install_packages and
> target_install_packages_norec, although there's no equivalent to the
> --no-install-recommends in the FreeBSD package manager.
> 
> Signed-off-by: Roger Pau Monné 
...
> +sub package_install_cmd {
  ^
   (;$)

Should have a prototype.  Sorry for not spotting this before.

With that fixed,

Acked-by: Ian Jackson 

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 08/17] osstest: add a FreeBSD host install script

2017-07-19 Thread Ian Jackson
Roger Pau Monne writes ("[PATCH v5 08/17] osstest: add a FreeBSD host install 
script"):
> The installation is performed using the bsdinstall tool, which is part
> of the FreeBSD base system. The installer image is setup with the
> osstest ssh keys and sshd enabled by default, which allows the test
> harness to just ssh into the box, create the install config file and
> launch the scripted install.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 07/17] osstest: introduce rename_shared_mark_ready

2017-07-19 Thread Ian Jackson
Roger Pau Monne writes ("[PATCH v5 07/17] osstest: introduce 
rename_shared_mark_ready"):
> That allows marking a host as ready to be shared. Replace the current
> caller that open-codes it.

You got the Subject wrong.  "rename" for "resource".  :-)

With that fixed:

Acked-by: Ian Jackson 

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 06/17] osstest: add executive prefix to resource_shared_mark_ready

2017-07-19 Thread Ian Jackson
Roger Pau Monne writes ("[PATCH v5 06/17] osstest: add executive prefix to 
resource_shared_mark_ready"):
> This is a non-functional change in preparation for introducing a
> resource_shared_mark_ready in TestSupport.
> 
> Signed-off-by: Roger Pau Monné 

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.10 Development Update

2017-07-19 Thread Daniel Kiper
Hey Julien,

On Mon, Jul 17, 2017 at 02:26:22PM +0100, Julien Grall wrote:
> This email only tracks big items for xen.git tree. Please reply for items you
> woulk like to see in 4.10 so that people have an idea what is going on and
> prioritise accordingly.
>
> You're welcome to provide description and use cases of the feature you're
> working on.
>
> = Timeline =
>
> We now adopt a fixed cut-off date scheme. We will release twice a
> year. The upcoming 4.10 timeline are as followed:
>
> * Last posting date: September 15th, 2017
> * Hard code freeze: September 29th, 2017
> * RC1: TBD
> * Release: December 2, 2017
>
> Note that we don't have freeze exception scheme anymore. All patches
> that wish to go into 4.10 must be posted no later than the last posting
> date. All patches posted after that date will be automatically queued
> into next release.
>
> RCs will be arranged immediately after freeze.
>
> We recently introduced a jira instance to track all the tasks (not only big)
> for the project. See: https://xenproject.atlassian.net/projects/XEN/issues.
>
> Most of the tasks tracked by this e-mail also have a corresponding jira task
> referred by XEN-N.
>
> I have started to include the version number of series associated to each
> feature. Can each owner send an update on the version number if the series
> was posted upstream?
>
> = Projects =
>
> == Hypervisor ==
>
> *  Per-cpu tasklet
>   -  XEN-28
>   -  Konrad Rzeszutek Wilk
>
> *  Add support of rcu_idle_{enter,exit}
>   -  XEN-27
>   -  Dario Faggioli
>
> === x86 ===

Could you add the following project to the list?

*  Change xen.efi build and add SHIM_LOCK verification into efi_multiboot2()
  -  Daniel Kiper

This is probably more 4.11 material but let's have it on the 4.10 list too.
Who knows what may happen.

Thanks,

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 00/25 v6] SBSA UART emulation support in Xen

2017-07-19 Thread Julien Grall
Hi Bhupinder,

I've tried this series today on an ARM64 platform. When I enable pl011 for the 
guest,
I am not able to fully destroy the guest. It stay in zombie mode:

42sh> xl list
NameID   Mem VCPUs  State   Time(s)
Domain-0 0  3072 2 r-  62.1
(null)   6 0 2 --p--d   1.5

It does not happen when I don't have pl011 enabled in the guest config.

The step to reproduce it is:

42sh> xl create guest.cfg
Parsing config from guest.cfg
42sh> xl list
NameID   Mem VCPUs  State   Time(s)
Domain-0 0  3072 2 r-  12.9
guest1  3265 1 r-   0.1
42sh> xl destroy guest
42sh> xl list
NameID   Mem VCPUs  State   Time(s)
Domain-0 0  3072 2 r-  14.9
(null)   1 0 1 --p--d   0.1

And my guest.cfg is:

42sh> cat guest.cfg
kernel="/home/julien/works/guest/Image"
name="guest"
memory=3265
vcpus=2
vuart="sbsa_uart"

I haven't dug into the problem but I would look at how you unmap the ring from 
Xen
and xenconsole. Likely we still have a reference on it.

Let me know if you need any help.

Cheers,

On 17/07/17 14:06, Bhupinder Thakur wrote:
> SBSA UART emulation for guests in Xen
> ==
> Linaro has published VM System specification for ARM Processors, which
> provides a set of guidelines for both guest OS and hypervisor 
> implementations, 
> such that building OS images according to these guidelines guarantees
> that those images can also run on hypervisors compliant with this 
> specification.
> 
> One of the spec requirements is that the hypervisor must provide an
> emulated SBSA UART as a serial console which meets the minimum requirements 
> in 
> SBSA UART as defined in appendix B of the following 
> ARM Server Base Architecture Document:
> 
> https://static.docs.arm.com/den0029/a/Server_Base_System_Architecture_v3_1_ARM_DEN_0029A.pdf.
> 
> This feature allows the Xen guests to use SBSA compliant pl011 UART as 
> as a console. 
> 
> Note that SBSA pl011 UART is a subset of full featured ARM pl011 UART and
> supports only a subset of registers as mentioned below. It does not support
> rx/tx DMA.
> 
> Currently, Xen supports paravirtualized (aka PV console) and an emulated 
> serial 
> consoles. This feature will expose an emulated SBSA pl011 UART console to the
> guest, which a user can access using xenconsole.
> 
> The device tree passed to the guest VM will contain the pl011 MMIO address 
> range and an irq for receiving rx/tx pl011 interrupts. The device tree format 
> is specified in Documentation/devicetree/bindings/serial/arm_sbsa_uart.txt.
> 
> The Xen hypervisor will expose two types of interfaces to the backend and 
> domU. 
> 
> The interface exposed to domU will be an emulated pl011 UART by emulating the 
> access to the following pl011 registers by the guest.
> 
> - Data register (DR)- RW
> - Raw interrupt status register (RIS)   - RO
> - Masked interrupt status register (MIS)- RO
> - Interrupt Mask (IMSC) - RW
> - Interrupt Clear (ICR) - WO
> 
> It will also inject the pl011 interrupts to the guest in the following 
> conditions:
> 
> - incoming data in the rx buffer for the guest
> - there is space in the tx buffer for the guest to write more data
> 
> The interface exposed to the backend will be the same PV console interface, 
> which minimizes the changes required in xenconsole to support a new pl011 
> console.
> 
> This interface has rx and tx ring buffers and an event channel for 
> sending/receiving events from the backend. 
> 
> So essentially Xen handles the data on behalf of domU and the backend. Any 
> data 
> written by domU is captured by Xen and written to the TX (OUT) ring buffer 
> and a pl011 event is raised to the backend to read the TX ring buffer.
>  
> Similarly on reciving a pl011 event, Xen injects an interrupt to guest to
> indicate there is data available in the RX (IN) ring buffer.
> 
> The pl011 UART state is completely captured in the set of registers 
> mentioned above and this state is updated everytime there is an event from 
> the backend or there is register read/write access from domU. 
> 
> For example, if domU has masked the rx interrupt in the IMSC register, then 
> Xen 
> will not inject an interrupt to guest and will just update the RIS register. 
> Once the interrupt is unmasked by guest, the interrupt will be delivered to 
> the 
> guest.
> 
> Changes summary:
> 
> Xen Hypervisor
> ===
> 
> 1. Add emulation code to emulate read/write access to pl011 registers and 
> pl011 
>interrupts:
> - It emulates DR read/write by reading and writing from/to the IN 

Re: [Xen-devel] [PATCH v2] xen: xen-pciback: remove DRIVER_ATTR() usage

2017-07-19 Thread Greg KH
On Wed, Jul 19, 2017 at 04:51:02PM +0200, Juergen Gross wrote:
> On 19/07/17 16:43, Greg KH wrote:
> > From: Greg Kroah-Hartman 
> > 
> > It's better to be explicit and use the DRIVER_ATTR_RW() and
> > DRIVER_ATTR_RO() macros when defining a driver's sysfs file.
> > 
> > Bonus is this fixes up a checkpatch.pl warning.
> > 
> > This is part of a series to drop DRIVER_ATTR() from the tree entirely.
> > 
> > Cc: Boris Ostrovsky 
> > Cc: Juergen Gross 
> > Signed-off-by: Greg Kroah-Hartman 
> 
> Reviewed-by: Juergen Gross 
> 
> I'll take this through the Xen tree, unless you want to use your tree.

If I can take it through mine, then I could drop DRIVER_ATTR() from the
whole tree for the next kernel release, which would be ideal.

But if you want to take it, that's fine, I can wait another release, no
rush.

thanks,

greg k-h

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xen: xen-pciback: remove DRIVER_ATTR() usage

2017-07-19 Thread Juergen Gross
On 19/07/17 16:43, Greg KH wrote:
> From: Greg Kroah-Hartman 
> 
> It's better to be explicit and use the DRIVER_ATTR_RW() and
> DRIVER_ATTR_RO() macros when defining a driver's sysfs file.
> 
> Bonus is this fixes up a checkpatch.pl warning.
> 
> This is part of a series to drop DRIVER_ATTR() from the tree entirely.
> 
> Cc: Boris Ostrovsky 
> Cc: Juergen Gross 
> Signed-off-by: Greg Kroah-Hartman 

Reviewed-by: Juergen Gross 

I'll take this through the Xen tree, unless you want to use your tree.


Thanks,

Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2] xen: xen-pciback: remove DRIVER_ATTR() usage

2017-07-19 Thread Greg KH
From: Greg Kroah-Hartman 

It's better to be explicit and use the DRIVER_ATTR_RW() and
DRIVER_ATTR_RO() macros when defining a driver's sysfs file.

Bonus is this fixes up a checkpatch.pl warning.

This is part of a series to drop DRIVER_ATTR() from the tree entirely.

Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Signed-off-by: Greg Kroah-Hartman 
---
v2: fix build error (quirks_store was wrong), thanks to Juergen for the
catch, it's now correctly build tested locally...

 drivers/xen/xen-pciback/pci_stub.c |   44 -
 1 file changed, 20 insertions(+), 24 deletions(-)


--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -1172,8 +1172,8 @@ out:
return err;
 }
 
-static ssize_t pcistub_slot_add(struct device_driver *drv, const char *buf,
-   size_t count)
+static ssize_t new_slot_store(struct device_driver *drv, const char *buf,
+ size_t count)
 {
int domain, bus, slot, func;
int err;
@@ -1189,10 +1189,10 @@ out:
err = count;
return err;
 }
-static DRIVER_ATTR(new_slot, S_IWUSR, NULL, pcistub_slot_add);
+static DRIVER_ATTR_WO(new_slot);
 
-static ssize_t pcistub_slot_remove(struct device_driver *drv, const char *buf,
-  size_t count)
+static ssize_t remove_slot_store(struct device_driver *drv, const char *buf,
+size_t count)
 {
int domain, bus, slot, func;
int err;
@@ -1208,9 +1208,9 @@ out:
err = count;
return err;
 }
-static DRIVER_ATTR(remove_slot, S_IWUSR, NULL, pcistub_slot_remove);
+static DRIVER_ATTR_WO(remove_slot);
 
-static ssize_t pcistub_slot_show(struct device_driver *drv, char *buf)
+static ssize_t slots_show(struct device_driver *drv, char *buf)
 {
struct pcistub_device_id *pci_dev_id;
size_t count = 0;
@@ -1231,9 +1231,9 @@ static ssize_t pcistub_slot_show(struct
 
return count;
 }
-static DRIVER_ATTR(slots, S_IRUSR, pcistub_slot_show, NULL);
+static DRIVER_ATTR_RO(slots);
 
-static ssize_t pcistub_irq_handler_show(struct device_driver *drv, char *buf)
+static ssize_t irq_handlers_show(struct device_driver *drv, char *buf)
 {
struct pcistub_device *psdev;
struct xen_pcibk_dev_data *dev_data;
@@ -1260,11 +1260,10 @@ static ssize_t pcistub_irq_handler_show(
spin_unlock_irqrestore(_devices_lock, flags);
return count;
 }
-static DRIVER_ATTR(irq_handlers, S_IRUSR, pcistub_irq_handler_show, NULL);
+static DRIVER_ATTR_RO(irq_handlers);
 
-static ssize_t pcistub_irq_handler_switch(struct device_driver *drv,
- const char *buf,
- size_t count)
+static ssize_t irq_handler_state_store(struct device_driver *drv,
+  const char *buf, size_t count)
 {
struct pcistub_device *psdev;
struct xen_pcibk_dev_data *dev_data;
@@ -1301,11 +1300,10 @@ out:
err = count;
return err;
 }
-static DRIVER_ATTR(irq_handler_state, S_IWUSR, NULL,
-  pcistub_irq_handler_switch);
+static DRIVER_ATTR_WO(irq_handler_state);
 
-static ssize_t pcistub_quirk_add(struct device_driver *drv, const char *buf,
-size_t count)
+static ssize_t quirks_store(struct device_driver *drv, const char *buf,
+   size_t count)
 {
int domain, bus, slot, func, reg, size, mask;
int err;
@@ -1323,7 +1321,7 @@ out:
return err;
 }
 
-static ssize_t pcistub_quirk_show(struct device_driver *drv, char *buf)
+static ssize_t quirks_show(struct device_driver *drv, char *buf)
 {
int count = 0;
unsigned long flags;
@@ -1366,11 +1364,10 @@ out:
 
return count;
 }
-static DRIVER_ATTR(quirks, S_IRUSR | S_IWUSR, pcistub_quirk_show,
-  pcistub_quirk_add);
+static DRIVER_ATTR_RW(quirks);
 
-static ssize_t permissive_add(struct device_driver *drv, const char *buf,
- size_t count)
+static ssize_t permissive_store(struct device_driver *drv, const char *buf,
+   size_t count)
 {
int domain, bus, slot, func;
int err;
@@ -1431,8 +1428,7 @@ static ssize_t permissive_show(struct de
spin_unlock_irqrestore(_devices_lock, flags);
return count;
 }
-static DRIVER_ATTR(permissive, S_IRUSR | S_IWUSR, permissive_show,
-  permissive_add);
+static DRIVER_ATTR_RW(permissive);
 
 static void pcistub_exit(void)
 {

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Notes from PCI Passthrough design discussion at Xen Summit

2017-07-19 Thread Punit Agrawal

I took some notes for the PCI Passthrough design discussion at Xen
Summit. Due to the wide range of topics covered, the notes got sparser
towards the end of the session. I've tried to attribute names against
comments but have very likely got things mixed up. Apologies in advance.

Although the session was well attended, some of the more active
discussions involved - Julien Grall, Stefano Stabillini, Roger Pau
Monné, Jan Beulich, Vikram Sethi. I'm sure I am missing some folks here.

Please do point out any mistakes I've made for the audience's benefit.

* Discovery of PCI hostbridges
  - Dom0 will be responsible for scanning the ECAM for devices and
register them with Xen. This approach is chosen due to variety of
non-standard PCI controllers on ARM platforms and the desire to
not duplicate driver code between Linux and Xen.
  - Jan, Roger: Bus scan needs to happer before device discovery
otherwise a small window where Xen doesn't know which host bridge
the device is registered on (as it'll likely only refer to the
segment number).
  - Roger: Registering config space with Xen before device discovery
will allow the hypervisor to set access traps for certain
functionality as appropriate.
  - Jan: Xen and Dom0 have to agree on the PCI segment number mapping
to host bridges. This is so that for future calls, Dom0 and
hypervisor can communicate using sBDF without ambiguity. 
  - Julien: Dom0 will register config space address and segment
number. mcfg_add will be used to pass the segment to Xen.
  - PCI segment - it's purely a software construct so identify
different host bridges.
  - Some discussion on whether boot devices need to be on
Segment 0. Technically, MCFG is only required to describe Segment
0 - other host bridges can be described in AML.

* Configuration accesses for non-ecam compliant host bridge
  - Julien proposed these to be forwarded to Dom0 for handling.
  - Audience: What kind of non-compliance are we talking about? If
they are simple, can they be implemented in Xen in a few lines of
code?
  - A few different types
- restrictions on access size, e.g., only certain sizes supported 
- register multiplexing via a window; similar to legacy x86 PCI
  access mechanism
- ECAM compliant but with special casing for different devices

* Support on 32bit platforms
  - Is there enough address space to map ECAM into Dom0. Maximum ECAM
size is 256MB.

* PCI ACS support
  - Vikram: Xen needs to be aware of the PCI device topology to
correctly setup device groups for passthrough
  - Jan: Roger: IIRC, Xen is already aware of the device topology
thought it doesn't use ACS to work out which devices need to be
passed to guest as a group.
  - Stefano: There was support in xend (previous Xen toolstack) but the
functionality has not yet been ported to libxl.

* Implementation milestones
  - Julien provided a summary of breakdown
- M0 - design document, currently under discussion on xen-devel
- M1 - PCI support in Xen
  - Xen aware of PCI devices (via Dom0 registration)
- M2 - Guest PCIe passthrough
  - Julien: Some complexity in dealing with Legacy interrupts as they can 
be shared.
  - Roger: MSIs mandatory for PCIe. So legacy interrupts can be
tackled at a later stage.
- M3 - testing
  - fuzzing. Jan: If implemented it'll be better than what x86
currently have.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 03/15] xen: x86: add early stage SGX feature detection

2017-07-19 Thread Andrew Cooper
On 09/07/17 09:09, Kai Huang wrote:
> This patch adds early stage SGX feature detection via SGX CPUID 0x12. Function
> detect_sgx is added to detect SGX info on each CPU (called from vmx_cpu_up).
> SDM says SGX info returned by CPUID is per-thread, and we cannot assume all
> threads will return the same SGX info, so we have to detect SGX for each CPU.
> For simplicity, currently SGX is only supported when all CPUs reports the same
> SGX info.
>
> SDM also says it's possible to have multiple EPC sections but this is only for
> multiple-socket server, which we don't support now (there are other things
> need to be done, ex, NUMA EPC, scheduling, etc, as well), so currently only
> one EPC is supported.
>
> Dedicated files sgx.c and sgx.h are added (under vmx directory as SGX is Intel
> specific) for bulk of above SGX detection code detection code, and for further
> SGX code as well.
>
> Signed-off-by: Kai Huang 

I am not sure putting this under hvm/ is a sensible move.  Almost
everything in this patch is currently common, and I can forsee us
wanting to introduce PV support, so it would be good to introduce this
in a guest-neutral location to begin with.

> ---
>  xen/arch/x86/hvm/vmx/Makefile |   1 +
>  xen/arch/x86/hvm/vmx/sgx.c| 208 
> ++
>  xen/arch/x86/hvm/vmx/vmcs.c   |   4 +
>  xen/include/asm-x86/cpufeature.h  |   1 +
>  xen/include/asm-x86/hvm/vmx/sgx.h |  45 +
>  5 files changed, 259 insertions(+)
>  create mode 100644 xen/arch/x86/hvm/vmx/sgx.c
>  create mode 100644 xen/include/asm-x86/hvm/vmx/sgx.h
>
> diff --git a/xen/arch/x86/hvm/vmx/Makefile b/xen/arch/x86/hvm/vmx/Makefile
> index 04a29ce59d..f6bcf0d143 100644
> --- a/xen/arch/x86/hvm/vmx/Makefile
> +++ b/xen/arch/x86/hvm/vmx/Makefile
> @@ -4,3 +4,4 @@ obj-y += realmode.o
>  obj-y += vmcs.o
>  obj-y += vmx.o
>  obj-y += vvmx.o
> +obj-y += sgx.o
> diff --git a/xen/arch/x86/hvm/vmx/sgx.c b/xen/arch/x86/hvm/vmx/sgx.c
> new file mode 100644
> index 00..6b41469371
> --- /dev/null
> +++ b/xen/arch/x86/hvm/vmx/sgx.c

This file looks like it should be arch/x86/sgx.c, given its current content.

> @@ -0,0 +1,208 @@
> +/*
> + * Intel Software Guard Extensions support

Please include a GPLv2 header.

> + *
> + * Author: Kai Huang 
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +static struct sgx_cpuinfo __read_mostly sgx_cpudata[NR_CPUS];
> +static struct sgx_cpuinfo __read_mostly boot_sgx_cpudata;

I don't think any of this is necessary.  The description says that all
EPCs across the server will be reported in CPUID subleaves, and our
implementation gives up if the data are non-identical across CPUs.

Therefore, we only need to keep one copy of the data, and check check
APs against the master copy.


Let me see about splitting up a few bits of the existing CPUID
infrastructure, so we can use the host cpuid policy more effectively for
Xen related things.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen: xen-pciback: remove DRIVER_ATTR() usage

2017-07-19 Thread Greg KH
On Wed, Jul 19, 2017 at 03:17:53PM +0200, Juergen Gross wrote:
> On 19/07/17 14:58, Greg KH wrote:
> > From: Greg Kroah-Hartman 
> > 
> > It's better to be explicit and use the DRIVER_ATTR_RW() and
> > DRIVER_ATTR_RO() macros when defining a driver's sysfs file.
> > 
> > Bonus is this fixes up a checkpatch.pl warning.
> > 
> > This is part of a series to drop DRIVER_ATTR() from the tree entirely.
> > 
> > Cc: Boris Ostrovsky 
> > Cc: Juergen Gross 
> > Signed-off-by: Greg Kroah-Hartman 
> > 
> > ---
> >  drivers/xen/xen-pciback/pci_stub.c |   44 
> > -
> >  1 file changed, 20 insertions(+), 24 deletions(-)
> > 
> > 
> > --- a/drivers/xen/xen-pciback/pci_stub.c
> > +++ b/drivers/xen/xen-pciback/pci_stub.c
> > @@ -1172,8 +1172,8 @@ out:
> > return err;
> >  }
> >  
> > -static ssize_t pcistub_slot_add(struct device_driver *drv, const char *buf,
> > -   size_t count)
> > +static ssize_t new_slot_store(struct device_driver *drv, const char *buf,
> > + size_t count)
> >  {
> > int domain, bus, slot, func;
> > int err;
> > @@ -1189,10 +1189,10 @@ out:
> > err = count;
> > return err;
> >  }
> > -static DRIVER_ATTR(new_slot, S_IWUSR, NULL, pcistub_slot_add);
> > +static DRIVER_ATTR_WO(new_slot);
> >  
> > -static ssize_t pcistub_slot_remove(struct device_driver *drv, const char 
> > *buf,
> > -  size_t count)
> > +static ssize_t remove_slot_store(struct device_driver *drv, const char 
> > *buf,
> > +size_t count)
> >  {
> > int domain, bus, slot, func;
> > int err;
> > @@ -1208,9 +1208,9 @@ out:
> > err = count;
> > return err;
> >  }
> > -static DRIVER_ATTR(remove_slot, S_IWUSR, NULL, pcistub_slot_remove);
> > +static DRIVER_ATTR_WO(remove_slot);
> >  
> > -static ssize_t pcistub_slot_show(struct device_driver *drv, char *buf)
> > +static ssize_t slots_show(struct device_driver *drv, char *buf)
> >  {
> > struct pcistub_device_id *pci_dev_id;
> > size_t count = 0;
> > @@ -1231,9 +1231,9 @@ static ssize_t pcistub_slot_show(struct
> >  
> > return count;
> >  }
> > -static DRIVER_ATTR(slots, S_IRUSR, pcistub_slot_show, NULL);
> > +static DRIVER_ATTR_RO(slots);
> >  
> > -static ssize_t pcistub_irq_handler_show(struct device_driver *drv, char 
> > *buf)
> > +static ssize_t irq_handlers_show(struct device_driver *drv, char *buf)
> >  {
> > struct pcistub_device *psdev;
> > struct xen_pcibk_dev_data *dev_data;
> > @@ -1260,11 +1260,10 @@ static ssize_t pcistub_irq_handler_show(
> > spin_unlock_irqrestore(_devices_lock, flags);
> > return count;
> >  }
> > -static DRIVER_ATTR(irq_handlers, S_IRUSR, pcistub_irq_handler_show, NULL);
> > +static DRIVER_ATTR_RO(irq_handlers);
> >  
> > -static ssize_t pcistub_irq_handler_switch(struct device_driver *drv,
> > - const char *buf,
> > - size_t count)
> > +static ssize_t irq_handler_state_store(struct device_driver *drv,
> > +  const char *buf, size_t count)
> >  {
> > struct pcistub_device *psdev;
> > struct xen_pcibk_dev_data *dev_data;
> > @@ -1301,11 +1300,10 @@ out:
> > err = count;
> > return err;
> >  }
> > -static DRIVER_ATTR(irq_handler_state, S_IWUSR, NULL,
> > -  pcistub_irq_handler_switch);
> > +static DRIVER_ATTR_WO(irq_handler_state);
> >  
> > -static ssize_t pcistub_quirk_add(struct device_driver *drv, const char 
> > *buf,
> > -size_t count)
> > +static ssize_t quirks_add(struct device_driver *drv, const char *buf,
> > + size_t count)
> 
> Shouldn't this be named quirks_store()?

Odd, yes, it should, I don't know how my build-testing didn't catch
that, sorry.

Let me go make a v2...

thanks,

greg k-h

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/hvm: Drop more remains of the PVHv1 implementation

2017-07-19 Thread Andrew Cooper
On 19/07/17 15:12, Roger Pau Monné wrote:
> On Wed, Jul 19, 2017 at 02:27:31PM +0100, Andrew Cooper wrote:
>> These functions don't need is_hvm_{vcpu,domain}() predicates.
>>
>> hvmop_set_evtchn_upcall_vector() does need the predicate to prevent a PV
>> caller accessing the hvm union, but swap the copy_from_guest() and
>> is_hvm_domain() predicate to avoid reading the hypercall parameter if we not
>> going to use it.
> IC, certain HVMOPs are available to PV guests (ie: the control domain).

At the very least, the control domain needs to use HVMOP_getparam for
construction and migration purposes.  As a result, PV guests have always
had blanket reign on HVMOPs.

>
>> Signed-off-by: Andrew Cooper 
> Reviewed-by: Roger Pau Monné 
>
> Thanks. Just one style nit.
>
>> CC: George Dunlap 
>> CC: Jan Beulich 
>> CC: Wei Liu 
>> CC: Paul Durrant 
>> CC: Roger Pau Monné 
>> ---
>>  xen/arch/x86/hvm/hvm.c | 15 ++-
>>  1 file changed, 6 insertions(+), 9 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
>> index 8145385..4fef616 100644
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -506,8 +506,7 @@ void hvm_do_resume(struct vcpu *v)
>>  {
>>  check_wakeup_from_wait();
>>  
>> -if ( is_hvm_domain(v->domain) )
>> -pt_restore_timer(v);
>> +pt_restore_timer(v);
>>  
>>  if ( !handle_hvm_io_completion(v) )
>>  return;
>> @@ -1544,8 +1543,7 @@ void hvm_vcpu_destroy(struct vcpu *v)
>>  tasklet_kill(>arch.hvm_vcpu.assert_evtchn_irq_tasklet);
>>  hvm_funcs.vcpu_destroy(v);
>>  
>> -if ( is_hvm_vcpu(v) )
>> -vlapic_destroy(v);
>> +vlapic_destroy(v);
>>  
>>  hvm_vcpu_cacheattr_destroy(v);
>>  }
>> @@ -1711,7 +1709,6 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned 
>> long gla,
>>   * - newer Windows (like Server 2012) for HPET accesses.
>>   */
>>  if ( !nestedhvm_vcpu_in_guestmode(curr)
>> - && is_hvm_domain(currd)
>>   && hvm_mmio_internal(gpa) )
> Can this be moved to the previous line?

Will fix on commit.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/hvm: Drop more remains of the PVHv1 implementation

2017-07-19 Thread Roger Pau Monné
On Wed, Jul 19, 2017 at 02:27:31PM +0100, Andrew Cooper wrote:
> These functions don't need is_hvm_{vcpu,domain}() predicates.
> 
> hvmop_set_evtchn_upcall_vector() does need the predicate to prevent a PV
> caller accessing the hvm union, but swap the copy_from_guest() and
> is_hvm_domain() predicate to avoid reading the hypercall parameter if we not
> going to use it.

IC, certain HVMOPs are available to PV guests (ie: the control domain).

> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Roger Pau Monné 

Thanks. Just one style nit.

> CC: George Dunlap 
> CC: Jan Beulich 
> CC: Wei Liu 
> CC: Paul Durrant 
> CC: Roger Pau Monné 
> ---
>  xen/arch/x86/hvm/hvm.c | 15 ++-
>  1 file changed, 6 insertions(+), 9 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 8145385..4fef616 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -506,8 +506,7 @@ void hvm_do_resume(struct vcpu *v)
>  {
>  check_wakeup_from_wait();
>  
> -if ( is_hvm_domain(v->domain) )
> -pt_restore_timer(v);
> +pt_restore_timer(v);
>  
>  if ( !handle_hvm_io_completion(v) )
>  return;
> @@ -1544,8 +1543,7 @@ void hvm_vcpu_destroy(struct vcpu *v)
>  tasklet_kill(>arch.hvm_vcpu.assert_evtchn_irq_tasklet);
>  hvm_funcs.vcpu_destroy(v);
>  
> -if ( is_hvm_vcpu(v) )
> -vlapic_destroy(v);
> +vlapic_destroy(v);
>  
>  hvm_vcpu_cacheattr_destroy(v);
>  }
> @@ -1711,7 +1709,6 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned 
> long gla,
>   * - newer Windows (like Server 2012) for HPET accesses.
>   */
>  if ( !nestedhvm_vcpu_in_guestmode(curr)
> - && is_hvm_domain(currd)
>   && hvm_mmio_internal(gpa) )

Can this be moved to the previous line?

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] x86: PIE support and option to extend KASLR randomization

2017-07-19 Thread Christopher Lameter
On Tue, 18 Jul 2017, Thomas Garnier wrote:

> Performance/Size impact:
> Hackbench (50% and 1600% loads):
>  - PIE enabled: 7% to 8% on half load, 10% on heavy load.
> slab_test (average of 10 runs):
>  - PIE enabled: 3% to 4%
> Kernbench (average of 10 Half and Optimal runs):
>  - PIE enabled: 5% to 6%
>
> Size of vmlinux (Ubuntu configuration):
>  File size:
>  - PIE disabled: 472928672 bytes (-0.000169% from baseline)
>  - PIE enabled: 216878461 bytes (-54.14% from baseline)

Maybe we need something like CONFIG_PARANOIA so that we can determine at
build time how much performance we want to sacrifice for performance?

Its going to be difficult to understand what all these hardening config
options do.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen-blkfront: fix mq start/stop race

2017-07-19 Thread Konrad Rzeszutek Wilk
On Wed, Jul 19, 2017 at 03:51:48PM +0800, Junxiao Bi wrote:
> Hi Konrad,
> 
> On 07/19/2017 03:37 PM, Roger Pau Monné wrote:
> > On Wed, Jul 19, 2017 at 09:19:49AM +0800, Junxiao Bi wrote:
> >> Hi Roger,
> >>
> >> On 06/23/2017 08:57 PM, Roger Pau Monné wrote:
> >>> On Thu, Jun 22, 2017 at 09:36:52AM +0800, Junxiao Bi wrote:
>  When ring buf full, hw queue will be stopped. While blkif interrupt 
>  consume
>  request and make free space in ring buf, hw queue will be started again.
>  But since start queue is protected by spin lock while stop not, that will
>  cause a race.
> 
>  interrupt:  process:
>  blkif_interrupt()   blkif_queue_rq()
>   kick_pending_request_queues_locked()
>    blk_mq_start_stopped_hw_queues()
> clear_bit(BLK_MQ_S_STOPPED, >state)
>   
>  blk_mq_stop_hw_queue(hctx)
> blk_mq_run_hw_queue(hctx, async)
> 
>  If ring buf is made empty in this case, interrupt will never come, then 
>  the
>  hw queue will be stopped forever, all processes waiting for the pending 
>  io
>  in the queue will hung.
> 
>  Signed-off-by: Junxiao Bi 
>  Reviewed-by: Ankur Arora 
> >>>
> >>> Acked-by: Roger Pau Monné 
> >> Looks patch not in mainline. Can you please help merge it?
> > 
> > I'm afraid this needs to be done by Konrad or one of the Linux
> > maintainers, I don't have an account on kernel.org in order to send
> > pull requests to Jens.
> Can you pls help merge it?

Could you kindly repost it with the updated tags _and_ against Linus's latest
branch?

I get:
[konrad@char linux]$ git am -s < /tmp/a
Applying: xen-blkfront: fix mq start/stop race
error: patch failed: drivers/block/xen-blkfront.c:912
error: drivers/block/xen-blkfront.c: patch does not apply
Patch failed at 0001 xen-blkfront: fix mq start/stop race
The copy of the patch that failed is found in: .git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


> 
> Thanks,
> Junxiao.
> > 
> > Roger.
> > 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 22/22] x86/kaslr: Add option to extend KASLR range from 1GB to 3GB

2017-07-19 Thread Baoquan He
On 07/19/17 at 08:10pm, Baoquan He wrote:
> On 07/18/17 at 03:33pm, Thomas Garnier wrote:
> 
> >  quiet_cmd_relocs = RELOCS  $@
> >cmd_relocs = $(CMD_RELOCS) $< > $@;$(CMD_RELOCS) --abs-relocs $<
> >  $(obj)/vmlinux.relocs: vmlinux FORCE
> > diff --git a/arch/x86/boot/compressed/misc.c 
> > b/arch/x86/boot/compressed/misc.c
> > index a0838ab929f2..0a0c80ab1842 100644
> > --- a/arch/x86/boot/compressed/misc.c
> > +++ b/arch/x86/boot/compressed/misc.c
> > @@ -170,10 +170,18 @@ void __puthex(unsigned long value)
> >  }
> >  
> >  #if CONFIG_X86_NEED_RELOCS
> > +
> > +/* Large randomization go lower than -2G and use large relocation table */
> > +#ifdef CONFIG_RANDOMIZE_BASE_LARGE
> > +typedef long rel_t;
> > +#else
> > +typedef int rel_t;
> > +#endif
> > +
> >  static void handle_relocations(void *output, unsigned long output_len,
> >unsigned long virt_addr)
> >  {
> > -   int *reloc;
> > +   rel_t *reloc;
> > unsigned long delta, map, ptr;
> > unsigned long min_addr = (unsigned long)output;
> > unsigned long max_addr = min_addr + (VO___bss_start - VO__text);
> > diff --git a/arch/x86/include/asm/page_64_types.h 
> > b/arch/x86/include/asm/page_64_types.h
> > index 3f5f08b010d0..6b65f846dd64 100644
> > --- a/arch/x86/include/asm/page_64_types.h
> > +++ b/arch/x86/include/asm/page_64_types.h
> > @@ -48,7 +48,11 @@
> >  #define __PAGE_OFFSET   __PAGE_OFFSET_BASE
> >  #endif /* CONFIG_RANDOMIZE_MEMORY */
> >  
> > +#ifdef CONFIG_RANDOMIZE_BASE_LARGE
> > +#define __START_KERNEL_map _AC(0x, UL)
> > +#else
> >  #define __START_KERNEL_map _AC(0x8000, UL)
> > +#endif /* CONFIG_RANDOMIZE_BASE_LARGE */
> >  
> >  /* See Documentation/x86/x86_64/mm.txt for a description of the memory 
> > map. */
> >  #ifdef CONFIG_X86_5LEVEL
> > @@ -65,9 +69,14 @@
> >   * 512MiB by default, leaving 1.5GiB for modules once the page tables
> >   * are fully set up. If kernel ASLR is configured, it can extend the
> >   * kernel page table mapping, reducing the size of the modules area.
> > + * On PIE, we relocate the binary 2G lower so add this extra space.
> >   */
> >  #if defined(CONFIG_RANDOMIZE_BASE)
> > +#ifdef CONFIG_RANDOMIZE_BASE_LARGE
> > +#define KERNEL_IMAGE_SIZE  (_AC(3, UL) * 1024 * 1024 * 1024)
> > +#else
> >  #define KERNEL_IMAGE_SIZE  (1024 * 1024 * 1024)
> > +#endif
> >  #else
> >  #define KERNEL_IMAGE_SIZE  (512 * 1024 * 1024)
> >  #endif
> > diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> > index 4103e90ff128..235c3f7b46c7 100644
> > --- a/arch/x86/kernel/head64.c
> > +++ b/arch/x86/kernel/head64.c
> > @@ -39,6 +39,7 @@ static unsigned int __initdata next_early_pgt;
> >  pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | 
> > _PAGE_NX);
> >  
> >  #define __head __section(.head.text)
> > +#define pud_count(x)   (((x + (PUD_SIZE - 1)) & ~(PUD_SIZE - 1)) >> 
> > PUD_SHIFT)
> >  
> >  static void __head *fixup_pointer(void *ptr, unsigned long physaddr)
> >  {
> > @@ -54,6 +55,8 @@ unsigned long _text_offset = (unsigned long)(_text - 
> > __START_KERNEL_map);
> >  void __head notrace __startup_64(unsigned long physaddr)
> >  {
> > unsigned long load_delta, *p;
> > +   unsigned long level3_kernel_start, level3_kernel_count;
> > +   unsigned long level3_fixmap_start;
> > pgdval_t *pgd;
> > p4dval_t *p4d;
> > pudval_t *pud;
> > @@ -74,6 +77,11 @@ void __head notrace __startup_64(unsigned long physaddr)
> > if (load_delta & ~PMD_PAGE_MASK)
> > for (;;);
> >  
> > +   /* Look at the randomization spread to adapt page table used */
> > +   level3_kernel_start = pud_index(__START_KERNEL_map);
> > +   level3_kernel_count = pud_count(KERNEL_IMAGE_SIZE);
> > +   level3_fixmap_start = level3_kernel_start + level3_kernel_count;
> > +
> > /* Fixup the physical addresses in the page table */
> >  
> > pgd = fixup_pointer(_top_pgt, physaddr);
> > @@ -85,8 +93,9 @@ void __head notrace __startup_64(unsigned long physaddr)
> > }
> >  
> > pud = fixup_pointer(_kernel_pgt, physaddr);
> > -   pud[510] += load_delta;
> > -   pud[511] += load_delta;
> > +   for (i = 0; i < level3_kernel_count; i++)
> > +   pud[level3_kernel_start + i] += load_delta;
> > +   pud[level3_fixmap_start] += load_delta;
> >  
> > pmd = fixup_pointer(level2_fixmap_pgt, physaddr);
> > pmd[506] += load_delta;
> > @@ -137,7 +146,7 @@ void __head notrace __startup_64(unsigned long physaddr)
> >  */
> >  
> > pmd = fixup_pointer(level2_kernel_pgt, physaddr);
> > -   for (i = 0; i < PTRS_PER_PMD; i++) {
> > +   for (i = 0; i < PTRS_PER_PMD * level3_kernel_count; i++) {
> > if (pmd[i] & _PAGE_PRESENT)
> > pmd[i] += load_delta;
> 
> Wow, this is dangerous. Three pud entries of level3_kernel_pgt all point
> to level2_kernel_pgt, it's out of bound of level2_kernel_pgt and
> overwrite the next data.
> 
> And if only use one page for level2_kernel_pgt, and 

Re: [Xen-devel] [OSSTEST PATCH v12 10/21] ts-openstack-deploy: Increase open fd limit for RabbitMQ

2017-07-19 Thread Ian Jackson
Anthony PERARD writes ("Re: [OSSTEST PATCH v12 10/21] ts-openstack-deploy: 
Increase open fd limit for RabbitMQ"):
> I've created a bug report for openstack.
> https://bugs.launchpad.net/devstack/+bug/1703651

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/hvm: Drop more remains of the PVHv1 implementation

2017-07-19 Thread Wei Liu
On Wed, Jul 19, 2017 at 02:27:31PM +0100, Andrew Cooper wrote:
> These functions don't need is_hvm_{vcpu,domain}() predicates.
> 
> hvmop_set_evtchn_upcall_vector() does need the predicate to prevent a PV
> caller accessing the hvm union, but swap the copy_from_guest() and
> is_hvm_domain() predicate to avoid reading the hypercall parameter if we not
> going to use it.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/hvm: Drop more remains of the PVHv1 implementation

2017-07-19 Thread Paul Durrant
> -Original Message-
> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Sent: 19 July 2017 14:28
> To: Xen-devel 
> Cc: Andrew Cooper ; George Dunlap
> ; Jan Beulich ; Wei Liu
> ; Paul Durrant ; Roger Pau
> Monne 
> Subject: [PATCH] x86/hvm: Drop more remains of the PVHv1 implementation
> 
> These functions don't need is_hvm_{vcpu,domain}() predicates.
> 
> hvmop_set_evtchn_upcall_vector() does need the predicate to prevent a
> PV
> caller accessing the hvm union, but swap the copy_from_guest() and
> is_hvm_domain() predicate to avoid reading the hypercall parameter if we
> not
> going to use it.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Paul Durrant 

> ---
> CC: George Dunlap 
> CC: Jan Beulich 
> CC: Wei Liu 
> CC: Paul Durrant 
> CC: Roger Pau Monné 
> ---
>  xen/arch/x86/hvm/hvm.c | 15 ++-
>  1 file changed, 6 insertions(+), 9 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 8145385..4fef616 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -506,8 +506,7 @@ void hvm_do_resume(struct vcpu *v)
>  {
>  check_wakeup_from_wait();
> 
> -if ( is_hvm_domain(v->domain) )
> -pt_restore_timer(v);
> +pt_restore_timer(v);
> 
>  if ( !handle_hvm_io_completion(v) )
>  return;
> @@ -1544,8 +1543,7 @@ void hvm_vcpu_destroy(struct vcpu *v)
>  tasklet_kill(>arch.hvm_vcpu.assert_evtchn_irq_tasklet);
>  hvm_funcs.vcpu_destroy(v);
> 
> -if ( is_hvm_vcpu(v) )
> -vlapic_destroy(v);
> +vlapic_destroy(v);
> 
>  hvm_vcpu_cacheattr_destroy(v);
>  }
> @@ -1711,7 +1709,6 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
> unsigned long gla,
>   * - newer Windows (like Server 2012) for HPET accesses.
>   */
>  if ( !nestedhvm_vcpu_in_guestmode(curr)
> - && is_hvm_domain(currd)
>   && hvm_mmio_internal(gpa) )
>  {
>  if ( !handle_mmio_with_translation(gla, gpa >> PAGE_SHIFT, npfec) )
> @@ -3139,7 +3136,7 @@ static enum hvm_copy_result __hvm_copy(
>   * - 32-bit WinXP (& older Windows) on AMD CPUs for LAPIC accesses,
>   * - newer Windows (like Server 2012) for HPET accesses.
>   */
> -if ( v == current && is_hvm_vcpu(v)
> +if ( v == current
>   && !nestedhvm_vcpu_in_guestmode(v)
>   && hvm_mmio_internal(gpa) )
>  return HVMCOPY_bad_gfn_to_mfn;
> @@ -3971,12 +3968,12 @@ static int hvmop_set_evtchn_upcall_vector(
>  struct domain *d = current->domain;
>  struct vcpu *v;
> 
> -if ( copy_from_guest(, uop, 1) )
> -return -EFAULT;
> -
>  if ( !is_hvm_domain(d) )
>  return -EINVAL;
> 
> +if ( copy_from_guest(, uop, 1) )
> +return -EFAULT;
> +
>  if ( op.vector < 0x10 )
>  return -EINVAL;
> 
> --
> 2.1.4

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/6] x86/vpmu: Use vmx_{clear, set}_msr_intercept() rather than opencoding them

2017-07-19 Thread Boris Ostrovsky
On 07/19/2017 07:57 AM, Andrew Cooper wrote:
> No functional change.
>
> Signed-off-by: Andrew Cooper 

Reviewed-by: Boris Ostrovsky 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/hvm: Drop more remains of the PVHv1 implementation

2017-07-19 Thread Andrew Cooper
These functions don't need is_hvm_{vcpu,domain}() predicates.

hvmop_set_evtchn_upcall_vector() does need the predicate to prevent a PV
caller accessing the hvm union, but swap the copy_from_guest() and
is_hvm_domain() predicate to avoid reading the hypercall parameter if we not
going to use it.

Signed-off-by: Andrew Cooper 
---
CC: George Dunlap 
CC: Jan Beulich 
CC: Wei Liu 
CC: Paul Durrant 
CC: Roger Pau Monné 
---
 xen/arch/x86/hvm/hvm.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 8145385..4fef616 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -506,8 +506,7 @@ void hvm_do_resume(struct vcpu *v)
 {
 check_wakeup_from_wait();
 
-if ( is_hvm_domain(v->domain) )
-pt_restore_timer(v);
+pt_restore_timer(v);
 
 if ( !handle_hvm_io_completion(v) )
 return;
@@ -1544,8 +1543,7 @@ void hvm_vcpu_destroy(struct vcpu *v)
 tasklet_kill(>arch.hvm_vcpu.assert_evtchn_irq_tasklet);
 hvm_funcs.vcpu_destroy(v);
 
-if ( is_hvm_vcpu(v) )
-vlapic_destroy(v);
+vlapic_destroy(v);
 
 hvm_vcpu_cacheattr_destroy(v);
 }
@@ -1711,7 +1709,6 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long 
gla,
  * - newer Windows (like Server 2012) for HPET accesses.
  */
 if ( !nestedhvm_vcpu_in_guestmode(curr)
- && is_hvm_domain(currd)
  && hvm_mmio_internal(gpa) )
 {
 if ( !handle_mmio_with_translation(gla, gpa >> PAGE_SHIFT, npfec) )
@@ -3139,7 +3136,7 @@ static enum hvm_copy_result __hvm_copy(
  * - 32-bit WinXP (& older Windows) on AMD CPUs for LAPIC accesses,
  * - newer Windows (like Server 2012) for HPET accesses.
  */
-if ( v == current && is_hvm_vcpu(v)
+if ( v == current
  && !nestedhvm_vcpu_in_guestmode(v)
  && hvm_mmio_internal(gpa) )
 return HVMCOPY_bad_gfn_to_mfn;
@@ -3971,12 +3968,12 @@ static int hvmop_set_evtchn_upcall_vector(
 struct domain *d = current->domain;
 struct vcpu *v;
 
-if ( copy_from_guest(, uop, 1) )
-return -EFAULT;
-
 if ( !is_hvm_domain(d) )
 return -EINVAL;
 
+if ( copy_from_guest(, uop, 1) )
+return -EFAULT;
+
 if ( op.vector < 0x10 )
 return -EINVAL;
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST PATCH v12 10/21] ts-openstack-deploy: Increase open fd limit for RabbitMQ

2017-07-19 Thread Anthony PERARD
On Wed, Jul 19, 2017 at 02:05:44PM +0100, Ian Jackson wrote:
> Anthony PERARD writes ("Re: [OSSTEST PATCH v12 10/21] ts-openstack-deploy: 
> Increase open fd limit for RabbitMQ"):
> > On Wed, Jul 19, 2017 at 11:28:29AM +0100, Ian Jackson wrote:
> > > Anthony PERARD writes ("[OSSTEST PATCH v12 10/21] ts-openstack-deploy: 
> > > Increase open fd limit for RabbitMQ"):
> > > > +target_putfilecontents_root_stash($ho, 100,
> > > > +< > > > +ulimit -n 65536
> > > 
> > > Is the lack of this not an upstream bug of some kind ?
> > 
> > I don't know.
> 
> OK, then.  I think it probably is.  Feel free to try to convince me
> otherwise...
> 
> > FIY, when rabbitmq is install on debian, we have:
> > cat /etc/default/rabbitmq-server
> 
> > # This file is sourced by /etc/init.d/rabbitmq-server. Its primary
> > # reason for existing is to allow adjustment of system limits for the
> > # rabbitmq-server process.
> > #
> > # Maximum number of open file handles. This will need to be increased
> > # to handle many simultaneous connections. Refer to the system
> > # documentation for ulimit (in man bash) for more information.
> > #
> > #ulimit -n 1024
> 
> That's rather mysterious.
> 
> > > And, for osstest, why 65536 and not, say, "unlimited" ?
> > 
> > I've just reproduce the number from the openstack ci loop. Which is the
> > found in rabbitmq-server.service, which I think is found in ubuntu
> > package of rabbitmq.
> 
> None of this seems to explain why this isn't a configuration which
> should be supplied or arranged by upstream.
> 
> (I noticed when looking at my previous reviews that I made a similar
> point last time I saw this hunk...)

But the hunk is different, I've created a bug report for openstack.
https://bugs.launchpad.net/devstack/+bug/1703651

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen: xen-pciback: remove DRIVER_ATTR() usage

2017-07-19 Thread Juergen Gross
On 19/07/17 14:58, Greg KH wrote:
> From: Greg Kroah-Hartman 
> 
> It's better to be explicit and use the DRIVER_ATTR_RW() and
> DRIVER_ATTR_RO() macros when defining a driver's sysfs file.
> 
> Bonus is this fixes up a checkpatch.pl warning.
> 
> This is part of a series to drop DRIVER_ATTR() from the tree entirely.
> 
> Cc: Boris Ostrovsky 
> Cc: Juergen Gross 
> Signed-off-by: Greg Kroah-Hartman 
> 
> ---
>  drivers/xen/xen-pciback/pci_stub.c |   44 
> -
>  1 file changed, 20 insertions(+), 24 deletions(-)
> 
> 
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -1172,8 +1172,8 @@ out:
>   return err;
>  }
>  
> -static ssize_t pcistub_slot_add(struct device_driver *drv, const char *buf,
> - size_t count)
> +static ssize_t new_slot_store(struct device_driver *drv, const char *buf,
> +   size_t count)
>  {
>   int domain, bus, slot, func;
>   int err;
> @@ -1189,10 +1189,10 @@ out:
>   err = count;
>   return err;
>  }
> -static DRIVER_ATTR(new_slot, S_IWUSR, NULL, pcistub_slot_add);
> +static DRIVER_ATTR_WO(new_slot);
>  
> -static ssize_t pcistub_slot_remove(struct device_driver *drv, const char 
> *buf,
> -size_t count)
> +static ssize_t remove_slot_store(struct device_driver *drv, const char *buf,
> +  size_t count)
>  {
>   int domain, bus, slot, func;
>   int err;
> @@ -1208,9 +1208,9 @@ out:
>   err = count;
>   return err;
>  }
> -static DRIVER_ATTR(remove_slot, S_IWUSR, NULL, pcistub_slot_remove);
> +static DRIVER_ATTR_WO(remove_slot);
>  
> -static ssize_t pcistub_slot_show(struct device_driver *drv, char *buf)
> +static ssize_t slots_show(struct device_driver *drv, char *buf)
>  {
>   struct pcistub_device_id *pci_dev_id;
>   size_t count = 0;
> @@ -1231,9 +1231,9 @@ static ssize_t pcistub_slot_show(struct
>  
>   return count;
>  }
> -static DRIVER_ATTR(slots, S_IRUSR, pcistub_slot_show, NULL);
> +static DRIVER_ATTR_RO(slots);
>  
> -static ssize_t pcistub_irq_handler_show(struct device_driver *drv, char *buf)
> +static ssize_t irq_handlers_show(struct device_driver *drv, char *buf)
>  {
>   struct pcistub_device *psdev;
>   struct xen_pcibk_dev_data *dev_data;
> @@ -1260,11 +1260,10 @@ static ssize_t pcistub_irq_handler_show(
>   spin_unlock_irqrestore(_devices_lock, flags);
>   return count;
>  }
> -static DRIVER_ATTR(irq_handlers, S_IRUSR, pcistub_irq_handler_show, NULL);
> +static DRIVER_ATTR_RO(irq_handlers);
>  
> -static ssize_t pcistub_irq_handler_switch(struct device_driver *drv,
> -   const char *buf,
> -   size_t count)
> +static ssize_t irq_handler_state_store(struct device_driver *drv,
> +const char *buf, size_t count)
>  {
>   struct pcistub_device *psdev;
>   struct xen_pcibk_dev_data *dev_data;
> @@ -1301,11 +1300,10 @@ out:
>   err = count;
>   return err;
>  }
> -static DRIVER_ATTR(irq_handler_state, S_IWUSR, NULL,
> -pcistub_irq_handler_switch);
> +static DRIVER_ATTR_WO(irq_handler_state);
>  
> -static ssize_t pcistub_quirk_add(struct device_driver *drv, const char *buf,
> -  size_t count)
> +static ssize_t quirks_add(struct device_driver *drv, const char *buf,
> +   size_t count)

Shouldn't this be named quirks_store()?


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xenconsole: Add pipe option

2017-07-19 Thread Ian Jackson
Felix Schmoll writes ("Re: [PATCH v2] xenconsole: Add pipe option"):
> As there is already an interactive variable in the code, it seems
> like a rather strange overloading to call the option interactive
> that directly affects a different variable (currently pipe). The
> name seems to make sense however,

Right, I think the UI should be driven by the needs of the human
who'll use it, not by the variable names in the code.

> so I propose to simplify the code
> by removing the isatty-check from line 349 and moving it to line
> 472, resulting in the following:
> 
> 472 if (isatty(STDIN_FILENO) && isatty(STDOUT_FILENO)) {
> 473 interactive = 1;
> 474 init_term(STDIN_FILENO, _old_attr);
> 475 atexit(restore_term_stdin); /* if this fails, oh dear */ 
> 476 }
> 
> Then the interactive-variable is free for my purposes, so there is no need to
> introduce a new variable at all.
> 
> Or is there anything that requires the check to be at the top?

I doubt it.  Doing it after the option parsing loop would be much more
conventional.

> As the new commit message I suggest:
> 
> Add option to xenconsole to always forward console input
> 
> Currently the default behaviour of the xenconsole client is to
> ignore any input to stdin, unless stdin and stdout are both
> ttys. The new option allows to manually overwrite this, causing the
> client to forward input regardless.

SGTM.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST PATCH v12 10/21] ts-openstack-deploy: Increase open fd limit for RabbitMQ

2017-07-19 Thread Ian Jackson
Anthony PERARD writes ("Re: [OSSTEST PATCH v12 10/21] ts-openstack-deploy: 
Increase open fd limit for RabbitMQ"):
> On Wed, Jul 19, 2017 at 11:28:29AM +0100, Ian Jackson wrote:
> > Anthony PERARD writes ("[OSSTEST PATCH v12 10/21] ts-openstack-deploy: 
> > Increase open fd limit for RabbitMQ"):
> > > +target_putfilecontents_root_stash($ho, 100,
> > > +< > > +ulimit -n 65536
> > 
> > Is the lack of this not an upstream bug of some kind ?
> 
> I don't know.

OK, then.  I think it probably is.  Feel free to try to convince me
otherwise...

> FIY, when rabbitmq is install on debian, we have:
> cat /etc/default/rabbitmq-server

> # This file is sourced by /etc/init.d/rabbitmq-server. Its primary
> # reason for existing is to allow adjustment of system limits for the
> # rabbitmq-server process.
> #
> # Maximum number of open file handles. This will need to be increased
> # to handle many simultaneous connections. Refer to the system
> # documentation for ulimit (in man bash) for more information.
> #
> #ulimit -n 1024

That's rather mysterious.

> > And, for osstest, why 65536 and not, say, "unlimited" ?
> 
> I've just reproduce the number from the openstack ci loop. Which is the
> found in rabbitmq-server.service, which I think is found in ubuntu
> package of rabbitmq.

None of this seems to explain why this isn't a configuration which
should be supplied or arranged by upstream.

(I noticed when looking at my previous reviews that I made a similar
point last time I saw this hunk...)

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] xen: xen-pciback: remove DRIVER_ATTR() usage

2017-07-19 Thread Greg KH
From: Greg Kroah-Hartman 

It's better to be explicit and use the DRIVER_ATTR_RW() and
DRIVER_ATTR_RO() macros when defining a driver's sysfs file.

Bonus is this fixes up a checkpatch.pl warning.

This is part of a series to drop DRIVER_ATTR() from the tree entirely.

Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/xen/xen-pciback/pci_stub.c |   44 -
 1 file changed, 20 insertions(+), 24 deletions(-)


--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -1172,8 +1172,8 @@ out:
return err;
 }
 
-static ssize_t pcistub_slot_add(struct device_driver *drv, const char *buf,
-   size_t count)
+static ssize_t new_slot_store(struct device_driver *drv, const char *buf,
+ size_t count)
 {
int domain, bus, slot, func;
int err;
@@ -1189,10 +1189,10 @@ out:
err = count;
return err;
 }
-static DRIVER_ATTR(new_slot, S_IWUSR, NULL, pcistub_slot_add);
+static DRIVER_ATTR_WO(new_slot);
 
-static ssize_t pcistub_slot_remove(struct device_driver *drv, const char *buf,
-  size_t count)
+static ssize_t remove_slot_store(struct device_driver *drv, const char *buf,
+size_t count)
 {
int domain, bus, slot, func;
int err;
@@ -1208,9 +1208,9 @@ out:
err = count;
return err;
 }
-static DRIVER_ATTR(remove_slot, S_IWUSR, NULL, pcistub_slot_remove);
+static DRIVER_ATTR_WO(remove_slot);
 
-static ssize_t pcistub_slot_show(struct device_driver *drv, char *buf)
+static ssize_t slots_show(struct device_driver *drv, char *buf)
 {
struct pcistub_device_id *pci_dev_id;
size_t count = 0;
@@ -1231,9 +1231,9 @@ static ssize_t pcistub_slot_show(struct
 
return count;
 }
-static DRIVER_ATTR(slots, S_IRUSR, pcistub_slot_show, NULL);
+static DRIVER_ATTR_RO(slots);
 
-static ssize_t pcistub_irq_handler_show(struct device_driver *drv, char *buf)
+static ssize_t irq_handlers_show(struct device_driver *drv, char *buf)
 {
struct pcistub_device *psdev;
struct xen_pcibk_dev_data *dev_data;
@@ -1260,11 +1260,10 @@ static ssize_t pcistub_irq_handler_show(
spin_unlock_irqrestore(_devices_lock, flags);
return count;
 }
-static DRIVER_ATTR(irq_handlers, S_IRUSR, pcistub_irq_handler_show, NULL);
+static DRIVER_ATTR_RO(irq_handlers);
 
-static ssize_t pcistub_irq_handler_switch(struct device_driver *drv,
- const char *buf,
- size_t count)
+static ssize_t irq_handler_state_store(struct device_driver *drv,
+  const char *buf, size_t count)
 {
struct pcistub_device *psdev;
struct xen_pcibk_dev_data *dev_data;
@@ -1301,11 +1300,10 @@ out:
err = count;
return err;
 }
-static DRIVER_ATTR(irq_handler_state, S_IWUSR, NULL,
-  pcistub_irq_handler_switch);
+static DRIVER_ATTR_WO(irq_handler_state);
 
-static ssize_t pcistub_quirk_add(struct device_driver *drv, const char *buf,
-size_t count)
+static ssize_t quirks_add(struct device_driver *drv, const char *buf,
+ size_t count)
 {
int domain, bus, slot, func, reg, size, mask;
int err;
@@ -1323,7 +1321,7 @@ out:
return err;
 }
 
-static ssize_t pcistub_quirk_show(struct device_driver *drv, char *buf)
+static ssize_t quirks_show(struct device_driver *drv, char *buf)
 {
int count = 0;
unsigned long flags;
@@ -1366,11 +1364,10 @@ out:
 
return count;
 }
-static DRIVER_ATTR(quirks, S_IRUSR | S_IWUSR, pcistub_quirk_show,
-  pcistub_quirk_add);
+static DRIVER_ATTR_RW(quirks);
 
-static ssize_t permissive_add(struct device_driver *drv, const char *buf,
- size_t count)
+static ssize_t permissive_store(struct device_driver *drv, const char *buf,
+   size_t count)
 {
int domain, bus, slot, func;
int err;
@@ -1431,8 +1428,7 @@ static ssize_t permissive_show(struct de
spin_unlock_irqrestore(_devices_lock, flags);
return count;
 }
-static DRIVER_ATTR(permissive, S_IRUSR | S_IWUSR, permissive_show,
-  permissive_add);
+static DRIVER_ATTR_RW(permissive);
 
 static void pcistub_exit(void)
 {

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST PATCH v12 20/21] Create a flight to test OpenStack with xen-unstable and libvirt

2017-07-19 Thread Ian Jackson
Anthony PERARD writes ("[OSSTEST PATCH v12 20/21] Create a flight to test 
OpenStack with xen-unstable and libvirt"):
> This patch creates a flight "openstack-ocata", with those jobs:
...

I think it would help if you split apart the changes to make-flight
and mfi-* from the ones to cr-daily-* and ap-*.

> OpenStack have many different repo which should be in sync, so we should
> attempd to grab the revisions of the stable branch of every OpenStack
> tree, for now, the runvars REVISION_* of tree other than nova is set to
> "origin/stable/ocata", except Tempest does not have stable branch and
> should be able to test any OpenStack version.

Do you intend to provide a version of this patch which maintains a
tested branch for all of these different trees ?

And now some details:

> +openstack-ocata)
> +os_release="${branch##*-}"

Can you please call this variable os_release ?


I think for now we don't have the capacity to add openstack testing on
ARM.  Can you please arrange to suppress those, in a way that means we
can add them in later ?  Best would be a separate commit which we can
later revert.

> +$os_runvars \

Again, please use "openstack_..." not "os_...".  "os" means "operating
system" to me.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC] tools: Drop xc_cpuid_check() and bindings

2017-07-19 Thread Boris Ostrovsky
On 07/19/2017 06:43 AM, Juergen Gross wrote:
> On 19/07/17 12:32, Wei Liu wrote:
>> On Mon, Jul 17, 2017 at 01:38:03PM +0100, Andrew Cooper wrote:
>>> There are no current users which I can locate.  One piece of xend which 
>>> didn't
>>> move forwards into xl/libxl is this:
>>>
>>>   #   Configure host CPUID consistency checks, which must be satisfied for 
>>> this
>>>   #   VM to be allowed to run on this host's processor type:
>>>   #cpuid_check=[ '1:ecx=xx1x' ]
>>>   # - Host must have VMX feature flag set
>>>
>>> The implementation of xc_cpuid_check() is conceptually broken.  Dom0's view 
>>> of
>>> CPUID is not the approprite view to check, and will be wrong in the presence
>>> of CPUID masking/faulting, and for HVM-based toolstack domains.
>>>
>>> If it turns out that the functionality is required, it should be implemented
>>> in terms of XEN_SYSCTL_get_cpuid_policy to use the proper CPUID view.
>>>
>>> Signed-off-by: Andrew Cooper 
>>> ---
>>> CC: Ian Jackson 
>>> CC: Wei Liu 
>>> CC: Marek Marczykowski-Górecki 
>>> CC: David Scott 
>>> CC: Christian Lindig 
>>> CC: Juergen Gross 
>>> CC: Jim Fehlig 
>>> CC: Boris Ostrovsky 
>>> CC: Konrad Rzeszutek Wilk 
>>>
>>> RFC initially for feedback, and to see if anyone does expect to be using 
>>> this
>>> call.  It turns out that Xapi has a library function using it, but that
>>> function is dead so can be removed.
>> FAOD I am still waiting for Oracle and Suse folks to express their
>> opinions.
> No objection from me.
>


Or from me.

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/6] x86/vpmu: Use vmx_{clear, set}_msr_intercept() rather than opencoding them

2017-07-19 Thread Andrew Cooper
On 19/07/17 12:57, Andrew Cooper wrote:
> No functional change.
>
> Signed-off-by: Andrew Cooper 

I have just realise I can now drop msraddr_to_bitpos(), so have folded
the additional hunk into this patch.

diff --git a/xen/arch/x86/cpu/vpmu_intel.c b/xen/arch/x86/cpu/vpmu_intel.c
index d58eca3..207e2e7 100644
--- a/xen/arch/x86/cpu/vpmu_intel.c
+++ b/xen/arch/x86/cpu/vpmu_intel.c
@@ -225,12 +225,6 @@ static int is_core2_vpmu_msr(u32 msr_index, int
*type, int *index)
 }
 }
 
-static inline int msraddr_to_bitpos(int x)
-{
-ASSERT(x == (x & 0x1fff));
-return x;
-}
-
 static void core2_vpmu_set_msr_bitmap(struct vcpu *v)
 {
 unsigned int i;


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] ARM: Adjusting guest memory size through xl mem-{set|max} fails

2017-07-19 Thread Sergej Proskurin
On 07/19/2017 01:57 PM, Wei Liu wrote:
> On Wed, Jul 19, 2017 at 01:52:08PM +0200, Sergej Proskurin wrote:
>> ---
>> root@avocet:~# xl list
>> NameID   Mem VCPUs  State  
>> Time(s)
>> Domain-0 0  1024 6
>> r-  38.9
>> domu11   511 2
>> -b   0.3
>> root@avocet:~# xl mem-max 1 550m
>> root@avocet:~# xl mem-set 1 520m
>> libxl: error: libxl_mem.c:272:libxl_set_memory_target: Domain
>> 1:memory_dynamic_max must be less than or equal to memory_static_max
>>
> This is a bit strange. What is the maxmem= in your domain config?
>
> I'm not too sure if you can just use xl mem-max. It's a bit messy in
> that area.

As far as I remember, it was possible before (at least on Xen 4.7 and
4.8). I have not set the maxmem= option in the domain config at all. I
just specify the amount of memory by means of memory=.

>> cannot set domid 1 dynamic max memory to : 520m
>> ---
>>
>> According to the error messages from above, I assume this patch will not
>> fix the issues on ARMv7 yet, right?
>>
> The error you saw on ARMv7 is different from the one above afaict. Not
> sure if my patch would fix ARMv7. I'm not too familiar with the inner
> working of ARM guests.

Alright. Anyway, I will try your patch also on ARMv7; just to be sure.
But I also don't think that it'll fix the issue.

Thanks,
Sergej


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 22/22] x86/kaslr: Add option to extend KASLR range from 1GB to 3GB

2017-07-19 Thread Baoquan He
On 07/18/17 at 03:33pm, Thomas Garnier wrote:

>  quiet_cmd_relocs = RELOCS  $@
>cmd_relocs = $(CMD_RELOCS) $< > $@;$(CMD_RELOCS) --abs-relocs $<
>  $(obj)/vmlinux.relocs: vmlinux FORCE
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index a0838ab929f2..0a0c80ab1842 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -170,10 +170,18 @@ void __puthex(unsigned long value)
>  }
>  
>  #if CONFIG_X86_NEED_RELOCS
> +
> +/* Large randomization go lower than -2G and use large relocation table */
> +#ifdef CONFIG_RANDOMIZE_BASE_LARGE
> +typedef long rel_t;
> +#else
> +typedef int rel_t;
> +#endif
> +
>  static void handle_relocations(void *output, unsigned long output_len,
>  unsigned long virt_addr)
>  {
> - int *reloc;
> + rel_t *reloc;
>   unsigned long delta, map, ptr;
>   unsigned long min_addr = (unsigned long)output;
>   unsigned long max_addr = min_addr + (VO___bss_start - VO__text);
> diff --git a/arch/x86/include/asm/page_64_types.h 
> b/arch/x86/include/asm/page_64_types.h
> index 3f5f08b010d0..6b65f846dd64 100644
> --- a/arch/x86/include/asm/page_64_types.h
> +++ b/arch/x86/include/asm/page_64_types.h
> @@ -48,7 +48,11 @@
>  #define __PAGE_OFFSET   __PAGE_OFFSET_BASE
>  #endif /* CONFIG_RANDOMIZE_MEMORY */
>  
> +#ifdef CONFIG_RANDOMIZE_BASE_LARGE
> +#define __START_KERNEL_map   _AC(0x, UL)
> +#else
>  #define __START_KERNEL_map   _AC(0x8000, UL)
> +#endif /* CONFIG_RANDOMIZE_BASE_LARGE */
>  
>  /* See Documentation/x86/x86_64/mm.txt for a description of the memory map. 
> */
>  #ifdef CONFIG_X86_5LEVEL
> @@ -65,9 +69,14 @@
>   * 512MiB by default, leaving 1.5GiB for modules once the page tables
>   * are fully set up. If kernel ASLR is configured, it can extend the
>   * kernel page table mapping, reducing the size of the modules area.
> + * On PIE, we relocate the binary 2G lower so add this extra space.
>   */
>  #if defined(CONFIG_RANDOMIZE_BASE)
> +#ifdef CONFIG_RANDOMIZE_BASE_LARGE
> +#define KERNEL_IMAGE_SIZE(_AC(3, UL) * 1024 * 1024 * 1024)
> +#else
>  #define KERNEL_IMAGE_SIZE(1024 * 1024 * 1024)
> +#endif
>  #else
>  #define KERNEL_IMAGE_SIZE(512 * 1024 * 1024)
>  #endif
> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index 4103e90ff128..235c3f7b46c7 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -39,6 +39,7 @@ static unsigned int __initdata next_early_pgt;
>  pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
>  
>  #define __head   __section(.head.text)
> +#define pud_count(x)   (((x + (PUD_SIZE - 1)) & ~(PUD_SIZE - 1)) >> 
> PUD_SHIFT)
>  
>  static void __head *fixup_pointer(void *ptr, unsigned long physaddr)
>  {
> @@ -54,6 +55,8 @@ unsigned long _text_offset = (unsigned long)(_text - 
> __START_KERNEL_map);
>  void __head notrace __startup_64(unsigned long physaddr)
>  {
>   unsigned long load_delta, *p;
> + unsigned long level3_kernel_start, level3_kernel_count;
> + unsigned long level3_fixmap_start;
>   pgdval_t *pgd;
>   p4dval_t *p4d;
>   pudval_t *pud;
> @@ -74,6 +77,11 @@ void __head notrace __startup_64(unsigned long physaddr)
>   if (load_delta & ~PMD_PAGE_MASK)
>   for (;;);
>  
> + /* Look at the randomization spread to adapt page table used */
> + level3_kernel_start = pud_index(__START_KERNEL_map);
> + level3_kernel_count = pud_count(KERNEL_IMAGE_SIZE);
> + level3_fixmap_start = level3_kernel_start + level3_kernel_count;
> +
>   /* Fixup the physical addresses in the page table */
>  
>   pgd = fixup_pointer(_top_pgt, physaddr);
> @@ -85,8 +93,9 @@ void __head notrace __startup_64(unsigned long physaddr)
>   }
>  
>   pud = fixup_pointer(_kernel_pgt, physaddr);
> - pud[510] += load_delta;
> - pud[511] += load_delta;
> + for (i = 0; i < level3_kernel_count; i++)
> + pud[level3_kernel_start + i] += load_delta;
> + pud[level3_fixmap_start] += load_delta;
>  
>   pmd = fixup_pointer(level2_fixmap_pgt, physaddr);
>   pmd[506] += load_delta;
> @@ -137,7 +146,7 @@ void __head notrace __startup_64(unsigned long physaddr)
>*/
>  
>   pmd = fixup_pointer(level2_kernel_pgt, physaddr);
> - for (i = 0; i < PTRS_PER_PMD; i++) {
> + for (i = 0; i < PTRS_PER_PMD * level3_kernel_count; i++) {
>   if (pmd[i] & _PAGE_PRESENT)
>   pmd[i] += load_delta;

Wow, this is dangerous. Three pud entries of level3_kernel_pgt all point
to level2_kernel_pgt, it's out of bound of level2_kernel_pgt and
overwrite the next data.

And if only use one page for level2_kernel_pgt, and kernel is randomized
to cross the pud entry of -4G to -1G, it won't work well.

>   }
> @@ -268,7 +277,8 @@ asmlinkage __visible void __init x86_64_start_kernel(char 
> * real_mode_data)
>*/
>   

[Xen-devel] [xen-unstable-smoke test] 112012: tolerable trouble: broken/pass - PUSHED

2017-07-19 Thread osstest service owner
flight 112012 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112012/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  ab48596654ca20bd45eee4bdc1252188e9beb5a5
baseline version:
 xen  d535d8922f571502252deaf607e82e7475cd1728

Last test of basis   111993  2017-07-18 23:01:14 Z0 days
Testing same since   112012  2017-07-19 10:03:30 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Wei Liu 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  broken  
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=ab48596654ca20bd45eee4bdc1252188e9beb5a5
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
ab48596654ca20bd45eee4bdc1252188e9beb5a5
+ branch=xen-unstable-smoke
+ revision=ab48596654ca20bd45eee4bdc1252188e9beb5a5
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.9-testing
+ '[' xab48596654ca20bd45eee4bdc1252188e9beb5a5 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : 

Re: [Xen-devel] [PATCH 4.12 26/84] x86/xen/efi: Initialize only the EFI struct members used by Xen

2017-07-19 Thread Daniel Kiper
On Wed, Jul 19, 2017 at 01:19:58PM +0200, Greg Kroah-Hartman wrote:
> On Wed, Jul 19, 2017 at 01:12:14PM +0200, Greg Kroah-Hartman wrote:
> > On Wed, Jul 19, 2017 at 12:37:47PM +0200, Daniel Kiper wrote:
> > > Hey Greg,
> > >
> > > On Wed, Jul 19, 2017 at 11:43:32AM +0200, Greg Kroah-Hartman wrote:
> > > > 4.12-stable review patch.  If anyone has any objections, please let me 
> > > > know.
> > >
> > > Why did you skip this patch for 4.11? IMO it should be applied there too.
> >
> > Are you sure it actually applied?  (hint, it did not...)
> >
> > If you want it in 4.11, or older kernels, please provide a working
> > backport.
>
> And, in the future, if you want it to be applied to older kernels, or be
> notified if it can not be, please add a kernel version number in the
> stable marking:
>   Cc: sta...@vger.kernel.org # 4.0+
> or use the Fixes: tag:
>   Fixes: SHASHAHSA ("short description")
> which I pick up on and let you know if the patch does not actually apply
> back to the kernel that the fixes: tag was in.
>
> hope this helps,

Sure thing! Thanks a lot!

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 2/6] x86/vpmu: Use vmx_{clear, set}_msr_intercept() rather than opencoding them

2017-07-19 Thread Andrew Cooper
No functional change.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Jun Nakajima 
CC: Kevin Tian 
CC: Boris Ostrovsky 
---
 xen/arch/x86/cpu/vpmu_intel.c | 64 ---
 1 file changed, 23 insertions(+), 41 deletions(-)

diff --git a/xen/arch/x86/cpu/vpmu_intel.c b/xen/arch/x86/cpu/vpmu_intel.c
index 6d768cb..d58eca3 100644
--- a/xen/arch/x86/cpu/vpmu_intel.c
+++ b/xen/arch/x86/cpu/vpmu_intel.c
@@ -231,68 +231,50 @@ static inline int msraddr_to_bitpos(int x)
 return x;
 }
 
-static void core2_vpmu_set_msr_bitmap(unsigned long *msr_bitmap)
+static void core2_vpmu_set_msr_bitmap(struct vcpu *v)
 {
-int i;
+unsigned int i;
 
 /* Allow Read/Write PMU Counters MSR Directly. */
 for ( i = 0; i < fixed_pmc_cnt; i++ )
-{
-clear_bit(msraddr_to_bitpos(MSR_CORE_PERF_FIXED_CTR0 + i), msr_bitmap);
-clear_bit(msraddr_to_bitpos(MSR_CORE_PERF_FIXED_CTR0 + i),
-  msr_bitmap + 0x800/BYTES_PER_LONG);
-}
+vmx_clear_msr_intercept(v, MSR_CORE_PERF_FIXED_CTR0 + i, VMX_MSR_RW);
+
 for ( i = 0; i < arch_pmc_cnt; i++ )
 {
-clear_bit(msraddr_to_bitpos(MSR_IA32_PERFCTR0+i), msr_bitmap);
-clear_bit(msraddr_to_bitpos(MSR_IA32_PERFCTR0+i),
-  msr_bitmap + 0x800/BYTES_PER_LONG);
+vmx_clear_msr_intercept(v, MSR_IA32_PERFCTR0 + i, VMX_MSR_RW);
 
 if ( full_width_write )
-{
-clear_bit(msraddr_to_bitpos(MSR_IA32_A_PERFCTR0 + i), msr_bitmap);
-clear_bit(msraddr_to_bitpos(MSR_IA32_A_PERFCTR0 + i),
-  msr_bitmap + 0x800/BYTES_PER_LONG);
-}
+vmx_clear_msr_intercept(v, MSR_IA32_A_PERFCTR0 + i, VMX_MSR_RW);
 }
 
 /* Allow Read PMU Non-global Controls Directly. */
 for ( i = 0; i < arch_pmc_cnt; i++ )
- clear_bit(msraddr_to_bitpos(MSR_P6_EVNTSEL(i)), msr_bitmap);
+vmx_clear_msr_intercept(v, MSR_P6_EVNTSEL(i), VMX_MSR_R);
 
-clear_bit(msraddr_to_bitpos(MSR_CORE_PERF_FIXED_CTR_CTRL), msr_bitmap);
-clear_bit(msraddr_to_bitpos(MSR_IA32_DS_AREA), msr_bitmap);
+vmx_clear_msr_intercept(v, MSR_CORE_PERF_FIXED_CTR_CTRL, VMX_MSR_R);
+vmx_clear_msr_intercept(v, MSR_IA32_DS_AREA, VMX_MSR_R);
 }
 
-static void core2_vpmu_unset_msr_bitmap(unsigned long *msr_bitmap)
+static void core2_vpmu_unset_msr_bitmap(struct vcpu *v)
 {
-int i;
+unsigned int i;
 
 for ( i = 0; i < fixed_pmc_cnt; i++ )
-{
-set_bit(msraddr_to_bitpos(MSR_CORE_PERF_FIXED_CTR0 + i), msr_bitmap);
-set_bit(msraddr_to_bitpos(MSR_CORE_PERF_FIXED_CTR0 + i),
-msr_bitmap + 0x800/BYTES_PER_LONG);
-}
+vmx_set_msr_intercept(v, MSR_CORE_PERF_FIXED_CTR0 + i, VMX_MSR_RW);
+
 for ( i = 0; i < arch_pmc_cnt; i++ )
 {
-set_bit(msraddr_to_bitpos(MSR_IA32_PERFCTR0 + i), msr_bitmap);
-set_bit(msraddr_to_bitpos(MSR_IA32_PERFCTR0 + i),
-msr_bitmap + 0x800/BYTES_PER_LONG);
+vmx_set_msr_intercept(v, MSR_IA32_PERFCTR0 + i, VMX_MSR_RW);
 
 if ( full_width_write )
-{
-set_bit(msraddr_to_bitpos(MSR_IA32_A_PERFCTR0 + i), msr_bitmap);
-set_bit(msraddr_to_bitpos(MSR_IA32_A_PERFCTR0 + i),
-  msr_bitmap + 0x800/BYTES_PER_LONG);
-}
+vmx_set_msr_intercept(v, MSR_IA32_A_PERFCTR0 + i, VMX_MSR_RW);
 }
 
 for ( i = 0; i < arch_pmc_cnt; i++ )
-set_bit(msraddr_to_bitpos(MSR_P6_EVNTSEL(i)), msr_bitmap);
+vmx_set_msr_intercept(v, MSR_P6_EVNTSEL(i), VMX_MSR_R);
 
-set_bit(msraddr_to_bitpos(MSR_CORE_PERF_FIXED_CTR_CTRL), msr_bitmap);
-set_bit(msraddr_to_bitpos(MSR_IA32_DS_AREA), msr_bitmap);
+vmx_set_msr_intercept(v, MSR_CORE_PERF_FIXED_CTR_CTRL, VMX_MSR_R);
+vmx_set_msr_intercept(v, MSR_IA32_DS_AREA, VMX_MSR_R);
 }
 
 static inline void __core2_vpmu_save(struct vcpu *v)
@@ -327,7 +309,7 @@ static int core2_vpmu_save(struct vcpu *v, bool_t to_guest)
 /* Unset PMU MSR bitmap to trap lazy load. */
 if ( !vpmu_is_set(vpmu, VPMU_RUNNING) && is_hvm_vcpu(v) &&
  cpu_has_vmx_msr_bitmap )
-core2_vpmu_unset_msr_bitmap(v->arch.hvm_vmx.msr_bitmap);
+core2_vpmu_unset_msr_bitmap(v);
 
 if ( to_guest )
 {
@@ -541,9 +523,9 @@ static int core2_vpmu_msr_common_check(u32 msr_index, int 
*type, int *index)
 {
 __core2_vpmu_load(current);
 vpmu_set(vpmu, VPMU_CONTEXT_LOADED);
-if ( is_hvm_vcpu(current) &&
- cpu_has_vmx_msr_bitmap )
-core2_vpmu_set_msr_bitmap(current->arch.hvm_vmx.msr_bitmap);
+
+if ( is_hvm_vcpu(current) && cpu_has_vmx_msr_bitmap )
+core2_vpmu_set_msr_bitmap(current);
 }
 return 1;
 }
@@ -860,7 +842,7 @@ static void core2_vpmu_destroy(struct vcpu *v)
 xfree(vpmu->priv_context);
 vpmu->priv_context 

[Xen-devel] [PATCH 5/6] x86/vvmx: Fix handing of the MSR_BITMAP field with VMCS shadowing

2017-07-19 Thread Andrew Cooper
Currently, the following sequence of actions:

 * VMPTRLD (creates a mapping, likely pointing at gfn 0 for an empty vmcs)
 * VMWRITE CPU_BASED_VM_EXEC_CONTROL (completed by hardware)
 * VMWRITE MSR_BITMAP (completed by hardware)
 * VMLAUNCH

results in an L2 guest running with ACTIVATE_MSR_BITMAP set, but Xen using a
stale mapping (likely gfn 0) when reading the interception bitmap.  The
MSR_BITMAP field needs unconditionally intercepting even with VMCS shadowing,
so Xen's mapping of the bitmap can be updated.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Jun Nakajima 
CC: Kevin Tian 
---
 xen/arch/x86/hvm/vmx/vvmx.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index 0d08789..f84478e 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -98,13 +98,15 @@ int nvmx_vcpu_initialise(struct vcpu *v)
 clear_page(vw);
 
 /*
- * For the following 4 encodings, we need to handle them in VMM.
+ * For the following 6 encodings, we need to handle them in VMM.
  * Let them vmexit as usual.
  */
 set_bit(IO_BITMAP_A, vw);
 set_bit(VMCS_HIGH(IO_BITMAP_A), vw);
 set_bit(IO_BITMAP_B, vw);
 set_bit(VMCS_HIGH(IO_BITMAP_B), vw);
+set_bit(MSR_BITMAP, vw);
+set_bit(VMCS_HIGH(MSR_BITMAP), vw);
 
 unmap_domain_page(vw);
 }
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] ARM: Adjusting guest memory size through xl mem-{set|max} fails

2017-07-19 Thread Wei Liu
On Wed, Jul 19, 2017 at 01:52:08PM +0200, Sergej Proskurin wrote:
> 
> ---
> root@avocet:~# xl list
> NameID   Mem VCPUs  State  
> Time(s)
> Domain-0 0  1024 6
> r-  38.9
> domu11   511 2
> -b   0.3
> root@avocet:~# xl mem-max 1 550m
> root@avocet:~# xl mem-set 1 520m
> libxl: error: libxl_mem.c:272:libxl_set_memory_target: Domain
> 1:memory_dynamic_max must be less than or equal to memory_static_max
> 

This is a bit strange. What is the maxmem= in your domain config?

I'm not too sure if you can just use xl mem-max. It's a bit messy in
that area.

> cannot set domid 1 dynamic max memory to : 520m
> ---
> 
> According to the error messages from above, I assume this patch will not
> fix the issues on ARMv7 yet, right?
> 

The error you saw on ARMv7 is different from the one above afaict. Not
sure if my patch would fix ARMv7. I'm not too familiar with the inner
working of ARM guests.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 6/6] x86/vvmx: Fix auditing of MSR_BITMAP parameter

2017-07-19 Thread Andrew Cooper
The MSR_BITMAP field is required to be page aligned.  Also switch gpa to be a
uint64_t, as the MSR_BITMAP is strictly a 64bit VMCS field.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Jun Nakajima 
CC: Kevin Tian 
---
 xen/arch/x86/hvm/vmx/vvmx.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index f84478e..6ee5385 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -754,14 +754,27 @@ static void __clear_current_vvmcs(struct vcpu *v)
 __vmpclear(nvcpu->nv_n2vmcx_pa);
 }
 
-static bool_t __must_check _map_msr_bitmap(struct vcpu *v)
+/*
+ * Refreshes the MSR bitmap mapping for the current nested vcpu.  Returns true
+ * for a success mapping, and returns false for MSR_BITMAP parameter errors or
+ * gfn mapping errors.
+ */
+static bool __must_check _map_msr_bitmap(struct vcpu *v)
 {
 struct nestedvmx *nvmx = _2_nvmx(v);
-unsigned long gpa;
+uint64_t gpa;
 
 if ( nvmx->msrbitmap )
+{
 hvm_unmap_guest_frame(nvmx->msrbitmap, 1);
+nvmx->msrbitmap = NULL;
+}
+
 gpa = get_vvmcs(v, MSR_BITMAP);
+
+if ( !IS_ALIGNED(gpa, PAGE_SIZE) )
+return false;
+
 nvmx->msrbitmap = hvm_map_guest_frame_ro(gpa >> PAGE_SHIFT, 1);
 
 return nvmx->msrbitmap != NULL;
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


  1   2   >