Re: [Xen-devel] [PATCH v1 00/15] arm64: Mediate access to GICv3 sysregs at EL2

2018-03-22 Thread Julien Grall
(Sorry for the formatting)

On 23 Mar 2018 14:46, "Manish Jaggi"  wrote:



On 03/21/2018 03:26 PM, Julien Grall wrote:

> Hi Manish,
>
> On 03/21/2018 09:38 AM, Manish Jaggi wrote:
>
>>
>>
>> On 03/21/2018 02:15 PM, Julien Grall wrote:
>>
>>>
>>>
>>> On 03/21/2018 04:58 AM, Manish Jaggi wrote:
>>>

 Hi Julien,

 On 03/20/2018 01:16 PM, Julien Grall wrote:

>
>
> On 03/16/2018 11:58 AM, Manish Jaggi wrote:
>
>> This patchset is a Xen port of Marc's patchset.
>> arm64: KVM: Mediate access to GICv3 sysregs at EL2 [1]
>>
>> The current RFC patchset is a subset of [1], as it handleing only
>> Group1 traps
>> as a PoC. Most of the trap code is added in vsysreg.c. Trap handler
>> function is kept
>> independent of the usual guest trap handling code.
>> Looking for feedback on this approach.
>>
>
> This cover letter does not seem to match the series. Please update it
> on every time you send a series.
>
 %s/vsysreg.c/vgic-v3-sr..

 Could you please review the other patches in the series, so that I can
 send v2.

>>>
>>> Here the major comments for the series (included patch not reviewed):
>>> 1) You seem to miss some patches from Linux. I would like to
>>> understand why they are not there.
>>>
>> if code is ported to xen, it is perfectly fine to take only relevant
>> patches.
>>
>
> It is usually expected from the contributor to have some sort of
> explanation in the cover letter. In particular when you are based on a
> series from Linux.
>
> Where I am more worried is there are patch on top in Linux, that you
> didn't backport. So it would be really nice to understand why those patches
> are not in Xen.
>
> A non-exhaustive list:
> - KVM: arm64: Log an error if trapping a write-to-read-only GICv3
> access
> - KVM: arm64: Log an error if trapping a read-from-write-only
> GICv3 access
>
>
> For instance we are not providing any command line option to individually
>> enable group1 grou0 traps.
>>
>
> I think the command line option could be useful for testing. Developer
> don't necessarily have a Thunder-X in hand.
>
> 2) Strangely some commits does not match the Linux one either in order
>>> and content (I am not speaking about the changes required by Xen). For
>>> instance this is the case of patch #14 "arm64: vgic-v3: Add
>>> ICV_AP(0/1)Rn_EL1 handler". If you port commit from Linux, then you should
>>> follow the same. This help a lot for review.
>>>
>> Since we are not doing individually enable of group0/1, it doesnt make
>> sense to have two set of patches for ICV_AP0 / ICV_AP1. So I merged it.
>>
>
> Sorry, but it does not make sense. Looking at the series you pointed. I
> don't see a patch just for ICV_AP0. Instead it is part of " KVM: arm64:
> vgic-v3: Enable trapping of Group-0 system registers". You ported that
> patch in Xen.
>
> If you see this patch, you will find this one specifically for ICV_AP1
https://lists.cs.columbia.edu/pipermail/kvmarm/2017-June/026040.html


You didn't get my point...  You still don't explain why you move the
ICV_AP0 from "Enable trapping of Group-0 system registers"  to that patch.
If you take commit from Linux then don't move code between commit around
unless there is a good reason.

Please try to make the review a bit easier...

Cheers,
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v1 00/15] arm64: Mediate access to GICv3 sysregs at EL2

2018-03-22 Thread Manish Jaggi



On 03/21/2018 03:26 PM, Julien Grall wrote:

Hi Manish,

On 03/21/2018 09:38 AM, Manish Jaggi wrote:



On 03/21/2018 02:15 PM, Julien Grall wrote:



On 03/21/2018 04:58 AM, Manish Jaggi wrote:


Hi Julien,

On 03/20/2018 01:16 PM, Julien Grall wrote:



On 03/16/2018 11:58 AM, Manish Jaggi wrote:

This patchset is a Xen port of Marc's patchset.
arm64: KVM: Mediate access to GICv3 sysregs at EL2 [1]

The current RFC patchset is a subset of [1], as it handleing only 
Group1 traps
as a PoC. Most of the trap code is added in vsysreg.c. Trap 
handler function is kept

independent of the usual guest trap handling code.
Looking for feedback on this approach.


This cover letter does not seem to match the series. Please update 
it on every time you send a series.

%s/vsysreg.c/vgic-v3-sr..

Could you please review the other patches in the series, so that I 
can send v2.


Here the major comments for the series (included patch not reviewed):
1) You seem to miss some patches from Linux. I would like to 
understand why they are not there.
if code is ported to xen, it is perfectly fine to take only relevant 
patches.


It is usually expected from the contributor to have some sort of 
explanation in the cover letter. In particular when you are based on a 
series from Linux.


Where I am more worried is there are patch on top in Linux, that you 
didn't backport. So it would be really nice to understand why those 
patches are not in Xen.


A non-exhaustive list:
- KVM: arm64: Log an error if trapping a write-to-read-only GICv3 
access
    - KVM: arm64: Log an error if trapping a read-from-write-only 
GICv3 access



For instance we are not providing any command line option to 
individually enable group1 grou0 traps.


I think the command line option could be useful for testing. Developer 
don't necessarily have a Thunder-X in hand.


2) Strangely some commits does not match the Linux one either in 
order and content (I am not speaking about the changes required by 
Xen). For instance this is the case of patch #14 "arm64: vgic-v3: 
Add ICV_AP(0/1)Rn_EL1 handler". If you port commit from Linux, then 
you should follow the same. This help a lot for review.
Since we are not doing individually enable of group0/1, it doesnt 
make sense to have two set of patches for ICV_AP0 / ICV_AP1. So I 
merged it.


Sorry, but it does not make sense. Looking at the series you pointed. 
I don't see a patch just for ICV_AP0. Instead it is part of " KVM: 
arm64: vgic-v3: Enable trapping of Group-0 system registers". You 
ported that patch in Xen.



If you see this patch, you will find this one specifically for ICV_AP1
https://lists.cs.columbia.edu/pipermail/kvmarm/2017-June/026040.html

Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [ovmf test] 121046: all pass - PUSHED

2018-03-22 Thread osstest service owner
flight 121046 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/121046/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 2f1b849dc82f01ba5df198715f52dd6a0a8051c0
baseline version:
 ovmf df67a480eb81821ba21ad6909e2fda287e745834

Last test of basis   120991  2018-03-20 12:58:39 Z2 days
Testing same since   121046  2018-03-22 03:46:49 Z1 days1 attempts


People who touched revisions under test:
  Achin Gupta 
  Carsey, Jaben 
  Chao Zhang 
  Jaben Carsey 
  Jiaxin Wu 
  Laszlo Ersek 
  Yonghong Zhu 
  Zhang, Chao B 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   df67a480eb..2f1b849dc8  2f1b849dc82f01ba5df198715f52dd6a0a8051c0 -> 
xen-tested-master

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH resend 03/13] acpi: arm: Code to generate Hardware Domains IORT

2018-03-22 Thread Julien Grall

Hi,

On 03/23/2018 04:17 AM, Manish Jaggi wrote:

On 03/23/2018 06:58 AM, Julien Grall wrote:

On 03/13/2018 03:20 PM, mja...@caviumnetworks.com wrote:

+ *
+ * fwits_node - ITS Node pointer in Firmware IORT
+ * offset - offset of the equivalent its node to be stored in
+ *  hwdom's IORT
+ */
+static int is_uniq_fwits_node(struct acpi_iort_node *fwits_node,


The name is a bit odd given that you add the ITS node. On the previous 
version, I requested to document that behavior...
I think the name is quite appropriate. Also in this patch I have added 
description of the flow so this should be fairly intuitive.
Could you please let me know the specific point you dont understand, I 
can explain that.


The fact that a function calling is_* will add the element to the list. 
An is_* function should only check the element is in the list.


So yes, it is not intuitive for me.



But you likely want to rename the function to add_fwits_node(...) or 
something similar.

I think name is quite appropriate.


See above.




+  unsigned int offset)
+{
+    struct fwits_hwits_map *map;
+
+    list_for_each_entry(map, &fwits_hwits_list, entry)
+    {
+    if ( map->fwits_node == fwits_node )
+    return 0;
+    }
+
+    map = xzalloc(struct fwits_hwits_map);


Where this memory is going to be freed?

Since this list can be used multiple times even after creation of IORT 
for dom0, say thinking ahead for domUs.


IORT for DomU will not rely on the host firmware and will be created by 
the toolstack. It does not make sense to keep that around.



+    if ( !map )
+    return -ENOMEM;
+
+    map->fwits_node = fwits_node;
+    map->hwitsnode_offset = offset;
+    list_add_tail(&map->entry, &fwits_hwits_list);
+
+    return 1;
+}
+
+/*
+ * Returns the offset of corresponding its node to fwits_node
+ * written in hwdom's IORT.
+ *
+ * This function would be used when write hwdoms pcirc nodes' idmap
+ * entries.
+ */
+static
+unsigned int hwitsnode_offset_from_map(struct acpi_iort_node 
*fwits_node)

+{
+    struct fwits_hwits_map *map;
+
+    list_for_each_entry(map, &fwits_hwits_list, entry)
+    {
+    if ( map->fwits_node == fwits_node )
+    return map->hwitsnode_offset;
+    }
+
+    return 0;


0 could never be a valid offset, right?

Yes
See a bug_on when it is used.


I don't much care about the BUG_ON()... I was only checking what the 
spec says here.


But... you document the return really well on the previous function and 
here it is seems to be forgotten.





+}
+
+static void write_hwits_nodes(u8 *iort, unsigned int *offset,


Please avoid using u* and use uint_* instead. I expect that you fix 
all of for the next version.


Here, I think you want to use void *.


+  unsigned int *num_nodes)
+{
+    struct rid_devid_map *rmap;
+    unsigned int of = *offset;


Please name it off. This is clearer that it is an offset. But as I 
said on the previous version, why can't you just re-use offset?

I will change it to off, will not break anything.


It does not answer my question.




+    int n = 0;
+
+    /*
+ * rid_devid_list is iterated to get unique its group nodes
+ * Each unique ITS group node is written in hardware domains IORT
+ * by using some values from the firmware ITS group node.
+ */
+    list_for_each_entry(rmap, &rid_devid_list, entry)
+    {
+    struct acpi_iort_node *node;
+    struct acpi_iort_its_group *grp;
+    struct acpi_iort_its_group *fw_grp;
+
+    /* save its_node_offset_map in a list uniquely */
+    if ( is_uniq_fwits_node(rmap->its_node, of) == 1 )


If the function is returning -ENOMEM, then you will ignore the node 
without a warning. That's going to be a real pain to find out a ITS 
node is not present if that happen.


Here, you should propagate error if something wrong is going.

ok.



+    {
+    node = (struct acpi_iort_node *) &iort[of];
+    grp = (struct acpi_iort_its_group *)(&node->node_data);
+
+    node->type = ACPI_IORT_NODE_ITS_GROUP;
+    node->length = sizeof(struct acpi_iort_node) +
+   sizeof(struct acpi_iort_its_group) -
+   sizeof(node->node_data);


While the substraction is good, this is odd enough to warrant a 
comment. But likely But likely you want to provide macros for defining 
the length. This will clean a lot the code.



+
+    node->revision = rmap->its_node->revision;


I am not sure this is right. You rewrite the IORT based on a given 
revision. Imagine the host IORT get updated to a newer spec but not Xen.

Not sure if I follow your comment here.
Xen gets host IORT from firmware, how will it get updated?


The host IORT will be built on top of a revision X. For the hwdom IORT, 
at the moment, you are always building on top of revision 0.


While today X == 0, this may change in the future and will prevent 
proper booting.



+
+

Re: [Xen-devel] [PATCH resend 03/13] acpi: arm: Code to generate Hardware Domains IORT

2018-03-22 Thread Manish Jaggi



On 03/23/2018 06:58 AM, Julien Grall wrote:

Hi Manish,

On 03/13/2018 03:20 PM, mja...@caviumnetworks.com wrote:

From: Manish Jaggi 

Structure of Hardware domain's (hwdom) IORT

hwdom's IORT will only have PCIRC nodes and ITS group nodes
in the following order. SMMU nodes as they are hidden from hardware
domain.

[IORT Header]
[ITS Group 1 ]
...
[ITS Group n ]
[PCIRC Node 1]
   [PCIRC IDMAP entry 1]
   ...
   [PCIRC IDMAP entry m]
...
[PCIRC Node p]
   [PCIRC IDMAP entry 1]
   ...
   [PCIRC IDMAP entry q]
...
*n,m,p are variable.

requesterid-deviceid mapping list (rid_devid_list) populated by
parsing IORT is used to generate hwdom IORT.

As the rid_devid_list is populated from firmware IORT, IDMAP entry
would have output references offsets based on firmware's IORT.
It is required to fixup node offset of ITS Group Nodes in the PCIRC
idmap (output_reference)

First write all the ITS group nodes in the hwdom's IORT. For this
write_hwits_nodes is called, which parses the rid_devid_list and for
each unique its_node in firmware IORT create a its_node in hwdom's
IORT and also creates and entry in fwits_hwits_map.

fwits_hwits_map is a mapping between firmware IORT's its node
and the node offset of the corresponding its_node stored in the
hwdom's IORT.

This map can later be used to set output reference value in hwdom's
pcirc node's idmap entries.

Signed-off-by: Manish Jaggi 
---
  xen/arch/arm/acpi/gen-iort.c    | 299 


  xen/arch/arm/domain_build.c |  35 +
  xen/include/asm-arm/acpi.h  |   1 +
  xen/include/asm-arm/acpi/gen-iort.h |  11 ++
  4 files changed, 346 insertions(+)

diff --git a/xen/arch/arm/acpi/gen-iort.c b/xen/arch/arm/acpi/gen-iort.c
index 687c4f18ee..251a9771e3 100644
--- a/xen/arch/arm/acpi/gen-iort.c
+++ b/xen/arch/arm/acpi/gen-iort.c
@@ -19,6 +19,305 @@
    #include 
  #include 
+#include 
+
+/*
+ * Structure of Hardware domain's (hwdom) IORT
+ * ---
+ *
+ * hwdom's IORT will only have PCIRC nodes and ITS group nodes
+ * in the following order.
+ *
+ * [IORT Header]
+ * [ITS Group 1 ]
+ * ...
+ * [ITS Group N ]
+ * [PCIRC Node 1]
+ * [PCIRC IDMAP entry 1]
+ * ...
+ * [PCIRC IDMAP entry N]
+ * ...
+ * [PCIRC Node N]
+ *
+ * requesterid-deviceid mapping list (rid_devid_list) populated by 
parsing IORT

+ * is used to generate hwdom IORT.
+ *
+ * One of the challanges is to fixup node offset of ITS Group Nodes


s/challanges/challenges/


+ * in the PCIRC idmap (output_reference)
+ *
+ * In rid_devid_map firmware IORT's ITS group node pointer in stored.
+ *
+ * We first write all the ITS group nodes in the hwdom's IORT. For this
+ * write_hwits_nodes is called, which parses the rid_devid_list and for
+ * each unique its_node in firmware IORT create a its_node in 
hwdom's IORT

+ * and also creates and entry in fwits_hwits_map.
+ *
+ * fwits_hwits_map is a mapping between firmware IORT's its node
+ * and the node offset of the corresponding its_node stored in the
+ * hwdom's IORT.
+ *
+ * This map can be later used to set output reference value in hwdom's
+ * pcirc node's idmap entries.
+ *
+ */
+
+/*
+ * Stores the mapping between firmware tables its group node
+ * to the offset of the equivalent its node to be stored in
+ * hwdom's IORT.
+ */
+struct fwits_hwits_map
+{
+    struct acpi_iort_node *fwits_node;
+    unsigned int hwitsnode_offset;
+    struct list_head entry;
+};
+
+LIST_HEAD(fwits_hwits_list);


As said in the previous version, I think this should be static.


+
+/*
+ * is_uniq_fwits_node
+ *
+ * returns 1 - if fwits_node is not already in the its_map_list
+ * 0 - if it is present already


It also returns -ENOMEM when you can't allocate memory.


+ *
+ * fwits_node - ITS Node pointer in Firmware IORT
+ * offset - offset of the equivalent its node to be stored in
+ *  hwdom's IORT
+ */
+static int is_uniq_fwits_node(struct acpi_iort_node *fwits_node,


The name is a bit odd given that you add the ITS node. On the previous 
version, I requested to document that behavior...
I think the name is quite appropriate. Also in this patch I have added 
description of the flow so this should be fairly intuitive.
Could you please let me know the specific point you dont understand, I 
can explain that.


But you likely want to rename the function to add_fwits_node(...) or 
something similar.

I think name is quite appropriate.



+  unsigned int offset)
+{
+    struct fwits_hwits_map *map;
+
+    list_for_each_entry(map, &fwits_hwits_list, entry)
+    {
+    if ( map->fwits_node == fwits_node )
+    return 0;
+    }
+
+    map = xzalloc(struct fwits_hwits_map);


Where this memory is going to be freed?

Since this list can be used multiple times even after creation of IORT 
for dom0, say thinking ahead for domUs.

+    if ( !map )
+    return -ENOMEM;
+
+    map->fwits_node = fwits_node;
+    map->hwitsnode_offset = offset;
+   

[Xen-devel] [xen-4.9-testing test] 121015: tolerable FAIL - PUSHED

2018-03-22 Thread osstest service owner
flight 121015 xen-4.9-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/121015/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 12 guest-start  fail REGR. vs. 12

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-win7-amd64 18 guest-start/win.repeat fail blocked in 
12
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop  fail blocked in 12
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail like 119954
 test-amd64-amd64-xl-qemuu-ws16-amd64 16 guest-localmigrate/x10 fail like 119954
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 12
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 12
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 12
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 12
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass

version targeted for testing:
 xen  6f8eed4d934b53012c079cb2fca3866e56bf7d25
baseline version:
 xen  88fbabc49158b0b858248fa124ef590c5df7782f

Last test of basis   12  2018-02-24 21:12:43 Z   26 days
Failing since120063  2018-02-27 13:55:23 Z   23 days   13 attempts
Testing same since   121015  2018-03-21 03:34:22 Z1 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Boris Ostrovsky 
  Daniel Sabogal 
  George Dunlap 
  Haozhong Zhang 
  Igor Druzhinin 
  Jan Beulich 
  Juergen Gross 
  Julien Grall 
  Liran Alon 
  Martin Cerveny 
  Ross Lagerwall 
  Stefano Stabellini 
  Wei Liu 

jobs:
 build-amd64-xsm  p

Re: [Xen-devel] Passthrough a device to DomU on a ARM platform

2018-03-22 Thread Julien Grall

Hello,

Apologies for the late answer.

On 03/16/2018 02:39 PM, Naveed Asmat wrote:

Hi,


I am new to Xen and trying to understand how does the VGA passthrough 
will work on a ARM based hardware.


I am not sure what you mean by VGA passthrough. Do you want to 
passthrough the graphic device? If so, it is a PCI device or integrated one?


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-linus test] 121012: regressions - trouble: blocked/broken/fail/pass

2018-03-22 Thread osstest service owner
flight 121012 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/121012/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-libvirt  broken
 test-amd64-i386-xl-xsm7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-libvirt   7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 
118324
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-win10-i386  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-freebsd10-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-boot   fail REGR. vs. 118324
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-boot   fail REGR. vs. 118324
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start   fail REGR. vs. 118324
 test-amd64-i386-xl-raw7 xen-boot fail REGR. vs. 118324
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start fail REGR. vs. 118324
 test-amd64-i386-examine   8 reboot   fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-pair 10 xen-boot/src_hostfail REGR. vs. 118324
 test-amd64-i386-pair 11 xen-boot/dst_hostfail REGR. vs. 118324
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail REGR. vs. 118324
 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail REGR. vs. 118324
 test-amd64-i386-xl-qemut-win10-i386  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-rumprun-i386  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-freebsd10-i386  7 xen-boot   fail REGR. vs. 118324
 test-amd64-i386-xl7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-libvirt-xsm   7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 
118324
 build-armhf-libvirt   5 host-build-prep  fail REGR. vs. 118324

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 118324
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 118324
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 118324
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 118324
 test-amd64-i386-xl-pvshim 7 xen-boot fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migr

Re: [Xen-devel] [PATCH resend 03/13] acpi: arm: Code to generate Hardware Domains IORT

2018-03-22 Thread Julien Grall

Hi Manish,

On 03/13/2018 03:20 PM, mja...@caviumnetworks.com wrote:

From: Manish Jaggi 

Structure of Hardware domain's (hwdom) IORT

hwdom's IORT will only have PCIRC nodes and ITS group nodes
in the following order. SMMU nodes as they are hidden from hardware
domain.

[IORT Header]
[ITS Group 1 ]
...
[ITS Group n ]
[PCIRC Node 1]
   [PCIRC IDMAP entry 1]
   ...
   [PCIRC IDMAP entry m]
...
[PCIRC Node p]
   [PCIRC IDMAP entry 1]
   ...
   [PCIRC IDMAP entry q]
...
*n,m,p are variable.

requesterid-deviceid mapping list (rid_devid_list) populated by
parsing IORT is used to generate hwdom IORT.

As the rid_devid_list is populated from firmware IORT, IDMAP entry
would have output references offsets based on firmware's IORT.
It is required to fixup node offset of ITS Group Nodes in the PCIRC
idmap (output_reference)

First write all the ITS group nodes in the hwdom's IORT. For this
write_hwits_nodes is called, which parses the rid_devid_list and for
each unique its_node in firmware IORT create a its_node in hwdom's
IORT and also creates and entry in fwits_hwits_map.

fwits_hwits_map is a mapping between firmware IORT's its node
and the node offset of the corresponding its_node stored in the
hwdom's IORT.

This map can later be used to set output reference value in hwdom's
pcirc node's idmap entries.

Signed-off-by: Manish Jaggi 
---
  xen/arch/arm/acpi/gen-iort.c| 299 
  xen/arch/arm/domain_build.c |  35 +
  xen/include/asm-arm/acpi.h  |   1 +
  xen/include/asm-arm/acpi/gen-iort.h |  11 ++
  4 files changed, 346 insertions(+)

diff --git a/xen/arch/arm/acpi/gen-iort.c b/xen/arch/arm/acpi/gen-iort.c
index 687c4f18ee..251a9771e3 100644
--- a/xen/arch/arm/acpi/gen-iort.c
+++ b/xen/arch/arm/acpi/gen-iort.c
@@ -19,6 +19,305 @@
  
  #include 

  #include 
+#include 
+
+/*
+ * Structure of Hardware domain's (hwdom) IORT
+ * ---
+ *
+ * hwdom's IORT will only have PCIRC nodes and ITS group nodes
+ * in the following order.
+ *
+ * [IORT Header]
+ * [ITS Group 1 ]
+ * ...
+ * [ITS Group N ]
+ * [PCIRC Node 1]
+ * [PCIRC IDMAP entry 1]
+ * ...
+ * [PCIRC IDMAP entry N]
+ * ...
+ * [PCIRC Node N]
+ *
+ * requesterid-deviceid mapping list (rid_devid_list) populated by parsing IORT
+ * is used to generate hwdom IORT.
+ *
+ * One of the challanges is to fixup node offset of ITS Group Nodes


s/challanges/challenges/


+ * in the PCIRC idmap (output_reference)
+ *
+ * In rid_devid_map firmware IORT's ITS group node pointer in stored.
+ *
+ * We first write all the ITS group nodes in the hwdom's IORT. For this
+ * write_hwits_nodes is called, which parses the rid_devid_list and for
+ * each unique its_node in firmware IORT create a its_node in hwdom's IORT
+ * and also creates and entry in fwits_hwits_map.
+ *
+ * fwits_hwits_map is a mapping between firmware IORT's its node
+ * and the node offset of the corresponding its_node stored in the
+ * hwdom's IORT.
+ *
+ * This map can be later used to set output reference value in hwdom's
+ * pcirc node's idmap entries.
+ *
+ */
+
+/*
+ * Stores the mapping between firmware tables its group node
+ * to the offset of the equivalent its node to be stored in
+ * hwdom's IORT.
+ */
+struct fwits_hwits_map
+{
+struct acpi_iort_node *fwits_node;
+unsigned int hwitsnode_offset;
+struct list_head entry;
+};
+
+LIST_HEAD(fwits_hwits_list);


As said in the previous version, I think this should be static.


+
+/*
+ * is_uniq_fwits_node
+ *
+ * returns 1 - if fwits_node is not already in the its_map_list
+ * 0 - if it is present already


It also returns -ENOMEM when you can't allocate memory.


+ *
+ * fwits_node - ITS Node pointer in Firmware IORT
+ * offset - offset of the equivalent its node to be stored in
+ *  hwdom's IORT
+ */
+static int is_uniq_fwits_node(struct acpi_iort_node *fwits_node,


The name is a bit odd given that you add the ITS node. On the previous 
version, I requested to document that behavior...


But you likely want to rename the function to add_fwits_node(...) or 
something similar.



+  unsigned int offset)
+{
+struct fwits_hwits_map *map;
+
+list_for_each_entry(map, &fwits_hwits_list, entry)
+{
+if ( map->fwits_node == fwits_node )
+return 0;
+}
+
+map = xzalloc(struct fwits_hwits_map);


Where this memory is going to be freed?


+if ( !map )
+return -ENOMEM;
+
+map->fwits_node = fwits_node;
+map->hwitsnode_offset = offset;
+list_add_tail(&map->entry, &fwits_hwits_list);
+
+return 1;
+}
+
+/*
+ * Returns the offset of corresponding its node to fwits_node
+ * written in hwdom's IORT.
+ *
+ * This function would be used when write hwdoms pcirc nodes' idmap
+ * entries.
+ */
+static
+unsigned int hwitsnode_offset_from_map(struct acpi_iort_node *fwits_node)
+{
+struct fwits_hwits_map *map;
+
+list_for_each_entry(map, &fwits_hwits

Re: [Xen-devel] [PATCH v12 10/12] vpci: add a priority parameter to the vPCI register initializer

2018-03-22 Thread Julien Grall

Hi Roger,

On 03/22/2018 01:58 PM, Roger Pau Monne wrote:

This is needed for MSI-X, since MSI-X will need to be initialized
before parsing the BARs, so that the header BAR handlers are aware of
the MSI-X related holes and make sure they are not mapped in order for
the trap handlers to work properly.

Signed-off-by: Roger Pau Monné 
Reviewed-by: Jan Beulich 


For ARM bits:

Acked-by: Julien Grall 

Cheers,


---
Cc: Stefano Stabellini 
Cc: Julien Grall 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Konrad Rzeszutek Wilk 
Cc: Tim Deegan 
Cc: Wei Liu 
---
Changes since v4:
  - Add a middle priority and add the PCI header to it.

Changes since v3:
  - Add a numerial suffix to the section used to store the pointer to
each initializer function, and sort them at link time.
---
  xen/arch/arm/xen.lds.S| 4 ++--
  xen/arch/x86/xen.lds.S| 4 ++--
  xen/drivers/vpci/header.c | 2 +-
  xen/drivers/vpci/msi.c| 2 +-
  xen/include/xen/vpci.h| 8 ++--
  5 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
index 49cae2af71..245a0e0e85 100644
--- a/xen/arch/arm/xen.lds.S
+++ b/xen/arch/arm/xen.lds.S
@@ -69,7 +69,7 @@ SECTIONS
  #if defined(CONFIG_HAS_VPCI) && defined(CONFIG_LATE_HWDOM)
 . = ALIGN(POINTER_ALIGN);
 __start_vpci_array = .;
-   *(.data.vpci)
+   *(SORT(.data.vpci.*))
 __end_vpci_array = .;
  #endif
} :text
@@ -182,7 +182,7 @@ SECTIONS
  #if defined(CONFIG_HAS_VPCI) && !defined(CONFIG_LATE_HWDOM)
 . = ALIGN(POINTER_ALIGN);
 __start_vpci_array = .;
-   *(.data.vpci)
+   *(SORT(.data.vpci.*))
 __end_vpci_array = .;
  #endif
} :text
diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
index 7bd6fb51c3..70afedd31d 100644
--- a/xen/arch/x86/xen.lds.S
+++ b/xen/arch/x86/xen.lds.S
@@ -139,7 +139,7 @@ SECTIONS
  #if defined(CONFIG_HAS_VPCI) && defined(CONFIG_LATE_HWDOM)
 . = ALIGN(POINTER_ALIGN);
 __start_vpci_array = .;
-   *(.data.vpci)
+   *(SORT(.data.vpci.*))
 __end_vpci_array = .;
  #endif
} :text
@@ -246,7 +246,7 @@ SECTIONS
  #if defined(CONFIG_HAS_VPCI) && !defined(CONFIG_LATE_HWDOM)
 . = ALIGN(POINTER_ALIGN);
 __start_vpci_array = .;
-   *(.data.vpci)
+   *(SORT(.data.vpci.*))
 __end_vpci_array = .;
  #endif
} :text
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 25d8ec0507..9fa07992cc 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -532,7 +532,7 @@ static int init_bars(struct pci_dev *pdev)
  
  return (cmd & PCI_COMMAND_MEMORY) ? modify_bars(pdev, true, false) : 0;

  }
-REGISTER_VPCI_INIT(init_bars);
+REGISTER_VPCI_INIT(init_bars, VPCI_PRIORITY_MIDDLE);
  
  /*

   * Local variables:
diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
index c3c69ec453..de4ddf562e 100644
--- a/xen/drivers/vpci/msi.c
+++ b/xen/drivers/vpci/msi.c
@@ -267,7 +267,7 @@ static int init_msi(struct pci_dev *pdev)
  
  return 0;

  }
-REGISTER_VPCI_INIT(init_msi);
+REGISTER_VPCI_INIT(init_msi, VPCI_PRIORITY_LOW);
  
  void vpci_dump_msi(void)

  {
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 116b93f519..7266c17679 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -15,9 +15,13 @@ typedef void vpci_write_t(const struct pci_dev *pdev, 
unsigned int reg,
  
  typedef int vpci_register_init_t(struct pci_dev *dev);
  
-#define REGISTER_VPCI_INIT(x)   \

+#define VPCI_PRIORITY_HIGH  "1"
+#define VPCI_PRIORITY_MIDDLE"5"
+#define VPCI_PRIORITY_LOW   "9"
+
+#define REGISTER_VPCI_INIT(x, p)\
static vpci_register_init_t *const x##_entry  \
-   __used_section(".data.vpci") = x
+   __used_section(".data.vpci." p) = x
  
  /* Add vPCI handlers to device. */

  int __must_check vpci_add_handlers(struct pci_dev *dev);



--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3a 14/39] ARM: new VGIC: Add GICv2 world switch backend

2018-03-22 Thread Julien Grall

Hi,

On 03/22/2018 03:12 PM, Andre Przywara wrote:

Hi,

On 22/03/18 14:06, Julien Grall wrote:

Hi Andre,

On 03/22/2018 11:56 AM, Andre Przywara wrote:

+    /* The locking order forces us to drop and re-take the locks
here. */
+    if ( irq->hw )
+    {
+    spin_unlock(&irq->irq_lock);
+
+    desc = irq_to_desc(irq->hwintid);
+    spin_lock(&desc->lock);
+    spin_lock(&irq->irq_lock);
+
+    /* This h/w IRQ should still be assigned to the virtual
IRQ. */
+    ASSERT(irq->hw && desc->irq == irq->hwintid);
+
+    have_desc_lock = true;
+    }


I am a bit concerned of this dance in fold_lr_state(). This looks
awfully complex but I don't have better solution here.


I agree.


I still have much idea how to solve that nicely. Maybe Stefano has?

Meanwhile, I would be happy to get that in Xen:

Acked-by: Julien Grall 


I will have a think during the night.

However, this is not going to solve the race condition I mentioned
between clearing _IRQ_INPROGRESS here and setting _IRQ_INPROGRESS in
do_IRQ. This is because you don't know the order they are going to be
executed.

I wanted to make sure you didn't intend to solve that one. Am I correct?


This is right, this is orthogonal and not addressed by this patch. I
have a hunch we need to solve this in irq.c instead.


I guess this should be logged on Jira with the rest of the open items.



Cheers,
Andre.



--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-4.6-testing test] 121031: regressions - FAIL

2018-03-22 Thread osstest service owner
flight 121031 xen-4.6-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/121031/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-xtf-amd64-amd64-3 50 xtf/test-hvm64-lbr-tsx-vmentry fail REGR. vs. 119227
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 16 guest-localmigrate/x10 fail REGR. 
vs. 119227
 test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail REGR. vs. 
119227

Tests which did not succeed, but are not blocking:
 test-xtf-amd64-amd64-2  50 xtf/test-hvm64-lbr-tsx-vmentry fail like 119187
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 119187
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 119227
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 119227
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 119227
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 119227
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 119227
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 119227
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 119227
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail like 119227
 test-xtf-amd64-amd64-2   37 xtf/test-hvm32pae-memop-seg  fail   never pass
 test-xtf-amd64-amd64-3   37 xtf/test-hvm32pae-memop-seg  fail   never pass
 test-xtf-amd64-amd64-1   37 xtf/test-hvm32pae-memop-seg  fail   never pass
 test-xtf-amd64-amd64-5   37 xtf/test-hvm32pae-memop-seg  fail   never pass
 test-xtf-amd64-amd64-2   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-3   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-1   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-5   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-2   76 xtf/test-pv32pae-xsa-194 fail   never pass
 test-xtf-amd64-amd64-3   76 xtf/test-pv32pae-xsa-194 fail   never pass
 test-xtf-amd64-amd64-1   76 xtf/test-pv32pae-xsa-194 fail   never pass
 test-xtf-amd64-amd64-5   76 xtf/test-pv32pae-xsa-194 fail   never pass
 test-xtf-amd64-amd64-4   37 xtf/test-hvm32pae-memop-seg  fail   never pass
 test-xtf-amd64-amd64-4   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-4   76 xtf/test-pv32pae-xsa-194 fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install  

Re: [Xen-devel] X86 Community Call - Wed Apr 11, 14:00 - 15:00 UTC - Call for Agenda Items

2018-03-22 Thread Stefano Stabellini
On Thu, 22 Mar 2018, Lars Kurth wrote:
> On 22/03/2018, 14:49, "Julien Grall"  wrote:
> 
> >> -
> >>
> >> I think we need to discuss PCI emulation and our future direction. Our 
> current hybrid with QEMU is becoming increasingly problematic.
> > 
> > +1
> 
> I think it would be worth for Stefano and I to join this discussion. 
> Ideally, we want to use a common solution between Arm and x86.
> 
> Not sure the time will fit for Stefano thought.
> 
> It's at 7am Pacific, which is a little early for Stefano. I can't really move 
> the call: it was quite hard to agree a time-slot.
> But we could aim to schedule this discussion for say 7:30 or 7:45, which 
> makes this easier for Stefano

Yes, indeed it is very early for Stefano :-)

But I can do 7:30-7:45 for once.

In general, for things that interest both x86 and Arm, and PCI
passthrough is a great example, I think it would be best to organize
topic specific calls (that I would love push to 8AM or later ;-)___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [rumprun test] 121042: regressions - FAIL

2018-03-22 Thread osstest service owner
flight 121042 rumprun real [real]
http://logs.test-lab.xenproject.org/osstest/logs/121042/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-rumprun   6 rumprun-buildfail REGR. vs. 106754
 build-i386-rumprun6 rumprun-buildfail REGR. vs. 106754

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumprun-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-rumprun-i386  1 build-check(1)   blocked  n/a

version targeted for testing:
 rumprun  94bdf32ac57b84c1b42150d21f0ad79b3b5dd99c
baseline version:
 rumprun  c7f2f016becc1cd0e85da6e1b25a8e7f9fb2aa74

Last test of basis   106754  2017-03-18 04:21:25 Z  369 days
Testing same since   120360  2018-03-09 04:19:20 Z   13 days   10 attempts


People who touched revisions under test:
  Kent McLeod 
  Kent McLeod 
  Naja Melan 
  Sebastian Wicki 
  Wei Liu 

jobs:
 build-amd64  pass
 build-i386   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumprun  fail
 build-i386-rumprun   fail
 test-amd64-amd64-rumprun-amd64   blocked 
 test-amd64-i386-rumprun-i386 blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 94bdf32ac57b84c1b42150d21f0ad79b3b5dd99c
Merge: 8fe40c8 b3c1033
Author: Kent McLeod 
Date:   Fri Feb 16 09:15:45 2018 +1100

Merge pull request #118 from kent-mcleod/stretch-linking-defaultpie

Fix linking on Debian Stretch (gcc-6)

commit b3c1033b090b65e8e86999ddd063c174502aa3f0
Author: Kent McLeod 
Date:   Wed Feb 14 16:43:16 2018 +1100

Add further -no-pie checks to Rumprun build tools

This builds upon the previous commit to add -no-pie anywhere the
relocatable flag (-Wl,-r) is used to handle compilers that enable -pie
by default (Such as Debian Stretch).

commit 8fe40c84edddfbf472b4a7cce960df749701174c
Merge: c7f2f01 685f4ab
Author: Sebastian Wicki 
Date:   Fri Jan 5 15:04:18 2018 +0100

Merge pull request #112 from najamelan/bugfix/gcc7-fallthrough

Add the -Wimplicit-fallthrough=0 flag to allow compiling with GCC7

commit 685f4ab3b74b6f1e1b40bdd3d2c42efa44bf385d
Author: Naja Melan 
Date:   Thu Jan 4 16:07:46 2018 +

Make the disabling of the fallthrough warning dependent on GCC version

This should prevent older gcc versions from choking on unknown argument.

I have not tested this, just wrote the code directly on github. Use with 
caution.

commit 34056451174e8722b972229fefc1bf9e0b89a7da
Author: Naja Melan 
Date:   Wed Jan 3 18:57:50 2018 +

Add the -Wimplicit-fallthrough=0 flag to allow compiling with GCC7

GCC7 comes with a new warning "implicit-fallthrough" which will prevent 
building the netbsd-src.

For more information: 
https://dzone.com/articles/implicit-fallthrough-in-gcc-7

commit 35d81194b7feb75d20af3ba4fdb45ea76230852f
Author: Wei Liu 
Date:   Wed Jun 7 16:30:00 2017 +0100

Fix linking on Debian Stretch

Provide cc-option. Use that to check if -no-pie is available and
append it when necessary.

Signed-off-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 121068: tolerable all pass - PUSHED

2018-03-22 Thread osstest service owner
flight 121068 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/121068/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  e633b13a18f7a7e407cba2de42a5a2a86aaec9c1
baseline version:
 xen  6161d9f27fcb6c48021e6928bb240dfa39d9f1d3

Last test of basis   121065  2018-03-22 16:01:10 Z0 days
Testing same since   121068  2018-03-22 19:05:15 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   6161d9f27f..e633b13a18  e633b13a18f7a7e407cba2de42a5a2a86aaec9c1 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] qemu: replace "" with <> in headers

2018-03-22 Thread Paolo Bonzini
On 22/03/2018 20:29, Michael S. Tsirkin wrote:
> On Wed, Mar 21, 2018 at 05:22:03PM +0100, Kevin Wolf wrote:
>>> It's all still very much a non-standard convention and so less robust
>>> than prefixing file name with a project-specifix prefix.
>> I've always had the impression that it's by far the most common
>> convention, to the point that I'd blindly assume it when joining a new
>> project.
> 
> Any examples?

GCC - https://github.com/gcc-mirror/gcc/blob/master/gcc/reload.c
Libvirt - https://github.com/libvirt/libvirt/blob/master/src/util/virprocess.c
SDL - https://github.com/SDL-mirror/SDL/blob/master/src/core/unix/SDL_poll.c

Anything but Linux really.

I find  verbose and unnecessary.  The only advantage
of your proposal is that files included from the source directory would be
clearly noticeable.  That said, it's easy to add a checkpatch.pl rule that
detects when ".../..." is used on a file not under include/.

Paolo

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 1/2] make: move generated headers to qemu-build/

2018-03-22 Thread Michael S. Tsirkin
On Thu, Mar 22, 2018 at 02:42:55PM -0500, Eric Blake wrote:
> On 03/22/2018 02:27 PM, Michael S. Tsirkin wrote:
> > Make sure all generated files go into qemu-build subdirectory.
> > We can then include them like this:
> >   #include "qemu-build/trace.h"
> > 
> > This serves two purposes:
> > - make it easy to detect which files are in the source
> >directory (a bit more work for writers, easier for readers)
> > - reduce chances of conflicts with possible stale files in source
> >directory (which could be left over from e.g. old patches, etc)
> > 
> > This patch needs to be merged with patch 2  of series updating all
> > files: sending it separately to avoid spamming the list.
> > 
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> 
> > +++ b/Makefile
> > @@ -89,102 +89,102 @@ endif
> >   include $(SRC_PATH)/rules.mak
> > -GENERATED_FILES = qemu-version.h config-host.h qemu-options.def
> > -GENERATED_FILES += qapi/qapi-builtin-types.h qapi/qapi-builtin-types.c
> 
> Uggh - I really need to follow up on my threat to make smarter use of make
> variables and string manipulation to cut down on the boilerplate involved
> here.  Sadly, I'm not convinced that doing so is a 2.12 bugfix priority, so
> it isn't at the top of my work queue.
> 
> Overall, the patch is an interesting idea.  I'm still not 100% sold on it
> (as you say, it's now slightly more work for writers), but I'm not coming up
> with any solid reasons why it should not be applied (at least, for 2.13 -
> doing it during freeze for 2.12 is a bit harder to justify).

It's up to Peter really: it helps reduce conflicts if we apply patches
like this during freeze.  But with enough effort on Pater's part it's
not a huge deal.

> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 1/2] make: move generated headers to qemu-build/

2018-03-22 Thread Eric Blake

On 03/22/2018 02:27 PM, Michael S. Tsirkin wrote:

Make sure all generated files go into qemu-build subdirectory.
We can then include them like this:
  #include "qemu-build/trace.h"

This serves two purposes:
- make it easy to detect which files are in the source
   directory (a bit more work for writers, easier for readers)
- reduce chances of conflicts with possible stale files in source
   directory (which could be left over from e.g. old patches, etc)

This patch needs to be merged with patch 2  of series updating all
files: sending it separately to avoid spamming the list.

Signed-off-by: Michael S. Tsirkin 
---



+++ b/Makefile
@@ -89,102 +89,102 @@ endif
  
  include $(SRC_PATH)/rules.mak
  
-GENERATED_FILES = qemu-version.h config-host.h qemu-options.def

-GENERATED_FILES += qapi/qapi-builtin-types.h qapi/qapi-builtin-types.c


Uggh - I really need to follow up on my threat to make smarter use of 
make variables and string manipulation to cut down on the boilerplate 
involved here.  Sadly, I'm not convinced that doing so is a 2.12 bugfix 
priority, so it isn't at the top of my work queue.


Overall, the patch is an interesting idea.  I'm still not 100% sold on 
it (as you say, it's now slightly more work for writers), but I'm not 
coming up with any solid reasons why it should not be applied (at least, 
for 2.13 - doing it during freeze for 2.12 is a bit harder to justify).


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] qemu: replace "" with <> in headers

2018-03-22 Thread Michael S. Tsirkin
On Wed, Mar 21, 2018 at 05:22:03PM +0100, Kevin Wolf wrote:
> > It's all still very much a non-standard convention and so less robust
> > than prefixing file name with a project-specifix prefix.
> 
> I've always had the impression that it's by far the most common
> convention, to the point that I'd blindly assume it when joining a new
> project.

Any examples?

> > > > As another example of problems, a header by the same name in the source
> > > > directory will always be picked up first - before any headers in
> > > > the include directory.
> > > > 
> > > > Let's change the scheme: make sure all headers that are not
> > > > in the source directory are included through a path
> > > > starting with qemu/ , thus:
> > > > 
> > > >  #include <>
> > > > 
> > > > headers in the same directory as source are included with
> > > > 
> > > >  #include ""
> > > > 
> > > > as per standard.
> > > > 
> > > > This (untested) patch is just to start the discussion and does not
> > > > change all of the codebase. If there's agreement, this will be
> > > > run on all code to converting code to this scheme.
> > > 
> > > Renaming files is always painful. If that's the fix, the cure might be
> > > worse than the disease. As far as I know, the conflict is only
> > > theoretical, so in that case I'd say: If it ain't broke, don't fix it.
> > > 
> > > Kevin
> > 
> > It's broke I think, it's very hard for new people to contribute to QEMU.
> > Look e.g. at rdma which all has messed up includes - and that's from an
> > experienced conributor who just isn't an experienced maintainer.
> 
> I don't think the problem is that the convention is hard to apply (it's
> definitely not). It's knowing about the convention. This problem isn't
> going away by switching to a different, less common convention. We're
> only going to see more offenders then.

Not if we have some automatic tools to catch violators.

> > Amount of time spent on teaching new people trivia about our
> > conventions just isn't funny. They should be self-documenting
> > and violations should cause the build to fail.
> 
> Yes, but your proposal doesn't achieve this. You can still use
> "qemu/foo.h" instead of  and it will build successfully.
> That's something we can't change, as far as I know, because the include
> path for "foo.h" is always a superset of .

If the rule is that "" is only for files in the current directory
then we can easily code up a checkpatch script to catch violators.

> If anything, this means that we should prefer "foo.h" for local headers
> (i.e. the way it currently is) because we can let the compiler enforce
> it:  for "foo.h" can become a build error, and does so with your
> -iquote patch, but the other way round doesn't work.
> 
> Then it's only system headers that you can possibly get wrong, but for
> those everyone should be used to using  anyway.
> 
> Kevin

If my proposal to prefix all include directories with qemu/
is accepted, then we can solve the stale file problem
by prohibiting a directory named qemu everywhere in source.


-- 
MST

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 1/2] make: move generated headers to qemu-build/

2018-03-22 Thread Michael S. Tsirkin
Make sure all generated files go into qemu-build subdirectory.
We can then include them like this:
 #include "qemu-build/trace.h"

This serves two purposes:
- make it easy to detect which files are in the source
  directory (a bit more work for writers, easier for readers)
- reduce chances of conflicts with possible stale files in source
  directory (which could be left over from e.g. old patches, etc)

This patch needs to be merged with patch 2  of series updating all
files: sending it separately to avoid spamming the list.

Signed-off-by: Michael S. Tsirkin 
---
 configure   |   6 +-
 Makefile| 412 +++-
 rules.mak   |   5 +-
 .gitignore  |   1 +
 Makefile.objs   | 144 +-
 Makefile.target |  21 +--
 trace/Makefile.objs |  15 +-
 7 files changed, 313 insertions(+), 291 deletions(-)

diff --git a/configure b/configure
index 23a4f3b..7b0a183 100755
--- a/configure
+++ b/configure
@@ -6638,6 +6638,8 @@ if test "$gcov" = "yes" ; then
   echo "GCOV=$gcov_tool" >> $config_host_mak
 fi
 
+mkdir -p qemu-build
+
 # use included Linux headers
 if test "$linux" = "yes" ; then
   mkdir -p linux-headers
@@ -7046,10 +7048,10 @@ echo "QEMU_CFLAGS+=$cflags" >> $config_target_mak
 done # for target in $targets
 
 if [ "$dtc_internal" = "yes" ]; then
-  echo "config-host.h: subdir-dtc" >> $config_host_mak
+  echo "qemu-build/config-host.h: subdir-dtc" >> $config_host_mak
 fi
 if [ "$capstone" = "git" -o "$capstone" = "internal" ]; then
-  echo "config-host.h: subdir-capstone" >> $config_host_mak
+  echo "qemu-build/config-host.h: subdir-capstone" >> $config_host_mak
 fi
 if test -n "$LIBCAPSTONE"; then
   echo "LIBCAPSTONE=$LIBCAPSTONE" >> $config_host_mak
diff --git a/Makefile b/Makefile
index f799390..6fd90a8 100644
--- a/Makefile
+++ b/Makefile
@@ -89,102 +89,102 @@ endif
 
 include $(SRC_PATH)/rules.mak
 
-GENERATED_FILES = qemu-version.h config-host.h qemu-options.def
-GENERATED_FILES += qapi/qapi-builtin-types.h qapi/qapi-builtin-types.c
-GENERATED_FILES += qapi/qapi-types.h qapi/qapi-types.c
-GENERATED_FILES += qapi/qapi-types-block-core.h qapi/qapi-types-block-core.c
-GENERATED_FILES += qapi/qapi-types-block.h qapi/qapi-types-block.c
-GENERATED_FILES += qapi/qapi-types-char.h qapi/qapi-types-char.c
-GENERATED_FILES += qapi/qapi-types-common.h qapi/qapi-types-common.c
-GENERATED_FILES += qapi/qapi-types-crypto.h qapi/qapi-types-crypto.c
-GENERATED_FILES += qapi/qapi-types-introspect.h qapi/qapi-types-introspect.c
-GENERATED_FILES += qapi/qapi-types-migration.h qapi/qapi-types-migration.c
-GENERATED_FILES += qapi/qapi-types-misc.h qapi/qapi-types-misc.c
-GENERATED_FILES += qapi/qapi-types-net.h qapi/qapi-types-net.c
-GENERATED_FILES += qapi/qapi-types-rocker.h qapi/qapi-types-rocker.c
-GENERATED_FILES += qapi/qapi-types-run-state.h qapi/qapi-types-run-state.c
-GENERATED_FILES += qapi/qapi-types-sockets.h qapi/qapi-types-sockets.c
-GENERATED_FILES += qapi/qapi-types-tpm.h qapi/qapi-types-tpm.c
-GENERATED_FILES += qapi/qapi-types-trace.h qapi/qapi-types-trace.c
-GENERATED_FILES += qapi/qapi-types-transaction.h qapi/qapi-types-transaction.c
-GENERATED_FILES += qapi/qapi-types-ui.h qapi/qapi-types-ui.c
-GENERATED_FILES += qapi/qapi-builtin-visit.h qapi/qapi-builtin-visit.c
-GENERATED_FILES += qapi/qapi-visit.h qapi/qapi-visit.c
-GENERATED_FILES += qapi/qapi-visit-block-core.h qapi/qapi-visit-block-core.c
-GENERATED_FILES += qapi/qapi-visit-block.h qapi/qapi-visit-block.c
-GENERATED_FILES += qapi/qapi-visit-char.h qapi/qapi-visit-char.c
-GENERATED_FILES += qapi/qapi-visit-common.h qapi/qapi-visit-common.c
-GENERATED_FILES += qapi/qapi-visit-crypto.h qapi/qapi-visit-crypto.c
-GENERATED_FILES += qapi/qapi-visit-introspect.h qapi/qapi-visit-introspect.c
-GENERATED_FILES += qapi/qapi-visit-migration.h qapi/qapi-visit-migration.c
-GENERATED_FILES += qapi/qapi-visit-misc.h qapi/qapi-visit-misc.c
-GENERATED_FILES += qapi/qapi-visit-net.h qapi/qapi-visit-net.c
-GENERATED_FILES += qapi/qapi-visit-rocker.h qapi/qapi-visit-rocker.c
-GENERATED_FILES += qapi/qapi-visit-run-state.h qapi/qapi-visit-run-state.c
-GENERATED_FILES += qapi/qapi-visit-sockets.h qapi/qapi-visit-sockets.c
-GENERATED_FILES += qapi/qapi-visit-tpm.h qapi/qapi-visit-tpm.c
-GENERATED_FILES += qapi/qapi-visit-trace.h qapi/qapi-visit-trace.c
-GENERATED_FILES += qapi/qapi-visit-transaction.h qapi/qapi-visit-transaction.c
-GENERATED_FILES += qapi/qapi-visit-ui.h qapi/qapi-visit-ui.c
-GENERATED_FILES += qapi/qapi-commands.h qapi/qapi-commands.c
-GENERATED_FILES += qapi/qapi-commands-block-core.h 
qapi/qapi-commands-block-core.c
-GENERATED_FILES += qapi/qapi-commands-block.h qapi/qapi-commands-block.c
-GENERATED_FILES += qapi/qapi-commands-char.h qapi/qapi-commands-char.c
-GENERATED_FILES += qapi/qapi-commands-common.h qapi/qapi-commands-common.c
-GENERATED_FILES += qapi/qapi-commands-crypto.h qapi/qapi-commands-crypto.c
-GENERATED_FILES += qapi/qapi-commands-introsp

[Xen-devel] [PATCH] x86/pv: Fix the handing of writes to %dr7

2018-03-22 Thread Andrew Cooper
c/s 65e35549 "x86/PV: support data breakpoint extension registers"
accidentally broke the handing of writes.  The call to activate_debugregs()
doesn't write %dr7 as v->arch.debugreg[7] hasn't been updated yet, and the
break skips the intended write to %dr7.

Remove the break, causing execution to hit the write_debugreg(7, value); in
context at the bottom of the hunk, which in turn causes hardware to be updated
appropriately.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
---
 xen/arch/x86/traps.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 37210da..4bed9de 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -2074,14 +2074,11 @@ long set_debugreg(struct vcpu *v, unsigned int reg, 
unsigned long value)
 /*
  * If DR7 was previously clear then we need to load all other
  * debug registers at this point as they were not restored during
- * context switch.
+ * context switch.  Updating DR7 itself happens later.
  */
 if ( (v == curr) &&
  !(v->arch.debugreg[7] & DR7_ACTIVE_MASK) )
-{
 activate_debugregs(v);
-break;
-}
 }
 if ( v == curr )
 write_debugreg(7, value);
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 121065: tolerable all pass - PUSHED

2018-03-22 Thread osstest service owner
flight 121065 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/121065/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  6161d9f27fcb6c48021e6928bb240dfa39d9f1d3
baseline version:
 xen  8df3821c08d024684a6c83659d8d794b565067f9

Last test of basis   121043  2018-03-21 21:04:22 Z0 days
Testing same since   121056  2018-03-22 10:01:22 Z0 days3 attempts


People who touched revisions under test:
  Andrew Cooper 
  Doug Goldstein 
  Jan Beulich 
  Joe Jin 
  Tim Deegan 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   8df3821c08..6161d9f27f  6161d9f27fcb6c48021e6928bb240dfa39d9f1d3 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] docs/qemu-deprivilege: Revise and update with status and future plans

2018-03-22 Thread George Dunlap
docs/qemu-deprivilege.txt had some basic instructions for using
dm_restrict, but it was incomplete, misleading, and stale.

Update the docs in a number of ways.

Introduce a section mentioning minimim versions of Linux, Xen, and
qemu required (TBD)

Fix the discussion of qemu userid.  Mention xen-qemuuser-range-base,
and provide example shell code that actually has some hope of working
(instead of failing out after creating 900 userids.

Describe how to enable restrictions, as well as features which
probably don't or definitely don't work.

Introduce a "Technical Details" section which describes specifically
what restrictions are currently done, and also what restrictions we
are looking at doing in the future.

The idea here is that as we implement the various items for the
future, we move them from "Restrictions still to do" to "Restrictions
done".  This can also act as a design document -- a place for public
discussion of what can or should be done and how.

Signed-off-by: George Dunlap 
---
Thank you to Ross Lagerwall, whose description of what XenServer is
doing formed much of the basis for the text here.

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
CC: Konrad Wilk 
CC: Stefano Stabellini 
CC: Julien Grall 
CC: Anthony Perard 
CC: Ross Lagerwall 
---
 docs/misc/qemu-deprivilege.txt | 259 -
 1 file changed, 233 insertions(+), 26 deletions(-)

diff --git a/docs/misc/qemu-deprivilege.txt b/docs/misc/qemu-deprivilege.txt
index 58b86a3908..9a5627350a 100644
--- a/docs/misc/qemu-deprivilege.txt
+++ b/docs/misc/qemu-deprivilege.txt
@@ -1,36 +1,243 @@
-For security reasons, libxl tries to pass a non-root username to QEMU as
-argument. During initialization QEMU calls setuid and setgid with the
-user ID and the group ID of the user passed as argument.
-Libxl looks for the following users in this order:
-
-1) a user named "xen-qemuuser-domid$domid",
-Where $domid is the domid of the domain being created.
-This requires the reservation of 65535 uids from xen-qemuuser-domid1
-to xen-qemuuser-domid65535. To use this mechanism, you might want to
-create a large number of users at installation time. For example:
-
-for ((i=1; i<65536; i++))
+# Introduction
+
+# Setup
+
+## Getting the right versions of software
+
+Linux 4.XX
+
+Xen 4.XX
+
+Qemu: Requires patches not yet in any release
+
+## Setting up a userid range
+
+For maximum security, libxl needs to run the devicemodel for each
+domain under a user id (UID) corresponding to its domain id.  There
+are 32752 possible domain IDs, and so libxl needs 32752 user ids set
+aside for it.
+
+The simplest and most effective way to do this is to allocate a
+contiguous block of UIDs, and create a single user named
+`xen-qemuuser-range-base` with the first UID.  For example, under Debian:
+
+adduser --no-create-home --uid 65536 --system xen-qemuuser-range-base
+
+An alternate way is to create 32752 distinct users with the name
+`xen-qemuuser-domid$domid`, doing something like the following:
+
+for ((i=1; i<=32751; i++))
 do
-adduser --no-create-home --system xen-qemuuser-domid$i
+adduser --no-create-home --system --uid $(($i-1+65536)) 
xen-qemuuser-domid$i
 done
 
-You might want to consider passing --group to adduser to create a new
-group for each new user.
+FIXME: Test the above script to see if it works
+
+NOTE: Most modern systems have 32-bit UIDs, and so can in theory go up
+to 2^31 (or 2^32 if uids are unsigned).  POSIX only guarantees 16-bit
+UIDs however.  UID 65535 is reserved for an invalid value, and 65534
+is normally allocated to "nobody".
+
+Another, less-secure way is to run all QEMUs as the same UID.  To do
+this, create a user named `xen-qemuuser-shared`; for example:
+
+adduser --no-create-home --system xen-qemuuser-shared
+
+## Domain config changes
+
+The core domain config change is to add the following line to the
+domain configuration:
+
+dm_restrict=1
+
+This will perform a number of restrictions, outlined below in the
+'Technical details' section.
+
+Remove non-functioning default features:
+
+vga="none"
+
+Other features expected not to work include:
+* Inserting a new cdrom while the guest is running (xl cdrom-insert)
+* migration / save / restore
+* PCI passthrough
+
+# Technical details
+
+## Restrictions done
+
+### Having qemu switch user
+
+'''Description''': As mentioned above, having qemu switch to a non-root user, 
one per
+domain id.
+
+'''Implementation''': The toolstack adds the following to the qemu 
command-line:
+
+-runas :
+
+'''Testing Status''': Not tested
+
+### Xen restrictions
+
+'''Description''': Close and restrict Xen-related file descriptors.
+Specifically, make sure that only one `privcmd` instance is open, and
+that the IOCTL_EVTCHN_RESTRICT_DOMID ioctl has been called.
+
+XXX Also, make sure that only one `xenstore` fd remains open, and that
+it's restricted.
+
+'''Implementation''': Toolstack adds the following to t

Re: [Xen-devel] [PATCH v3 5/7] xen/x86: disable global pages for domains with XPTI active

2018-03-22 Thread Juergen Gross
On 22/03/18 17:30, Jan Beulich wrote:
 On 21.03.18 at 13:51,  wrote:
>> Instead of flushing the TLB from global pages when switching address
>> spaces with XPTI being active just disable global pages via %cr4
>> completely when a domain subject to XPTI is active. This avoids the
>> need for extra TLB flushes as loading %cr3 will remove all TLB
>> entries.
> 
> I continue to be not entirely convinced of this move. I had an
> alternative in mind: Since retaining global pages is particularly
> relevant for switches between guest user and guest kernel
> modes, what if we made a shortcut from e.g. lstar_enter through
> switch_to_kernel to restore_all_guest without ever switching to
> the full page Xen tables?

With patch 7 of this series in mind I'm not convinced the extra effort
is really making sense. Today most processors do have PCID support so
for that old hardware I don't think we need to make the handling even
more complex.

> 
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -508,18 +508,23 @@ void make_cr3(struct vcpu *v, mfn_t mfn)
>>  void write_ptbase(struct vcpu *v)
>>  {
>>  struct cpu_info *cpu_info = get_cpu_info();
>> +unsigned long new_cr4;
>> +
>> +new_cr4 = (is_pv_vcpu(v) && !is_idle_vcpu(v))
>> +  ? pv_guest_cr4_to_real_cr4(v) : mmu_cr4_features;
> 
> I'm not overly happy to see any new uses of mmu_cr4_features.
> This should really only be used for priming certain values imo,
> which isn't the case here (otoh pv_guest_cr4_to_real_cr4() does
> so too, and perhaps better wouldn't). Hence I wonder whether
> this shouldn't be read_cr4() | X86_CR4_PGE, not the least
> because we've just got rid of the blanket reversion to
> mmu_cr4_features in VMX code.

I do understand that using mmu_cr4_features isn't the best way to set
cr4. But I think it is a good idea to have a default value which should
normally be used instead of only switching various bits on and off.

In case cr4 is loaded with a strange value in some corner case that
value might be used from then on instead of being repaired by loading a
dedicated value at certain points in time, e.g. when doing a context
switch.

So maybe we should introduce cr4_default which is derived from
mmu_cr4_features? mmu_cr4_features would contain all bits which are
allowed on the current processor with the current command line options,
while cr4_default would be a subset of mmu_cr4_features.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Deprecated option -usbdevice in QEMU

2018-03-22 Thread Wei Liu
On Wed, Mar 14, 2018 at 02:29:17PM +, Anthony PERARD wrote:
> Hi,
> 
> In an xl guest config, we have the "usbdevice" option. It is just
> passthrough to QEMU "-usbdevice" without parsing. The QEMU option is now
> deprecated. v2.11 (to be released with Xen 4.11) is the last version of
> QEMU to have the option.
> 
> Unfortunatly, our documentation relie on QEMU's documentation, so there
> would be a lot to parse if we want to keep the option.
> 
> I propose that we also deprecated the "usbdevice" option and find a
> suitable alternative (or we have to parse usbdevice in libxl).

If we want to be backward compatible we need to parse the string anyway,
regardless of whatever new thing we recommend, right?

> 
> "usbdev" seems to be a good fit for that, and should already handle
> "usbdevice='host:bus.addr', but would be written:
> "usbdev=['type=hostdev,hostbus=bus,hostaddr=addr']"
> 
> The other use of "usbdevice" documented on the man are:
> - tablet
> - host:vendor_id:product_id
> 
> Other usage of "usbdevice" documented in the QEMU documentation:
> - mouse
> - disk:[format=format]:file
> - serial:[vendorid=vendor_id][,productid=product_id]:dev
> - braille
> - net:options
> 
> 
> From QEMU perspective, those options can be replaced the following
> cmdline options or QMP command equivalent (I haven't check everything):
> * -device usb-tablet
> * -device usb-host,vendorid=vendor,productid=product
> * -device usb-mouse
> * -drive if=none,id=drive_id,file=file  -device usb-storage,drive=drive_id
> * -chardev x,id=id -device usb-serial,chardev=id
> * -device usb-braille
> * -netdev x,id=id -device usb-net,netdev=id
> 
> 
> How the original "usbdevice" could be translated into a USBDEV_SPEC for
> "usbdev" ? Maybe for e.g.:
> 'type=tablet'
> 'type=storage,file=file,format=format'
> 
> I don't know is anybody would be using "usbdevice='net:...'" or
> "usbdevice='serial:...'".
> 
> What do you think?

I would be fine with restricting usage to a few known devices and reject
the rest. There is already a kitchen sink device_model_args that we can
turn to.

Wei.


> 
> Thanks,
> 
> -- 
> Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [qemu-mainline test] 120994: regressions - FAIL

2018-03-22 Thread osstest service owner
flight 120994 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120994/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-qemuu-nested-intel 14 xen-boot/l1   fail REGR. vs. 120095
 test-amd64-amd64-qemuu-nested-amd 14 xen-boot/l1 fail REGR. vs. 120095

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 120095
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 120095
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 120095
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 120095
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 120095
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 120095
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass

version targeted for testing:
 qemuu036793aebfc1dd0ce124fa278d7668d89b5da936
baseline version:
 qemuu6697439794f72b3501ee16bb95d16854f9981421

Last test of basis   120095  2018-02-28 13:46:33 Z   22 days
Failing since120146  2018-03-02 10:10:57 Z   20 days   11 attempts
Testing same since   120994  2018-03-20 14:08:16 Z2 days1 attempts


People who touched revisions under test:
  Alberto Garcia 
  Alex Bennée 
  Alex Bennée 
  Alex Williamson 
  Alexey Kardashevskiy 
  Alistair Francis 
  Alistair Francis 
  Andrey Smirnov 
  Anton Nefedov 
  BALATON Zoltan 
  Bastian Koppelmann 
  Bastian Koppelmann  (tricore)
  Bill Paul 
  Brijesh Singh 
  Bruce Rogers 
  Chao Peng 
  Christian Borntraeger 
  Claudio Imbrenda 
  Collin L. Walling 
  Corey Minyard 
  Cor

Re: [Xen-devel] [PATCH v3 3/7] xen/x86: support per-domain flag for xpti

2018-03-22 Thread Juergen Gross
On 22/03/18 16:44, Jan Beulich wrote:
 On 22.03.18 at 16:29,  wrote:
>> On 22/03/18 16:26, Jan Beulich wrote:
>> On 21.03.18 at 13:51,  wrote:
 +void xpti_domain_init(struct domain *d)
 +{
 +if ( !is_pv_domain(d) || is_pv_32bit_domain(d) )
 +return;
>>>
>>> As you rely on the zero-initialization of the field here, ...
>>>
 +switch ( opt_xpti )
 +{
 +case XPTI_OFF:
 +d->arch.pv_domain.xpti = false;
>>>
>>> ... this could go away as well.
>>
>> I wanted to make the switch statement complete. No problem to drop
>> setting of xpti here of you like that better.
> 
> FAOD I didn't mean dropping the entire case block.

Of course not. This would just be wrong with the current default block.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 4/7] xen/x86: use invpcid for flushing the TLB

2018-03-22 Thread Juergen Gross
On 22/03/18 16:35, Jan Beulich wrote:
 On 21.03.18 at 13:51,  wrote:
>> --- a/docs/misc/xen-command-line.markdown
>> +++ b/docs/misc/xen-command-line.markdown
>> @@ -1380,6 +1380,14 @@ Because responsibility for APIC setup is shared 
>> between Xen and the
>>  domain 0 kernel this option is automatically propagated to the domain
>>  0 command line.
>>  
>> +### noinvpcid (x86)
>> +> `= `
>> +
>> +Disable using the INVPCID instruction for flushing TLB entries.
>> +This should only be used in case of known issues on the current platform
>> +with that instruction. Disabling INVPCID will normally result in a slightly
>> +degraded performance.
> 
> At the first glance this looks as if it wants to be a cpuid=
> sub-option. However, that would disable use by both Xen and
> (HVM) guests. Andrew, what are your plans here as to
> distinguishing the "Xen uses a feature" from the "disable use of
> a feature altogether"?
> 
> If we stay with a separate option, then please make this a
> normal boolean one (i.e. drop the "no" prefix), as "no-noinvpcid"
> is rather ugly.

Okay.

> 
>> @@ -457,7 +472,6 @@ static void generic_set_all(void)
>>  set_bit(count, &smp_changes_mask);
>>  mask >>= 1;
>>  }
>> -
>>  }
>>  
>>  static void generic_set_mtrr(unsigned int reg, unsigned long base,
> 
> I don't mind this line being dropped, but in general please avoid
> stray changes which aren't assimilated into changes you do anyway.

The main reason I did drop this line was the trailing tab. I just took
the risk of someone complaining.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 5/7] xen/x86: disable global pages for domains with XPTI active

2018-03-22 Thread Jan Beulich
>>> On 21.03.18 at 13:51,  wrote:
> Instead of flushing the TLB from global pages when switching address
> spaces with XPTI being active just disable global pages via %cr4
> completely when a domain subject to XPTI is active. This avoids the
> need for extra TLB flushes as loading %cr3 will remove all TLB
> entries.

I continue to be not entirely convinced of this move. I had an
alternative in mind: Since retaining global pages is particularly
relevant for switches between guest user and guest kernel
modes, what if we made a shortcut from e.g. lstar_enter through
switch_to_kernel to restore_all_guest without ever switching to
the full page Xen tables?

> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -508,18 +508,23 @@ void make_cr3(struct vcpu *v, mfn_t mfn)
>  void write_ptbase(struct vcpu *v)
>  {
>  struct cpu_info *cpu_info = get_cpu_info();
> +unsigned long new_cr4;
> +
> +new_cr4 = (is_pv_vcpu(v) && !is_idle_vcpu(v))
> +  ? pv_guest_cr4_to_real_cr4(v) : mmu_cr4_features;

I'm not overly happy to see any new uses of mmu_cr4_features.
This should really only be used for priming certain values imo,
which isn't the case here (otoh pv_guest_cr4_to_real_cr4() does
so too, and perhaps better wouldn't). Hence I wonder whether
this shouldn't be read_cr4() | X86_CR4_PGE, not the least
because we've just got rid of the blanket reversion to
mmu_cr4_features in VMX code.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3a 14/39] ARM: new VGIC: Add GICv2 world switch backend

2018-03-22 Thread Andre Przywara
Hi,

On 22/03/18 14:06, Julien Grall wrote:
> Hi Andre,
> 
> On 03/22/2018 11:56 AM, Andre Przywara wrote:
>> +    /* The locking order forces us to drop and re-take the locks
>> here. */
>> +    if ( irq->hw )
>> +    {
>> +    spin_unlock(&irq->irq_lock);
>> +
>> +    desc = irq_to_desc(irq->hwintid);

Argh, those two lines should be swapped, I guess.
I guess that doesn't really matter with our current "stick with that
hardware mapped IRQ forever" approach, but should be more future proof
anyway and is more correct.

Cheers,
Andre.

>> +    spin_lock(&desc->lock);
>> +    spin_lock(&irq->irq_lock);
>> +
>> +    /* This h/w IRQ should still be assigned to the virtual
>> IRQ. */
>> +    ASSERT(irq->hw && desc->irq == irq->hwintid);
>> +
>> +    have_desc_lock = true;
>> +    }
> 
> I am a bit concerned of this dance in fold_lr_state(). This looks
> awfully complex but I don't have better solution here. I will have a
> think during the night.
> 
> However, this is not going to solve the race condition I mentioned
> between clearing _IRQ_INPROGRESS here and setting _IRQ_INPROGRESS in
> do_IRQ. This is because you don't know the order they are going to be
> executed.
> 
> I wanted to make sure you didn't intend to solve that one. Am I correct?
> 
> Cheers,
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 3/7] xen/x86: support per-domain flag for xpti

2018-03-22 Thread Jan Beulich
>>> On 22.03.18 at 16:29,  wrote:
> On 22/03/18 16:26, Jan Beulich wrote:
> On 21.03.18 at 13:51,  wrote:
>>> +void xpti_domain_init(struct domain *d)
>>> +{
>>> +if ( !is_pv_domain(d) || is_pv_32bit_domain(d) )
>>> +return;
>> 
>> As you rely on the zero-initialization of the field here, ...
>> 
>>> +switch ( opt_xpti )
>>> +{
>>> +case XPTI_OFF:
>>> +d->arch.pv_domain.xpti = false;
>> 
>> ... this could go away as well.
> 
> I wanted to make the switch statement complete. No problem to drop
> setting of xpti here of you like that better.

FAOD I didn't mean dropping the entire case block.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/7] x86/xpti: avoid copying L4 page table contents when possible

2018-03-22 Thread Jan Beulich
>>> On 22.03.18 at 16:26,  wrote:
> On 22/03/18 15:31, Jan Beulich wrote:
> On 21.03.18 at 13:51,  wrote:
>>>  void write_ptbase(struct vcpu *v)
>>>  {
>>> +if ( this_cpu(root_pgt) && is_pv_vcpu(v) && !is_pv_32bit_vcpu(v) )
>>> +get_cpu_info()->root_pgt_changed = true;
>>>  write_cr3(v->arch.cr3);
>> 
>> When you come here from e.g. __sync_local_execstate(), you
>> don't really need to set the flag. Of course you'll come here again
>> before the next 64-bit PV vCPU will make it to restore_all_guest,
>> so by the time we make it there the flag will be set anyway.
>> However, if you already use such a subtlety, then there's also
>> no point excluding 32-bit vCPU-s here (nor in make_cr3()), as
>> those will never make it to restore_all_guest. Same then for
>> excluding HVM vCPU-s. And I then wonder whether (here or
>> more likely in a later patch) the root_pgt check couldn't go away
>> as well.
> 
> I'm not sure this is worth it. Patch 3 will re-introduce a conditional
> here and it will look rather different (e.g. without the root_pgt
> check). So micro-optimizing this patch barely makes any sense.

Yes, I've seen that once I made it there. Perhaps worth dropping
the two is_*() checks here, but not worry about the root_pgt one.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 4/7] xen/x86: use invpcid for flushing the TLB

2018-03-22 Thread Jan Beulich
>>> On 21.03.18 at 13:51,  wrote:
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -1380,6 +1380,14 @@ Because responsibility for APIC setup is shared 
> between Xen and the
>  domain 0 kernel this option is automatically propagated to the domain
>  0 command line.
>  
> +### noinvpcid (x86)
> +> `= `
> +
> +Disable using the INVPCID instruction for flushing TLB entries.
> +This should only be used in case of known issues on the current platform
> +with that instruction. Disabling INVPCID will normally result in a slightly
> +degraded performance.

At the first glance this looks as if it wants to be a cpuid=
sub-option. However, that would disable use by both Xen and
(HVM) guests. Andrew, what are your plans here as to
distinguishing the "Xen uses a feature" from the "disable use of
a feature altogether"?

If we stay with a separate option, then please make this a
normal boolean one (i.e. drop the "no" prefix), as "no-noinvpcid"
is rather ugly.

> @@ -457,7 +472,6 @@ static void generic_set_all(void)
>   set_bit(count, &smp_changes_mask);
>   mask >>= 1;
>   }
> - 
>  }
>  
>  static void generic_set_mtrr(unsigned int reg, unsigned long base,

I don't mind this line being dropped, but in general please avoid
stray changes which aren't assimilated into changes you do anyway.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] possible I/O emulation state machine issue

2018-03-22 Thread Andrew Cooper
On 22/03/18 15:12, Jan Beulich wrote:
> Paul,
>
> our PV driver person has found a reproducible crash with ws2k8,
> triggered by one of the WHQL tests. The guest get crashed because
> the re-issue check of an ioreq close to the top of hvmemul_do_io()
> fails. I've handed him a first debugging patch, output of which
> suggests that we're dealing with a completely new request, which
> in turn would mean that we've run into stale STATE_IORESP_READY
> state:
>
> (XEN) d2v3: t=0/1 a=3c4/fed000f0 s=2/4 c=1/1 d=0/1 f=0/0 p=0/0 
> v=100/831873f27a30
> (XEN) [ Xen-4.10.0_15-0  x86_64  debug=n   Tainted:  C   ]

Irrespective of the issue at hand, can testing be tried with a debug
build to see if any of the assertions are hit?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 07/12] hvmloader: allocate MMCONFIG area in the MMIO hole + minor code refactoring

2018-03-22 Thread Alexey G
On Thu, 22 Mar 2018 12:44:02 +
Roger Pau Monné  wrote:

>On Thu, Mar 22, 2018 at 10:29:22PM +1000, Alexey G wrote:
>> On Thu, 22 Mar 2018 09:57:16 +
>> Roger Pau Monné  wrote:
>> [...]  
>> >> Yes, and it is still needed as we have two distinct (and not
>> >> equal) interfaces to PCI conf space. Apart from 0..FFh range
>> >> overlapping they can be considered very different interfaces. And
>> >> whether it is a real system or emulated -- we can use either one
>> >> of these two interfaces or both.
>> >
>> >The legacy PCI config space accesses and the MCFG config space
>> >access are just different methods of accessing the PCI
>> >configuration space, but the data _must_ be exactly the same. I
>> >don't see how a device would care about where the access to the
>> >config space originated.  
>> 
>> If they were different methods of accessing the same thing, they
>> could've been used interchangeably. When we've got a PCI conf ioreq
>> which has offset>100h we know we cannot just pass it to emulated
>> CF8/CFC but have to emulate this specifically.  
>
>This is already not the best approach to dispatch PCI config space
>access in QEMU. I think the interface in QEMU should be:
>
>pci_conf_space_{read/write}(sbdf, register, size , data)
>
>And this would go directly into the device. But I assume this involves
>a non-trivial amount of work to be implemented. Hence xen-hvm.c usage
>of the IO port access replay.

Yes, it's a helpful shortcut. The only bad thing that we can't use
it for PCI extended config accesses, a memory address within emulated
MMCONFIG much more preferable in current architecture.

>> >OK, so you don't want to reconstruct the access, fine.
>> >
>> >Then just inject it using pcie_mmcfg_data_{read/write} or some
>> >similar wrapper. My suggestion was just to try to use the easier
>> >way to get this injected into QEMU.  
>> 
>> QEMU knows its position, the problem it that xen-hvm.c (ioreq
>> processor) is rather isolated from MMCONFIG emulation.
>> 
>> If you check the pcie_mmcfg_data_read/write MMCONFIG handlers in
>> QEMU, you can see this:
>> 
>> static uint64_t pcie_mmcfg_data_read(void *opaque, <...>
>> {
>> PCIExpressHost *e = opaque;
>> ...
>> 
>> We know this 'opaque' when we do MMIO-style MMCONFIG handling as
>> pcie_mmcfg_data_read/write are actual handlers.
>> 
>> But xen-hvm.c needs to gain access to PCIExpressHost out of nowhere,
>> which is possible but considered a hack by QEMU. We can also insert
>> some code to MMCONFIG emulation which will store info we need to some
>> global variables to be used across wildly different and unrelated
>> modules. It will work, but anyone who see it will have bad thoughts
>> on his mind.  
>
>Since you need to notify Xen the MCFG area address, why not just store
>the MCFG address while doing this operation? You could do this with a
>helper in xen-hvm.c, and keep the variable locally to that file.
>
>In any case, this is a QEMU implementation detail. IMO the IOREQ
>interface is clear and should not be bended like this just because
>'this is easier to implement in QEMU'.

A bit of hack too, but might work. Anyway, it's an extra work we can
avoid if we simply skip PCI conf translation for MMCONFIG MMIO ioreqs
targeting QEMU. I completely agree that we need to translate these
accesses into PCI conf ioreqs for device DMs, but for QEMU it is an
unwanted and redundant step.

AFAIK (Paul might correct me here) the multiple device emulators
feature already makes use of the primary (aka default) DM and
device-specific DM distinction, so in theory it should be possible to
provide that translation only for device-specific DMs (which function
apart from the emulated machine and cannot use its facilities).

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 121061: trouble: blocked/broken/pass

2018-03-22 Thread osstest service owner
flight 121061 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/121061/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf  broken
 build-armhf   4 host-install(4)broken REGR. vs. 121043

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  6161d9f27fcb6c48021e6928bb240dfa39d9f1d3
baseline version:
 xen  8df3821c08d024684a6c83659d8d794b565067f9

Last test of basis   121043  2018-03-21 21:04:22 Z0 days
Testing same since   121056  2018-03-22 10:01:22 Z0 days2 attempts


People who touched revisions under test:
  Andrew Cooper 
  Doug Goldstein 
  Jan Beulich 
  Joe Jin 
  Tim Deegan 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  broken  
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  blocked 
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary

broken-job build-armhf broken
broken-step build-armhf host-install(4)

Not pushing.

(No revision log; it would be 318 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 3/7] xen/x86: support per-domain flag for xpti

2018-03-22 Thread Juergen Gross
On 22/03/18 16:26, Jan Beulich wrote:
 On 21.03.18 at 13:51,  wrote:
>> +void xpti_domain_init(struct domain *d)
>> +{
>> +if ( !is_pv_domain(d) || is_pv_32bit_domain(d) )
>> +return;
> 
> As you rely on the zero-initialization of the field here, ...
> 
>> +switch ( opt_xpti )
>> +{
>> +case XPTI_OFF:
>> +d->arch.pv_domain.xpti = false;
> 
> ... this could go away as well.

I wanted to make the switch statement complete. No problem to drop
setting of xpti here of you like that better.

> 
>> @@ -1050,8 +1050,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
>>  panic("Error %d setting up PV root page table\n", rc);
>>  if ( per_cpu(root_pgt, 0) )
>>  {
>> -get_cpu_info()->pv_cr3 = __pa(per_cpu(root_pgt, 0));
>> -
>> +get_cpu_info()->pv_cr3 = 0;
>>  /*
>>   * All entry points which may need to switch page tables have to 
>> start
>>   * with interrupts off. Re-write what pv_trap_init() has put there.
> 
> Please don't drop the blank line.

Okay.

> 
>> @@ -36,7 +38,8 @@ static inline void pv_vcpu_destroy(struct vcpu *v) {}
>>  static inline int pv_vcpu_initialise(struct vcpu *v) { return -EOPNOTSUPP; }
>>  static inline void pv_domain_destroy(struct domain *d) {}
>>  static inline int pv_domain_initialise(struct domain *d) { return 
>> -EOPNOTSUPP; }
>> -
>> +static inline void xpti_init(void) {}
>> +static inline void xpti_domain_init(struct domain *d) {}
>>  #endif  /* CONFIG_PV */
> 
> Same here. With that
> Reviewed-by: Jan Beulich 

Thanks,


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/7] x86/xpti: avoid copying L4 page table contents when possible

2018-03-22 Thread Juergen Gross
On 22/03/18 15:31, Jan Beulich wrote:
 On 21.03.18 at 13:51,  wrote:
>> --- a/xen/arch/x86/flushtlb.c
>> +++ b/xen/arch/x86/flushtlb.c
>> @@ -158,6 +158,9 @@ unsigned int flush_area_local(const void *va, unsigned 
>> int flags)
>>  }
>>  }
>>  
>> +if ( flags & FLUSH_ROOT_PGTBL )
>> +get_cpu_info()->root_pgt_changed = true;
>> +
>>  local_irq_restore(irqfl);
>>  
>>  return flags;
> 
> Does this really need to sit inside the interrupts disabled section?

Hmm, no, I don't think so. I'll move it below local_irq_restore().

> Thinking about it I even wonder whether the cache flush part needs
> to be. Even for the INVLPG portion of the TLB flush part I can't
> seem to see a need for IRQs to be off. I think it's really just the
> pre_flush() / post_flush() pair which needs to be inside such a
> section. I'll prepare a patch (for after 4.11). I think some of the
> changes later in your series will actually further ease this.
> 
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -499,10 +499,15 @@ void free_shared_domheap_page(struct page_info *page)
>>  void make_cr3(struct vcpu *v, mfn_t mfn)
>>  {
>>  v->arch.cr3 = mfn_x(mfn) << PAGE_SHIFT;
>> +if ( v == current && this_cpu(root_pgt) && is_pv_vcpu(v) &&
>> + !is_pv_32bit_vcpu(v) )
>> +get_cpu_info()->root_pgt_changed = true;
>>  }
> 
> As this doesn't actually update CR3, setting the flag shouldn't
> generally be necessary if the caller then invokes write_ptbase().
> Isn't setting the flag here needed solely in the case of
> _toggle_guest_pt() being up the call tree? In which case it would
> perhaps better be set there (and in turn some or even all of the
> conditional around it could be dropped)?

Yes, you are right.

> 
>>  void write_ptbase(struct vcpu *v)
>>  {
>> +if ( this_cpu(root_pgt) && is_pv_vcpu(v) && !is_pv_32bit_vcpu(v) )
>> +get_cpu_info()->root_pgt_changed = true;
>>  write_cr3(v->arch.cr3);
> 
> When you come here from e.g. __sync_local_execstate(), you
> don't really need to set the flag. Of course you'll come here again
> before the next 64-bit PV vCPU will make it to restore_all_guest,
> so by the time we make it there the flag will be set anyway.
> However, if you already use such a subtlety, then there's also
> no point excluding 32-bit vCPU-s here (nor in make_cr3()), as
> those will never make it to restore_all_guest. Same then for
> excluding HVM vCPU-s. And I then wonder whether (here or
> more likely in a later patch) the root_pgt check couldn't go away
> as well.

I'm not sure this is worth it. Patch 3 will re-introduce a conditional
here and it will look rather different (e.g. without the root_pgt
check). So micro-optimizing this patch barely makes any sense.

> 
>> @@ -3698,18 +3703,29 @@ long do_mmu_update(
>>  break;
>>  rc = mod_l4_entry(va, l4e_from_intpte(req.val), mfn,
>>cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
>> -/*
>> - * No need to sync if all uses of the page can be 
>> accounted
>> - * to the page lock we hold, its pinned status, and 
>> uses on
>> - * this (v)CPU.
>> - */
>> -if ( !rc && !cpu_has_no_xpti &&
>> - ((page->u.inuse.type_info & PGT_count_mask) >
>> -  (1 + !!(page->u.inuse.type_info & PGT_pinned) +
>> -   (pagetable_get_pfn(curr->arch.guest_table) == 
>> mfn) 
>> +
>> -   (pagetable_get_pfn(curr->arch.guest_table_user) 
>> ==
>> -mfn))) )
>> -sync_guest = true;
>> +if ( !rc && !cpu_has_no_xpti )
>> +{
>> +bool local_in_use = false;
>> +
>> +if ( (pagetable_get_pfn(curr->arch.guest_table) ==
>> +  mfn) ||
>> + 
>> (pagetable_get_pfn(curr->arch.guest_table_user) ==
>> +  mfn) )
>> +{
>> +local_in_use = true;
>> +get_cpu_info()->root_pgt_changed = true;
>> +}
> 
> The conditional causes root_pgt_changed to get set even in cases
> where what CR3 points to doesn't actually change (if it's the user
> page tables that get modified). I think you want to check
> curr->arch.cr3 here, or only curr->arch.guest_table (as user mode
> can't invoke hypercalls).

I'll go with curr->arch.guest_table.

> 
>> +/*
>> + * No need to sync if all uses of the page can be
>> + * accounted to the page lock we hold, its pinned
>> + * status, and uses on this (v)CPU.
>> + */
>> +if ( (page->u.inuse

Re: [Xen-devel] [PATCH v3 3/7] xen/x86: support per-domain flag for xpti

2018-03-22 Thread Jan Beulich
>>> On 21.03.18 at 13:51,  wrote:
> +void xpti_domain_init(struct domain *d)
> +{
> +if ( !is_pv_domain(d) || is_pv_32bit_domain(d) )
> +return;

As you rely on the zero-initialization of the field here, ...

> +switch ( opt_xpti )
> +{
> +case XPTI_OFF:
> +d->arch.pv_domain.xpti = false;

... this could go away as well.

> @@ -1050,8 +1050,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
>  panic("Error %d setting up PV root page table\n", rc);
>  if ( per_cpu(root_pgt, 0) )
>  {
> -get_cpu_info()->pv_cr3 = __pa(per_cpu(root_pgt, 0));
> -
> +get_cpu_info()->pv_cr3 = 0;
>  /*
>   * All entry points which may need to switch page tables have to 
> start
>   * with interrupts off. Re-write what pv_trap_init() has put there.

Please don't drop the blank line.

> @@ -36,7 +38,8 @@ static inline void pv_vcpu_destroy(struct vcpu *v) {}
>  static inline int pv_vcpu_initialise(struct vcpu *v) { return -EOPNOTSUPP; }
>  static inline void pv_domain_destroy(struct domain *d) {}
>  static inline int pv_domain_initialise(struct domain *d) { return 
> -EOPNOTSUPP; }
> -
> +static inline void xpti_init(void) {}
> +static inline void xpti_domain_init(struct domain *d) {}
>  #endif   /* CONFIG_PV */

Same here. With that
Reviewed-by: Jan Beulich 

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3a 14/39] ARM: new VGIC: Add GICv2 world switch backend

2018-03-22 Thread Andre Przywara
Hi,

On 22/03/18 14:06, Julien Grall wrote:
> Hi Andre,
> 
> On 03/22/2018 11:56 AM, Andre Przywara wrote:
>> +    /* The locking order forces us to drop and re-take the locks
>> here. */
>> +    if ( irq->hw )
>> +    {
>> +    spin_unlock(&irq->irq_lock);
>> +
>> +    desc = irq_to_desc(irq->hwintid);
>> +    spin_lock(&desc->lock);
>> +    spin_lock(&irq->irq_lock);
>> +
>> +    /* This h/w IRQ should still be assigned to the virtual
>> IRQ. */
>> +    ASSERT(irq->hw && desc->irq == irq->hwintid);
>> +
>> +    have_desc_lock = true;
>> +    }
> 
> I am a bit concerned of this dance in fold_lr_state(). This looks
> awfully complex but I don't have better solution here.

I agree.

> I will have a think during the night.
> 
> However, this is not going to solve the race condition I mentioned
> between clearing _IRQ_INPROGRESS here and setting _IRQ_INPROGRESS in
> do_IRQ. This is because you don't know the order they are going to be
> executed.
> 
> I wanted to make sure you didn't intend to solve that one. Am I correct?

This is right, this is orthogonal and not addressed by this patch. I
have a hunch we need to solve this in irq.c instead.

Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] possible I/O emulation state machine issue

2018-03-22 Thread Jan Beulich
Paul,

our PV driver person has found a reproducible crash with ws2k8,
triggered by one of the WHQL tests. The guest get crashed because
the re-issue check of an ioreq close to the top of hvmemul_do_io()
fails. I've handed him a first debugging patch, output of which
suggests that we're dealing with a completely new request, which
in turn would mean that we've run into stale STATE_IORESP_READY
state:

(XEN) d2v3: t=0/1 a=3c4/fed000f0 s=2/4 c=1/1 d=0/1 f=0/0 p=0/0 
v=100/831873f27a30
(XEN) [ Xen-4.10.0_15-0  x86_64  debug=n   Tainted:  C   ]
(XEN) CPU:39
(XEN) RIP:e008:[] emulate.c#hvmemul_do_io+0x1b1/0x640
(XEN) RFLAGS: 00010292   CONTEXT: hypervisor (d2v3)
(XEN) rax: 8308797d802c   rbx: 0004   rcx: 
(XEN) rdx: 831873f27fff   rsi: 000a   rdi: 82d0804433b8
(XEN) rbp: 830007d28000   rsp: 831873f27728   r8:  0027
(XEN) r9:  0010   r10: 0400   r11: 82d08035bd40
(XEN) r12: 0001   r13:    r14: 0001
(XEN) r15: 831873f278e0   cr0: 80050033   cr4: 26e0
(XEN) cr3: 003794f02000   cr2: fa6000fae10e
(XEN) fsb:    gsb:    gss: 07fdd000
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen code around  (emulate.c#hvmemul_do_io+0x1b1/0x640):
(XEN)  54 24 70 e8 cf 87 f7 ff <0f> 0b 48 8d 3d 16 b6 0b 00 48 8d 35 88 f8 0c 00
(XEN) Xen stack trace from rsp=831873f27728:
(XEN)0002 0004 0001 0001
(XEN) 0001  
(XEN)  0100 831873f27a30
(XEN)83283fe74010 83284ad22000  0001
(XEN)831873f277d8 831873f277e0 03c4 0100
(XEN)00020001  8317f8e5b000 0004
(XEN)0001 831873f27a30 831873f27a30 fed000f0
(XEN)830007d289c8 82d0802d578e  831873f27a30
(XEN) 0004 0004 
(XEN)831873f27a30 82d0802d64dd 831873f27a30 831873f27d10
(XEN)fed000f0 831873f27a30 0103 
(XEN)831873f278e0 ffd070f0 00040004 0004
(XEN)0001 831873f27c78 831873f278d8 831873f278d0
(XEN)831873f27938 fed000f0 0001 0001
(XEN)82d080350ecb 0004 0001 831873f27c78
(XEN)831873f27a30 0002 830007d28000 82d0802d69f1
(XEN)0001 82d0802a313d ffd070f0 0001
(XEN) 00f0 82d080350ecb 831873f27aa0
(XEN) 831873f27c78 831873f27a28 830007d28a60
(XEN)82d0803a7620 82d0802a4aad 831873f279c8 831873f27ac0
(XEN) Xen call trace:
(XEN)[] emulate.c#hvmemul_do_io+0x1b1/0x640
(XEN)[] emulate.c#hvmemul_do_io_buffer+0x2e/0x70
(XEN)[] emulate.c#hvmemul_linear_mmio_access+0x24d/0x540
(XEN)[] common_interrupt+0x9b/0x120
(XEN)[] emulate.c#__hvmemul_read+0x221/0x230
(XEN)[] x86_emulate.c#x86_decode+0xe2d/0x1e50
(XEN)[] common_interrupt+0x9b/0x120
(XEN)[] x86_emulate+0x94d/0x19150
(XEN)[] __get_gfn_type_access+0x101/0x290
(XEN)[] emulate.c#_hvm_emulate_one+0x4a/0x1e0
(XEN)[] vmx.c#vmx_get_interrupt_shadow+0/0x10
(XEN)[] hvm_emulate_init_once+0x7e/0xb0
(XEN)[] hvm_emulate_one_insn+0x3b/0x120
(XEN)[] x86_insn_is_mem_access+0/0xc0
(XEN)[] hvm_hap_nested_page_fault+0x138/0x710
(XEN)[] timer.c#add_entry+0x50/0xc0
(XEN)[] vmx_asm_vmexit_handler+0xab/0x240
(XEN)[] vmx_asm_vmexit_handler+0x9f/0x240
(XEN)[] vmx_asm_vmexit_handler+0xab/0x240
(XEN)[] vmx_asm_vmexit_handler+0xab/0x240
(XEN)[] vmx_asm_vmexit_handler+0x9f/0x240
(XEN)[] vmx_asm_vmexit_handler+0xab/0x240
(XEN)[] vmx_vmexit_handler+0x8ae/0x1960
(XEN)[] vmx_asm_vmexit_handler+0xab/0x240
(XEN)[] vmx_asm_vmexit_handler+0xab/0x240
(XEN)[] vmx_asm_vmexit_handler+0x9f/0x240
(XEN)[] vmx_asm_vmexit_handler+0xab/0x240
(XEN)[] vmx_asm_vmexit_handler+0x9f/0x240
(XEN)[] vmx_asm_vmexit_handler+0xab/0x240
(XEN)[] vmx_asm_vmexit_handler+0x9f/0x240
(XEN)[] vmx_asm_vmexit_handler+0xab/0x240
(XEN)[] vmx_asm_vmexit_handler+0x9f/0x240
(XEN)[] vmx_asm_vmexit_handler+0xab/0x240
(XEN)[] vmx_asm_vmexit_handler+0xe2/0x240
(XEN) 
(XEN) domain_crash called from emulate.c:171
(XEN) Domain 2 (vcpu#3) crashed on cpu#39:
(XEN) [ Xen-4.10.0_15-0  x86_64  debug=n   Tainted:  C   ]
(XEN) CPU:39
(XEN) RIP:0010:[]
(XEN) RFLAGS: 00010286   CONTEXT: hvm guest (d2v3)
(XEN) rax: ffd07000   rbx: 0003   rcx: 000

Re: [Xen-devel] [RFC PATCH 07/12] hvmloader: allocate MMCONFIG area in the MMIO hole + minor code refactoring

2018-03-22 Thread Alexey G
On Thu, 22 Mar 2018 08:42:09 -0600
"Jan Beulich"  wrote:

 On 22.03.18 at 15:34,  wrote:  
>> On Thu, 22 Mar 2018 07:20:00 -0600
>> "Jan Beulich"  wrote:
>>   
>> On 22.03.18 at 14:05,  wrote:
 On Thu, 22 Mar 2018 06:09:44 -0600
 "Jan Beulich"  wrote:
 
 On 22.03.18 at 12:56,  wrote:  
>> I really don't understand why some people have that fear of
>> emulated MMCONFIG -- it's really the same thing as any other MMIO
>> range QEMU already emulates via map_io_range_to_ioreq_server().
>> No sensitive information exposed. It is related only to emulated
>> PCI conf space which QEMU already knows about and use, providing
>> emulated PCI devices for it.  
>
>You continue to ignore the routing requirement multiple ioreq
>servers impose.
 
 If the emulated MMCONFIG approach will be modified to become
 fully compatible with multiple ioreq servers (whatever they used
 for), I assume there will be no objections that emulated MMCONFIG
 can't be used?
 I just want to clarify this moment -- why people think that
 a completely emulated MMIO range, not related in any
 way to host's MMCONFIG may compromise something.
>>>
>>>Compromise? All that was said so far - afair - was that this is the
>>>wrong way round design wise.  
>> 
>> I assume it's all about emulating some real system for HVM, for other
>> goals PV/PVH are available. What is a proper, design-wise way to
>> emulate the MMIO-based MMCONFIG range Q35 provides you think of?
>> 
>> Here is what I've heard so far in this thread:
>> 
>> 1. Add a completely new dmop/hypercall so that QEMU can tell Xen
>> where emulated MMCONFIG MMIO area is located and in the same time
>> map it for MMIO trapping to intercept accesses. Latter action is the
>> same what map_io_range_to_ioreq_server() does, but let's ignore it
>> for now because there was opinion that we need to stick to a
>> distinct hypercall.
>> 
>> 2. Upon trapping accesses to this emulated range, Xen will pretend
>> that QEMU didn't just told him about MMCONFIG location and size and
>> instead convert MMIO access into PCI conf one and send the ioreq to
>> QEMU or some other DM.
>> 
>> 3. If there will be a PCIEXBAR relocation (OVMF does it currently for
>> MMCONFIG usage, but we must later teach him non-QEMU manners), QEMU
>> must immediately inform Xen about any changes in MMCONFIG
>> location/status.
>> 
>> 4. QEMU receives PCI conf access while expecting the MMIO address, so
>> xen-hvm.c has to deal with it somehow, either obtaining MMCONFIG base
>> and recreating emulated MMIO access from BDF/reg or doing the dirty
>> work of finding PCIBus/PCIDevice target itself as it cannot use
>> emulated CF8/CFC ports due to legacy PCI conf size limitation.
>> 
>> Please confirm that it is a preferable solution or if something
>> missing.  
>
>I'm afraid this is only part of the picture, as you've been told by
>others before. We first of all need to settle on who emulates
>the core chipset registers. Depending on that will be how Xen
>would learn about the MCFG location inside the guest.

Few related thoughts:

1. MMCONFIG address is chipset-specific. On Q35 it's a PCIEXBAR, on
other x86 systems it may be HECBASE or else. So we can assume it is
bound to the emulated machine

2. We rely on QEMU to emulate different machines for us.

3. There are users which touch chipset-specific PCIEXBAR directly if
they see a Q35 system (OVMF so far)

Seems like we're pretty limited in freedom of choice in this
conditions, I'm afraid.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86emul: fix #XM delivery typo

2018-03-22 Thread Andrew Cooper
On 22/03/18 14:41, Roger Pau Monné wrote:
> On Thu, Mar 22, 2018 at 08:40:04AM -0600, Jan Beulich wrote:
>> This clearly wasn't meant the way it was originally written.
>>
>> Reported-by: Roger Pau Monné 
>> Signed-off-by: Jan Beulich 
> Reviewed-by: Roger Pau Monné 

Acked-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 2/7] x86/xpti: don't flush TLB twice when switching to 64-bit pv context

2018-03-22 Thread Jan Beulich
>>> On 21.03.18 at 13:51,  wrote:
> When switching to a 64-bit pv context the TLB is flushed twice today:
> the first time when switching to the new address space in
> write_ptbase(), the second time when switching to guest mode in
> restore_to_guest.
> 
> Avoid the first TLB flush in that case.
> 
> Signed-off-by: Juergen Gross 
> ---
> V3:
> - omit setting root_pgt_changed to false (Jan Beulich)
> ---
>  xen/arch/x86/mm.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index 352600ad73..8c944b33c9 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -123,6 +123,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -507,8 +508,14 @@ void make_cr3(struct vcpu *v, mfn_t mfn)
>  void write_ptbase(struct vcpu *v)
>  {
>  if ( this_cpu(root_pgt) && is_pv_vcpu(v) && !is_pv_32bit_vcpu(v) )
> +{
>  get_cpu_info()->root_pgt_changed = true;
> -write_cr3(v->arch.cr3);
> +asm volatile ( "mov %0, %%cr3" : : "r" (v->arch.cr3) : "memory" );
> +}
> +else
> +{
> +write_cr3(v->arch.cr3);
> +}

Unnecessary braces. with that
Reviewed-by: Jan Beulich 
(This could be taken care of while committing, but the patch
depends on patch 1 anyway, which may see further
transformation.)

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 07/12] hvmloader: allocate MMCONFIG area in the MMIO hole + minor code refactoring

2018-03-22 Thread Jan Beulich
>>> On 22.03.18 at 15:34,  wrote:
> On Thu, 22 Mar 2018 07:20:00 -0600
> "Jan Beulich"  wrote:
> 
> On 22.03.18 at 14:05,  wrote:  
>>> On Thu, 22 Mar 2018 06:09:44 -0600
>>> "Jan Beulich"  wrote:
>>>   
>>> On 22.03.18 at 12:56,  wrote:
> I really don't understand why some people have that fear of
> emulated MMCONFIG -- it's really the same thing as any other MMIO
> range QEMU already emulates via map_io_range_to_ioreq_server(). No
> sensitive information exposed. It is related only to emulated PCI
> conf space which QEMU already knows about and use, providing
> emulated PCI devices for it.

You continue to ignore the routing requirement multiple ioreq
servers impose.  
>>> 
>>> If the emulated MMCONFIG approach will be modified to become
>>> fully compatible with multiple ioreq servers (whatever they used
>>> for), I assume there will be no objections that emulated MMCONFIG
>>> can't be used?
>>> I just want to clarify this moment -- why people think that
>>> a completely emulated MMIO range, not related in any
>>> way to host's MMCONFIG may compromise something.  
>>
>>Compromise? All that was said so far - afair - was that this is the
>>wrong way round design wise.
> 
> I assume it's all about emulating some real system for HVM, for other
> goals PV/PVH are available. What is a proper, design-wise way to
> emulate the MMIO-based MMCONFIG range Q35 provides you think of?
> 
> Here is what I've heard so far in this thread:
> 
> 1. Add a completely new dmop/hypercall so that QEMU can tell Xen where
> emulated MMCONFIG MMIO area is located and in the same time map it for
> MMIO trapping to intercept accesses. Latter action is the same what
> map_io_range_to_ioreq_server() does, but let's ignore it for now
> because there was opinion that we need to stick to a distinct hypercall.
> 
> 2. Upon trapping accesses to this emulated range, Xen will pretend that
> QEMU didn't just told him about MMCONFIG location and size and instead
> convert MMIO access into PCI conf one and send the ioreq to QEMU or
> some other DM.
> 
> 3. If there will be a PCIEXBAR relocation (OVMF does it currently for
> MMCONFIG usage, but we must later teach him non-QEMU manners), QEMU must
> immediately inform Xen about any changes in MMCONFIG location/status.
> 
> 4. QEMU receives PCI conf access while expecting the MMIO address, so
> xen-hvm.c has to deal with it somehow, either obtaining MMCONFIG base
> and recreating emulated MMIO access from BDF/reg or doing the dirty work
> of finding PCIBus/PCIDevice target itself as it cannot use emulated
> CF8/CFC ports due to legacy PCI conf size limitation.
> 
> Please confirm that it is a preferable solution or if something missing.

I'm afraid this is only part of the picture, as you've been told by
others before. We first of all need to settle on who emulates
the core chipset registers. Depending on that will be how Xen
would learn about the MCFG location inside the guest.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86emul: fix #XM delivery typo

2018-03-22 Thread Roger Pau Monné
On Thu, Mar 22, 2018 at 08:40:04AM -0600, Jan Beulich wrote:
> This clearly wasn't meant the way it was originally written.
> 
> Reported-by: Roger Pau Monné 
> Signed-off-by: Jan Beulich 

Reviewed-by: Roger Pau Monné 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86emul: fix #XM delivery typo

2018-03-22 Thread Jan Beulich
This clearly wasn't meant the way it was originally written.

Reported-by: Roger Pau Monné 
Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -8662,7 +8662,7 @@ x86_emulate(
 {
 unsigned long cr4;
 
-if ( !ops->read_cr || !ops->read_cr(4, &cr4, ctxt) == X86EMUL_OKAY )
+if ( !ops->read_cr || ops->read_cr(4, &cr4, ctxt) != X86EMUL_OKAY )
 cr4 = X86_CR4_OSXMMEXCPT;
 generate_exception(cr4 & X86_CR4_OSXMMEXCPT ? EXC_XM : EXC_UD);
 }



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 05/14] x86/HVM: eliminate custom #MF/#XM handling

2018-03-22 Thread Jan Beulich
>>> On 22.03.18 at 15:12,  wrote:
> On Thu, Mar 15, 2018 at 07:06:36AM -0600, Jan Beulich wrote:
>> @@ -8478,7 +8411,8 @@ x86_emulate(
>>  }
>>  
>>   complete_insn: /* Commit shadow register state. */
>> -put_fpu(&fic, false, state, ctxt, ops);
>> +put_fpu(fpu_type, false, state, ctxt, ops);
>> +fpu_type = X86EMUL_FPU_none;
>>  
>>  /* Zero the upper 32 bits of %rip if not in 64-bit mode. */
>>  if ( !mode_64bit() )
>> @@ -8502,13 +8436,22 @@ x86_emulate(
>>  ctxt->regs->eflags &= ~X86_EFLAGS_RF;
>>  
>>   done:
>> -put_fpu(&fic, fic.insn_bytes > 0 && dst.type == OP_MEM, state, ctxt, 
>> ops);
>> +put_fpu(fpu_type, insn_bytes > 0 && dst.type == OP_MEM, state, ctxt, 
>> ops);
>>  put_stub(stub);
>>  return rc;
>>  #undef state
>>  
>>  #ifdef __XEN__
>>   emulation_stub_failure:
>> +generate_exception_if(stub_exn.info.fields.trapnr == EXC_MF, EXC_MF);
>> +if ( stub_exn.info.fields.trapnr == EXC_XM )
>> +{
>> +unsigned long cr4;
>> +
>> +if ( !ops->read_cr || !ops->read_cr(4, &cr4, ctxt) == X86EMUL_OKAY )
> 
> Is the second expression in the above line missing parentheses:
> 
> if ( !ops->read_cr || !(ops->read_cr(4, &cr4, ctxt) == X86EMUL_OKAY) )
> 
> Or should this be:
> 
> if ( !ops->read_cr || ops->read_cr(4, &cr4, ctxt) != X86EMUL_OKAY )

Oops, yes indeed, the latter. Thanks for the report.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 07/12] hvmloader: allocate MMCONFIG area in the MMIO hole + minor code refactoring

2018-03-22 Thread Alexey G
On Thu, 22 Mar 2018 07:20:00 -0600
"Jan Beulich"  wrote:

 On 22.03.18 at 14:05,  wrote:  
>> On Thu, 22 Mar 2018 06:09:44 -0600
>> "Jan Beulich"  wrote:
>>   
>> On 22.03.18 at 12:56,  wrote:
 I really don't understand why some people have that fear of
 emulated MMCONFIG -- it's really the same thing as any other MMIO
 range QEMU already emulates via map_io_range_to_ioreq_server(). No
 sensitive information exposed. It is related only to emulated PCI
 conf space which QEMU already knows about and use, providing
 emulated PCI devices for it.
>>>
>>>You continue to ignore the routing requirement multiple ioreq
>>>servers impose.  
>> 
>> If the emulated MMCONFIG approach will be modified to become
>> fully compatible with multiple ioreq servers (whatever they used
>> for), I assume there will be no objections that emulated MMCONFIG
>> can't be used?
>> I just want to clarify this moment -- why people think that
>> a completely emulated MMIO range, not related in any
>> way to host's MMCONFIG may compromise something.  
>
>Compromise? All that was said so far - afair - was that this is the
>wrong way round design wise.

I assume it's all about emulating some real system for HVM, for other
goals PV/PVH are available. What is a proper, design-wise way to
emulate the MMIO-based MMCONFIG range Q35 provides you think of?

Here is what I've heard so far in this thread:

1. Add a completely new dmop/hypercall so that QEMU can tell Xen where
emulated MMCONFIG MMIO area is located and in the same time map it for
MMIO trapping to intercept accesses. Latter action is the same what
map_io_range_to_ioreq_server() does, but let's ignore it for now
because there was opinion that we need to stick to a distinct hypercall.

2. Upon trapping accesses to this emulated range, Xen will pretend that
QEMU didn't just told him about MMCONFIG location and size and instead
convert MMIO access into PCI conf one and send the ioreq to QEMU or
some other DM.

3. If there will be a PCIEXBAR relocation (OVMF does it currently for
MMCONFIG usage, but we must later teach him non-QEMU manners), QEMU must
immediately inform Xen about any changes in MMCONFIG location/status.

4. QEMU receives PCI conf access while expecting the MMIO address, so
xen-hvm.c has to deal with it somehow, either obtaining MMCONFIG base
and recreating emulated MMIO access from BDF/reg or doing the dirty work
of finding PCIBus/PCIDevice target itself as it cannot use emulated
CF8/CFC ports due to legacy PCI conf size limitation.

Please confirm that it is a preferable solution or if something missing.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/7] x86/xpti: avoid copying L4 page table contents when possible

2018-03-22 Thread Jan Beulich
>>> On 21.03.18 at 13:51,  wrote:
> --- a/xen/arch/x86/flushtlb.c
> +++ b/xen/arch/x86/flushtlb.c
> @@ -158,6 +158,9 @@ unsigned int flush_area_local(const void *va, unsigned 
> int flags)
>  }
>  }
>  
> +if ( flags & FLUSH_ROOT_PGTBL )
> +get_cpu_info()->root_pgt_changed = true;
> +
>  local_irq_restore(irqfl);
>  
>  return flags;

Does this really need to sit inside the interrupts disabled section?

Thinking about it I even wonder whether the cache flush part needs
to be. Even for the INVLPG portion of the TLB flush part I can't
seem to see a need for IRQs to be off. I think it's really just the
pre_flush() / post_flush() pair which needs to be inside such a
section. I'll prepare a patch (for after 4.11). I think some of the
changes later in your series will actually further ease this.

> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -499,10 +499,15 @@ void free_shared_domheap_page(struct page_info *page)
>  void make_cr3(struct vcpu *v, mfn_t mfn)
>  {
>  v->arch.cr3 = mfn_x(mfn) << PAGE_SHIFT;
> +if ( v == current && this_cpu(root_pgt) && is_pv_vcpu(v) &&
> + !is_pv_32bit_vcpu(v) )
> +get_cpu_info()->root_pgt_changed = true;
>  }

As this doesn't actually update CR3, setting the flag shouldn't
generally be necessary if the caller then invokes write_ptbase().
Isn't setting the flag here needed solely in the case of
_toggle_guest_pt() being up the call tree? In which case it would
perhaps better be set there (and in turn some or even all of the
conditional around it could be dropped)?

>  void write_ptbase(struct vcpu *v)
>  {
> +if ( this_cpu(root_pgt) && is_pv_vcpu(v) && !is_pv_32bit_vcpu(v) )
> +get_cpu_info()->root_pgt_changed = true;
>  write_cr3(v->arch.cr3);

When you come here from e.g. __sync_local_execstate(), you
don't really need to set the flag. Of course you'll come here again
before the next 64-bit PV vCPU will make it to restore_all_guest,
so by the time we make it there the flag will be set anyway.
However, if you already use such a subtlety, then there's also
no point excluding 32-bit vCPU-s here (nor in make_cr3()), as
those will never make it to restore_all_guest. Same then for
excluding HVM vCPU-s. And I then wonder whether (here or
more likely in a later patch) the root_pgt check couldn't go away
as well.

> @@ -3698,18 +3703,29 @@ long do_mmu_update(
>  break;
>  rc = mod_l4_entry(va, l4e_from_intpte(req.val), mfn,
>cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
> -/*
> - * No need to sync if all uses of the page can be 
> accounted
> - * to the page lock we hold, its pinned status, and uses 
> on
> - * this (v)CPU.
> - */
> -if ( !rc && !cpu_has_no_xpti &&
> - ((page->u.inuse.type_info & PGT_count_mask) >
> -  (1 + !!(page->u.inuse.type_info & PGT_pinned) +
> -   (pagetable_get_pfn(curr->arch.guest_table) == 
> mfn) 
> +
> -   (pagetable_get_pfn(curr->arch.guest_table_user) ==
> -mfn))) )
> -sync_guest = true;
> +if ( !rc && !cpu_has_no_xpti )
> +{
> +bool local_in_use = false;
> +
> +if ( (pagetable_get_pfn(curr->arch.guest_table) ==
> +  mfn) ||
> + (pagetable_get_pfn(curr->arch.guest_table_user) 
> ==
> +  mfn) )
> +{
> +local_in_use = true;
> +get_cpu_info()->root_pgt_changed = true;
> +}

The conditional causes root_pgt_changed to get set even in cases
where what CR3 points to doesn't actually change (if it's the user
page tables that get modified). I think you want to check
curr->arch.cr3 here, or only curr->arch.guest_table (as user mode
can't invoke hypercalls).

> +/*
> + * No need to sync if all uses of the page can be
> + * accounted to the page lock we hold, its pinned
> + * status, and uses on this (v)CPU.
> + */
> +if ( (page->u.inuse.type_info & PGT_count_mask) >
> + (1 + !!(page->u.inuse.type_info & PGT_pinned) +
> +  local_in_use) )

The boolean local_in_use evaluates to 1 here, when previously the
value could have been 1 or 2 (I agree that's highly theoretical, but
anyway). Of course this will be addressed implicitly if you check
(only) curr->arch.guest_table above and move the
curr->arch.guest_table_user check here.

Jan


___
Xen-dev

Re: [Xen-devel] [PATCH v5 05/14] x86/HVM: eliminate custom #MF/#XM handling

2018-03-22 Thread Roger Pau Monné
On Thu, Mar 15, 2018 at 07:06:36AM -0600, Jan Beulich wrote:
> @@ -8478,7 +8411,8 @@ x86_emulate(
>  }
>  
>   complete_insn: /* Commit shadow register state. */
> -put_fpu(&fic, false, state, ctxt, ops);
> +put_fpu(fpu_type, false, state, ctxt, ops);
> +fpu_type = X86EMUL_FPU_none;
>  
>  /* Zero the upper 32 bits of %rip if not in 64-bit mode. */
>  if ( !mode_64bit() )
> @@ -8502,13 +8436,22 @@ x86_emulate(
>  ctxt->regs->eflags &= ~X86_EFLAGS_RF;
>  
>   done:
> -put_fpu(&fic, fic.insn_bytes > 0 && dst.type == OP_MEM, state, ctxt, 
> ops);
> +put_fpu(fpu_type, insn_bytes > 0 && dst.type == OP_MEM, state, ctxt, 
> ops);
>  put_stub(stub);
>  return rc;
>  #undef state
>  
>  #ifdef __XEN__
>   emulation_stub_failure:
> +generate_exception_if(stub_exn.info.fields.trapnr == EXC_MF, EXC_MF);
> +if ( stub_exn.info.fields.trapnr == EXC_XM )
> +{
> +unsigned long cr4;
> +
> +if ( !ops->read_cr || !ops->read_cr(4, &cr4, ctxt) == X86EMUL_OKAY )

Is the second expression in the above line missing parentheses:

if ( !ops->read_cr || !(ops->read_cr(4, &cr4, ctxt) == X86EMUL_OKAY) )

Or should this be:

if ( !ops->read_cr || ops->read_cr(4, &cr4, ctxt) != X86EMUL_OKAY )

clang complains with:

In file included from x86_emulate.c:44:
./x86_emulate/x86_emulate.c:8665:31: error: logical not is only applied to the 
left hand side of
  this comparison [-Werror,-Wlogical-not-parentheses]
if ( !ops->read_cr || !ops->read_cr(4, &cr4, ctxt) == X86EMUL_OKAY )
  ^~~
./x86_emulate/x86_emulate.c:8665:31: note: add parentheses after the '!' to 
evaluate the comparison
  first
if ( !ops->read_cr || !ops->read_cr(4, &cr4, ctxt) == X86EMUL_OKAY )
  ^
   (  )
./x86_emulate/x86_emulate.c:8665:31: note: add parentheses around left hand 
side expression to
  silence this warning
if ( !ops->read_cr || !ops->read_cr(4, &cr4, ctxt) == X86EMUL_OKAY )
  ^
  (   )
1 error generated.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [for-4.11][PATCH v6 16/16] xen: Convert page_to_mfn and mfn_to_page to use typesafe MFN

2018-03-22 Thread Julien Grall



On 03/22/2018 12:24 PM, Tim Deegan wrote:

Hi,


Hi Tim,


At 04:47 + on 21 Mar (1521607657), Julien Grall wrote:

Most of the users of page_to_mfn and mfn_to_page are either overriding
the macros to make them work with mfn_t or use mfn_x/_mfn because the
rest of the function use mfn_t.

So make page_to_mfn and mfn_to_page return mfn_t by default. The __*
version are now dropped as this patch will convert all the remaining
non-typesafe callers.

Only reasonable clean-ups are done in this patch. The rest will use
_mfn/mfn_x for the time being.

Lastly, domain_page_to_mfn is also converted to use mfn_t given that
most of the callers are now switched to _mfn(domain_page_to_mfn(...)).

Signed-off-by: Julien Grall 
Acked-by: Razvan Cojocaru 
Reviewed-by: Paul Durrant 
Reviewed-by: Boris Ostrovsky 
Reviewed-by: Kevin Tian 
Reviewed-by: Wei Liu 
Acked-by: Jan Beulich 
Reviewed-by: George Dunlap 


Thought I'd already acked this for the shadow code, but clearly not.


You acked on the first version. That patch was heavily rework just after 
to drop __mfn_to_page and __page_to_mfn rather. Hence I dropped the acked.



Sorry for the delay, and:

Acked-by: Tim Deegan 


Thank you!

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3a 14/39] ARM: new VGIC: Add GICv2 world switch backend

2018-03-22 Thread Julien Grall

Hi Andre,

On 03/22/2018 11:56 AM, Andre Przywara wrote:

+/* The locking order forces us to drop and re-take the locks here. */
+if ( irq->hw )
+{
+spin_unlock(&irq->irq_lock);
+
+desc = irq_to_desc(irq->hwintid);
+spin_lock(&desc->lock);
+spin_lock(&irq->irq_lock);
+
+/* This h/w IRQ should still be assigned to the virtual IRQ. */
+ASSERT(irq->hw && desc->irq == irq->hwintid);
+
+have_desc_lock = true;
+}


I am a bit concerned of this dance in fold_lr_state(). This looks 
awfully complex but I don't have better solution here. I will have a 
think during the night.


However, this is not going to solve the race condition I mentioned 
between clearing _IRQ_INPROGRESS here and setting _IRQ_INPROGRESS in 
do_IRQ. This is because you don't know the order they are going to be 
executed.


I wanted to make sure you didn't intend to solve that one. Am I correct?

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] X86 Community Call - Wed Apr 11, 14:00 - 15:00 UTC - Call for Agenda Items

2018-03-22 Thread Lars Kurth
Removing the non-working Intel alias

@John: once this alias actually works, let me know. 
The start of the thread is at 
https://lists.xenproject.org/archives/html/xen-devel/2018-03/threads.html#02672

@All:
To summarize in terms of higher level discussions:
* Discuss PCI emulation and our future direction. Our current hybrid with QEMU 
is becoming increasingly problematic (leader: Paul)
* Update on PVH work (leader: Royger)

It would be good if leaders could do some preparation and send out a short 
description of anything that they think may help others follow the discussion.

We should probably also summarize quickly any developments on NVDIMM, depending 
on progress.

I would say: maybe use the first 15-30 minutes for more operational stuff. The 
second half for bigger ticket items.

On 22/03/2018, 14:49, "Julien Grall"  wrote:

>> -
>>
>> I think we need to discuss PCI emulation and our future direction. Our 
current hybrid with QEMU is becoming increasingly problematic.
> 
> +1

I think it would be worth for Stefano and I to join this discussion. 
Ideally, we want to use a common solution between Arm and x86.

Not sure the time will fit for Stefano thought.

It's at 7am Pacific, which is a little early for Stefano. I can't really move 
the call: it was quite hard to agree a time-slot.
But we could aim to schedule this discussion for say 7:30 or 7:45, which makes 
this easier for Stefano

Regards
Lars 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v12 03/12] x86/physdev: enable PHYSDEVOP_pci_mmcfg_reserved for PVH Dom0

2018-03-22 Thread Roger Pau Monne
So that MMCFG regions not present in the MCFG ACPI table can be added
at run time by the hardware domain.

Signed-off-by: Roger Pau Monné 
Reviewed-by: Jan Beulich 
Reviewed-by: Paul Durrant 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Paul Durrant 
---
Changes since v7:
 - Add newline in hvm_physdev_op for non-fallthrough case.

Changes since v6:
 - Do not return EEXIST if the same exact region is already tracked by
   Xen.

Changes since v5:
 - Check for has_vpci before calling register_vpci_mmcfg_handler
   instead of checking for is_hvm_domain.

Changes since v4:
 - Change the hardware_domain check in hvm_physdev_op to a vpci check.
 - Only register the MMCFG area, but don't scan it.

Changes since v3:
 - New in this version.
---
 xen/arch/x86/hvm/hypercall.c |  5 +
 xen/arch/x86/hvm/io.c| 16 +++-
 xen/arch/x86/physdev.c   | 11 +++
 3 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index 5742dd1797..85eacd7d33 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -89,6 +89,11 @@ static long hvm_physdev_op(int cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 if ( !has_pirq(curr->domain) )
 return -ENOSYS;
 break;
+
+case PHYSDEVOP_pci_mmcfg_reserved:
+if ( !has_vpci(curr->domain) )
+return -ENOSYS;
+break;
 }
 
 if ( !curr->hcall_compat )
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 04425c064b..556810c126 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -507,10 +507,9 @@ static const struct hvm_mmio_ops vpci_mmcfg_ops = {
 .write = vpci_mmcfg_write,
 };
 
-int __hwdom_init register_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
- unsigned int start_bus,
- unsigned int end_bus,
- unsigned int seg)
+int register_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
+unsigned int start_bus, unsigned int end_bus,
+unsigned int seg)
 {
 struct hvm_mmcfg *mmcfg, *new = xmalloc(struct hvm_mmcfg);
 
@@ -535,9 +534,16 @@ int __hwdom_init register_vpci_mmcfg_handler(struct domain 
*d, paddr_t addr,
 if ( new->addr < mmcfg->addr + mmcfg->size &&
  mmcfg->addr < new->addr + new->size )
 {
+int ret = -EEXIST;
+
+if ( new->addr == mmcfg->addr &&
+ new->start_bus == mmcfg->start_bus &&
+ new->segment == mmcfg->segment &&
+ new->size == mmcfg->size )
+ret = 0;
 write_unlock(&d->arch.hvm_domain.mmcfg_lock);
 xfree(new);
-return -EEXIST;
+return ret;
 }
 
 if ( list_empty(&d->arch.hvm_domain.mmcfg_regions) )
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index 380d36f6b9..984491c3dc 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -557,6 +557,17 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) 
arg)
 
 ret = pci_mmcfg_reserved(info.address, info.segment,
  info.start_bus, info.end_bus, info.flags);
+if ( !ret && has_vpci(currd) )
+{
+/*
+ * For HVM (PVH) domains try to add the newly found MMCFG to the
+ * domain.
+ */
+ret = register_vpci_mmcfg_handler(currd, info.address,
+  info.start_bus, info.end_bus,
+  info.segment);
+}
+
 break;
 }
 
-- 
2.16.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v12 08/12] x86/pt: mask MSI vectors on unbind

2018-03-22 Thread Roger Pau Monne
When a MSI device with per-vector masking capabilities is detected or
added to Xen all the vectors are masked when initializing it. This
implies that the first time the interrupt is bound to a domain it's
masked.

This however only applies to the first time the interrupt is bound
because neither the unbind nor the pirq unmap will mask the vector
again. In order to fix this re-mask the interrupt when unbinding it
from a guest. This makes sure that pairs of bind/unbind will always
get the same masking state.

Note that no issues have been reported regarding this behavior because
QEMU always uses the newly introduced XEN_PT_GFLAGSSHIFT_UNMASKED when
binding interrupts, so it's always unmasked.

Signed-off-by: Roger Pau Monné 
Reviewed-by: Jan Beulich 
---
Cc: Jan Beulich 
---
Changes since v7:
 - New in this version.
---
 xen/drivers/passthrough/io.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index 8f16e6c0a5..bab3aa349a 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -645,7 +645,22 @@ int pt_irq_destroy_bind(
 }
 break;
 case PT_IRQ_TYPE_MSI:
+{
+unsigned long flags;
+struct irq_desc *desc = domain_spin_lock_irq_desc(d, machine_gsi,
+  &flags);
+
+if ( !desc )
+return -EINVAL;
+/*
+ * Leave the MSI masked, so that the state when calling
+ * pt_irq_create_bind is consistent across bind/unbinds.
+ */
+guest_mask_msi_irq(desc, true);
+spin_unlock_irqrestore(&desc->lock, flags);
 break;
+}
+
 default:
 return -EOPNOTSUPP;
 }
-- 
2.16.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v12 10/12] vpci: add a priority parameter to the vPCI register initializer

2018-03-22 Thread Roger Pau Monne
This is needed for MSI-X, since MSI-X will need to be initialized
before parsing the BARs, so that the header BAR handlers are aware of
the MSI-X related holes and make sure they are not mapped in order for
the trap handlers to work properly.

Signed-off-by: Roger Pau Monné 
Reviewed-by: Jan Beulich 
---
Cc: Stefano Stabellini 
Cc: Julien Grall 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Konrad Rzeszutek Wilk 
Cc: Tim Deegan 
Cc: Wei Liu 
---
Changes since v4:
 - Add a middle priority and add the PCI header to it.

Changes since v3:
 - Add a numerial suffix to the section used to store the pointer to
   each initializer function, and sort them at link time.
---
 xen/arch/arm/xen.lds.S| 4 ++--
 xen/arch/x86/xen.lds.S| 4 ++--
 xen/drivers/vpci/header.c | 2 +-
 xen/drivers/vpci/msi.c| 2 +-
 xen/include/xen/vpci.h| 8 ++--
 5 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
index 49cae2af71..245a0e0e85 100644
--- a/xen/arch/arm/xen.lds.S
+++ b/xen/arch/arm/xen.lds.S
@@ -69,7 +69,7 @@ SECTIONS
 #if defined(CONFIG_HAS_VPCI) && defined(CONFIG_LATE_HWDOM)
. = ALIGN(POINTER_ALIGN);
__start_vpci_array = .;
-   *(.data.vpci)
+   *(SORT(.data.vpci.*))
__end_vpci_array = .;
 #endif
   } :text
@@ -182,7 +182,7 @@ SECTIONS
 #if defined(CONFIG_HAS_VPCI) && !defined(CONFIG_LATE_HWDOM)
. = ALIGN(POINTER_ALIGN);
__start_vpci_array = .;
-   *(.data.vpci)
+   *(SORT(.data.vpci.*))
__end_vpci_array = .;
 #endif
   } :text
diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
index 7bd6fb51c3..70afedd31d 100644
--- a/xen/arch/x86/xen.lds.S
+++ b/xen/arch/x86/xen.lds.S
@@ -139,7 +139,7 @@ SECTIONS
 #if defined(CONFIG_HAS_VPCI) && defined(CONFIG_LATE_HWDOM)
. = ALIGN(POINTER_ALIGN);
__start_vpci_array = .;
-   *(.data.vpci)
+   *(SORT(.data.vpci.*))
__end_vpci_array = .;
 #endif
   } :text
@@ -246,7 +246,7 @@ SECTIONS
 #if defined(CONFIG_HAS_VPCI) && !defined(CONFIG_LATE_HWDOM)
. = ALIGN(POINTER_ALIGN);
__start_vpci_array = .;
-   *(.data.vpci)
+   *(SORT(.data.vpci.*))
__end_vpci_array = .;
 #endif
   } :text
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 25d8ec0507..9fa07992cc 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -532,7 +532,7 @@ static int init_bars(struct pci_dev *pdev)
 
 return (cmd & PCI_COMMAND_MEMORY) ? modify_bars(pdev, true, false) : 0;
 }
-REGISTER_VPCI_INIT(init_bars);
+REGISTER_VPCI_INIT(init_bars, VPCI_PRIORITY_MIDDLE);
 
 /*
  * Local variables:
diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
index c3c69ec453..de4ddf562e 100644
--- a/xen/drivers/vpci/msi.c
+++ b/xen/drivers/vpci/msi.c
@@ -267,7 +267,7 @@ static int init_msi(struct pci_dev *pdev)
 
 return 0;
 }
-REGISTER_VPCI_INIT(init_msi);
+REGISTER_VPCI_INIT(init_msi, VPCI_PRIORITY_LOW);
 
 void vpci_dump_msi(void)
 {
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 116b93f519..7266c17679 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -15,9 +15,13 @@ typedef void vpci_write_t(const struct pci_dev *pdev, 
unsigned int reg,
 
 typedef int vpci_register_init_t(struct pci_dev *dev);
 
-#define REGISTER_VPCI_INIT(x)   \
+#define VPCI_PRIORITY_HIGH  "1"
+#define VPCI_PRIORITY_MIDDLE"5"
+#define VPCI_PRIORITY_LOW   "9"
+
+#define REGISTER_VPCI_INIT(x, p)\
   static vpci_register_init_t *const x##_entry  \
-   __used_section(".data.vpci") = x
+   __used_section(".data.vpci." p) = x
 
 /* Add vPCI handlers to device. */
 int __must_check vpci_add_handlers(struct pci_dev *dev);
-- 
2.16.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v12 01/12] vpci: introduce basic handlers to trap accesses to the PCI config space

2018-03-22 Thread Roger Pau Monne
This functionality is going to reside in vpci.c (and the corresponding
vpci.h header), and should be arch-agnostic. The handlers introduced
in this patch setup the basic functionality required in order to trap
accesses to the PCI config space, and allow decoding the address and
finding the corresponding handler that should handle the access
(although no handlers are implemented).

Note that the traps to the PCI IO ports registers (0xcf8/0xcfc) are
setup inside of a x86 HVM file, since that's not shared with other
arches.

A new XEN_X86_EMU_VPCI x86 domain flag is added in order to signal Xen
whether a domain should use the newly introduced vPCI handlers, this
is only enabled for PVH Dom0 at the moment.

A very simple user-space test is also provided, so that the basic
functionality of the vPCI traps can be asserted. This has been proven
quite helpful during development, since the logic to handle partial
accesses or accesses that expand across multiple registers is not
trivial.

The handlers for the registers are added to a linked list that's keep
sorted at all times. Both the read and write handlers support accesses
that expand across multiple emulated registers and contain gaps not
emulated.

Signed-off-by: Roger Pau Monné 
Reviewed-by: Jan Beulich 
[IO parts]
Reviewed-by: Paul Durrant 
[ARM]
Acked-by: Julien Grall 
[Tools]
Acked-by: Wei Liu 
---
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Wei Liu 
Cc: Julien Grall 
Cc: Paul Durrant 
---
Changes since v9:
 - Remove vpci/Kconfig and use drivers/Kconfig instead.
 - Remove depends on HAS_PCI.

Changes since v8:
 - Introduce HAS_VPCI Kconfig option.
 - Drop Jan and Wei's RB (keep Paul's since the HAS_VPCI addition
   doesn't change IO code).
 - Rebase on top of XSA-256.

Changes since v7:
 - Constify d in vpci_portio_read.
 - ASSERT the correctness of the address in the read/write handlers.
 - Add newlines between non-fallthrough case statements.

Changes since v6:
 - Align the vpci handlers in the linker script.
 - Switch add/remove register functions to take a vpci parameter
   instead of a pci_dev.
 - Expand comment of merge_result.
 - Return X86EMUL_UNHANDLEABLE if accessing cfc and cf8 is disabled.

Changes since v5:
 - Use a spinlock per pci device.
 - Use the recently introduced pci_sbdf_t type.
 - Fix test harness to use the right handler type and the newly
   introduced lock.
 - Move the position of the vpci sections in the linker scripts.
 - Constify domain and pci_dev in vpci_{read/write}.
 - Fix typos in comments.
 - Use _XEN_VPCI_H_ as header guard.

Changes since v4:
* User-space test harness:
 - Do not redirect the output of the test.
 - Add main.c and emul.h as dependencies of the Makefile target.
 - Use the same rule to modify the vpci and list headers.
 - Remove underscores from local macro variables.
 - Add _check suffix to the test harness multiread function.
 - Change the value written by every different size in the multiwrite
   test.
 - Use { } to initialize the r16 and r20 arrays (instead of { 0 }).
 - Perform some of the read checks with the local variable directly.
 - Expand some comments.
 - Implement a dummy rwlock.
* Hypervisor code:
 - Guard the linker script changes with CONFIG_HAS_PCI.
 - Rename vpci_access_check to vpci_access_allowed and make it return
   bool.
 - Make hvm_pci_decode_addr return the register as return value.
 - Use ~3 instead of 0xfffc to remove the register offset when
   checking accesses to IO ports.
 - s/head/prev in vpci_add_register.
 - Add parentheses around & in vpci_add_register.
 - Fix register removal.
 - Change the BUGs in vpci_{read/write}_hw helpers to
   ASSERT_UNREACHABLE.
 - Make merge_result static and change the computation of the mask to
   avoid using a uint64_t.
 - Modify vpci_read to only read from hardware the not-emulated gaps.
 - Remove the vpci_val union and use a uint32_t instead.
 - Change handler read type to return a uint32_t instead of modifying
   a variable passed by reference.
 - Constify the data opaque parameter of read handlers.
 - Change the size parameter of the vpci_{read/write} functions to
   unsigned int.
 - Place the array of initialization handlers in init.rodata or
   .rodata depending on whether late-hwdom is enabled.
 - Remove the pci_devs lock, assume the Dom0 is well behaved and won't
   remove the device while trying to access it.
 - Change the recursive spinlock into a rw lock for performance
   reasons.

Changes since v3:
* User-space test harness:
 - Fix spaces in container_of macro.
 - Implement a dummy locking functions.
 - Remove 'current' macro make current a pointer to the statically
   allocated vpcu.
 - Remove unneeded parentheses in the pci_conf_readX macros.
 - Fix the name of the write test macro.
 - Remove the dummy EXPORT_SYMBOL macro (this was needed by the RB
   code only).
 - Import the max macro.
 - Test all possible read/write size combinati

[Xen-devel] [PATCH v12 11/12] vpci/msix: add MSI-X handlers

2018-03-22 Thread Roger Pau Monne
Add handlers for accesses to the MSI-X message control field on the
PCI configuration space, and traps for accesses to the memory region
that contains the MSI-X table and PBA. This traps detect attempts from
the guest to configure MSI-X interrupts and properly sets them up.

Note that accesses to the Table Offset, Table BIR, PBA Offset and PBA
BIR are not trapped by Xen at the moment.

Finally, turn the panic in the Dom0 PVH builder into a warning.

Signed-off-by: Roger Pau Monné 
Reviewed-by: Jan Beulich 
[IO]
Reviewed-by: Paul Durrant 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Wei Liu 
Cc: Paul Durrant 
---
Changes since v10:
 - Do not continue to print msix entries if the MSIX struct has
   changed it's address while processing softirqs.
 - Use unsigned long to store the frame numbers in modify_bars.
 - Use lu to print frame values in modify_bars.

Changes since v9:
 - Unlock/lock when calling process_pending_softirqs.
 - Change vpci_msix_arch_print to return int in order to signal
   failure to continue after having processed softirqs.
 - Use a power of 2 to do the module.
 - Use PFN_DOWN in order to calculate the end of the MSI-X memory
   areas for the rangeset.

Changes since v8:
 - Call process_pending_softirqs between printing MSI-X entries.
 - Free msix struct in vpci_add_handlers.
 - Print only MSI or MSI-X if they are enabled.
 - Fix comment in update_entry.

Changes since v7:
 - Switch vpci.h macros to inline functions.
 - Change vpci_msix_arch_print_entry into vpci_msix_arch_print and
   make it print all the entries.
 - Add a log message if rangeset_remove_range fails to remove the BAR
   MSI-related range.
 - Introduce a new update_entry to disable and enable a MSIX entry in
   order to either update or set it up. This removes open coding it in
   two different places.
 - Unify access checks in access_allowed.
 - Add newlines between switch cases.
 - Expand max_entries to 12 bits.

Changes since v6:
 - Reduce the output of the debug keys.
 - Fix comments and code to match in vpci_msix_control_write.
 - Optimize size of the MSIX structure.
 - Convert 'tables[]' to a uint32_t in order to reduce the size of
   vpci_msix. Introduce some macros to make it easier to get the MSIX
   tables related data.
 - Limit size of the bool fields to 1 bit.
 - Remove the 'nr' field of vpci_msix_entry. The position can be
   calculated from the base of the entries array.
 - Drop the 'vpci_' prefix from the functions in msix.c, they are all
   static.
 - Remove the val local variable in control_read.
 - Initialize new_masked and new_enabled at declaration.
 - Recalculate the msix control value before writing it.
 - Remove the seg and bus local variables and use pdev->seg and
   pdev->bus instead.
 - Initialize msix at declaration in msix_{write/read}.
 - Add the must_check attribute to
   vpci_msix_arch_{enable/disable}_entry.

Changes since v5:
 - Update lock usage.
 - Unbind/unmap PIRQs when MSIX is disabled.
 - Share the arch-specific MSIX code with the MSI functions.
 - Do not reference the MSIX memory areas from the PCI BARs fields,
   instead fetch the BIR and offset each time needed.
 - Add the '_entry' suffix to the MSIX arch functions.
 - Prefix the vMSIX macros with 'V'.
 - s/gdprintk/gprintk/ in msix.c
 - Make vpci_msix_access_check return bool, and change it's name to
   vpci_msix_access_allowed.
 - Join the first two ifs in vpci_msix_{read/write} into a single one.
 - Allow Dom0 to write to the PBA area.
 - Add a note that reads from the PBA area will need to be translated
   if the PBA it's not identity mapped.

Changes since v4:
 - Remove parentheses around offsetof.
 - Add "being" to MSI-X enabling comment.
 - Use INVALID_PIRQ.
 - Add a simple sanity check to vpci_msix_arch_enable in order to
   detect wrong MSI-X entries more quickly.
 - Constify vpci_msix_arch_print entry argument.
 - s/cpu/fixed/ in vpci_msix_arch_print.
 - Dump the MSI-X info together with the MSI info.
 - Fix vpci_msix_control_write to take into account changes to the
   address and data fields when switching the function mask bit.
 - Only disable/enable the entries if the address or data fields have
   been updated.
 - Usew the BAR enable field to check if a BAR is mapped or not
   (instead of reading the command register for each device).
 - Fix error path in vpci_msix_read to set the return data to ~0.
 - Simplify mask usage in vpci_msix_write.
 - Cast data to uint64_t when shifting it 32 bits.
 - Fix writes to the table entry control register to take into account
   if the mask-all bit is set.
 - Add some comments to clarify the intended behavior of the code.
 - Align the PBA size to 64-bits.
 - Remove the error label in vpci_init_msix.
 - Try to compact the layout of the vpci_msix structure.
 - Remove the local table_bar and pba_bar variables from
   vpci_init_msix, they are used only once.

Change

[Xen-devel] [PATCH v12 04/12] pci: split code to size BARs from pci_add_device

2018-03-22 Thread Roger Pau Monne
So that it can be called from outside in order to get the size of regular PCI
BARs. This will be required in order to map the BARs from PCI devices into PVH
Dom0 p2m.

Signed-off-by: Roger Pau Monné 
Reviewed-by: Jan Beulich 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Wei Liu 
---
Changes since v11:
 - Fix initialization of sbdf with gcc 4.3.

Changes since v7:
 - Do not return error from pci_size_mem_bar in order to keep previous
   behavior.

Changes since v6:
 - Remove the vf and addr local variables.
 - Change the way flags are declared.
 - Move the last bool parameter to the flags field.

Changes since v5:
 - Introduce a flags field for pci_size_mem_bar.
 - Use pci_sbdf_t.

Changes since v4:
 - Restore printing whether the BAR is from a vf.
 - Make the psize pointer parameter not optional.
 - s/u64/uint64_t.
 - Remove some unneeded parentheses.
 - Assert the return value is never 0.
 - Use the newly introduced pci_sbdf_t type.

Changes since v3:
 - Rename function to size BARs to pci_size_mem_bar.
 - Change the parameters passed to the function. Pass the position and
   whether the BAR is the last one, instead of the (base, max_bars,
   *index) tuple.
 - Make the function return the number of BARs consumed (1 for 32b, 2
   for 64b BARs).
 - Change the dprintk back to printk.
 - Do not log another error message in pci_add_device in case
   pci_size_mem_bar fails.
---
 xen/drivers/passthrough/pci.c | 94 +++
 xen/include/xen/pci.h |  5 +++
 2 files changed, 65 insertions(+), 34 deletions(-)

diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index e65c7faa6f..c0846e8ebb 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -603,6 +603,56 @@ static int iommu_add_device(struct pci_dev *pdev);
 static int iommu_enable_device(struct pci_dev *pdev);
 static int iommu_remove_device(struct pci_dev *pdev);
 
+unsigned int pci_size_mem_bar(pci_sbdf_t sbdf, unsigned int pos,
+  uint64_t *paddr, uint64_t *psize,
+  unsigned int flags)
+{
+uint32_t hi = 0, bar = pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev,
+   sbdf.func, pos);
+uint64_t size;
+
+ASSERT((bar & PCI_BASE_ADDRESS_SPACE) == PCI_BASE_ADDRESS_SPACE_MEMORY);
+pci_conf_write32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos, ~0);
+if ( (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) ==
+ PCI_BASE_ADDRESS_MEM_TYPE_64 )
+{
+if ( flags & PCI_BAR_LAST )
+{
+printk(XENLOG_WARNING
+   "%sdevice %04x:%02x:%02x.%u with 64-bit %sBAR in last 
slot\n",
+   (flags & PCI_BAR_VF) ? "SR-IOV " : "", sbdf.seg, sbdf.bus,
+   sbdf.dev, sbdf.func, (flags & PCI_BAR_VF) ? "vf " : "");
+*psize = 0;
+return 1;
+}
+hi = pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos + 4);
+pci_conf_write32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos + 4, ~0);
+}
+size = pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos) &
+   PCI_BASE_ADDRESS_MEM_MASK;
+if ( (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) ==
+ PCI_BASE_ADDRESS_MEM_TYPE_64 )
+{
+size |= (uint64_t)pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev,
+  sbdf.func, pos + 4) << 32;
+pci_conf_write32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos + 4, hi);
+}
+else if ( size )
+size |= (uint64_t)~0 << 32;
+pci_conf_write32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos, bar);
+size = -size;
+
+if ( paddr )
+*paddr = (bar & PCI_BASE_ADDRESS_MEM_MASK) | ((uint64_t)hi << 32);
+*psize = size;
+
+if ( (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) ==
+ PCI_BASE_ADDRESS_MEM_TYPE_64 )
+return 2;
+
+return 1;
+}
+
 int pci_add_device(u16 seg, u8 bus, u8 devfn,
const struct pci_dev_info *info, nodeid_t node)
 {
@@ -672,11 +722,13 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn,
 unsigned int i;
 
 BUILD_BUG_ON(ARRAY_SIZE(pdev->vf_rlen) != PCI_SRIOV_NUM_BARS);
-for ( i = 0; i < PCI_SRIOV_NUM_BARS; ++i )
+for ( i = 0; i < PCI_SRIOV_NUM_BARS; )
 {
 unsigned int idx = pos + PCI_SRIOV_BAR + i * 4;
 u32 bar = pci_conf_read32(seg, bus, slot, func, idx);
-u32 hi = 0;
+pci_sbdf_t sbdf = {
+.sbdf = PCI_SBDF3(seg, bus, devfn),
+};
 
 if ( (bar & PCI_BASE_ADDRESS_SPACE) ==
  PCI_BASE_ADDRESS_SPACE_IO )
@@ -687,38 +739,12 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn,
seg, bus, slot, func, i);
 con

[Xen-devel] [PATCH v12 00/12] vpci: PCI config space emulation

2018-03-22 Thread Roger Pau Monne
Hello,

The following series contain an implementation of handlers for the PCI
configuration space inside of Xen. This allows Xen to detect accesses
to the PCI configuration space and react accordingly.

Why is this needed? IMHO, there are two main points of doing all this
emulation inside of Xen, the first one is to prevent adding a bunch of
duplicated Xen PV specific code to each OS we want to support in PVH
mode. This just promotes Xen code duplication amongst OSes, which
leads to a higher maintainership burden.

The second reason would be that this code (or it's functionality to be
more precise) already exists in QEMU (and pciback to a degree), and
it's code that we already support and maintain. By moving it into the
hypervisor itself every guest type can make use of it, and should be
shared between them all. I know that the code in this series is not
yet suitable for DomU HVM guests in it's current state, but it should
be in due time.

As usual, each patch contains a changeset summary between versions,
I'm not going to copy the list of changes here.

The branch containing the patches can be found at:

git://xenbits.xen.org/people/royger/xen.git vpci_v11

Note that this is only safe to use for the hardware domain (that's
trusted), any non-trusted domain will need a lot more handlers before it
can freely access the PCI configuration space.

Roger Pau Monne (12):
  vpci: introduce basic handlers to trap accesses to the PCI config
space
  x86/mmcfg: add handlers for the PVH Dom0 MMCFG areas
  x86/physdev: enable PHYSDEVOP_pci_mmcfg_reserved for PVH Dom0
  pci: split code to size BARs from pci_add_device
  pci: add support to size ROM BARs to pci_size_mem_bar
  xen: introduce rangeset_consume_ranges
  vpci: add header handlers
  x86/pt: mask MSI vectors on unbind
  vpci/msi: add MSI handlers
  vpci: add a priority parameter to the vPCI register initializer
  vpci/msix: add MSI-X handlers
  vpci: do not expose unneeded functions to the user-space test harness

 .gitignore|   3 +
 tools/libxl/libxl_x86.c   |   2 +-
 tools/tests/Makefile  |   1 +
 tools/tests/vpci/Makefile |  33 +++
 tools/tests/vpci/emul.h   | 134 +
 tools/tests/vpci/main.c   | 309 +
 xen/arch/arm/xen.lds.S|  14 +
 xen/arch/x86/Kconfig  |   1 +
 xen/arch/x86/domain.c |   6 +-
 xen/arch/x86/hvm/dom0_build.c |  23 +-
 xen/arch/x86/hvm/hvm.c|   7 +
 xen/arch/x86/hvm/hypercall.c  |   5 +
 xen/arch/x86/hvm/io.c | 293 
 xen/arch/x86/hvm/ioreq.c  |   4 +
 xen/arch/x86/hvm/vmsi.c   | 246 +
 xen/arch/x86/msi.c|   3 +
 xen/arch/x86/physdev.c|  11 +
 xen/arch/x86/setup.c  |   2 +-
 xen/arch/x86/x86_64/mmconfig.h|   4 -
 xen/arch/x86/xen.lds.S|  14 +
 xen/common/rangeset.c |  28 ++
 xen/drivers/Kconfig   |   3 +
 xen/drivers/Makefile  |   1 +
 xen/drivers/passthrough/io.c  |  15 +
 xen/drivers/passthrough/pci.c | 104 ---
 xen/drivers/vpci/Makefile |   1 +
 xen/drivers/vpci/header.c | 564 ++
 xen/drivers/vpci/msi.c| 349 +++
 xen/drivers/vpci/msix.c   | 458 +++
 xen/drivers/vpci/vpci.c   | 482 
 xen/include/asm-x86/domain.h  |   1 +
 xen/include/asm-x86/hvm/domain.h  |   7 +
 xen/include/asm-x86/hvm/io.h  |  20 ++
 xen/include/asm-x86/msi.h |   3 +
 xen/include/asm-x86/pci.h |   6 +
 xen/include/public/arch-x86/xen.h |   5 +-
 xen/include/xen/irq.h |   1 +
 xen/include/xen/pci.h |   9 +
 xen/include/xen/pci_regs.h|   8 +
 xen/include/xen/rangeset.h|  10 +
 xen/include/xen/sched.h   |   4 +
 xen/include/xen/vpci.h| 225 +++
 42 files changed, 3373 insertions(+), 46 deletions(-)
 create mode 100644 tools/tests/vpci/Makefile
 create mode 100644 tools/tests/vpci/emul.h
 create mode 100644 tools/tests/vpci/main.c
 create mode 100644 xen/drivers/vpci/Makefile
 create mode 100644 xen/drivers/vpci/header.c
 create mode 100644 xen/drivers/vpci/msi.c
 create mode 100644 xen/drivers/vpci/msix.c
 create mode 100644 xen/drivers/vpci/vpci.c
 create mode 100644 xen/include/xen/vpci.h

-- 
2.16.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v12 09/12] vpci/msi: add MSI handlers

2018-03-22 Thread Roger Pau Monne
Add handlers for the MSI control, address, data and mask fields in
order to detect accesses to them and setup the interrupts as requested
by the guest.

Note that the pending register is not trapped, and the guest can
freely read/write to it.

Signed-off-by: Roger Pau Monné 
Reviewed-by: Jan Beulich 
[IO]
Reviewed-by: Paul Durrant 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Wei Liu 
Cc: Paul Durrant 
---
Changes since v8:
 - Add a FIXME about the lack of testing and a comment regarding the
   lack of cleaning done in the init_msi error path.
 - Free msi struct when cleaning up if an init function failed.
 - Remove the 'error' label of init_msi, the caller will already
   perform the cleaning.

Changes since v7:
 - Don't store pci segment/bus on local variables.
 - Add an error label to init_msi.
 - Don't trap accesses to the PBA.
 - Fix msi_pending_bits_reg macro so it matches coding style.
 - Move the position of vectors in the vpci_msi struct.
 - Add a comment to clarify the expected state of vectors after
   pt_irq_create_bind and use XEN_DOMCTL_VMSI_X86_UNMASKED.

Changes since v6:
 - Use domain_spin_lock_irq_desc instead of open coding it.
 - Reduce the size of printed debug messages.
 - Constify domain in vpci_dump_msi.
 - Lock domlist_read_lock before iterating over the list of domains.
 - Make max_vectors and vectors uint8_t.
 - Drop the vpci_ prefix from the static functions in msi.c.
 - Turn the booleans in vpci_msi into bitfields.
 - Apply the mask bits to all vectors when enabling msi.
 - Remove the pos field.
 - Remove the usage of __msi_set_{enable/disable}.
 - Update the bindings when the message or data fields are updated.
 - Make vpci_msi_arch_disable return void, it wasn't returning any
   error.
 - Prevent the guest from writing to the pending bits field, it's read
   only as defined in the spec.
 - Add the must_check attribute to vpci_msi_arch_enable.

Changes since v5:
 - Update to new lock usage.
 - Change handlers to match the new type.
 - s/msi_flags/msi_gflags/, remove the local variables and use the new
   DOMCTL_VMSI_* defines.
 - Change the MSI arch function to take a vpci_msi instead of a
   vpci_arch_msi as parameter.
 - Fix the calculation of the guest vector for MSI injection to take
   into account the number of bits that can be modified.
 - Use INVALID_PIRQ everywhere.
 - Simplify exit path of vpci_msi_disable.
 - Remove the conditional when setting address64 and masking fields.
 - Add a process_pending_softirqs to the MSI dump loop.
 - Place the prototypes for the MSI arch-specific functions in
   xen/vpci.h.
 - Add parentheses around the INVALID_PIRQ definition.

Changes since v4:
 - Fix commit message.
 - Change the ASSERTs in vpci_msi_arch_mask into ifs.
 - Introduce INVALID_PIRQ.
 - Destroy the partially created bindings in case of failure in
   vpci_msi_arch_enable.
 - Just take the pcidevs lock once in vpci_msi_arch_disable.
 - Print an error message in case of failure of pt_irq_destroy_bind.
 - Make vpci_msi_arch_init return void.
 - Constify the arch parameter of vpci_msi_arch_print.
 - Use fixed instead of cpu for msi redirection.
 - Separate the header includes in vpci/msi.c between xen and asm.
 - Store the number of configured vectors even if MSI is not enabled
   and always return it in vpci_msi_control_read.
 - Fix/add comments in vpci_msi_control_write to clarify intended
   behavior.
 - Simplify usage of masks in vpci_msi_address_{upper_}write.
 - Add comment to vpci_msi_mask_{read/write}.
 - Don't use MASK_EXTR in vpci_msi_mask_write.
 - s/msi_offset/pos/ in vpci_init_msi.
 - Move control variable setup closer to it's usage.
 - Use d%d in vpci_dump_msi.
 - Fix printing of bitfield mask in vpci_dump_msi.
 - Fix definition of MSI_ADDR_REDIRECTION_MASK.
 - Shuffle the layout of vpci_msi to minimize gaps.
 - Remove the error label in vpci_init_msi.

Changes since v3:
 - Propagate changes from previous versions: drop xen_ prefix, drop
   return value from handlers, use the new vpci_val fields.
 - Use MASK_EXTR.
 - Remove the usage of GENMASK.
 - Add GFLAGS_SHIFT_DEST_ID and use it in msi_flags.
 - Add "arch" to the MSI arch specific functions.
 - Move the dumping of vPCI MSI information to dump_msi (key 'M').
 - Remove the guest_vectors field.
 - Allow the guest to change the number of active vectors without
   having to disable and enable MSI.
 - Check the number of active vectors when parsing the disable
   mask.
 - Remove the debug messages from vpci_init_msi.
 - Move the arch-specific part of the dump handler to x86/hvm/vmsi.c.
 - Use trylock in the dump handler to get the vpci lock.

Changes since v2:
 - Add an arch-specific abstraction layer. Note that this is only implemented
   for x86 currently.
 - Add a wrapper to detect MSI enabling for vPCI.
---
NB: I've only been able to test this with devices using a single MSI
interrupt and n

Re: [Xen-devel] [PATCH v3a 39/39] ARM: VGIC: wire new VGIC(-v2) files into Xen build system

2018-03-22 Thread Julien Grall

Hi Andre,

On 03/22/2018 11:56 AM, Andre Przywara wrote:

Now that we have both the old VGIC prepared to cope with a sibling and
the code for the new VGIC in place, lets add a Kconfig option to enable
the new code and wire it into the Xen build system.
This will add a compile time option to use either the "old" or the "new"
VGIC.
In the moment this is restricted to a vGIC-v2. To make the build system
happy, we provide a temporary dummy implementation of
vgic_v3_setup_hw() to allow building for now.

Signed-off-by: Andre Przywara 


Acked-by: Julien Grall 

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v12 12/12] vpci: do not expose unneeded functions to the user-space test harness

2018-03-22 Thread Roger Pau Monne
Some functions in vpci.c (vpci_remove_device and vpci_add_handlers)
are not used by the user-space test harness, so guard them with
__XEN__ in order to avoid exposing them to the user-space test
harness.

Requested-by: Jan Beulich 
Signed-off-by: Roger Pau Monné 
Reviewed-by: Jan Beulich 
---
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Jan Beulich 
Cc: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
---
 tools/tests/vpci/Makefile |  8 ++--
 xen/drivers/vpci/vpci.c   | 10 ++
 xen/include/xen/vpci.h|  6 +-
 3 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/tools/tests/vpci/Makefile b/tools/tests/vpci/Makefile
index e45fcb5cd9..5075bc2be2 100644
--- a/tools/tests/vpci/Makefile
+++ b/tools/tests/vpci/Makefile
@@ -24,12 +24,8 @@ distclean: clean
 install:
 
 vpci.c: $(XEN_ROOT)/xen/drivers/vpci/vpci.c
-   # Trick the compiler so it doesn't complain about missing symbols
-   sed -e '/#include/d' \
-   -e '1s;^;#include "emul.h"\
-vpci_register_init_t *const __start_vpci_array[1]\;\
-vpci_register_init_t *const __end_vpci_array[1]\;\
-;' <$< >$@
+   # Remove includes and add the test harness header
+   sed -e '/#include/d' -e '1s/^/#include "emul.h"/' <$< >$@
 
 list.h: $(XEN_ROOT)/xen/include/xen/list.h
 vpci.h: $(XEN_ROOT)/xen/include/xen/vpci.h
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 8ec9c916ea..2913b56500 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -20,10 +20,6 @@
 #include 
 #include 
 
-extern vpci_register_init_t *const __start_vpci_array[];
-extern vpci_register_init_t *const __end_vpci_array[];
-#define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
-
 /* Internal struct to store the emulated PCI registers. */
 struct vpci_register {
 vpci_read_t *read;
@@ -34,6 +30,11 @@ struct vpci_register {
 struct list_head node;
 };
 
+#ifdef __XEN__
+extern vpci_register_init_t *const __start_vpci_array[];
+extern vpci_register_init_t *const __end_vpci_array[];
+#define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
+
 void vpci_remove_device(struct pci_dev *pdev)
 {
 spin_lock(&pdev->vpci->lock);
@@ -80,6 +81,7 @@ int __hwdom_init vpci_add_handlers(struct pci_dev *pdev)
 
 return rc;
 }
+#endif /* __XEN__ */
 
 static int vpci_register_cmp(const struct vpci_register *r1,
  const struct vpci_register *r2)
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index fc47163ba6..cb39e0ebea 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -90,11 +90,9 @@ struct vpci {
 bool rom_enabled  : 1;
 /* FIXME: currently there's no support for SR-IOV. */
 } header;
-#endif
 
 /* MSI data. */
 struct vpci_msi {
-#ifdef __XEN__
   /* Address. */
 uint64_t address;
 /* Mask bitfield. */
@@ -113,12 +111,10 @@ struct vpci {
 uint8_t vectors : 5;
 /* Arch-specific data. */
 struct vpci_arch_msi arch;
-#endif
 } *msi;
 
 /* MSI-X data. */
 struct vpci_msix {
-#ifdef __XEN__
 struct pci_dev *pdev;
 /* List link. */
 struct list_head next;
@@ -141,8 +137,8 @@ struct vpci {
 bool updated : 1;
 struct vpci_arch_msix_entry arch;
 } entries[];
-#endif
 } *msix;
+#endif
 };
 
 struct vpci_vcpu {
-- 
2.16.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v12 05/12] pci: add support to size ROM BARs to pci_size_mem_bar

2018-03-22 Thread Roger Pau Monne
Signed-off-by: Roger Pau Monné 
Reviewed-by: Jan Beulich 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Wei Liu 
---
Changes since v6:
 - Remove the rom local variable.

Changes since v5:
 - Use the flags field.
 - Introduce a mask local variable.
 - Simplify return.

Changes since v4:
 - New in this version.
---
 xen/drivers/passthrough/pci.c | 28 ++--
 xen/include/xen/pci.h |  1 +
 2 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index c0846e8ebb..1db69d5b99 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -610,11 +610,16 @@ unsigned int pci_size_mem_bar(pci_sbdf_t sbdf, unsigned 
int pos,
 uint32_t hi = 0, bar = pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev,
sbdf.func, pos);
 uint64_t size;
-
-ASSERT((bar & PCI_BASE_ADDRESS_SPACE) == PCI_BASE_ADDRESS_SPACE_MEMORY);
+bool is64bits = !(flags & PCI_BAR_ROM) &&
+(bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == PCI_BASE_ADDRESS_MEM_TYPE_64;
+uint32_t mask = (flags & PCI_BAR_ROM) ? (uint32_t)PCI_ROM_ADDRESS_MASK
+  : 
(uint32_t)PCI_BASE_ADDRESS_MEM_MASK;
+
+ASSERT(!((flags & PCI_BAR_VF) && (flags & PCI_BAR_ROM)));
+ASSERT((flags & PCI_BAR_ROM) ||
+   (bar & PCI_BASE_ADDRESS_SPACE) == PCI_BASE_ADDRESS_SPACE_MEMORY);
 pci_conf_write32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos, ~0);
-if ( (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) ==
- PCI_BASE_ADDRESS_MEM_TYPE_64 )
+if ( is64bits )
 {
 if ( flags & PCI_BAR_LAST )
 {
@@ -628,10 +633,9 @@ unsigned int pci_size_mem_bar(pci_sbdf_t sbdf, unsigned 
int pos,
 hi = pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos + 4);
 pci_conf_write32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos + 4, ~0);
 }
-size = pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos) &
-   PCI_BASE_ADDRESS_MEM_MASK;
-if ( (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) ==
- PCI_BASE_ADDRESS_MEM_TYPE_64 )
+size = pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func,
+   pos) & mask;
+if ( is64bits )
 {
 size |= (uint64_t)pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev,
   sbdf.func, pos + 4) << 32;
@@ -643,14 +647,10 @@ unsigned int pci_size_mem_bar(pci_sbdf_t sbdf, unsigned 
int pos,
 size = -size;
 
 if ( paddr )
-*paddr = (bar & PCI_BASE_ADDRESS_MEM_MASK) | ((uint64_t)hi << 32);
+*paddr = (bar & mask) | ((uint64_t)hi << 32);
 *psize = size;
 
-if ( (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) ==
- PCI_BASE_ADDRESS_MEM_TYPE_64 )
-return 2;
-
-return 1;
+return is64bits ? 2 : 1;
 }
 
 int pci_add_device(u16 seg, u8 bus, u8 devfn,
diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
index 2f171a8dcc..4cfa774615 100644
--- a/xen/include/xen/pci.h
+++ b/xen/include/xen/pci.h
@@ -191,6 +191,7 @@ const char *parse_pci_seg(const char *, unsigned int *seg, 
unsigned int *bus,
 
 #define PCI_BAR_VF  (1u << 0)
 #define PCI_BAR_LAST(1u << 1)
+#define PCI_BAR_ROM (1u << 2)
 unsigned int pci_size_mem_bar(pci_sbdf_t sbdf, unsigned int pos,
   uint64_t *paddr, uint64_t *psize,
   unsigned int flags);
-- 
2.16.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v12 02/12] x86/mmcfg: add handlers for the PVH Dom0 MMCFG areas

2018-03-22 Thread Roger Pau Monne
Introduce a set of handlers for the accesses to the MMCFG areas. Those
areas are setup based on the contents of the hardware MMCFG tables,
and the list of handled MMCFG areas is stored inside of the hvm_domain
struct.

The read/writes are forwarded to the generic vpci handlers once the
address is decoded in order to obtain the device and register the
guest is trying to access.

Signed-off-by: Roger Pau Monné 
Reviewed-by: Paul Durrant 
Reviewed-by: Jan Beulich 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Paul Durrant 
---
Changes since v7:
 - Add check for end_bus >= start_bus to register_vpci_mmcfg_handler.
 - Protect destroy_vpci_mmcfg with the mmcfg_lock.

Changes since v6:
 - Move allocation of mmcfg outside of the locked region.
 - Do proper overlap checks when adding mmcfg regions.
 - Return _RETRY if the mcfg region cannot be found in the read/write
   handlers. This means the mcfg area has been removed between the
   accept and the read/write calls.

Changes since v5:
 - Switch to use pci_sbdf_t.
 - Switch to the new per vpci locks.
 - Move the mmcfg related external definitions to asm-x86/pci.h.

Changes since v4:
 - Change the attribute of pvh_setup_mmcfg to __hwdom_init.
 - Try to add as many MMCFG regions as possible, even if one fails to
   add.
 - Change some fields of the hvm_mmcfg struct: turn size into a
   unsigned int, segment into uint16_t and bus into uint8_t.
 - Convert some address parameters from unsigned long to paddr_t for
   consistency.
 - Make vpci_mmcfg_decode_addr return the decoded register in the
   return of the function.
 - Introduce a new macro to convert a MMCFG address into a BDF, and
   use it in vpci_mmcfg_decode_addr to clarify the logic.
 - In vpci_mmcfg_{read/write} unify the logic for 8B accesses and
   smaller ones.
 - Add the __hwdom_init attribute to register_vpci_mmcfg_handler.
 - Test that reg + size doesn't cross a device boundary.

Changes since v3:
 - Propagate changes from previous patches: drop xen_ prefix for vpci
   functions, pass slot and func instead of devfn and fix the error
   paths of the MMCFG handlers.
 - s/ecam/mmcfg/.
 - Move the destroy code to a separate function, so the hvm_mmcfg
   struct can be private to hvm/io.c.
 - Constify the return of vpci_mmcfg_find.
 - Use d instead of v->domain in vpci_mmcfg_accept.
 - Allow 8byte accesses to the mmcfg.

Changes since v1:
 - Added locking.
---
 xen/arch/x86/hvm/dom0_build.c|  21 +
 xen/arch/x86/hvm/hvm.c   |   4 +
 xen/arch/x86/hvm/io.c| 184 ++-
 xen/arch/x86/x86_64/mmconfig.h   |   4 -
 xen/include/asm-x86/hvm/domain.h |   4 +
 xen/include/asm-x86/hvm/io.h |   7 ++
 xen/include/asm-x86/pci.h|   6 ++
 7 files changed, 225 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index 1c70416af4..259814d95d 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -1055,6 +1056,24 @@ static int __init pvh_setup_acpi(struct domain *d, 
paddr_t start_info)
 return 0;
 }
 
+static void __hwdom_init pvh_setup_mmcfg(struct domain *d)
+{
+unsigned int i;
+int rc;
+
+for ( i = 0; i < pci_mmcfg_config_num; i++ )
+{
+rc = register_vpci_mmcfg_handler(d, pci_mmcfg_config[i].address,
+ pci_mmcfg_config[i].start_bus_number,
+ pci_mmcfg_config[i].end_bus_number,
+ pci_mmcfg_config[i].pci_segment);
+if ( rc )
+printk("Unable to setup MMCFG handler at %#lx for segment %u\n",
+   pci_mmcfg_config[i].address,
+   pci_mmcfg_config[i].pci_segment);
+}
+}
+
 int __init dom0_construct_pvh(struct domain *d, const module_t *image,
   unsigned long image_headroom,
   module_t *initrd,
@@ -1096,6 +1115,8 @@ int __init dom0_construct_pvh(struct domain *d, const 
module_t *image,
 return rc;
 }
 
+pvh_setup_mmcfg(d);
+
 panic("Building a PVHv2 Dom0 is not yet supported.");
 return 0;
 }
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 26f6335854..346e11f2d6 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -584,8 +584,10 @@ int hvm_domain_initialise(struct domain *d)
 spin_lock_init(&d->arch.hvm_domain.irq_lock);
 spin_lock_init(&d->arch.hvm_domain.uc_lock);
 spin_lock_init(&d->arch.hvm_domain.write_map.lock);
+rwlock_init(&d->arch.hvm_domain.mmcfg_lock);
 INIT_LIST_HEAD(&d->arch.hvm_domain.write_map.list);
 INIT_LIST_HEAD(&d->arch.hvm_domain.g2m_ioport_list);
+INIT_LIST_HEAD(&d->arch.hvm_domain.mmcfg_regions);
 
 rc = create_perdomain_mapping(d, PERDOMAIN_VIRT_START, 0, NULL, NULL);
 if ( rc )
@@ -731,6 +733,8 @@ void hvm_domain_destroy(stru

[Xen-devel] [PATCH v12 07/12] vpci: add header handlers

2018-03-22 Thread Roger Pau Monne
Introduce a set of handlers that trap accesses to the PCI BARs and the
command register, in order to snoop BAR sizing and BAR relocation.

The command handler is used to detect changes to bit 2 (response to
memory space accesses), and maps/unmaps the BARs of the device into
the guest p2m. A rangeset is used in order to figure out which memory
to map/unmap. This makes it easier to keep track of the possible
overlaps with other BARs, and will also simplify MSI-X support, where
certain regions of a BAR might be used for the MSI-X table or PBA.

The BAR register handlers are used to detect attempts by the guest to
size or relocate the BARs.

Note that the long running BAR mapping and unmapping operations are
deferred to be performed by hvm_io_pending, so that they can be safely
preempted.

Signed-off-by: Roger Pau Monné 
Reviewed-by: Jan Beulich 
[IO]
Reviewed-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Jan Beulich 
Cc: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Paul Durrant 
---
Changes since v11:
 - Fix initialization of sbdf with gcc 4.3.

Changes since v10:
 - Fix indirect function call in map_range.
 - Use rom->addr instead of fetching it from the ROM BAR register in
   modify_decoding.
 - Remove ternary operator from modify_decoding.
 - Simply apply_map to have a single return.
 - Constify pci_dev parameter of apply_map.
 - Remove references to maybe_defer_map.
 - Use pdev (const) or dev (non-const) consistently in modify_bars.
 - Invert part of the logic in rom_write to remove one indentation
   level.
 - Add comments in rom_write to clarify why rom->addr is updated in
   two different places.
 - Use lx to print frame numbers in modify_bars.
 - Add start/end local variables in the first modify_bars loop.

Changes since v9:
 - Expand comments to clarify the code.
 - Rename rom to rom_only in the vpci_cpu struct.
 - Change definition style of dummy vpci_cpu.
 - Replace incorrect usage of PFN_UP.
 - Use system_state in order to check if the mapping functions are
   being called from Dom0 builder context.
 - Split the maybe_defer_map into two functions and place the Dom0
   builder one in the init section.

Changes since v8:
 - Do not pretend to support ARM in the map_range function. Explain
   the required changes in the comment.
 - Introduce PCI_HEADER_{NORMAL/BRIDGE}_NR_BARS defines.
 - Rename 'rom' boolean variable to 'rom_only', which is more
   descriptive of it's meaning.
 - Introduce vpci_remove_device which removes all handlers for a
   device.
 - Simplify error handling when modifying BARs mapping. Any error will
   cause the device to be unplugged (by calling vpci_remove_device).
 - Return an error code in modify_bars. Add comments describing why
   the error is sometimes ignored.

Changes since v7:
 - Order includes.
 - Add newline between switch cases.
 - Fix typo in comment (hopping).
 - Wrap ternary conditional in parentheses.
 - Remove CONFIG_HAS_PCI gueard from sched.h vpci_vcpu usage.
 - Add comment regarding vpci_vcpu usage.
 - Move rom_enabled from BAR struct to header.
 - Do not protect vpci_vcpu with __XEN__ guards.

Changes since v6:
 - s/vpci_check_pending/vpci_process_pending/.
 - Improve error handling in vpci_process_pending.
 - Add a comment that explains how vpci_check_bar_overlap works.
 - Add error messages to vpci_modify_bars and vpci_modify_rom.
 - Introduce vpci_hw_read16/32, in order to passthrough reads to
   the underlying hw.
 - Print BAR number on error in vpci_bar_write.
 - Place the CONFIG_HAS_PCI guards inside the vpci.h header and
   provide an empty vpci_vcpu structure for the !CONFIG_HAS_PCI case.
 - Define CONFIG_HAS_PCI in the test harness emul.h header before
   including vpci.h
 - Add ARM TODOs and an ARM-specific bodge to vpci_map_range due to
   the lack of preemption in {un}map_mmio_regions.
 - Make vpci_maybe_defer_map void.
 - Set rom_enabled in vpci_init_bars.
 - Defer enabling/disabling the memory decoding (or the ROM enable
   bit) until the memory has been mapped/unmapped.
 - Remove vpci_ prefix from static functions.
 - Use the same code in order to map the general BARs and the ROM
   BARs.
 - Remove the seg/bus local variables and use pdev->{seg,bus} instead.
 - Convert the bools in the BAR related structs into bool bitfields.
 - Add the must_check attribute to vpci_process_pending.
 - Open code check_bar_overlap inside modify_bars, which was it's only
   user.

Changes since v5:
 - Switch to the new handler type.
 - Use pci_sbdf_t to size the BARs.
 - Use a single return for vpci_modify_bar.
 - Do not return an error code from vpci_modify_bars, just log the
   failure.
 - Remove the 'sizing' parameter. Instead just let the guest write
   directly to the BAR, and read the value back. This simplifies the
   BAR register handlers, specially the read one.
 - Ignore ROM BAR writes with memory decoding enabled and ROM enabled.
 - Do not propagate failures to setu

[Xen-devel] [PATCH v12 06/12] xen: introduce rangeset_consume_ranges

2018-03-22 Thread Roger Pau Monne
This function allows to iterate over a rangeset while removing the
processed regions.

This will be used in order to split processing of large memory areas
when mapping them into the guest p2m.

Signed-off-by: Roger Pau Monné 
Reviewed-by: Wei Liu 
---
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Wei Liu 
---
Changes since v6:
 - Expand commit message.
 - Add a comment to describe the expected function behavior.
 - Fix indentation.

Changes since v5:
 - New in this version.
---
 xen/common/rangeset.c  | 28 
 xen/include/xen/rangeset.h | 10 ++
 2 files changed, 38 insertions(+)

diff --git a/xen/common/rangeset.c b/xen/common/rangeset.c
index ade34f6a50..bb68ce62e4 100644
--- a/xen/common/rangeset.c
+++ b/xen/common/rangeset.c
@@ -350,6 +350,34 @@ int rangeset_claim_range(struct rangeset *r, unsigned long 
size,
 return 0;
 }
 
+int rangeset_consume_ranges(struct rangeset *r,
+int (*cb)(unsigned long s, unsigned long e, void *,
+  unsigned long *c),
+void *ctxt)
+{
+int rc = 0;
+
+write_lock(&r->lock);
+while ( !rangeset_is_empty(r) )
+{
+unsigned long consumed = 0;
+struct range *x = first_range(r);
+
+rc = cb(x->s, x->e, ctxt, &consumed);
+
+ASSERT(consumed <= x->e - x->s + 1);
+x->s += consumed;
+if ( x->s > x->e )
+destroy_range(r, x);
+
+if ( rc )
+break;
+}
+write_unlock(&r->lock);
+
+return rc;
+}
+
 int rangeset_add_singleton(
 struct rangeset *r, unsigned long s)
 {
diff --git a/xen/include/xen/rangeset.h b/xen/include/xen/rangeset.h
index 1f83b1f44b..583b72bb0c 100644
--- a/xen/include/xen/rangeset.h
+++ b/xen/include/xen/rangeset.h
@@ -70,6 +70,16 @@ int rangeset_report_ranges(
 struct rangeset *r, unsigned long s, unsigned long e,
 int (*cb)(unsigned long s, unsigned long e, void *), void *ctxt);
 
+/*
+ * Note that the consume function can return an error value apart from
+ * -ERESTART, and that no cleanup is performed (ie: the user should call
+ * rangeset_destroy if needed).
+ */
+int rangeset_consume_ranges(struct rangeset *r,
+int (*cb)(unsigned long s, unsigned long e,
+  void *, unsigned long *c),
+void *ctxt);
+
 /* Add/remove/query a single number. */
 int __must_check rangeset_add_singleton(
 struct rangeset *r, unsigned long s);
-- 
2.16.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3a 03/39] ARM: GIC: Allow tweaking the active and pending state of an IRQ

2018-03-22 Thread Julien Grall

Hi Andre,

On 03/22/2018 11:56 AM, Andre Przywara wrote:

When playing around with hardware mapped, level triggered virtual IRQs,
there is the need to explicitly set the active or pending state of an
interrupt at some point.
To prepare the GIC for that, we introduce a set_active_state() and a
set_pending_state() function to let the VGIC manipulate the state of
an associated hardware IRQ.
This takes care of properly setting the _IRQ_INPROGRESS bit.

Signed-off-by: Andre Przywara 


Reviewed-by: Julien Grall 

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 14/39] ARM: new VGIC: Add GICv2 world switch backend

2018-03-22 Thread Julien Grall

Hi Andre,

On 03/22/2018 11:04 AM, Andre Przywara wrote:

This is a "patch to the patch" mentioned above, to make it clear what
changed:
We now take the desc lock in vgic_v2_fold_lr_state() when we are dealing
with a hardware IRQ. This is a bit complicated, because we have to obey
the existing locking order, so do our infamous "drop-take-retake" dance.
Also I print a message about using the new VGIC and fix that last
remaining "u32" usage.

Please note that I had to initialise "desc" to NULL because my compiler
(GCC 5.3) is not smart enough to see that we only use it with irq->hw
set and it's safe. Please let me know if it's me not being smart enough
here instead ;-)


I would not be surprised that even recent compiler can't deal with that. 
It would require quite some work from the compiler to know that desc is 
only used when irq->hw.


I will comment the code on 3a.

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] X86 Community Call - Wed Apr 11, 14:00 - 15:00 UTC - Call for Agenda Items

2018-03-22 Thread Julien Grall

Hi,

On 03/22/2018 11:55 AM, Roger Pau Monné wrote:

On Thu, Mar 22, 2018 at 10:27:35AM +, Paul Durrant wrote:

De-htmling...

-
From: Lars Kurth
Sent: 22 March 2018 10:22
To: xen-de...@lists.xensource.com
Cc: committ...@xenproject.org; Juergen Gross ; Janakarajan Natarajan ; Tamas K Lengyel 
; Wei Liu ; Andrew Cooper ; Daniel Kiper 
; Roger Pau Monné ; Christopher Clark ; Rich Persaud 
; Paul Durrant ; Jan Beulich' ; Brian Woods 
; intel-...@intel.com
Subject: X86 Community Call - Wed Apr 11, 14:00 - 15:00 UTC - Call for Agenda 
Items

Hi all,
please find attached
a) Meeting details (just a link with timezones) – the meeting invite will 
follow when we have an agenda
    Bridge details – will be sent with the meeting invite
    I am thinking of using GotoMeeting, but want to try this with a Linux only 
user before I commit
c) Call for agenda items
A few suggestions were made, such as XPTI status (if applicable), PVH status
Also we have some left-overs from the last call: see 
https://lists.xenproject.org/archives/html/xen-devel/2018-03/threads.html#01571
Regards
Lars
== Meeting Details ==
Wed April 11, 15:00 - 16:00 UTC
International meeting times: 
https://www.timeanddate.com/worldclock/meetingdetails.html?year=2018&month=4&day=11&hour=14&min=0&sec=0&p1=224&p2=24&p3=179&p4=136&p5=37&p6=33
== Agenda Proposal ==
We start with a round the table call as to who is on the call (name and company)
=== A) Coordination and Planning ===
Coordinating who does what, what needs attention, what is blocked, etc.
A1) Short-term
Any urgent issues related to the 4.11 release that need discussing
A2) Long-term, Larger series
Please call out any x86 related series, that need attention in the longer term. 
Provide
* Title of series
* Link to series (e.g. on https://lists.xenproject.org/archives/html/xen-devel, 
markmail, …)
* Describe any: Dependencies, Issues, etc. that are relevant
=== B) Design, architecture, feature eupdates related discussions ===
Please highlight any design/architecture discussions that you would like to 
cover. Please describe
* Design, point to any mail discussions
* Describe clearly what you are blocked on: highlight any issues
=== C) Demos, Sharing of Experiences, Sometimes discussion of specific 
issues/bugs/problems/... ===
Please highlight any of the above that you would like to cover. Please describe
* What the issue/experience/demo is that you would like to cover
=== D) AOB ===
-

I think we need to discuss PCI emulation and our future direction. Our current 
hybrid with QEMU is becoming increasingly problematic.


+1


I think it would be worth for Stefano and I to join this discussion. 
Ideally, we want to use a common solution between Arm and x86.


Not sure the time will fit for Stefano thought.

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 8/8] x86: avoid double CR3 reload when switching to guest user mode

2018-03-22 Thread Jan Beulich
>>> On 22.03.18 at 14:20,  wrote:
> On Mon, Mar 19, 2018 at 07:41:42AM -0600, Jan Beulich wrote:
>> --- a/xen/arch/x86/pv/domain.c
>> +++ b/xen/arch/x86/pv/domain.c
>> @@ -219,10 +219,22 @@ int pv_domain_initialise(struct domain *
>>  return rc;
>>  }
>>  
>> -static void _toggle_guest_pt(struct vcpu *v)
>> +static void _toggle_guest_pt(struct vcpu *v, bool force_cr3)
>>  {
>> +ASSERT(!in_irq());
>> +
>>  v->arch.flags ^= TF_kernel_mode;
>>  update_cr3(v);
>> +
>> +/*
>> + * There's no need to load CR3 here when it is going to be loaded on the
>> + * way out to guest mode again anyway, and when the page tables we're
>> + * currently on are the kernel ones (whereas when switching to kernel
>> + * mode we need to be able to write a bounce frame onto the kernel 
>> stack).
>> + */
> 
> Not sure I follow the comment. If you're talking about
> create_bounce_frame, it wouldn't call this function in the first place,
> right?

Right. The comment is talking about what may happen after we
return from here.

>> +if ( !force_cr3 && !(v->arch.flags & TF_kernel_mode) )
> 
> Also, it takes a bit of mental power to see !(v->arch.flags &
> TF_kernel_mode) means the mode Xen is using. Can you maybe just use a
> variable at the beginning like
> 
>bool kernel_mode = v->arch.flags & TF_kernel_mode;
> 
> and then use it here?

Except for the (how I would say) clutter by the extra local variable
I don't see much of a difference.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] X86 Community Call - Wed Apr 11, 14:00 - 15:00 UTC - Call for Agenda Items

2018-03-22 Thread Razvan Cojocaru
On 03/22/2018 12:22 PM, Lars Kurth wrote:
> Hi all,
> 
> please find attached 
> a) Meeting details (just a link with timezones) – the meeting invite
> will follow when we have an agenda
>    Bridge details – will be sent with the meeting invite
>    I am thinking of using GotoMeeting, but want to try this with a Linux
> only user before I commit
> c) Call for agenda items

Using GotoMeeting would be great (as a Linux user I recall being able to
hear / see GotoMeeting sessions, though I've not tried it recently).

It's definitely more convenient than a phone meeting.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 8/8] x86: avoid double CR3 reload when switching to guest user mode

2018-03-22 Thread Wei Liu
On Mon, Mar 19, 2018 at 07:41:42AM -0600, Jan Beulich wrote:
> When XPTI is active, the CR3 load in restore_all_guest is sufficient
> when switching to user mode, improving in particular system call and
> page fault exit paths for the guest.
> 
> Signed-off-by: Jan Beulich 
> Tested-by: Juergen Gross 
> Reviewed-by: Juergen Gross 
> ---
> v2: Add ASSERT(!in_irq()).
> 
> --- a/xen/arch/x86/pv/domain.c
> +++ b/xen/arch/x86/pv/domain.c
> @@ -219,10 +219,22 @@ int pv_domain_initialise(struct domain *
>  return rc;
>  }
>  
> -static void _toggle_guest_pt(struct vcpu *v)
> +static void _toggle_guest_pt(struct vcpu *v, bool force_cr3)
>  {
> +ASSERT(!in_irq());
> +
>  v->arch.flags ^= TF_kernel_mode;
>  update_cr3(v);
> +
> +/*
> + * There's no need to load CR3 here when it is going to be loaded on the
> + * way out to guest mode again anyway, and when the page tables we're
> + * currently on are the kernel ones (whereas when switching to kernel
> + * mode we need to be able to write a bounce frame onto the kernel 
> stack).
> + */

Not sure I follow the comment. If you're talking about
create_bounce_frame, it wouldn't call this function in the first place,
right?

> +if ( !force_cr3 && !(v->arch.flags & TF_kernel_mode) )

Also, it takes a bit of mental power to see !(v->arch.flags &
TF_kernel_mode) means the mode Xen is using. Can you maybe just use a
variable at the beginning like

   bool kernel_mode = v->arch.flags & TF_kernel_mode;

and then use it here?

> +return;
> +
>  /* Don't flush user global mappings from the TLB. Don't tick TLB clock. 
> */
>  asm volatile ( "mov %0, %%cr3" : : "r" (v->arch.cr3) : "memory" );
>  
> @@ -252,13 +264,13 @@ void toggle_guest_mode(struct vcpu *v)
>  }
>  asm volatile ( "swapgs" );
>  
> -_toggle_guest_pt(v);
> +_toggle_guest_pt(v, cpu_has_no_xpti);
>  }
>  
>  void toggle_guest_pt(struct vcpu *v)
>  {
>  if ( !is_pv_32bit_vcpu(v) )
> -_toggle_guest_pt(v);
> +_toggle_guest_pt(v, true);
>  }
>  
>  /*
> 
> 
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 07/12] hvmloader: allocate MMCONFIG area in the MMIO hole + minor code refactoring

2018-03-22 Thread Jan Beulich
>>> On 22.03.18 at 14:05,  wrote:
> On Thu, 22 Mar 2018 06:09:44 -0600
> "Jan Beulich"  wrote:
> 
> On 22.03.18 at 12:56,  wrote:  
>>> I really don't understand why some people have that fear of emulated
>>> MMCONFIG -- it's really the same thing as any other MMIO range QEMU
>>> already emulates via map_io_range_to_ioreq_server(). No sensitive
>>> information exposed. It is related only to emulated PCI conf space
>>> which QEMU already knows about and use, providing emulated PCI
>>> devices for it.  
>>
>>You continue to ignore the routing requirement multiple ioreq
>>servers impose.
> 
> If the emulated MMCONFIG approach will be modified to become
> fully compatible with multiple ioreq servers (whatever they used for), I
> assume there will be no objections that emulated MMCONFIG can't be
> used?
> I just want to clarify this moment -- why people think that
> a completely emulated MMIO range, not related in any
> way to host's MMCONFIG may compromise something.

Compromise? All that was said so far - afair - was that this is the
wrong way round design wise.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 07/12] hvmloader: allocate MMCONFIG area in the MMIO hole + minor code refactoring

2018-03-22 Thread Alexey G
On Thu, 22 Mar 2018 06:09:44 -0600
"Jan Beulich"  wrote:

 On 22.03.18 at 12:56,  wrote:  
>> I really don't understand why some people have that fear of emulated
>> MMCONFIG -- it's really the same thing as any other MMIO range QEMU
>> already emulates via map_io_range_to_ioreq_server(). No sensitive
>> information exposed. It is related only to emulated PCI conf space
>> which QEMU already knows about and use, providing emulated PCI
>> devices for it.  
>
>You continue to ignore the routing requirement multiple ioreq
>servers impose.

If the emulated MMCONFIG approach will be modified to become
fully compatible with multiple ioreq servers (whatever they used for), I
assume there will be no objections that emulated MMCONFIG can't be
used?
I just want to clarify this moment -- why people think that
a completely emulated MMIO range, not related in any
way to host's MMCONFIG may compromise something.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 121056: trouble: blocked/broken/pass

2018-03-22 Thread osstest service owner
flight 121056 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/121056/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf  broken
 build-armhf   4 host-install(4)broken REGR. vs. 121043

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  6161d9f27fcb6c48021e6928bb240dfa39d9f1d3
baseline version:
 xen  8df3821c08d024684a6c83659d8d794b565067f9

Last test of basis   121043  2018-03-21 21:04:22 Z0 days
Testing same since   121056  2018-03-22 10:01:22 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Doug Goldstein 
  Jan Beulich 
  Joe Jin 
  Tim Deegan 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  broken  
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  blocked 
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary

broken-job build-armhf broken
broken-step build-armhf host-install(4)

Not pushing.

(No revision log; it would be 318 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 07/12] hvmloader: allocate MMCONFIG area in the MMIO hole + minor code refactoring

2018-03-22 Thread Roger Pau Monné
On Thu, Mar 22, 2018 at 10:29:22PM +1000, Alexey G wrote:
> On Thu, 22 Mar 2018 09:57:16 +
> Roger Pau Monné  wrote:
> [...]
> >> Yes, and it is still needed as we have two distinct (and not equal)
> >> interfaces to PCI conf space. Apart from 0..FFh range overlapping
> >> they can be considered very different interfaces. And whether it is
> >> a real system or emulated -- we can use either one of these two
> >> interfaces or both.  
> >
> >The legacy PCI config space accesses and the MCFG config space access
> >are just different methods of accessing the PCI configuration space,
> >but the data _must_ be exactly the same. I don't see how a device
> >would care about where the access to the config space originated.
> 
> If they were different methods of accessing the same thing, they
> could've been used interchangeably. When we've got a PCI conf ioreq
> which has offset>100h we know we cannot just pass it to emulated
> CF8/CFC but have to emulate this specifically.

This is already not the best approach to dispatch PCI config space
access in QEMU. I think the interface in QEMU should be:

pci_conf_space_{read/write}(sbdf, register, size , data)

And this would go directly into the device. But I assume this involves
a non-trivial amount of work to be implemented. Hence xen-hvm.c usage
of the IO port access replay.

> >OK, so you don't want to reconstruct the access, fine.
> >
> >Then just inject it using pcie_mmcfg_data_{read/write} or some similar
> >wrapper. My suggestion was just to try to use the easier way to get
> >this injected into QEMU.
> 
> QEMU knows its position, the problem it that xen-hvm.c (ioreq
> processor) is rather isolated from MMCONFIG emulation.
> 
> If you check the pcie_mmcfg_data_read/write MMCONFIG handlers in QEMU,
> you can see this:
> 
> static uint64_t pcie_mmcfg_data_read(void *opaque, <...>
> {
> PCIExpressHost *e = opaque;
> ...
> 
> We know this 'opaque' when we do MMIO-style MMCONFIG handling as
> pcie_mmcfg_data_read/write are actual handlers.
> 
> But xen-hvm.c needs to gain access to PCIExpressHost out of nowhere,
> which is possible but considered a hack by QEMU. We can also insert
> some code to MMCONFIG emulation which will store info we need to some
> global variables to be used across wildly different and unrelated
> modules. It will work, but anyone who see it will have bad thoughts on
> his mind.

Since you need to notify Xen the MCFG area address, why not just store
the MCFG address while doing this operation? You could do this with a
helper in xen-hvm.c, and keep the variable locally to that file.

In any case, this is a QEMU implementation detail. IMO the IOREQ
interface is clear and should not be bended like this just because
'this is easier to implement in QEMU'.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 07/12] hvmloader: allocate MMCONFIG area in the MMIO hole + minor code refactoring

2018-03-22 Thread Alexey G
On Thu, 22 Mar 2018 09:57:16 +
Roger Pau Monné  wrote:
[...]
>> Yes, and it is still needed as we have two distinct (and not equal)
>> interfaces to PCI conf space. Apart from 0..FFh range overlapping
>> they can be considered very different interfaces. And whether it is
>> a real system or emulated -- we can use either one of these two
>> interfaces or both.  
>
>The legacy PCI config space accesses and the MCFG config space access
>are just different methods of accessing the PCI configuration space,
>but the data _must_ be exactly the same. I don't see how a device
>would care about where the access to the config space originated.

If they were different methods of accessing the same thing, they
could've been used interchangeably. When we've got a PCI conf ioreq
which has offset>100h we know we cannot just pass it to emulated
CF8/CFC but have to emulate this specifically.

>> For QEMU zero changes are needed to support MMCONFIG MMIO accesses if
>> they come as MMIO ioreqs. It's just what its MMCONFIG emulation code
>> expects.  
>
>As I said many times in this thread, you seem to be focused around
>what's best for QEMU only, and this is wrong. The IOREQ interface is
>used by QEMU, but it's also used by other device emulators.
>
>I get the feeling that you assume that the correct solution is the one
>that involves less changes to Xen and QEMU. This is simply not true.
>
>> Anyway, for (kind of vague) users of the multiple ioreq servers
>> capability we can enable MMIO translation to PCI conf ioreqs. Note
>> that actually this is an extra step, not forwarding trapped MMCONFIG
>> MMIO accesses to the selected device model as is.
>>  
>> >Getting both IOREQ_TYPE_PCI_CONFIG and IOREQ_TYPE_COPY for PCI
>> >config space access is misleading.  
>> 
>> These are very different accesses, both in transport and
>> capabilities. 
>> >In both cases Xen would have to do the MCFG access decoding in order
>> >to figure out which IOREQ server will handle the request. At which
>> >point the only step that you avoid is the reconstruction of the
>> >memory access from the IOREQ_TYPE_PCI_CONFIG which is trivial.  
>> 
>> The "reconstruction of the memory access" you mentioned won't be easy
>> actually. The thing is, address_space_read/write is not all what we
>> need.
>> 
>> In order to translate PCI conf ioreqs back to emulated MMIO ops, we
>> need to be an involved party, mainly to know where MMCONFIG area is
>> located so we can construct the address within its range from BDF.
>> This piece of information is destroyed in the process of MMIO ioreq
>> translation to PCI conf type.  
>
>QEMU certainly knows the position of the MCFG area (because it's the
>one that tells Xen about it), so I don't understand your concerns
>above.
>> The code which parse PCI conf ioreqs in xen-hvm.c doesn't know
>> anything about the current emulated MMCONFIG state. The correct way
>> to have this info is to participate in its emulation. As we don't
>> participate, we have no other way than trying to gain backdoor
>> access to PCIHost fields via things like object_resolve_*(). This
>> solution is cumbersome and ugly but will work... and may break
>> anytime due to changes in QEMU.   
>
>OK, so you don't want to reconstruct the access, fine.
>
>Then just inject it using pcie_mmcfg_data_{read/write} or some similar
>wrapper. My suggestion was just to try to use the easier way to get
>this injected into QEMU.

QEMU knows its position, the problem it that xen-hvm.c (ioreq
processor) is rather isolated from MMCONFIG emulation.

If you check the pcie_mmcfg_data_read/write MMCONFIG handlers in QEMU,
you can see this:

static uint64_t pcie_mmcfg_data_read(void *opaque, <...>
{
PCIExpressHost *e = opaque;
...

We know this 'opaque' when we do MMIO-style MMCONFIG handling as
pcie_mmcfg_data_read/write are actual handlers.

But xen-hvm.c needs to gain access to PCIExpressHost out of nowhere,
which is possible but considered a hack by QEMU. We can also insert
some code to MMCONFIG emulation which will store info we need to some
global variables to be used across wildly different and unrelated
modules. It will work, but anyone who see it will have bad thoughts on
his mind.

>> QEMU maintainers will grin while looking at all this I'm afraid --
>> trapped MMIO accesses which are translated to PCI conf accesses which
>> in turn translated back to emulated MMIO accesses upon receiving,
>> along with tedious attempts to gain access to MMCONFIG-related info
>> as we're not invited to the MMCONFIG emulation party.
>>
>> The more I think about it, the more I like the existing
>> map_io_range_to_ioreq_server() approach. :( It works without doing
>> anything, no hacks, no new interfaces, both MMCONFIG and CF8/CFC are
>> working as expected. There is a problem to make it compatible with
>> the specific multiple ioreq servers feature, but providing a new
>> dmop/hypercall (which you suggest is a must have thing to trap
>> MMCONFIG MMIO to give QEMU only the free

Re: [Xen-devel] [for-4.11][PATCH v6 16/16] xen: Convert page_to_mfn and mfn_to_page to use typesafe MFN

2018-03-22 Thread Tim Deegan
Hi,

At 04:47 + on 21 Mar (1521607657), Julien Grall wrote:
> Most of the users of page_to_mfn and mfn_to_page are either overriding
> the macros to make them work with mfn_t or use mfn_x/_mfn because the
> rest of the function use mfn_t.
> 
> So make page_to_mfn and mfn_to_page return mfn_t by default. The __*
> version are now dropped as this patch will convert all the remaining
> non-typesafe callers.
> 
> Only reasonable clean-ups are done in this patch. The rest will use
> _mfn/mfn_x for the time being.
> 
> Lastly, domain_page_to_mfn is also converted to use mfn_t given that
> most of the callers are now switched to _mfn(domain_page_to_mfn(...)).
> 
> Signed-off-by: Julien Grall 
> Acked-by: Razvan Cojocaru 
> Reviewed-by: Paul Durrant 
> Reviewed-by: Boris Ostrovsky 
> Reviewed-by: Kevin Tian 
> Reviewed-by: Wei Liu 
> Acked-by: Jan Beulich 
> Reviewed-by: George Dunlap 

Thought I'd already acked this for the shadow code, but clearly not.
Sorry for the delay, and:

Acked-by: Tim Deegan 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 7/8] x86: also NOP out xen_cr3 restores of XPTI

2018-03-22 Thread Wei Liu
On Mon, Mar 19, 2018 at 07:41:11AM -0600, Jan Beulich wrote:
> ... despite quite likely the gain being rather limited.
> 
> Signed-off-by: Jan Beulich 

Reviewed-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 5/8] x86/XPTI: reduce .text.entry

2018-03-22 Thread Wei Liu
On Mon, Mar 19, 2018 at 07:40:12AM -0600, Jan Beulich wrote:
> This exposes less code pieces and at the same time reduces the range
> covered from slightly above 3 pages to a little below 2 of them.
> 
> The code being moved is unchanged, except for the removal of trailing
> blanks, insertion of blanks between operands, and a pointless q suffix
> from "retq".
> 
> A few more small pieces could be moved, but it seems better to me to
> leave them where they are to not make it overly hard to follow code
> paths.
> 
> Signed-off-by: Jan Beulich 

Reviewed-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 6/8] x86: enable interrupts earlier with XPTI disabled

2018-03-22 Thread Wei Liu
On Mon, Mar 19, 2018 at 07:40:50AM -0600, Jan Beulich wrote:
> The STI instances were moved (or added in the INT80 case) to meet TLB
> flush requirements. When XPTI is disabled, they can be put back where
> they were (or omitted in the INT80 case).
> 
> Signed-off-by: Jan Beulich 

Reviewed-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable test] 120988: regressions - FAIL

2018-03-22 Thread osstest service owner
flight 120988 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120988/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf   6 xen-buildfail REGR. vs. 120943

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-examine  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 120859
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 120943
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 120943
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 120943
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 120943
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 120943
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 120943
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass

version targeted for testing:
 xen  7a1358bbe73e5f749c3d2f53478dc1f30720f949
baseline version:
 xen  0012ae8afb4a6e76f2847119f2c6850fbf41d9b7

Last test of basis   120943  2018-03-18 21:56:54 Z3 days
Testing same since   120988  2018-03-20 10:55:25 Z2 days1 attempts


People who touched revisions under test:
  Amit Singh Tomar 
  Julien Grall 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf  pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  fail
 build-i386   pass

Re: [Xen-devel] [RFC PATCH 07/12] hvmloader: allocate MMCONFIG area in the MMIO hole + minor code refactoring

2018-03-22 Thread Jan Beulich
>>> On 22.03.18 at 12:56,  wrote:
> I really don't understand why some people have that fear of emulated
> MMCONFIG -- it's really the same thing as any other MMIO range QEMU
> already emulates via map_io_range_to_ioreq_server(). No sensitive
> information exposed. It is related only to emulated PCI conf space which
> QEMU already knows about and use, providing emulated PCI devices for it.

You continue to ignore the routing requirement multiple ioreq
servers impose.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v18 08/11] tools/libxenforeignmemory: add support for resource mapping

2018-03-22 Thread Paul Durrant
A previous patch introduced a new HYPERVISOR_memory_op to acquire guest
resources for direct priv-mapping.

This patch adds new functionality into libxenforeignmemory to make use
of a new privcmd ioctl [1] that uses the new memory op to make such
resources available via mmap(2).

[1] 
http://xenbits.xen.org/gitweb/?p=people/pauldu/linux.git;a=commit;h=ce59a05e6712

Signed-off-by: Paul Durrant 
Reviewed-by: Roger Pau Monné 
Reviewed-by: Wei Liu 
---
Cc: Ian Jackson 

v4:
 - Fixed errno and removed single-use label
 - The unmap call now returns a status
 - Use C99 initialization for ioctl struct

v2:
 - Bump minor version up to 3.
---
 tools/include/xen-sys/Linux/privcmd.h  | 11 +
 tools/libs/foreignmemory/Makefile  |  2 +-
 tools/libs/foreignmemory/core.c| 53 ++
 .../libs/foreignmemory/include/xenforeignmemory.h  | 41 +
 tools/libs/foreignmemory/libxenforeignmemory.map   |  5 ++
 tools/libs/foreignmemory/linux.c   | 45 ++
 tools/libs/foreignmemory/private.h | 31 +
 7 files changed, 187 insertions(+), 1 deletion(-)

diff --git a/tools/include/xen-sys/Linux/privcmd.h 
b/tools/include/xen-sys/Linux/privcmd.h
index 732ff7c15a..9531b728f9 100644
--- a/tools/include/xen-sys/Linux/privcmd.h
+++ b/tools/include/xen-sys/Linux/privcmd.h
@@ -86,6 +86,15 @@ typedef struct privcmd_dm_op {
const privcmd_dm_op_buf_t __user *ubufs;
 } privcmd_dm_op_t;
 
+typedef struct privcmd_mmap_resource {
+   domid_t dom;
+   __u32 type;
+   __u32 id;
+   __u32 idx;
+   __u64 num;
+   __u64 addr;
+} privcmd_mmap_resource_t;
+
 /*
  * @cmd: IOCTL_PRIVCMD_HYPERCALL
  * @arg: &privcmd_hypercall_t
@@ -103,5 +112,7 @@ typedef struct privcmd_dm_op {
_IOC(_IOC_NONE, 'P', 5, sizeof(privcmd_dm_op_t))
 #define IOCTL_PRIVCMD_RESTRICT \
_IOC(_IOC_NONE, 'P', 6, sizeof(domid_t))
+#define IOCTL_PRIVCMD_MMAP_RESOURCE\
+   _IOC(_IOC_NONE, 'P', 7, sizeof(privcmd_mmap_resource_t))
 
 #endif /* __LINUX_PUBLIC_PRIVCMD_H__ */
diff --git a/tools/libs/foreignmemory/Makefile 
b/tools/libs/foreignmemory/Makefile
index cbe815fce8..ee5c3fd67e 100644
--- a/tools/libs/foreignmemory/Makefile
+++ b/tools/libs/foreignmemory/Makefile
@@ -2,7 +2,7 @@ XEN_ROOT = $(CURDIR)/../../..
 include $(XEN_ROOT)/tools/Rules.mk
 
 MAJOR= 1
-MINOR= 2
+MINOR= 3
 SHLIB_LDFLAGS += -Wl,--version-script=libxenforeignmemory.map
 
 CFLAGS   += -Werror -Wmissing-prototypes
diff --git a/tools/libs/foreignmemory/core.c b/tools/libs/foreignmemory/core.c
index 7c8562ae74..63f12e2450 100644
--- a/tools/libs/foreignmemory/core.c
+++ b/tools/libs/foreignmemory/core.c
@@ -17,6 +17,8 @@
 #include 
 #include 
 
+#include 
+
 #include "private.h"
 
 static int all_restrict_cb(Xentoolcore__Active_Handle *ah, domid_t domid) {
@@ -135,6 +137,57 @@ int xenforeignmemory_restrict(xenforeignmemory_handle 
*fmem,
 return osdep_xenforeignmemory_restrict(fmem, domid);
 }
 
+xenforeignmemory_resource_handle *xenforeignmemory_map_resource(
+xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
+unsigned int id, unsigned long frame, unsigned long nr_frames,
+void **paddr, int prot, int flags)
+{
+xenforeignmemory_resource_handle *fres;
+int rc;
+
+/* Check flags only contains POSIX defined values */
+if ( flags & ~(MAP_SHARED | MAP_PRIVATE) )
+{
+errno = EINVAL;
+return NULL;
+}
+
+fres = calloc(1, sizeof(*fres));
+if ( !fres )
+{
+errno = ENOMEM;
+return NULL;
+}
+
+fres->domid = domid;
+fres->type = type;
+fres->id = id;
+fres->frame = frame;
+fres->nr_frames = nr_frames;
+fres->addr = *paddr;
+fres->prot = prot;
+fres->flags = flags;
+
+rc = osdep_xenforeignmemory_map_resource(fmem, fres);
+if ( rc )
+{
+free(fres);
+fres = NULL;
+} else
+*paddr = fres->addr;
+
+return fres;
+}
+
+int xenforeignmemory_unmap_resource(
+xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres)
+{
+int rc = osdep_xenforeignmemory_unmap_resource(fmem, fres);
+
+free(fres);
+return rc;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/include/xenforeignmemory.h 
b/tools/libs/foreignmemory/include/xenforeignmemory.h
index f4814c390f..d594be8df0 100644
--- a/tools/libs/foreignmemory/include/xenforeignmemory.h
+++ b/tools/libs/foreignmemory/include/xenforeignmemory.h
@@ -138,6 +138,47 @@ int xenforeignmemory_unmap(xenforeignmemory_handle *fmem,
 int xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
   domid_t domid);
 
+typedef struct xenforeignmemory_resource_handle 
xenforeignmemory_resource_handle;
+
+/**
+ * This function maps a guest resource.
+ *
+ * @parm fmem handle to the open foreignmemory interf

[Xen-devel] [PATCH v18 11/11] tools/libxenctrl: use new xenforeignmemory API to seed grant table

2018-03-22 Thread Paul Durrant
A previous patch added support for priv-mapping guest resources directly
(rather than having to foreign-map, which requires P2M modification for
HVM guests).

This patch makes use of the new API to seed the guest grant table unless
the underlying infrastructure (i.e. privcmd) doesn't support it, in which
case the old scheme is used.

NOTE: The call to xc_dom_gnttab_hvm_seed() in hvm_build_set_params() was
  actually unnecessary, as the grant table has already been seeded
  by a prior call to xc_dom_gnttab_init() made by libxl__build_dom().

Signed-off-by: Paul Durrant 
Acked-by: Marek Marczykowski-Górecki 
Reviewed-by: Roger Pau Monné 
Acked-by: Wei Liu 
---
Cc: Ian Jackson 

v18:
 - Trivial re-base.

v13:
 - Re-base.

v10:
 - Use new id constant for grant table.

v4:
 - Minor cosmetic fix suggested by Roger.

v3:
 - Introduced xc_dom_set_gnttab_entry() to avoid duplicated code.
---
 tools/libxc/include/xc_dom.h|   8 +--
 tools/libxc/xc_dom_boot.c   | 114 +---
 tools/libxc/xc_sr_restore_x86_hvm.c |  10 ++--
 tools/libxc/xc_sr_restore_x86_pv.c  |   2 +-
 tools/libxl/libxl_dom.c |   1 -
 tools/python/xen/lowlevel/xc/xc.c   |   6 +-
 6 files changed, 92 insertions(+), 49 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 491cad8114..cee2ac9901 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -332,12 +332,8 @@ void *xc_dom_boot_domU_map(struct xc_dom_image *dom, 
xen_pfn_t pfn,
 int xc_dom_boot_image(struct xc_dom_image *dom);
 int xc_dom_compat_check(struct xc_dom_image *dom);
 int xc_dom_gnttab_init(struct xc_dom_image *dom);
-int xc_dom_gnttab_hvm_seed(xc_interface *xch, uint32_t domid,
-   xen_pfn_t console_gmfn,
-   xen_pfn_t xenstore_gmfn,
-   uint32_t console_domid,
-   uint32_t xenstore_domid);
-int xc_dom_gnttab_seed(xc_interface *xch, uint32_t domid,
+int xc_dom_gnttab_seed(xc_interface *xch, uint32_t guest_domid,
+   bool is_hvm,
xen_pfn_t console_gmfn,
xen_pfn_t xenstore_gmfn,
uint32_t console_domid,
diff --git a/tools/libxc/xc_dom_boot.c b/tools/libxc/xc_dom_boot.c
index 2e5681dc5d..8307ebeaf6 100644
--- a/tools/libxc/xc_dom_boot.c
+++ b/tools/libxc/xc_dom_boot.c
@@ -256,11 +256,29 @@ static xen_pfn_t xc_dom_gnttab_setup(xc_interface *xch, 
uint32_t domid)
 return gmfn;
 }
 
-int xc_dom_gnttab_seed(xc_interface *xch, uint32_t domid,
-   xen_pfn_t console_gmfn,
-   xen_pfn_t xenstore_gmfn,
-   uint32_t console_domid,
-   uint32_t xenstore_domid)
+static void xc_dom_set_gnttab_entry(xc_interface *xch,
+grant_entry_v1_t *gnttab,
+unsigned int idx,
+uint32_t guest_domid,
+uint32_t backend_domid,
+xen_pfn_t backend_gmfn)
+{
+if ( guest_domid == backend_domid || backend_gmfn == -1)
+return;
+
+xc_dom_printf(xch, "%s: [%u] -> 0x%"PRI_xen_pfn,
+  __FUNCTION__, idx, backend_gmfn);
+
+gnttab[idx].flags = GTF_permit_access;
+gnttab[idx].domid = backend_domid;
+gnttab[idx].frame = backend_gmfn;
+}
+
+static int compat_gnttab_seed(xc_interface *xch, uint32_t domid,
+  xen_pfn_t console_gmfn,
+  xen_pfn_t xenstore_gmfn,
+  uint32_t console_domid,
+  uint32_t xenstore_domid)
 {
 
 xen_pfn_t gnttab_gmfn;
@@ -284,18 +302,10 @@ int xc_dom_gnttab_seed(xc_interface *xch, uint32_t domid,
 return -1;
 }
 
-if ( domid != console_domid  && console_gmfn != -1)
-{
-gnttab[GNTTAB_RESERVED_CONSOLE].flags = GTF_permit_access;
-gnttab[GNTTAB_RESERVED_CONSOLE].domid = console_domid;
-gnttab[GNTTAB_RESERVED_CONSOLE].frame = console_gmfn;
-}
-if ( domid != xenstore_domid && xenstore_gmfn != -1)
-{
-gnttab[GNTTAB_RESERVED_XENSTORE].flags = GTF_permit_access;
-gnttab[GNTTAB_RESERVED_XENSTORE].domid = xenstore_domid;
-gnttab[GNTTAB_RESERVED_XENSTORE].frame = xenstore_gmfn;
-}
+xc_dom_set_gnttab_entry(xch, gnttab, GNTTAB_RESERVED_CONSOLE,
+domid, console_domid, console_gmfn);
+xc_dom_set_gnttab_entry(xch, gnttab, GNTTAB_RESERVED_XENSTORE,
+domid, xenstore_domid, xenstore_gmfn);
 
 if ( munmap(gnttab, PAGE_SIZE) == -1 )
 {
@@ -313,11 +323,11 @@ int xc_dom_gnttab_seed(xc_interface *xch, uint32_t domid,
 return 0;
 }
 
-int xc_dom_gnttab_hvm_seed(xc_interface *xch, uint32_t domid,
-   xen_pfn_t console_gpfn,
-

[Xen-devel] [PATCH v18 10/11] common: add a new mappable resource type: XENMEM_resource_grant_table

2018-03-22 Thread Paul Durrant
This patch allows grant table frames to be mapped using the
XENMEM_acquire_resource memory op.

NOTE: This patch expands the on-stack mfn_list array in acquire_resource()
  but it is still small enough to remain on-stack.

Signed-off-by: Paul Durrant 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Wei Liu 

v18:
 - Non-trivial re-base of grant table code.
 - Dropped Jan's R-b because of the grant table changes.

v13:
 - Re-work the internals to avoid using the XENMAPIDX_grant_table_status
   hack.

v12:
 - Dropped limit checks as requested by Jan.

v10:
 - Addressed comments from Jan.

v8:
 - The functionality was originally incorporated into the earlier patch
   "x86/mm: add HYPERVISOR_memory_op to acquire guest resources".
---
 xen/common/grant_table.c  | 71 +++
 xen/common/memory.c   | 45 ++-
 xen/include/public/memory.h   |  9 --
 xen/include/xen/grant_table.h |  4 +++
 4 files changed, 113 insertions(+), 16 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 18201912e4..c8c3661b19 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -3863,6 +3863,35 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, 
grant_ref_t ref,
 }
 #endif
 
+/* caller must hold read or write lock */
+static int gnttab_get_status_frame_mfn(struct domain *d,
+   unsigned long idx, mfn_t *mfn)
+{
+struct grant_table *gt = d->grant_table;
+
+if ( idx >= nr_status_frames(gt) )
+return -EINVAL;
+
+*mfn = _mfn(virt_to_mfn(gt->status[idx]));
+return 0;
+}
+
+/* caller must hold write lock */
+static int gnttab_get_shared_frame_mfn(struct domain *d,
+   unsigned long idx, mfn_t *mfn)
+{
+struct grant_table *gt = d->grant_table;
+
+if ( (idx >= nr_grant_frames(gt)) && (idx < gt->max_grant_frames) )
+gnttab_grow_table(d, idx + 1);
+
+if ( idx >= nr_grant_frames(gt) )
+return -EINVAL;
+
+*mfn = _mfn(virt_to_mfn(gt->shared_raw[idx]));
+return 0;
+}
+
 int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
  mfn_t *mfn)
 {
@@ -3880,21 +3909,11 @@ int gnttab_map_frame(struct domain *d, unsigned long 
idx, gfn_t gfn,
 {
 idx &= ~XENMAPIDX_grant_table_status;
 status = true;
-if ( idx < nr_status_frames(gt) )
-*mfn = _mfn(virt_to_mfn(gt->status[idx]));
-else
-rc = -EINVAL;
-}
-else
-{
-if ( (idx >= nr_grant_frames(gt)) && (idx < gt->max_grant_frames) )
-gnttab_grow_table(d, idx + 1);
 
-if ( idx < nr_grant_frames(gt) )
-*mfn = _mfn(virt_to_mfn(gt->shared_raw[idx]));
-else
-rc = -EINVAL;
+rc = gnttab_get_status_frame_mfn(d, idx, mfn);
 }
+else
+rc = gnttab_get_shared_frame_mfn(d, idx, mfn);
 
 if ( !rc && paging_mode_translate(d) &&
  !gfn_eq(gnttab_get_frame_gfn(gt, status, idx), INVALID_GFN) )
@@ -3909,6 +3928,32 @@ int gnttab_map_frame(struct domain *d, unsigned long 
idx, gfn_t gfn,
 return rc;
 }
 
+int gnttab_get_shared_frame(struct domain *d, unsigned long idx,
+mfn_t *mfn)
+{
+struct grant_table *gt = d->grant_table;
+int rc;
+
+grant_write_lock(gt);
+rc = gnttab_get_shared_frame_mfn(d, idx, mfn);
+grant_write_unlock(gt);
+
+return rc;
+}
+
+int gnttab_get_status_frame(struct domain *d, unsigned long idx,
+mfn_t *mfn)
+{
+struct grant_table *gt = d->grant_table;
+int rc;
+
+grant_read_lock(gt);
+rc = gnttab_get_status_frame_mfn(d, idx, mfn);
+grant_read_unlock(gt);
+
+return rc;
+}
+
 static void gnttab_usage_print(struct domain *rd)
 {
 int first = 1;
diff --git a/xen/common/memory.c b/xen/common/memory.c
index c09ef179e8..bc570167bb 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -967,6 +968,43 @@ static long xatp_permission_check(struct domain *d, 
unsigned int space)
 return xsm_add_to_physmap(XSM_TARGET, current->domain, d);
 }
 
+static int acquire_grant_table(struct domain *d, unsigned int id,
+   unsigned long frame,
+   unsigned int nr_frames,
+   xen_pfn_t mfn_list[])
+{
+unsigned int i = nr_frames;
+
+/* Iterate backwards in case table needs to grow */
+while ( i-- != 0 )
+{
+mfn_t mfn = INVALID_MFN;
+int rc;
+
+switch ( id )
+{
+case XENMEM_resource_grant_table_id_shared:
+rc = gnttab_get_shared_frame(d, frame + i, &mfn);
+break;
+
+case XENMEM_resource_grant_table_id_status

[Xen-devel] [PATCH v3a 39/39] ARM: VGIC: wire new VGIC(-v2) files into Xen build system

2018-03-22 Thread Andre Przywara
Now that we have both the old VGIC prepared to cope with a sibling and
the code for the new VGIC in place, lets add a Kconfig option to enable
the new code and wire it into the Xen build system.
This will add a compile time option to use either the "old" or the "new"
VGIC.
In the moment this is restricted to a vGIC-v2. To make the build system
happy, we provide a temporary dummy implementation of
vgic_v3_setup_hw() to allow building for now.

Signed-off-by: Andre Przywara 
---
Changelog v3 ... v3a:
- print panic when trying to run on GICv3 hardware

Changelog v2 ... v3:
- fix indentation of Kconfig entry
- select NEEDS_LIST_SORT
- drop unconditional list_sort.o inclusion

Changelog v1 ... v2:
- add Kconfig help text
- use separate Makefile in vgic/ directory
- protect compilation without GICV3 support
- always include list_sort() in build

 xen/arch/arm/Kconfig   | 18 +-
 xen/arch/arm/Makefile  |  5 -
 xen/arch/arm/vgic/Makefile |  5 +
 xen/arch/arm/vgic/vgic.c   | 11 +++
 4 files changed, 37 insertions(+), 2 deletions(-)
 create mode 100644 xen/arch/arm/vgic/Makefile

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 2782ee6589..8174c0c635 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -48,7 +48,23 @@ config HAS_GICV3
 config HAS_ITS
 bool
 prompt "GICv3 ITS MSI controller support" if EXPERT = "y"
-depends on HAS_GICV3
+depends on HAS_GICV3 && !NEW_VGIC
+
+config NEW_VGIC
+   bool
+   prompt "Use new VGIC implementation"
+   select NEEDS_LIST_SORT
+   ---help---
+
+   This is an alternative implementation of the ARM GIC interrupt
+   controller emulation, based on the Linux/KVM VGIC. It has a better
+   design and fixes many shortcomings of the existing GIC emulation in
+   Xen. It will eventually replace the existing/old VGIC.
+   However at the moment it lacks support for Dom0 using the ITS for
+   using MSIs.
+   Say Y if you want to help testing this new code or if you experience
+   problems with the standard emulation.
+   At the moment this implementation is not security supported.
 
 config SBSA_VUART_CONSOLE
bool "Emulated SBSA UART console support"
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 41d7366527..a9533b107e 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -16,7 +16,6 @@ obj-y += domain_build.o
 obj-y += domctl.o
 obj-$(EARLY_PRINTK) += early_printk.o
 obj-y += gic.o
-obj-y += gic-vgic.o
 obj-y += gic-v2.o
 obj-$(CONFIG_HAS_GICV3) += gic-v3.o
 obj-$(CONFIG_HAS_ITS) += gic-v3-its.o
@@ -47,10 +46,14 @@ obj-y += sysctl.o
 obj-y += time.o
 obj-y += traps.o
 obj-y += vcpreg.o
+subdir-$(CONFIG_NEW_VGIC) += vgic
+ifneq ($(CONFIG_NEW_VGIC),y)
+obj-y += gic-vgic.o
 obj-y += vgic.o
 obj-y += vgic-v2.o
 obj-$(CONFIG_HAS_GICV3) += vgic-v3.o
 obj-$(CONFIG_HAS_ITS) += vgic-v3-its.o
+endif
 obj-y += vm_event.o
 obj-y += vtimer.o
 obj-$(CONFIG_SBSA_VUART_CONSOLE) += vpl011.o
diff --git a/xen/arch/arm/vgic/Makefile b/xen/arch/arm/vgic/Makefile
new file mode 100644
index 00..806826948e
--- /dev/null
+++ b/xen/arch/arm/vgic/Makefile
@@ -0,0 +1,5 @@
+obj-y += vgic.o
+obj-y += vgic-v2.o
+obj-y += vgic-mmio.o
+obj-y += vgic-mmio-v2.o
+obj-y += vgic-init.o
diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
index f9a5088285..ac18cab6f3 100644
--- a/xen/arch/arm/vgic/vgic.c
+++ b/xen/arch/arm/vgic/vgic.c
@@ -981,6 +981,17 @@ unsigned int vgic_max_vcpus(const struct domain *d)
 return min_t(unsigned int, MAX_VIRT_CPUS, vgic_vcpu_limit);
 }
 
+#ifdef CONFIG_HAS_GICV3
+/* Dummy implementation to allow building without actual vGICv3 support. */
+void vgic_v3_setup_hw(paddr_t dbase,
+  unsigned int nr_rdist_regions,
+  const struct rdist_region *regions,
+  unsigned int intid_bits)
+{
+panic("New VGIC implementation does not yet support GICv3.");
+}
+#endif
+
 /*
  * Local variables:
  * mode: C
-- 
2.14.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3a 03/39] ARM: GIC: Allow tweaking the active and pending state of an IRQ

2018-03-22 Thread Andre Przywara
When playing around with hardware mapped, level triggered virtual IRQs,
there is the need to explicitly set the active or pending state of an
interrupt at some point.
To prepare the GIC for that, we introduce a set_active_state() and a
set_pending_state() function to let the VGIC manipulate the state of
an associated hardware IRQ.
This takes care of properly setting the _IRQ_INPROGRESS bit.

Signed-off-by: Andre Przywara 
---
Changelog v3 ... v3a:
- always set/clear _IRQ_INPROGRESS bit (not only for guest IRQs)
- add comments

Changelog v2 ... v3:
- extend comments to note preliminary nature of vgic_get_lpi()

Changelog v1 ... v2:
- reorder header file inclusion

 xen/arch/arm/gic-v2.c | 41 +
 xen/arch/arm/gic-v3.c | 37 +
 xen/include/asm-arm/gic.h | 24 
 3 files changed, 102 insertions(+)

diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index aa0fc6c1a1..7374686235 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -243,6 +243,45 @@ static void gicv2_poke_irq(struct irq_desc *irqd, uint32_t 
offset)
 writel_gicd(1U << (irqd->irq % 32), offset + (irqd->irq / 32) * 4);
 }
 
+/*
+ * This is forcing the active state of an interrupt, somewhat circumventing
+ * the normal interrupt flow and the GIC state machine. So use with care
+ * and only if you know what you are doing. For this reason we also have to
+ * tinker with the _IRQ_INPROGRESS bit here, since the normal IRQ handler
+ * will not be involved.
+ */
+static void gicv2_set_active_state(struct irq_desc *irqd, bool active)
+{
+ASSERT(spin_is_locked(&irqd->lock));
+
+if ( active )
+{
+set_bit(_IRQ_INPROGRESS, &irqd->status);
+gicv2_poke_irq(irqd, GICD_ISACTIVER);
+}
+else
+{
+clear_bit(_IRQ_INPROGRESS, &irqd->status);
+gicv2_poke_irq(irqd, GICD_ICACTIVER);
+}
+}
+
+static void gicv2_set_pending_state(struct irq_desc *irqd, bool pending)
+{
+ASSERT(spin_is_locked(&irqd->lock));
+
+if ( pending )
+{
+/* The _IRQ_INPROGRESS bit will be set when the interrupt fires. */
+gicv2_poke_irq(irqd, GICD_ISPENDR);
+}
+else
+{
+/* The _IRQ_INPROGRESS remains unchanged. */
+gicv2_poke_irq(irqd, GICD_ICPENDR);
+}
+}
+
 static void gicv2_set_irq_type(struct irq_desc *desc, unsigned int type)
 {
 uint32_t cfg, actual, edgebit;
@@ -1278,6 +1317,8 @@ const static struct gic_hw_operations gicv2_ops = {
 .eoi_irq = gicv2_eoi_irq,
 .deactivate_irq  = gicv2_dir_irq,
 .read_irq= gicv2_read_irq,
+.set_active_state= gicv2_set_active_state,
+.set_pending_state   = gicv2_set_pending_state,
 .set_irq_type= gicv2_set_irq_type,
 .set_irq_priority= gicv2_set_irq_priority,
 .send_SGI= gicv2_send_SGI,
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index cb41844af2..a5105ac9e7 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -477,6 +477,41 @@ static unsigned int gicv3_read_irq(void)
 return irq;
 }
 
+/*
+ * This is forcing the active state of an interrupt, somewhat circumventing
+ * the normal interrupt flow and the GIC state machine. So use with care
+ * and only if you know what you are doing. For this reason we also have to
+ * tinker with the _IRQ_INPROGRESS bit here, since the normal IRQ handler
+ * will not be involved.
+ */
+static void gicv3_set_active_state(struct irq_desc *irqd, bool active)
+{
+ASSERT(spin_is_locked(&irqd->lock));
+
+if ( active )
+{
+set_bit(_IRQ_INPROGRESS, &irqd->status);
+gicv3_poke_irq(irqd, GICD_ISACTIVER, false);
+}
+else
+{
+clear_bit(_IRQ_INPROGRESS, &irqd->status);
+gicv3_poke_irq(irqd, GICD_ICACTIVER, false);
+}
+}
+
+static void gicv3_set_pending_state(struct irq_desc *irqd, bool pending)
+{
+ASSERT(spin_is_locked(&irqd->lock));
+
+if ( pending )
+/* The _IRQ_INPROGRESS bit will be set when the interrupt fires. */
+gicv3_poke_irq(irqd, GICD_ISPENDR, false);
+else
+/* The _IRQ_INPROGRESS bit will remain unchanged. */
+gicv3_poke_irq(irqd, GICD_ICPENDR, false);
+}
+
 static inline uint64_t gicv3_mpidr_to_affinity(int cpu)
 {
  uint64_t mpidr = cpu_logical_map(cpu);
@@ -1769,6 +1804,8 @@ static const struct gic_hw_operations gicv3_ops = {
 .eoi_irq = gicv3_eoi_irq,
 .deactivate_irq  = gicv3_dir_irq,
 .read_irq= gicv3_read_irq,
+.set_active_state= gicv3_set_active_state,
+.set_pending_state   = gicv3_set_pending_state,
 .set_irq_type= gicv3_set_irq_type,
 .set_irq_priority= gicv3_set_irq_priority,
 .send_SGI= gicv3_send_sgi,
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index 3079387e06..2aca243ac3 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic

[Xen-devel] [PATCH v3a 14/39] ARM: new VGIC: Add GICv2 world switch backend

2018-03-22 Thread Andre Przywara
Processing maintenance interrupts and accessing the list registers
are dependent on the host's GIC version.
Introduce vgic-v2.c to contain GICv2 specific functions.
Implement the GICv2 specific code for syncing the emulation state
into the VGIC registers.
This also adds the hook to let Xen setup the host GIC addresses.

This is based on Linux commit 140b086dd197, written by Marc Zyngier.

Signed-off-by: Andre Przywara 
---
Changelog v3 ... v3a:
- take hardware IRQ lock in vgic_v2_fold_lr_state()
- fix last remaining u32 usage
- print message when using new VGIC
- add TODO about racy _IRQ_INPROGRESS setting

Changelog v2 ... v3:
- remove no longer needed asm/io.h header
- replace 0/1 with false/true for bool's
- clear _IRQ_INPROGRESS bit when retiring hardware mapped IRQ
- fix indentation and w/s issues

Changelog v1 ... v2:
- remove v2 specific underflow function (now generic)
- re-add Linux code to properly handle acked level IRQs

 xen/arch/arm/vgic/vgic-v2.c | 259 
 xen/arch/arm/vgic/vgic.c|   6 +
 xen/arch/arm/vgic/vgic.h|   9 ++
 3 files changed, 274 insertions(+)
 create mode 100644 xen/arch/arm/vgic/vgic-v2.c

diff --git a/xen/arch/arm/vgic/vgic-v2.c b/xen/arch/arm/vgic/vgic-v2.c
new file mode 100644
index 00..1773503cfb
--- /dev/null
+++ b/xen/arch/arm/vgic/vgic-v2.c
@@ -0,0 +1,259 @@
+/*
+ * Copyright (C) 2015, 2016 ARM Ltd.
+ * Imported from Linux ("new" KVM VGIC) and heavily adapted to Xen.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "vgic.h"
+
+static struct {
+bool enabled;
+paddr_t dbase;  /* Distributor interface address */
+paddr_t cbase;  /* CPU interface address & size */
+paddr_t csize;
+paddr_t vbase;  /* Virtual CPU interface address */
+
+/* Offset to add to get an 8kB contiguous region if GIC is aliased */
+uint32_t aliased_offset;
+} gic_v2_hw_data;
+
+void vgic_v2_setup_hw(paddr_t dbase, paddr_t cbase, paddr_t csize,
+  paddr_t vbase, uint32_t aliased_offset)
+{
+gic_v2_hw_data.enabled = true;
+gic_v2_hw_data.dbase = dbase;
+gic_v2_hw_data.cbase = cbase;
+gic_v2_hw_data.csize = csize;
+gic_v2_hw_data.vbase = vbase;
+gic_v2_hw_data.aliased_offset = aliased_offset;
+
+printk("Using the new VGIC implementation.\n");
+}
+
+/*
+ * transfer the content of the LRs back into the corresponding ap_list:
+ * - active bit is transferred as is
+ * - pending bit is
+ *   - transferred as is in case of edge sensitive IRQs
+ *   - set to the line-level (resample time) for level sensitive IRQs
+ */
+void vgic_v2_fold_lr_state(struct vcpu *vcpu)
+{
+struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic;
+unsigned int used_lrs = vcpu->arch.vgic.used_lrs;
+unsigned long flags;
+unsigned int lr;
+
+if ( !used_lrs )/* No LRs used, so nothing to sync back here. */
+return;
+
+gic_hw_ops->update_hcr_status(GICH_HCR_UIE, false);
+
+for ( lr = 0; lr < used_lrs; lr++ )
+{
+struct gic_lr lr_val;
+uint32_t intid;
+struct vgic_irq *irq;
+struct irq_desc *desc = NULL;
+bool have_desc_lock = false;
+
+gic_hw_ops->read_lr(lr, &lr_val);
+
+/*
+ * TODO: Possible optimization to avoid reading LRs:
+ * Read the ELRSR to find out which of our LRs have been cleared
+ * by the guest. We just need to know the IRQ number for those, which
+ * we could save in an array when populating the LRs.
+ * This trades one MMIO access (ELRSR) for possibly more than one 
(LRs),
+ * but requires some more code to save the IRQ number and to handle
+ * those finished IRQs according to the algorithm below.
+ * We need some numbers to justify this: chances are that we don't
+ * have many LRs in use most of the time, so we might not save much.
+ */
+gic_hw_ops->clear_lr(lr);
+
+intid = lr_val.virq;
+irq = vgic_get_irq(vcpu->domain, vcpu, intid);
+
+local_irq_save(flags);
+spin_lock(&irq->irq_lock);
+
+/* The locking order forces us to drop and re-take the locks here. */
+if ( irq->hw )
+{
+spin_unlock(&irq->irq_lock);
+
+desc = irq_to_desc(irq->hwintid);
+spin_lock(&desc->lock);
+spin_lock(&ir

[Xen-devel] [PATCH v3a 00/39] (0/3) Fixups for the new VGIC(-v2) implementation

2018-03-22 Thread Andre Przywara
Hi,

this is just an update of the three patches which didn't get any review
tags so far.
The fixes for the new versions of 03/39 and 39/39 are pretty straight
forward, but 14/39 is more of a beast. I sent a diff to the original
patch [1] separately to give an idea of the changes.

I added the R-b: and A-b: tags along with the NIT fixes to my tree and
will later push a branch with those tags and these fixes here in a somewhat
final version.
Look out for the vgic-new/v3a branch appearing at
http://www.linux-arm.org/git?p=xen-ap.git

Cheers,
Andre

[1] https://lists.xen.org/archives/html/xen-devel/2018-03/msg02680.html

---
Changelog v3 ... v3a: (copied from the patches' changelog)
03/39:
- always set/clear _IRQ_INPROGRESS bit (not only for guest IRQs)
- add comments
14/39:
- take hardware IRQ lock in vgic_v2_fold_lr_state()
- fix last remaining u32 usage
- print message when using new VGIC
- add TODO about racy _IRQ_INPROGRESS setting
39/39:
- print panic when trying to run on GICv3 hardware

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 07/12] hvmloader: allocate MMCONFIG area in the MMIO hole + minor code refactoring

2018-03-22 Thread Alexey G
On Thu, 22 Mar 2018 10:06:09 +
Paul Durrant  wrote:

>> -Original Message-
>> From: Alexey G [mailto:x19...@gmail.com]
>> Sent: 22 March 2018 09:55
>> To: Jan Beulich 
>> Cc: Andrew Cooper ; Anthony Perard
>> ; Ian Jackson ;
>> Paul Durrant ; Roger Pau Monne
>> ; Wei Liu ; Stefano
>> Stabellini ; xen-devel@lists.xenproject.org
>> Subject: Re: [Xen-devel] [RFC PATCH 07/12] hvmloader: allocate
>> MMCONFIG area in the MMIO hole + minor code refactoring
>> 
>> On Thu, 22 Mar 2018 03:04:16 -0600
>> "Jan Beulich"  wrote:
>>   
>>  On 22.03.18 at 01:31,  wrote:  
>> >> On Wed, 21 Mar 2018 17:06:28 +
>> >> Paul Durrant  wrote:
>> >> [...]  
>>  Well, this might work actually. Although the overall scenario
>>  will be overcomplicated a bit for _PCI_CONFIG ioreqs. Here is
>>  how it will look:
>> 
>>  QEMU receives PCIEXBAR update -> calls the new dmop to tell
>>  Xen  
>> new  
>>  MMCONFIG address/size -> Xen (re)maps MMIO trapping area ->  
>> someone  
>>  is
>>  accessing this area -> Xen intercepts this MMIO access
>> 
>>  But here's what happens next:
>> 
>>  Xen translates MMIO access into PCI_CONFIG and sends it to DM ->
>>  DM receives _PCI_CONFIG ioreq -> DM translates BDF/addr info
>>  back to the offset in emulated MMCONFIG range -> DM calls
>>  address_space_read/write to trigger MMIO emulation
>>   
>> >>>
>> >>>That would only be true of a dm that cannot handle PCI config
>> >>>ioreqs directly.  
>> >>
>> >> It's just a bit problematic for xen-hvm.c (Xen ioreq processor in
>> >> QEMU).
>> >>
>> >> It receives these PCI conf ioreqs out of any context. To
>> >> workaround this, existing code issues I/O to emulated CF8h/CFCh
>> >> ports in order to allow QEMU to find their target. But we can't
>> >> use the same method for MMCONFIG accesses -- this works for basic
>> >> PCI conf space only.  
>> >
>> >I think you want to view this the other way around: No physical
>> >device would ever get to see MMCFG accesses (or CF8/CFC port
>> >ones). This same layering is what we should have in the
>> >virtualized case.  
>> 
>> We have purely virtual layout of the PCI bus along with virtual,
>> emulated and completely unrelated to host's MMCONFIG -- so what's
>> exposed? This emulated MMCONFIG simply a supplement to virtual PCI
>> bus and its layout correspond to the virtual PCI bus guest/QEMU see.
>> 
>> It's QEMU who controls chipset-specific PCIEXBAR emulation and knows
>> about MMCONFIG position and size.  
>
>...and I think that it the wrong solution for Xen. We only use QEMU as
>an emulator for peripheral devices; we should not be using it for this
>kind of emulation... that should be brought into the hypervisor.
>
>> QEMU informs Xen about where it is,  
>
>No. Xen should not care where QEMU wants to put it because the MMIO
>emulations should not even read QEMU.

QEMU does a lot of MMIO emulation, what's so special in the emulated
MMCONFIG? It has absolutely nothing to do with host's MMCONFIG, neither
in address/size or the internal layout. None of the host
MMCONFIG-related facilities touched in any way. It is purely virtual
thing.

I really don't understand why some people have that fear of emulated
MMCONFIG -- it's really the same thing as any other MMIO range QEMU
already emulates via map_io_range_to_ioreq_server(). No sensitive
information exposed. It is related only to emulated PCI conf space which
QEMU already knows about and use, providing emulated PCI devices for it.

>   Paul
>
>> in order to receive events about R/W accesses to this emulated area
>> -- so, why he should receive these events in a form of PCI conf
>> BDF/reg and not simply as MMCONFIG offset directly if it is
>> basically the same thing?  


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v18 05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

2018-03-22 Thread Paul Durrant
Certain memory resources associated with a guest are not necessarily
present in the guest P2M.

This patch adds the boilerplate for new memory op to allow such a resource
to be priv-mapped directly, by either a PV or HVM tools domain.

NOTE: Whilst the new op is not intrinsicly specific to the x86 architecture,
  I have no means to test it on an ARM platform and so cannot verify
  that it functions correctly.

Signed-off-by: Paul Durrant 
Acked-by: Daniel De Graaf 
---
Cc: Jan Beulich 
Cc: George Dunlap 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Wei Liu 
Cc: Julien Grall 

v18:
 - Allow the resource page owner to be specified by a returned flag.
 - Drop Jan's R-b due to change.

v14:
 - Addressed more comments from Jan.

v13:
 - Use xen_pfn_t for mfn_list.
 - Addressed further comments from Jan and Julien.

v12:
 - Addressed more comments form Jan.
 - Removed #ifdef CONFIG_X86 from common code and instead introduced a
   stub set_foreign_p2m_entry() in asm-arm/p2m.h returning -EOPNOTSUPP.
 - Restricted mechanism for querying implementation limit on nr_frames
   and simplified compat code.

v11:
 - Addressed more comments from Jan.

v9:
 - Addressed more comments from Jan.

v8:
 - Move the code into common as requested by Jan.
 - Make the gmfn_list handle a 64-bit type to avoid limiting the MFN
   range for a 32-bit tools domain.
 - Add missing pad.
 - Add compat code.
 - Make this patch deal with purely boilerplate.
 - Drop George's A-b and Wei's R-b because the changes are non-trivial,
   and update Cc list now the boilerplate is common.

v5:
 - Switched __copy_to/from_guest_offset() to copy_to/from_guest_offset().
---
 tools/flask/policy/modules/xen.if   |   4 +-
 xen/arch/x86/mm/p2m.c   |   3 +-
 xen/common/compat/memory.c  | 100 
 xen/common/memory.c |  93 +
 xen/include/asm-arm/p2m.h   |  10 
 xen/include/asm-x86/p2m.h   |   3 ++
 xen/include/public/memory.h |  55 +++-
 xen/include/xlat.lst|   1 +
 xen/include/xsm/dummy.h |   6 +++
 xen/include/xsm/xsm.h   |   6 +++
 xen/xsm/dummy.c |   1 +
 xen/xsm/flask/hooks.c   |   6 +++
 xen/xsm/flask/policy/access_vectors |   2 +
 13 files changed, 286 insertions(+), 4 deletions(-)

diff --git a/tools/flask/policy/modules/xen.if 
b/tools/flask/policy/modules/xen.if
index 459880bb01..7aefd0061e 100644
--- a/tools/flask/policy/modules/xen.if
+++ b/tools/flask/policy/modules/xen.if
@@ -52,7 +52,8 @@ define(`create_domain_common', `
settime setdomainhandle getvcpucontext set_misc_info };
allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim
set_max_evtchn set_vnumainfo get_vnumainfo cacheflush
-   psr_cmt_op psr_alloc soft_reset set_gnttab_limits };
+   psr_cmt_op psr_alloc soft_reset set_gnttab_limits
+   resource_map };
allow $1 $2:security check_context;
allow $1 $2:shadow enable;
allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage 
mmuext_op updatemp };
@@ -152,6 +153,7 @@ define(`device_model', `
allow $1 $2_target:domain { getdomaininfo shutdown };
allow $1 $2_target:mmu { map_read map_write adjust physmap target_hack 
};
allow $1 $2_target:hvm { getparam setparam hvmctl dm };
+   allow $1 $2_target:domain2 resource_map;
 ')
 
 # make_device_model(priv, dm_dom, hvm_dom)
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 48e50fb5d8..55693eba59 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1132,8 +1132,7 @@ static int set_typed_p2m_entry(struct domain *d, unsigned 
long gfn_l,
 }
 
 /* Set foreign mfn in the given guest's p2m table. */
-static int set_foreign_p2m_entry(struct domain *d, unsigned long gfn,
- mfn_t mfn)
+int set_foreign_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
 {
 return set_typed_p2m_entry(d, gfn, mfn, PAGE_ORDER_4K, p2m_map_foreign,
p2m_get_hostp2m(d)->default_access);
diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
index 35bb259808..13fd64ddf5 100644
--- a/xen/common/compat/memory.c
+++ b/xen/common/compat/memory.c
@@ -71,6 +71,7 @@ int compat_memory_op(unsigned int cmd, 
XEN_GUEST_HANDLE_PARAM(void) compat)
 struct xen_remove_from_physmap *xrfp;
 struct xen_vnuma_topology_info *vnuma;
 struct xen_mem_access_op *mao;
+struct xen_mem_acquire_resource *mar;
 } nat;
 union {
 struct compat_memory_reservation rsrv;
@@ -79,6 +80,7 @@ int compat_memory_op(unsigned int cmd, 
XEN_GUEST_HANDLE_PARAM(void) compat)
 struct compat_add_t

[Xen-devel] [PATCH v18 03/11] x86/hvm/ioreq: use gfn_t in struct hvm_ioreq_page

2018-03-22 Thread Paul Durrant
This patch adjusts the ioreq server code to use type-safe gfn_t values
where possible. No functional change.

Signed-off-by: Paul Durrant 
Reviewed-by: Roger Pau Monné 
Reviewed-by: Wei Liu 
Acked-by: Jan Beulich 
---
Cc: Andrew Cooper 

v18:
 - Trivial re-base.
---
 xen/arch/x86/hvm/ioreq.c | 46 
 xen/include/asm-x86/hvm/domain.h |  2 +-
 2 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index bd141db0d5..d5f0e24b98 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -210,7 +210,7 @@ bool handle_hvm_io_completion(struct vcpu *v)
 return true;
 }
 
-static unsigned long hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
+static gfn_t hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
 {
 struct domain *d = s->target;
 unsigned int i;
@@ -220,20 +220,19 @@ static unsigned long hvm_alloc_ioreq_gfn(struct 
hvm_ioreq_server *s)
 for ( i = 0; i < sizeof(d->arch.hvm_domain.ioreq_gfn.mask) * 8; i++ )
 {
 if ( test_and_clear_bit(i, &d->arch.hvm_domain.ioreq_gfn.mask) )
-return d->arch.hvm_domain.ioreq_gfn.base + i;
+return _gfn(d->arch.hvm_domain.ioreq_gfn.base + i);
 }
 
-return gfn_x(INVALID_GFN);
+return INVALID_GFN;
 }
 
-static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s,
-   unsigned long gfn)
+static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s, gfn_t gfn)
 {
 struct domain *d = s->target;
-unsigned int i = gfn - d->arch.hvm_domain.ioreq_gfn.base;
+unsigned int i = gfn_x(gfn) - d->arch.hvm_domain.ioreq_gfn.base;
 
 ASSERT(!IS_DEFAULT(s));
-ASSERT(gfn != gfn_x(INVALID_GFN));
+ASSERT(!gfn_eq(gfn, INVALID_GFN));
 
 set_bit(i, &d->arch.hvm_domain.ioreq_gfn.mask);
 }
@@ -242,7 +241,7 @@ static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 {
 struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
 
-if ( iorp->gfn == gfn_x(INVALID_GFN) )
+if ( gfn_eq(iorp->gfn, INVALID_GFN) )
 return;
 
 destroy_ring_for_helper(&iorp->va, iorp->page);
@@ -251,7 +250,7 @@ static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 if ( !IS_DEFAULT(s) )
 hvm_free_ioreq_gfn(s, iorp->gfn);
 
-iorp->gfn = gfn_x(INVALID_GFN);
+iorp->gfn = INVALID_GFN;
 }
 
 static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
@@ -264,16 +263,17 @@ static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 return -EINVAL;
 
 if ( IS_DEFAULT(s) )
-iorp->gfn = buf ?
-d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN] :
-d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN];
+iorp->gfn = _gfn(buf ?
+ d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN] :
+ d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN]);
 else
 iorp->gfn = hvm_alloc_ioreq_gfn(s);
 
-if ( iorp->gfn == gfn_x(INVALID_GFN) )
+if ( gfn_eq(iorp->gfn, INVALID_GFN) )
 return -ENOMEM;
 
-rc = prepare_ring_for_helper(d, iorp->gfn, &iorp->page, &iorp->va);
+rc = prepare_ring_for_helper(d, gfn_x(iorp->gfn), &iorp->page,
+ &iorp->va);
 
 if ( rc )
 hvm_unmap_ioreq_gfn(s, buf);
@@ -309,10 +309,10 @@ static void hvm_remove_ioreq_gfn(struct hvm_ioreq_server 
*s, bool buf)
 struct domain *d = s->target;
 struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
 
-if ( IS_DEFAULT(s) || iorp->gfn == gfn_x(INVALID_GFN) )
+if ( IS_DEFAULT(s) || gfn_eq(iorp->gfn, INVALID_GFN) )
 return;
 
-if ( guest_physmap_remove_page(d, _gfn(iorp->gfn),
+if ( guest_physmap_remove_page(d, iorp->gfn,
_mfn(page_to_mfn(iorp->page)), 0) )
 domain_crash(d);
 clear_page(iorp->va);
@@ -324,15 +324,15 @@ static int hvm_add_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
 int rc;
 
-if ( IS_DEFAULT(s) || iorp->gfn == gfn_x(INVALID_GFN) )
+if ( IS_DEFAULT(s) || gfn_eq(iorp->gfn, INVALID_GFN) )
 return 0;
 
 clear_page(iorp->va);
 
-rc = guest_physmap_add_page(d, _gfn(iorp->gfn),
+rc = guest_physmap_add_page(d, iorp->gfn,
 _mfn(page_to_mfn(iorp->page)), 0);
 if ( rc == 0 )
-paging_mark_pfn_dirty(d, _pfn(iorp->gfn));
+paging_mark_pfn_dirty(d, _pfn(gfn_x(iorp->gfn)));
 
 return rc;
 }
@@ -595,8 +595,8 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
 INIT_LIST_HEAD(&s->ioreq_vcpu_list);
 spin_lock_init(&s->bufioreq_lock);
 
-s->ioreq.gfn = gfn_x(INVALID_GFN);
-s->bufioreq.gfn = gfn_x(INVALID_GFN);
+s->ioreq.gfn = INVALID_GFN;
+s->bufioreq.gfn = INVALID_GFN;
 
 rc = hvm_ioreq_server_alloc_rangesets(s, id);
 if ( rc

[Xen-devel] [PATCH v18 09/11] tools/libxenforeignmemory: reduce xenforeignmemory_restrict code footprint

2018-03-22 Thread Paul Durrant
By using a static inline stub in private.h for OS where this functionality
is not implemented, the various duplicate stubs in the OS-specific source
modules can be avoided.

Signed-off-by: Paul Durrant 
Reviewed-by: Roger Pau Monné 
Acked-by: Wei Liu 
---
Cc: Ian Jackson 

v4:
 - Removed extraneous freebsd code.

v3:
 - Patch added in response to review comments.
---
 tools/libs/foreignmemory/freebsd.c |  7 ---
 tools/libs/foreignmemory/minios.c  |  7 ---
 tools/libs/foreignmemory/netbsd.c  |  7 ---
 tools/libs/foreignmemory/private.h | 12 +---
 tools/libs/foreignmemory/solaris.c |  7 ---
 5 files changed, 9 insertions(+), 31 deletions(-)

diff --git a/tools/libs/foreignmemory/freebsd.c 
b/tools/libs/foreignmemory/freebsd.c
index dec447485a..6e6bc4b11f 100644
--- a/tools/libs/foreignmemory/freebsd.c
+++ b/tools/libs/foreignmemory/freebsd.c
@@ -95,13 +95,6 @@ int osdep_xenforeignmemory_unmap(xenforeignmemory_handle 
*fmem,
 return munmap(addr, num << PAGE_SHIFT);
 }
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-domid_t domid)
-{
-errno = -EOPNOTSUPP;
-return -1;
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/minios.c 
b/tools/libs/foreignmemory/minios.c
index 75f340122e..43341ca301 100644
--- a/tools/libs/foreignmemory/minios.c
+++ b/tools/libs/foreignmemory/minios.c
@@ -58,13 +58,6 @@ int osdep_xenforeignmemory_unmap(xenforeignmemory_handle 
*fmem,
 return munmap(addr, num << PAGE_SHIFT);
 }
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-domid_t domid)
-{
-errno = -EOPNOTSUPP;
-return -1;
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/netbsd.c 
b/tools/libs/foreignmemory/netbsd.c
index 9bf95ef4f0..54a418ebd6 100644
--- a/tools/libs/foreignmemory/netbsd.c
+++ b/tools/libs/foreignmemory/netbsd.c
@@ -100,13 +100,6 @@ int osdep_xenforeignmemory_unmap(xenforeignmemory_handle 
*fmem,
 return munmap(addr, num*XC_PAGE_SIZE);
 }
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-domid_t domid)
-{
-errno = -EOPNOTSUPP;
-return -1;
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/private.h 
b/tools/libs/foreignmemory/private.h
index b191000b49..b06ce12583 100644
--- a/tools/libs/foreignmemory/private.h
+++ b/tools/libs/foreignmemory/private.h
@@ -35,9 +35,6 @@ void *osdep_xenforeignmemory_map(xenforeignmemory_handle 
*fmem,
 int osdep_xenforeignmemory_unmap(xenforeignmemory_handle *fmem,
  void *addr, size_t num);
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-domid_t domid);
-
 #if defined(__NetBSD__) || defined(__sun__)
 /* Strictly compat for those two only only */
 void *compat_mapforeign_batch(xenforeignmem_handle *fmem, uint32_t dom,
@@ -57,6 +54,13 @@ struct xenforeignmemory_resource_handle {
 };
 
 #ifndef __linux__
+static inline int osdep_xenforeignmemory_restrict(xenforeignmemory_handle 
*fmem,
+  domid_t domid)
+{
+errno = EOPNOTSUPP;
+return -1;
+}
+
 static inline int osdep_xenforeignmemory_map_resource(
 xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres)
 {
@@ -70,6 +74,8 @@ static inline int osdep_xenforeignmemory_unmap_resource(
 return 0;
 }
 #else
+int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
+domid_t domid);
 int osdep_xenforeignmemory_map_resource(
 xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres);
 int osdep_xenforeignmemory_unmap_resource(
diff --git a/tools/libs/foreignmemory/solaris.c 
b/tools/libs/foreignmemory/solaris.c
index a33decb4ae..ee8aae4fbd 100644
--- a/tools/libs/foreignmemory/solaris.c
+++ b/tools/libs/foreignmemory/solaris.c
@@ -97,13 +97,6 @@ int osdep_xenforeignmemory_unmap(xenforeignmemory_handle 
*fmem,
 return munmap(addr, num*XC_PAGE_SIZE);
 }
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-domid_t domid)
-{
-errno = -EOPNOTSUPP;
-return -1;
-}
-
 /*
  * Local variables:
  * mode: C
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v18 07/11] x86/mm: add an extra command to HYPERVISOR_mmu_update...

2018-03-22 Thread Paul Durrant
...to allow the calling domain to prevent translation of specified l1e
value.

Despite what the comment in public/xen.h might imply, specifying a
command value of MMU_NORMAL_PT_UPDATE will not simply update an l1e with
the specified value. Instead, mod_l1_entry() tests whether foreign_dom
has PG_translate set in its paging mode and, if it does, assumes that the
the pfn value in the l1e is a gfn rather than an mfn.

To allow PV tools domain to map mfn values from a previously issued
HYPERVISOR_memory_op:XENMEM_acquire_resource, there needs to be a way
to tell HYPERVISOR_mmu_update that the specific l1e value does not
require translation regardless of the paging mode of foreign_dom. This
patch therefore defines a new command value, MMU_PT_UPDATE_NO_TRANSLATE,
which has the same semantics as MMU_NORMAL_PT_UPDATE except that the
paging mode of foreign_dom is ignored and the l1e value is used verbatim.

Signed-off-by: Paul Durrant 
Reviewed-by: Jan Beulich 
---
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Wei Liu 

v13:
 - Re-base.

v8:
 - New in this version, replacing "allow a privileged PV domain to map
   guest mfns".
---
 xen/arch/x86/mm.c| 13 -
 xen/include/public/xen.h | 12 +---
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 846cc61935..8e3be1f263 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1901,9 +1901,10 @@ void page_unlock(struct page_info *page)
 
 /* Update the L1 entry at pl1e to new value nl1e. */
 static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t nl1e,
-unsigned long gl1mfn, int preserve_ad,
+unsigned long gl1mfn, unsigned int cmd,
 struct vcpu *pt_vcpu, struct domain *pg_dom)
 {
+bool preserve_ad = (cmd == MMU_PT_UPDATE_PRESERVE_AD);
 l1_pgentry_t ol1e;
 struct domain *pt_dom = pt_vcpu->domain;
 int rc = 0;
@@ -1925,7 +1926,8 @@ static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t 
nl1e,
 }
 
 /* Translate foreign guest address. */
-if ( paging_mode_translate(pg_dom) )
+if ( cmd != MMU_PT_UPDATE_NO_TRANSLATE &&
+ paging_mode_translate(pg_dom) )
 {
 p2m_type_t p2mt;
 p2m_query_t q = l1e_get_flags(nl1e) & _PAGE_RW ?
@@ -3617,6 +3619,7 @@ long do_mmu_update(
  */
 case MMU_NORMAL_PT_UPDATE:
 case MMU_PT_UPDATE_PRESERVE_AD:
+case MMU_PT_UPDATE_NO_TRANSLATE:
 {
 p2m_type_t p2mt;
 
@@ -3676,8 +3679,7 @@ long do_mmu_update(
 {
 case PGT_l1_page_table:
 rc = mod_l1_entry(va, l1e_from_intpte(req.val), mfn,
-  cmd == MMU_PT_UPDATE_PRESERVE_AD, v,
-  pg_owner);
+  cmd, v, pg_owner);
 break;
 
 case PGT_l2_page_table:
@@ -3988,7 +3990,8 @@ static int __do_update_va_mapping(
 goto out;
 }
 
-rc = mod_l1_entry(pl1e, val, mfn_x(gl1mfn), 0, v, pg_owner);
+rc = mod_l1_entry(pl1e, val, mfn_x(gl1mfn), MMU_NORMAL_PT_UPDATE, v,
+  pg_owner);
 
 page_unlock(gl1pg);
 put_page(gl1pg);
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index 308109f176..fb1df8f293 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -268,6 +268,10 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
  * As MMU_NORMAL_PT_UPDATE above, but A/D bits currently in the PTE are ORed
  * with those in @val.
  *
+ * ptr[1:0] == MMU_PT_UPDATE_NO_TRANSLATE:
+ * As MMU_NORMAL_PT_UPDATE above, but @val is not translated though FD
+ * page tables.
+ *
  * @val is usually the machine frame number along with some attributes.
  * The attributes by default follow the architecture defined bits. Meaning that
  * if this is a X86_64 machine and four page table layout is used, the layout
@@ -334,9 +338,11 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
  *
  * PAT (bit 7 on) --> PWT (bit 3 on) and clear bit 7.
  */
-#define MMU_NORMAL_PT_UPDATE  0 /* checked '*ptr = val'. ptr is MA.  */
-#define MMU_MACHPHYS_UPDATE   1 /* ptr = MA of frame to modify entry for */
-#define MMU_PT_UPDATE_PRESERVE_AD 2 /* atomically: *ptr = val | (*ptr&(A|D)) */
+#define MMU_NORMAL_PT_UPDATE   0 /* checked '*ptr = val'. ptr is MA.  
*/
+#define MMU_MACHPHYS_UPDATE1 /* ptr = MA of frame to modify entry for 
*/
+#define MMU_PT_UPDATE_PRESERVE_AD  2 /* atomically: *ptr = val | (*ptr&(A|D)) 
*/
+#define MMU_PT_UPDATE_NO_TRANSLATE 3 /* checked '*ptr = val'. ptr is MA.  
*/
+ /* val never translated. 
*/
 
 /*
  * MMU EXTENDED OPERATIONS
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenp

[Xen-devel] [PATCH v18 02/11] x86/hvm/ioreq: simplify code and use consistent naming

2018-03-22 Thread Paul Durrant
This patch re-works much of the ioreq server initialization and teardown
code:

- The hvm_map/unmap_ioreq_gfn() functions are expanded to call through
  to hvm_alloc/free_ioreq_gfn() rather than expecting them to be called
  separately by outer functions.
- Several functions now test the validity of the hvm_ioreq_page gfn value
  to determine whether they need to act. This means can be safely called
  for the bufioreq page even when it is not used.
- hvm_add/remove_ioreq_gfn() simply return in the case of the default
  IOREQ server so callers no longer need to test before calling.
- hvm_ioreq_server_setup_pages() is renamed to hvm_ioreq_server_map_pages()
  to mirror the existing hvm_ioreq_server_unmap_pages().

All of this significantly shortens the code.

Signed-off-by: Paul Durrant 
Reviewed-by: Roger Pau Monné 
Reviewed-by: Wei Liu 
Acked-by: Jan Beulich 
---
Cc: Andrew Cooper 

v18:
 - Trivial re-base.

v3:
 - Re-based on top of 's->is_default' to 'IS_DEFAULT(s)' changes.
 - Minor updates in response to review comments from Roger.
---
 xen/arch/x86/hvm/ioreq.c | 182 ++-
 1 file changed, 69 insertions(+), 113 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index d8d4e96a80..bd141db0d5 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -210,63 +210,75 @@ bool handle_hvm_io_completion(struct vcpu *v)
 return true;
 }
 
-static int hvm_alloc_ioreq_gfn(struct domain *d, unsigned long *gfn)
+static unsigned long hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
 {
+struct domain *d = s->target;
 unsigned int i;
-int rc;
 
-rc = -ENOMEM;
+ASSERT(!IS_DEFAULT(s));
+
 for ( i = 0; i < sizeof(d->arch.hvm_domain.ioreq_gfn.mask) * 8; i++ )
 {
 if ( test_and_clear_bit(i, &d->arch.hvm_domain.ioreq_gfn.mask) )
-{
-*gfn = d->arch.hvm_domain.ioreq_gfn.base + i;
-rc = 0;
-break;
-}
+return d->arch.hvm_domain.ioreq_gfn.base + i;
 }
 
-return rc;
+return gfn_x(INVALID_GFN);
 }
 
-static void hvm_free_ioreq_gfn(struct domain *d, unsigned long gfn)
+static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s,
+   unsigned long gfn)
 {
+struct domain *d = s->target;
 unsigned int i = gfn - d->arch.hvm_domain.ioreq_gfn.base;
 
-if ( gfn != gfn_x(INVALID_GFN) )
-set_bit(i, &d->arch.hvm_domain.ioreq_gfn.mask);
+ASSERT(!IS_DEFAULT(s));
+ASSERT(gfn != gfn_x(INVALID_GFN));
+
+set_bit(i, &d->arch.hvm_domain.ioreq_gfn.mask);
 }
 
-static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, bool buf)
+static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
 {
 struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
 
+if ( iorp->gfn == gfn_x(INVALID_GFN) )
+return;
+
 destroy_ring_for_helper(&iorp->va, iorp->page);
+iorp->page = NULL;
+
+if ( !IS_DEFAULT(s) )
+hvm_free_ioreq_gfn(s, iorp->gfn);
+
+iorp->gfn = gfn_x(INVALID_GFN);
 }
 
-static int hvm_map_ioreq_page(
-struct hvm_ioreq_server *s, bool buf, unsigned long gfn)
+static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
 {
 struct domain *d = s->target;
 struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
-struct page_info *page;
-void *va;
 int rc;
 
-if ( (rc = prepare_ring_for_helper(d, gfn, &page, &va)) )
-return rc;
-
-if ( (iorp->va != NULL) || d->is_dying )
-{
-destroy_ring_for_helper(&va, page);
+if ( d->is_dying )
 return -EINVAL;
-}
 
-iorp->va = va;
-iorp->page = page;
-iorp->gfn = gfn;
+if ( IS_DEFAULT(s) )
+iorp->gfn = buf ?
+d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN] :
+d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN];
+else
+iorp->gfn = hvm_alloc_ioreq_gfn(s);
 
-return 0;
+if ( iorp->gfn == gfn_x(INVALID_GFN) )
+return -ENOMEM;
+
+rc = prepare_ring_for_helper(d, iorp->gfn, &iorp->page, &iorp->va);
+
+if ( rc )
+hvm_unmap_ioreq_gfn(s, buf);
+
+return rc;
 }
 
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
@@ -279,8 +291,7 @@ bool is_ioreq_server_page(struct domain *d, const struct 
page_info *page)
 
 FOR_EACH_IOREQ_SERVER(d, id, s)
 {
-if ( (s->ioreq.va && s->ioreq.page == page) ||
- (s->bufioreq.va && s->bufioreq.page == page) )
+if ( (s->ioreq.page == page) || (s->bufioreq.page == page) )
 {
 found = true;
 break;
@@ -292,20 +303,30 @@ bool is_ioreq_server_page(struct domain *d, const struct 
page_info *page)
 return found;
 }
 
-static void hvm_remove_ioreq_gfn(
-struct domain *d, struct hvm_ioreq_page *iorp)
+static void hvm_remove_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
+
 {
+struct domain *d = s->target;
+struct 

[Xen-devel] [PATCH v18 01/11] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list

2018-03-22 Thread Paul Durrant
A subsequent patch will remove the current implicit limitation on creation
of ioreq servers which is due to the allocation of gfns for the ioreq
structures and buffered ioreq ring.

It will therefore be necessary to introduce an explicit limit and, since
this limit should be small, it simplifies the code to maintain an array of
that size rather than using a list.

Also, by reserving an array slot for the default server and populating
array slots early in create, the need to pass an 'is_default' boolean
to sub-functions can be avoided.

Some function return values are changed by this patch: Specifically, in
the case where the id of the default ioreq server is passed in, -EOPNOTSUPP
is now returned rather than -ENOENT.

Signed-off-by: Paul Durrant 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 

v18:
 - non-trivial re-base.
 - small modification to FOR_EACH... macro to iterate backwards, to main-
   tain a previous undocumented but useful semantic that secondary
   emulators are selected in favour of qemu.
 - dropped R-b's because of change.

v10:
 - modified FOR_EACH... macro as suggested by Jan.
 - check for NULL in IS_DEFAULT macro as suggested by Jan.

v9:
 - modified FOR_EACH... macro as requested by Andrew.

v8:
 - Addressed various comments from Jan.

v7:
 - Fixed assertion failure found in testing.

v6:
 - Updated according to comments made by Roger on v4 that I'd missed.

v5:
 - Switched GET/SET_IOREQ_SERVER() macros to get/set_ioreq_server()
   functions to avoid possible double-evaluation issues.

v4:
 - Introduced more helper macros and relocated them to the top of the
   code.

v3:
 - New patch (replacing "move is_default into struct hvm_ioreq_server") in
   response to review comments.
---
 xen/arch/x86/hvm/ioreq.c | 539 +++
 xen/include/asm-x86/hvm/domain.h |  11 +-
 2 files changed, 265 insertions(+), 285 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 44d029499d..d8d4e96a80 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -33,6 +33,37 @@
 
 #include 
 
+static void set_ioreq_server(struct domain *d, unsigned int id,
+ struct hvm_ioreq_server *s)
+{
+ASSERT(id < MAX_NR_IOREQ_SERVERS);
+ASSERT(!s || !d->arch.hvm_domain.ioreq_server.server[id]);
+
+d->arch.hvm_domain.ioreq_server.server[id] = s;
+}
+
+#define GET_IOREQ_SERVER(d, id) \
+(d)->arch.hvm_domain.ioreq_server.server[id]
+
+static struct hvm_ioreq_server *get_ioreq_server(const struct domain *d,
+ unsigned int id)
+{
+if ( id >= MAX_NR_IOREQ_SERVERS )
+return NULL;
+
+return GET_IOREQ_SERVER(d, id);
+}
+
+#define IS_DEFAULT(s) \
+((s) && (s) == GET_IOREQ_SERVER((s)->target, DEFAULT_IOSERVID))
+
+/* Iterate over all possible ioreq servers */
+#define FOR_EACH_IOREQ_SERVER(d, id, s) \
+for ( (id) = MAX_NR_IOREQ_SERVERS; (id) != 0; ) \
+if ( !(s = GET_IOREQ_SERVER(d, --(id))) ) \
+continue; \
+else
+
 static ioreq_t *get_ioreq(struct hvm_ioreq_server *s, struct vcpu *v)
 {
 shared_iopage_t *p = s->ioreq.va;
@@ -47,10 +78,9 @@ bool hvm_io_pending(struct vcpu *v)
 {
 struct domain *d = v->domain;
 struct hvm_ioreq_server *s;
+unsigned int id;
 
-list_for_each_entry ( s,
-  &d->arch.hvm_domain.ioreq_server.list,
-  list_entry )
+FOR_EACH_IOREQ_SERVER(d, id, s)
 {
 struct hvm_ioreq_vcpu *sv;
 
@@ -127,10 +157,9 @@ bool handle_hvm_io_completion(struct vcpu *v)
 struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
 struct hvm_ioreq_server *s;
 enum hvm_io_completion io_completion;
+unsigned int id;
 
-  list_for_each_entry ( s,
-  &d->arch.hvm_domain.ioreq_server.list,
-  list_entry )
+FOR_EACH_IOREQ_SERVER(d, id, s)
 {
 struct hvm_ioreq_vcpu *sv;
 
@@ -243,13 +272,12 @@ static int hvm_map_ioreq_page(
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
 {
 const struct hvm_ioreq_server *s;
+unsigned int id;
 bool found = false;
 
 spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
-list_for_each_entry ( s,
-  &d->arch.hvm_domain.ioreq_server.list,
-  list_entry )
+FOR_EACH_IOREQ_SERVER(d, id, s)
 {
 if ( (s->ioreq.va && s->ioreq.page == page) ||
  (s->bufioreq.va && s->bufioreq.page == page) )
@@ -302,7 +330,7 @@ static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server 
*s,
 }
 
 static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
- bool is_default, struct vcpu *v)
+ struct vcpu *v)
 {
 struct hvm_ioreq_vcpu *sv;
 int rc;
@@ -316,7 +344,8 @@ static int hvm_ioreq_server_add_vcpu(struct 
hvm_ioreq_server *s,
 spin_lock(&

[Xen-devel] [PATCH v18 00/11] x86: guest resource mapping

2018-03-22 Thread Paul Durrant
This series introduces support for direct mapping of guest resources.
The resources are:
 - IOREQ server pages
 - Grant tables

v18:
 - Re-base
 - Use the now-reference-counted emulating domain to host ioreq pages

v17:
 - Make sure ioreq page free-ing is done at domain destruction

v16:
 - Fix default ioreq server code and verified with qemu trad

v15:
 - Correct page ownership of ioreq pages

v14:
 - Responded to more comments from Jan.

v13:
 - Responded to more comments from Jan and Julien.
 - Build-tested using ARM cross-compilation.

v12:
 - Responded to more comments from Jan.

v11:
 - Responded to more comments from Jan.

v10:
 - Responded to comments from Jan.

v9:
 - Change to patch #1 only.

v8:
 - Re-ordered series and dropped two patches that have already been
   committed.

v7:
 - Fixed assertion failure hit during domain destroy.

v6:
 - Responded to missed comments from Roger.

v5:
 - Responded to review comments from Wei.

v4:
 - Responded to further review comments from Roger.

v3:
 - Dropped original patch #1 since it is covered by Juergen's patch.
 - Added new xenforeignmemorycleanup patch (#4).
 - Replaced the patch introducing the ioreq server 'is_default' flag with
   one that changes the ioreq server list into an array (#8).

Paul Durrant (11):
  x86/hvm/ioreq: maintain an array of ioreq servers rather than a list
  x86/hvm/ioreq: simplify code and use consistent naming
  x86/hvm/ioreq: use gfn_t in struct hvm_ioreq_page
  x86/hvm/ioreq: defer mapping gfns until they are actually requested
  x86/mm: add HYPERVISOR_memory_op to acquire guest resources
  x86/hvm/ioreq: add a new mappable resource type...
  x86/mm: add an extra command to HYPERVISOR_mmu_update...
  tools/libxenforeignmemory: add support for resource mapping
  tools/libxenforeignmemory: reduce xenforeignmemory_restrict code
footprint
  common: add a new mappable resource type: XENMEM_resource_grant_table
  tools/libxenctrl: use new xenforeignmemory API to seed grant table

 tools/flask/policy/modules/xen.if  |   4 +-
 tools/include/xen-sys/Linux/privcmd.h  |  11 +
 tools/libs/devicemodel/core.c  |   8 +
 tools/libs/devicemodel/include/xendevicemodel.h|   6 +-
 tools/libs/foreignmemory/Makefile  |   2 +-
 tools/libs/foreignmemory/core.c|  53 ++
 tools/libs/foreignmemory/freebsd.c |   7 -
 .../libs/foreignmemory/include/xenforeignmemory.h  |  41 +
 tools/libs/foreignmemory/libxenforeignmemory.map   |   5 +
 tools/libs/foreignmemory/linux.c   |  45 ++
 tools/libs/foreignmemory/minios.c  |   7 -
 tools/libs/foreignmemory/netbsd.c  |   7 -
 tools/libs/foreignmemory/private.h |  43 +-
 tools/libs/foreignmemory/solaris.c |   7 -
 tools/libxc/include/xc_dom.h   |   8 +-
 tools/libxc/xc_dom_boot.c  | 114 ++-
 tools/libxc/xc_sr_restore_x86_hvm.c|  10 +-
 tools/libxc/xc_sr_restore_x86_pv.c |   2 +-
 tools/libxl/libxl_dom.c|   1 -
 tools/python/xen/lowlevel/xc/xc.c  |   6 +-
 xen/arch/x86/hvm/dm.c  |   9 +-
 xen/arch/x86/hvm/ioreq.c   | 884 -
 xen/arch/x86/mm.c  |  60 +-
 xen/arch/x86/mm/p2m.c  |   3 +-
 xen/common/compat/memory.c | 100 +++
 xen/common/grant_table.c   |  71 +-
 xen/common/memory.c| 137 
 xen/include/asm-arm/mm.h   |   8 +
 xen/include/asm-arm/p2m.h  |  10 +
 xen/include/asm-x86/hvm/domain.h   |  15 +-
 xen/include/asm-x86/hvm/ioreq.h|   2 +
 xen/include/asm-x86/mm.h   |   5 +
 xen/include/asm-x86/p2m.h  |   3 +
 xen/include/public/hvm/dm_op.h |  36 +-
 xen/include/public/memory.h|  69 +-
 xen/include/public/xen.h   |  12 +-
 xen/include/xen/grant_table.h  |   4 +
 xen/include/xlat.lst   |   1 +
 xen/include/xsm/dummy.h|   6 +
 xen/include/xsm/xsm.h  |   6 +
 xen/xsm/dummy.c|   1 +
 xen/xsm/flask/hooks.c  |   6 +
 xen/xsm/flask/policy/access_vectors|   2 +
 43 files changed, 1320 insertions(+), 517 deletions(-)
---
Cc: Daniel De Graaf 
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Jan Beulich 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: "Marek Marczykowski-Górecki" 
Cc: Paul Durrant 
Cc: George Dunlap 
Cc: Julien Grall 
Cc: Chao Gao 

-- 
2.11.0


__

  1   2   >