Re: [Xen-devel] [PATCH v3 4/7] xen/9pfs: connect to the backend

2017-03-15 Thread Juergen Gross
On 15/03/17 19:44, Stefano Stabellini wrote:
> On Wed, 15 Mar 2017, Juergen Gross wrote:
>> On 14/03/17 22:22, Stefano Stabellini wrote:
>>> Hi Juergen,
>>>
>>> thank you for the review!
>>>
>>> On Tue, 14 Mar 2017, Juergen Gross wrote:
 On 14/03/17 00:50, Stefano Stabellini wrote:
> Implement functions to handle the xenbus handshake. Upon connection,
> allocate the rings according to the protocol specification.
>
> Initialize a work_struct and a wait_queue. The work_struct will be used
> to schedule work upon receiving an event channel notification from the
> backend. The wait_queue will be used to wait when the ring is full and
> we need to send a new request.
>
> Signed-off-by: Stefano Stabellini 
> CC: boris.ostrov...@oracle.com
> CC: jgr...@suse.com
> CC: Eric Van Hensbergen 
> CC: Ron Minnich 
> CC: Latchesar Ionkov 
> CC: v9fs-develo...@lists.sourceforge.net
> ---
>>
 Did you think about using request_threaded_irq() instead of a workqueue?
 For an example see e.g. drivers/scsi/xen-scsifront.c
>>>
>>> I like workqueues :-)  It might come down to personal preferences, but I
>>> think workqueues are more flexible and a better fit for this use case.
>>> Not only it is easy to schedule work in a workqueue from the interrupt
>>> handler, but also they can be used for sleeping in the request function
>>> if there is not enough room on the ring. Besides, they can easily be
>>> configured to share a single thread or to have multiple independent
>>> threads.
>>
>> I'm fine with the workqueues as long as you have decided to use them
>> considering the alternatives. :-)
>>
 Can't you use xenbus_read_unsigned() instead of xenbus_read()?
>>>
>>> I can use xenbus_read_unsigned in the other cases below, but not here,
>>> because versions is in the form: "1,3,4"
>>
>> Is this documented somewhere?
>>
>> Hmm, are any of the Xenstore entries documented? Shouldn't this be done
>> in xen_9pfs.h ?
>  
> They are documented in docs/misc/9pfs.markdown, under "Xenstore". Given
> that it's all written there, especially the semantics, I didn't repeat
> it in xen_9pfs.h

Looking at it from the Linux kernel perspective this documentation is
not really highly visible. For me it is okay, but there have been
multiple examples in the past where documentation in the Xen repository
wasn't regarded as being sufficient.

I recommend moving the documentation regarding the interface into the
header file like for the other pv interfaces.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 4/6] VT-d: introduce update_irte to update irte safely

2017-03-15 Thread Chao Gao
On Wed, Mar 15, 2017 at 10:48:25AM -0600, Jan Beulich wrote:
 On 15.03.17 at 06:11,  wrote:
>> +/*
>> + * The following method to update IRTE is safe on condition that
>> + * only the high qword or the low qword is to be updated.
>> + * If entire IRTE is to be updated, callers should make sure the
>> + * IRTE is not in use.
>> + */
>> +entry->lo = new_ire->lo;
>> +entry->hi = new_ire->hi;
>
>How is this any better than structure assignment? Furthermore

Indeed, not better. when using structure assignment, the assembly code is
48 8b 06mov(%rsi),%rax
48 8b 56 08 mov0x8(%rsi),%rdx 
48 89 07mov%rax,(%rdi)
48 89 57 08 mov%rdx,0x8(%rdi)
Using the code above, the assembly code is
48 8b 06mov(%rsi),%rax  
48 89 07mov%rax,(%rdi)
48 8b 46 08 mov0x8(%rsi),%rax 
48 89 47 08 mov%rax,0x8(%rdi)

I thought structure assignment maybe ultilize memcpy considering structure
of a big size, so I made this change. I will change this back. Although
that, this patch is trying to make the change safer when cmpxchg16() is
supported. 

>the comment here partially contradicts the commit message. I

Yes.

>guess callers need to be given a way (another function parameter?)
>to signal the function whether the unsafe variant is okay to use.

This means we need to add the new parameter to iommu ops for only
IOAPIC/MSI know the entry they want to change is masked. Is there
any another reasonable and correct solution? How about...

>You should then add a suitable BUG_ON() in the else path here.

just add a BUG_ON() like this
BUG_ON( (entry->hi != new_ire->hi) && (entry->lo != new_ire->lo) );
Adding this BUG_ON() means update_irte() can't be used for initializing
or clearing IRTE which are not bugs.

Thanks
Chao
>
>Jan
>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [Xen-users] "Hello Xen Project" Book.

2017-03-15 Thread Juergen Gross
On 15/03/17 19:05, Mohsen wrote:
> Thank you so much Lars.
> I used LibreOffice and I will test HTML format and inform you.

You are aware of the MediaWiki export function of LibreOffice?


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 1/6] VT-d: Introduce new fields in msi_desc to track binding with guest interrupt

2017-03-15 Thread Chao Gao
On Wed, Mar 15, 2017 at 10:41:13AM -0600, Jan Beulich wrote:
 On 15.03.17 at 06:11,  wrote:
>> --- a/xen/drivers/passthrough/vtd/intremap.c
>> +++ b/xen/drivers/passthrough/vtd/intremap.c
>> @@ -552,11 +552,12 @@ static int msi_msg_to_remap_entry(
>>  struct msi_desc *msi_desc, struct msi_msg *msg)
>>  {
>>  struct iremap_entry *iremap_entry = NULL, *iremap_entries;
>> -struct iremap_entry new_ire;
>> +struct iremap_entry new_ire = {{0}};
>
>Any reason this isn't simple "{ }"?
>

I also think '{}' can work. But here is the compiler output:
intremap.c: In function ‘msi_msg_to_remap_entry’:
intremap.c:587:12: error: missing braces around initializer 
[-Werror=missing-braces]
 struct iremap_entry new_ire = {0};
^
intremap.c:587:12: error: (near initialization for ‘new_ire.’) 
[-Werror=missing-braces]

>> @@ -595,33 +596,35 @@ static int msi_msg_to_remap_entry(
>>  GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, index,
>>   iremap_entries, iremap_entry);
>>  
>> -memcpy(_ire, iremap_entry, sizeof(struct iremap_entry));
>> -
>> -/* Set interrupt remapping table entry */
>> -new_ire.remap.fpd = 0;
>> -new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT) & 0x1;
>> -new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
>> -new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x1;
>> -/* Hardware require RH = 1 for LPR delivery mode */
>> -new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
>> -new_ire.remap.avail = 0;
>> -new_ire.remap.res_1 = 0;
>> -new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
>> -MSI_DATA_VECTOR_MASK;
>> -new_ire.remap.res_2 = 0;
>> -if ( x2apic_enabled )
>> -new_ire.remap.dst = msg->dest32;
>> +if ( !pi_desc )
>> +{
>> +new_ire.remap.dm = msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT;
>> +new_ire.remap.tm = msg->data >> MSI_DATA_TRIGGER_SHIFT;
>> +new_ire.remap.dlm = msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT;
>> +/* Hardware require RH = 1 for LPR delivery mode */
>
>As you're touching this anyway, please make it "requires" and
>"lowest priority" respectively.

OK.

>
>> @@ -968,59 +927,14 @@ int pi_update_irte(const struct vcpu *v, const struct 
>> pirq *pirq,
>>  rc = -ENODEV;
>>  goto unlock_out;
>>  }
>> -
>> -pci_dev = msi_desc->dev;
>> -if ( !pci_dev )
>> -{
>> -rc = -ENODEV;
>> -goto unlock_out;
>> -}
>> -
>> -remap_index = msi_desc->remap_index;
>> +msi_desc->pi_desc = pi_desc;
>> +msi_desc->gvec = gvec;
>
>Am I overlooking something - I can't seem to find any place where these
>two fields (or at least the former) get cleared again? This may be correct,
>but if it is the reason wants recording in the commit message.

IIUC, the current logic is free the whole msi_desc when device deassignment.
But it is better to clear this two fields explicitly. I will call 
pi_update_irte()
with @vcpu=NULL, just like Patch [6/6] when device deassignment.
Do you think the new added code should be squashed into this one?
Shall I also squash Patch [6/6] to this one? I think it is to fix another flaw.

>
>> --- a/xen/include/asm-x86/msi.h
>> +++ b/xen/include/asm-x86/msi.h
>> @@ -118,6 +118,8 @@ struct msi_desc {
>>  struct msi_msg msg; /* Last set MSI message */
>>  
>>  int remap_index;/* index in interrupt remapping table */
>> +const void *pi_desc;/* PDA, indicates msi is delivered via 
>> VT-d PI */
>
>Why "void"? Please let's play type safe wherever we can.

Ok.

Thank
Chao
>
>Jan
>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [linux-4.1 test] 106689: regressions - FAIL

2017-03-15 Thread osstest service owner
flight 106689 linux-4.1 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106689/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-pvh-intel 11 guest-start fail REGR. vs. 104301

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-rtds  6 xen-boot   fail pass in 106669

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-xsm 15 guest-start/debian.repeat fail in 106669 like 104272
 test-armhf-armhf-xl-rtds 11 guest-start fail in 106669 like 104301
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 104301
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 104301
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 104301
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 104301
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 104301
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 104301

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64   5 xen-buildfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass

version targeted for testing:
 linuxd9e0350d2575a20ee7783427da9bd6b6107eb983
baseline version:
 linuxf40b3cc69de8c97bbcdb74e3cffda06ffcad2cd7

Last test of basis   104301  2017-01-19 13:16:22 Z   55 days
Testing same since   106644  2017-03-13 22:17:49 Z2 days4 attempts


People who touched revisions under test:
  "Eric W. Biederman" 
  Adrian Hunter 
  Akinobu Mita 
  Al Viro 
  Alan Stern 
  Aleksander Morgado 
  Alex Deucher 
  Ander Conselvan de Oliveira 
  Andrew Morton 

[Xen-devel] [linux-next test] 106687: regressions - FAIL

2017-03-15 Thread osstest service owner
flight 106687 linux-next real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106687/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail 
REGR. vs. 106570
 test-amd64-amd64-qemuu-nested-amd 17 capture-logs/l1(17) fail REGR. vs. 106570
 test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail REGR. vs. 
106570
 test-amd64-amd64-xl-qemuu-winxpsp3  9 windows-installfail REGR. vs. 106570
 test-amd64-amd64-xl-qemut-winxpsp3  9 windows-installfail REGR. vs. 106570
 test-amd64-i386-xl-qemut-win7-amd64  9 windows-install   fail REGR. vs. 106570
 test-amd64-i386-xl-qemut-winxpsp3  9 windows-install fail REGR. vs. 106570
 test-amd64-amd64-xl-qemut-win7-amd64  9 windows-install  fail REGR. vs. 106570
 test-amd64-i386-xl-qemuu-win7-amd64  9 windows-install   fail REGR. vs. 106570
 test-amd64-amd64-xl-qemuu-win7-amd64  9 windows-install  fail REGR. vs. 106570

Tests which are failing intermittently (not blocking):
 test-amd64-i386-freebsd10-i386 16 guest-localmigrate/x10 fail in 106587 pass 
in 106687
 test-amd64-i386-xl-qemuu-ovmf-amd64 15 guest-localmigrate/x10 fail in 106587 
pass in 106687
 test-amd64-i386-xl-qemuu-winxpsp3 9 windows-install fail in 106587 pass in 
106687
 test-amd64-amd64-xl-pvh-amd   9 debian-install fail pass in 106587
 test-amd64-amd64-xl-multivcpu 14 guest-saverestore fail pass in 106587
 test-amd64-amd64-xl-xsm   9 debian-install fail pass in 106587
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install 
fail pass in 106587
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 9 windows-install fail pass in 106587
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 9 windows-install fail pass in 106587

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-qemuu-nested-intel 17 capture-logs/l1(17) fail blocked in 
106570
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail like 106570
 test-armhf-armhf-xl-arndale  11 guest-start  fail  like 106570
 test-armhf-armhf-xl-rtds 11 guest-start  fail  like 106570
 test-armhf-armhf-xl-xsm  11 guest-start  fail  like 106570
 test-armhf-armhf-xl  11 guest-start  fail  like 106570
 test-armhf-armhf-libvirt 11 guest-start  fail  like 106570
 test-armhf-armhf-xl-multivcpu 11 guest-start  fail like 106570
 test-armhf-armhf-xl-credit2  11 guest-start  fail  like 106570
 test-armhf-armhf-libvirt-xsm 11 guest-start  fail  like 106570
 test-armhf-armhf-xl-cubietruck 11 guest-start fail like 106570
 test-armhf-armhf-xl-vhd   9 debian-di-installfail  like 106570
 test-armhf-armhf-libvirt-raw  9 debian-di-installfail  like 106570

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail in 106587 never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 build-arm64   5 xen-buildfail   never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass

version targeted for testing:
 linux5be4921c9958ec02a67506bd6f7a52fce663c201
baseline version:
 linuxea6200e84182989a3cce9687cf79a23ac44ec4db

Last test of basis  (not found) 
Failing since 0  1970-01-01 00:00:00 Z 17241 days
Testing same since   106587  2017-03-10 09:20:19 Z5 days2 attempts

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm 

Re: [Xen-devel] [PATCH V4 7/8] COLO-Proxy: Use socket to get checkpoint event.

2017-03-15 Thread Zhang Chen



On 03/15/2017 06:56 PM, Wei Liu wrote:

On Wed, Mar 15, 2017 at 10:02:46AM +0800, Zhang Chen wrote:


On 03/14/2017 07:24 PM, Wei Liu wrote:

On Mon, Mar 06, 2017 at 10:59:25AM +0800, Zhang Chen wrote:

We use kernel colo proxy's way to get the checkpoint event
from qemu colo-compare.
Qemu colo-compare need add a API to support this(I will add this in qemu).
Qemu side patch:
   https://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg07265.html

Signed-off-by: Zhang Chen 

Acked-by: Wei Liu 

But see below.


@@ -289,8 +393,19 @@ int colo_proxy_checkpoint(libxl__colo_proxy_state *cps,
* event.
*/
   if (cps->is_userspace_proxy) {
-usleep(timeout_us);
-return 0;
+ret = colo_userspace_proxy_recv(cps, recvbuff, timeout_us);
+if (ret <= 0) {
+ret = 0;
+goto out1;
+}
+
+if (!strcmp(recvbuff, "DO_CHECKPOINT")) {
+ret = 1;
+} else {
+LOGD(ERROR, ao->domid, "receive qemu colo-compare checkpoint 
error");
+ret = 0;
+}
+goto out1;
   }
   size = colo_proxy_recv(cps, , timeout_us);
@@ -318,4 +433,7 @@ int colo_proxy_checkpoint(libxl__colo_proxy_state *cps,
   out:
   free(buff);
   return ret;
+
+out1:

Perhaps try to come up with a better name than out1? Subsequent patch is
welcome.


How about change 'out1' to 'out_userspace_proxy' ?
If OK, I will send a patch for it.


How about the following patch instead? Compile test only.

In general I would like code to stick with coding style.

--->8---
>From 0a87defaad529c02babe24055d5782b74d3a38e3 Mon Sep 17 00:00:00 2001
From: Wei Liu 
Date: Wed, 15 Mar 2017 10:50:19 +
Subject: [PATCH] libxl/colo: unified exit path for colo_proxy_checkpoint

Slightly refactor the code to have only one exit path for the said
function.

Signed-off-by: Wei Liu 


Acked-by: Zhang Chen

Thanks
Zhang Chen



Cc: zhangchen.f...@cn.fujitsu.com
---
  tools/libxl/libxl_colo_proxy.c | 15 +++
  1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_colo_proxy.c b/tools/libxl/libxl_colo_proxy.c
index c3d55104ea..5475f7ea32 100644
--- a/tools/libxl/libxl_colo_proxy.c
+++ b/tools/libxl/libxl_colo_proxy.c
@@ -375,7 +375,7 @@ typedef struct colo_msg {
  int colo_proxy_checkpoint(libxl__colo_proxy_state *cps,
unsigned int timeout_us)
  {
-uint8_t *buff;
+uint8_t *buff = NULL;
  int64_t size;
  struct nlmsghdr *h;
  struct colo_msg *m;
@@ -396,7 +396,7 @@ int colo_proxy_checkpoint(libxl__colo_proxy_state *cps,
  ret = colo_userspace_proxy_recv(cps, recvbuff, timeout_us);
  if (ret <= 0) {
  ret = 0;
-goto out1;
+goto out;
  }
  
  if (!strcmp(recvbuff, "DO_CHECKPOINT")) {

@@ -405,14 +405,16 @@ int colo_proxy_checkpoint(libxl__colo_proxy_state *cps,
  LOGD(ERROR, ao->domid, "receive qemu colo-compare checkpoint 
error");
  ret = 0;
  }
-goto out1;
+goto out;
  }
  
  size = colo_proxy_recv(cps, , timeout_us);
  
  /* timeout, return no checkpoint message. */

-if (size <= 0)
-return 0;
+if (size <= 0) {
+ret = 0;
+goto out;
+}
  
  h = (struct nlmsghdr *) buff;
  
@@ -433,7 +435,4 @@ int colo_proxy_checkpoint(libxl__colo_proxy_state *cps,

  out:
  free(buff);
  return ret;
-
-out1:
-return ret;
  }


--
Thanks
Zhang Chen




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [qemu-upstream-unstable test] 106696: tolerable FAIL - PUSHED

2017-03-15 Thread osstest service owner
flight 106696 qemu-upstream-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106696/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 105955
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 105955
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 105955
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 105955

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64-xsm   5 xen-buildfail   never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 build-arm64   5 xen-buildfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass

version targeted for testing:
 qemuuacde9f32bb971f021557c15197f6cb677b1a3ab5
baseline version:
 qemuu57e8fbb2f702001a18bd81e9fe31b26d94247ac9

Last test of basis   105955  2017-02-21 19:13:25 Z   22 days
Testing same since   106696  2017-03-15 15:12:56 Z0 days1 attempts


People who touched revisions under test:
  Christopher Covington 
  Peter Maydell 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  fail
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 

Re: [Xen-devel] [PATCH 01/18] xen/arm: Introduce a helper to get default HCR_EL2 flags

2017-03-15 Thread Stefano Stabellini
On Wed, 15 Mar 2017, Julien Grall wrote:
> On 15/03/17 07:19, Wei Chen wrote:
> > Hi Stefano,
> 
> Hello,
> 
> > On 2017/3/15 8:24, Stefano Stabellini wrote:
> > > On Mon, 13 Mar 2017, Wei Chen wrote:
> > > > We want to add HCR_EL2 register to Xen context switch. And each copy
> > > > of HCR_EL2 in vcpu structure will be initialized with the same set
> > > > of trap flags as the HCR_EL2 register. We introduce a helper here to
> > > > represent these flags to be reused easily.
> > > > 
> > > > Signed-off-by: Wei Chen 
> > > > ---
> > > >  xen/arch/arm/traps.c| 11 ---
> > > >  xen/include/asm-arm/processor.h |  2 ++
> > > >  2 files changed, 10 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> > > > index 614501f..d343c66 100644
> > > > --- a/xen/arch/arm/traps.c
> > > > +++ b/xen/arch/arm/traps.c
> > > > @@ -115,6 +115,13 @@ static void __init parse_vwfi(const char *s)
> > > >  }
> > > >  custom_param("vwfi", parse_vwfi);
> > > > 
> > > > +register_t get_default_hcr_flags(void)
> > > > +{
> > > > +return  (HCR_PTW|HCR_BSU_INNER|HCR_AMO|HCR_IMO|HCR_FMO|HCR_VM|
> > > > + (vwfi != NATIVE ? (HCR_TWI|HCR_TWE) : 0) |
> > > > + HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB);
> > > > +}
> > > 
> > > I haven't finished reading this series yet, but I would make this a
> > > static inline function if possible
> > > 
> > 
> > I had considered to use static inline before. But it must move the
> > 
> > static enum {
> > TRAP,
> > NATIVE,
> > } vwfi;
> > 
> > to the header file at the same time. But get_default_hcr_flags would
> > not be used frequently. So I thought it didn't have enough value to
> > change a less relevant code to make this function become static inline.
> 
> Looking at the spec, HCR_EL2 is controlling the behavior of the VM. We only
> need to ensure this to be set before going to EL1/EL0. Note that the reset
> value of some register are architecturally UNKNOWN, but I don't think we care
> here.
> 
> So I would suggest to drop the setting in init_traps and only rely on the one
> in when returning to the guest. And therefore this function could be moved in
> domain.c
> 
> Any opinions?

Fine by me

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [Qemu-devel] [PATCH v2 6/9] xen/9pfs: receive requests from the frontend

2017-03-15 Thread Stefano Stabellini
On Wed, 15 Mar 2017, Greg Kurz wrote:
> On Mon, 13 Mar 2017 16:55:57 -0700
> Stefano Stabellini  wrote:
> 
> > Upon receiving an event channel notification from the frontend, schedule
> > the bottom half. From the bottom half, read one request from the ring,
> > create a pdu and call pdu_submit to handle it.
> > 
> > For now, only handle one request per ring at a time.
> > 
> > Signed-off-by: Stefano Stabellini 
> > CC: anthony.per...@citrix.com
> > CC: jgr...@suse.com
> > CC: Aneesh Kumar K.V 
> > CC: Greg Kurz 
> > ---
> 
> Oops, one more remark I forgot in my the previous mail. See below.
> 
> >  hw/9pfs/xen-9p-backend.c | 47 
> > +++
> >  1 file changed, 47 insertions(+)
> > 
> > diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
> > index 0e4a133..741dd31 100644
> > --- a/hw/9pfs/xen-9p-backend.c
> > +++ b/hw/9pfs/xen-9p-backend.c
> > @@ -94,12 +94,59 @@ static int xen_9pfs_init(struct XenDevice *xendev)
> >  return 0;
> >  }
> >  
> > +static int xen_9pfs_receive(struct Xen9pfsRing *ring)
> > +{
> > +struct xen_9pfs_header h;
> > +RING_IDX cons, prod, masked_prod, masked_cons;
> > +V9fsPDU *pdu;
> > +
> > +if (ring->inprogress) {
> > +return 0;
> > +}
> > +
> > +cons = ring->intf->out_cons;
> > +prod = ring->intf->out_prod;
> > +xen_rmb();
> > +
> > +if (xen_9pfs_queued(prod, cons, XEN_9PFS_RING_SIZE) < sizeof(h)) {
> > +return 0;
> > +}
> > +ring->inprogress = true;
> > +
> > +masked_prod = xen_9pfs_mask(prod, XEN_9PFS_RING_SIZE);
> > +masked_cons = xen_9pfs_mask(cons, XEN_9PFS_RING_SIZE);
> > +
> > +xen_9pfs_read_packet(ring->ring.out, masked_prod, _cons,
> > + XEN_9PFS_RING_SIZE, (uint8_t*) , sizeof(h));
> > +
> > +pdu = pdu_alloc(>priv->state);
> > +pdu->size = h.size;
> 
> 9P uses little-endian, so this should be:
> 
> pdu->size = le32_to_cpu(h.size);
> 
> > +pdu->id = h.id;
> > +pdu->tag = h.tag;
> 
> and:
> 
> pdu->tag = le16_to_cpu(h.tag);
> 
> > +ring->out_size = h.size;
> > +ring->out_cons = cons + h.size;
> 
> Same here.

OK, thanks. Xen doesn't support big endian today, but they don't hurt.


> > +
> > +qemu_co_queue_init(>complete);
> > +pdu_submit(pdu);
> > +
> > +return 0;
> > +}
> > +
> >  static void xen_9pfs_bh(void *opaque)
> >  {
> > +struct Xen9pfsRing *ring = opaque;
> > +xen_9pfs_receive(ring);
> >  }
> >  
> >  static void xen_9pfs_evtchn_event(void *opaque)
> >  {
> > +struct Xen9pfsRing *ring = opaque;
> > +evtchn_port_t port;
> > +
> > +port = xenevtchn_pending(ring->evtchndev);
> > +xenevtchn_unmask(ring->evtchndev, port);
> > +
> > +qemu_bh_schedule(ring->bh);
> >  }
> >  
> >  static int xen_9pfs_free(struct XenDevice *xendev)
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 8/9] xen/9pfs: send responses back to the frontend

2017-03-15 Thread Stefano Stabellini
On Wed, 15 Mar 2017, Greg Kurz wrote:
> On Mon, 13 Mar 2017 16:55:59 -0700
> Stefano Stabellini  wrote:
> 
> > Once a request is completed, xen_9pfs_push_and_notify gets called. In
> > xen_9pfs_push_and_notify, update the indexes (data has already been
> > copied to the sg by the common code) and send a notification to the
> > frontend.
> > 
> > Schedule the bottom-half to check if we already have any other requests
> > pending.
> > 
> > Signed-off-by: Stefano Stabellini 
> > CC: anthony.per...@citrix.com
> > CC: jgr...@suse.com
> > CC: Aneesh Kumar K.V 
> > CC: Greg Kurz 
> > ---
> >  hw/9pfs/xen-9p-backend.c | 16 
> >  1 file changed, 16 insertions(+)
> > 
> > diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
> > index d72a749..fed369f 100644
> > --- a/hw/9pfs/xen-9p-backend.c
> > +++ b/hw/9pfs/xen-9p-backend.c
> > @@ -166,6 +166,22 @@ static void xen_9pfs_init_in_iov_from_pdu(V9fsPDU *pdu,
> >  
> >  static void xen_9pfs_push_and_notify(V9fsPDU *pdu)
> >  {
> > +RING_IDX prod;
> > +struct Xen9pfsDev *priv = container_of(pdu->s, struct Xen9pfsDev, 
> > state);
> > +struct Xen9pfsRing *ring = >rings[pdu->tag % priv->num_rings];
> > +
> 
> Coding style for structured types.

Yep



> > +ring->intf->out_cons = ring->out_cons;
> > +xen_wmb();
> > +
> > +prod = ring->intf->in_prod;
> > +xen_rmb();
> > +ring->intf->in_prod = prod + pdu->size;
> > +xen_wmb();
> > +
> > +ring->inprogress = false;
> > +xenevtchn_notify(ring->evtchndev, ring->local_port);
> > +
> > +qemu_bh_schedule(ring->bh);
> >  }
> >  
> >  static const struct V9fsTransport xen_9p_transport = {
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [Qemu-devel] [PATCH v2 7/9] xen/9pfs: implement in/out_iov_from_pdu and vmarshal/vunmarshal

2017-03-15 Thread Stefano Stabellini
On Wed, 15 Mar 2017, Greg Kurz wrote:
> On Mon, 13 Mar 2017 16:55:58 -0700
> Stefano Stabellini  wrote:
> 
> > Implement xen_9pfs_init_in/out_iov_from_pdu and
> > xen_9pfs_pdu_vmarshal/vunmarshall by creating new sg pointing to the
> > data on the ring.
> > 
> > This is safe as we only handle one request per ring at any given time.
> > 
> > Signed-off-by: Stefano Stabellini 
> > CC: anthony.per...@citrix.com
> > CC: jgr...@suse.com
> > CC: Aneesh Kumar K.V 
> > CC: Greg Kurz 
> > ---
> >  hw/9pfs/xen-9p-backend.c | 91 
> > ++--
> >  1 file changed, 89 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
> > index 741dd31..d72a749 100644
> > --- a/hw/9pfs/xen-9p-backend.c
> > +++ b/hw/9pfs/xen-9p-backend.c
> > @@ -48,12 +48,77 @@ typedef struct Xen9pfsDev {
> >  struct Xen9pfsRing *rings;
> >  } Xen9pfsDev;
> >  
> > +static void xen_9pfs_in_sg(struct Xen9pfsRing *ring,
> 
> Coding style for structured types.

OK


> > +   struct iovec *in_sg,
> > +   int *num,
> > +   uint32_t idx,
> > +   uint32_t size)
> > +{
> > +RING_IDX cons, prod, masked_prod, masked_cons;
> > +
> > +cons = ring->intf->in_cons;
> > +prod = ring->intf->in_prod;
> > +xen_rmb();
> > +masked_prod = xen_9pfs_mask(prod, XEN_9PFS_RING_SIZE);
> > +masked_cons = xen_9pfs_mask(cons, XEN_9PFS_RING_SIZE);
> > +
> > +if (masked_prod < masked_cons) {
> > +in_sg[0].iov_base = ring->ring.in + masked_prod;
> > +in_sg[0].iov_len = masked_cons - masked_prod;
> > +*num = 1;
> > +} else {
> > +in_sg[0].iov_base = ring->ring.in + masked_prod;
> > +in_sg[0].iov_len = XEN_9PFS_RING_SIZE - masked_prod;
> > +in_sg[1].iov_base = ring->ring.in;
> > +in_sg[1].iov_len = masked_cons;
> > +*num = 2;
> > +}
> > +}
> > +
> > +static void xen_9pfs_out_sg(struct Xen9pfsRing *ring,
> 
> Coding style for structured types.

OK

> > +struct iovec *out_sg,
> > +int *num,
> > +uint32_t idx)
> > +{
> > +RING_IDX cons, prod, masked_prod, masked_cons;
> > +
> > +cons = ring->intf->out_cons;
> > +prod = ring->intf->out_prod;
> > +xen_rmb();
> > +masked_prod = xen_9pfs_mask(prod, XEN_9PFS_RING_SIZE);
> > +masked_cons = xen_9pfs_mask(cons, XEN_9PFS_RING_SIZE);
> > +
> > +if (masked_cons < masked_prod) {
> > +out_sg[0].iov_base = ring->ring.out + masked_cons;
> > +out_sg[0].iov_len = ring->out_size;
> > +*num = 1;
> > +} else {
> > +if (ring->out_size > (XEN_9PFS_RING_SIZE - masked_cons)) {
> > +out_sg[0].iov_base = ring->ring.out + masked_cons;
> > +out_sg[0].iov_len = XEN_9PFS_RING_SIZE - masked_cons;
> > +out_sg[1].iov_base = ring->ring.out;
> > +out_sg[1].iov_len = ring->out_size - (XEN_9PFS_RING_SIZE - 
> > masked_cons);
> > +*num = 2;
> > +} else {
> > +out_sg[0].iov_base = ring->ring.out + masked_cons;
> > +out_sg[0].iov_len = ring->out_size;
> > +*num = 1;
> > +}
> > +}
> > +}
> > +
> >  static ssize_t xen_9pfs_pdu_vmarshal(V9fsPDU *pdu,
> >   size_t offset,
> >   const char *fmt,
> >   va_list ap)
> >  {
> > -return 0;
> > +struct Xen9pfsDev *xen_9pfs = container_of(pdu->s, struct Xen9pfsDev, 
> > state);
> 
> Coding style for structured types.

OK 


> > +struct iovec in_sg[2];
> > +int num;
> > +
> > +xen_9pfs_in_sg(_9pfs->rings[pdu->tag % xen_9pfs->num_rings],
> > +   in_sg, , pdu->idx, ROUND_UP(offset + 128, 512));
> > +return v9fs_iov_vmarshal(in_sg, num, offset, 0, fmt, ap);
> >  }
> >  
> >  static ssize_t xen_9pfs_pdu_vunmarshal(V9fsPDU *pdu,
> > @@ -61,13 +126,27 @@ static ssize_t xen_9pfs_pdu_vunmarshal(V9fsPDU *pdu,
> > const char *fmt,
> > va_list ap)
> >  {
> > -return 0;
> > +struct Xen9pfsDev *xen_9pfs = container_of(pdu->s, struct Xen9pfsDev, 
> > state);
> 
> Coding style for structured types.

OK


> > +struct iovec out_sg[2];
> > +int num;
> > +
> > +xen_9pfs_out_sg(_9pfs->rings[pdu->tag % xen_9pfs->num_rings],
> > +out_sg, , pdu->idx);
> > +return v9fs_iov_vunmarshal(out_sg, num, offset, 0, fmt, ap);
> >  }
> >  
> >  static void xen_9pfs_init_out_iov_from_pdu(V9fsPDU *pdu,
> > struct iovec **piov,
> > unsigned int *pniov)
> >  {
> > +struct Xen9pfsDev 

Re: [Xen-devel] [Qemu-devel] [PATCH v2 6/9] xen/9pfs: receive requests from the frontend

2017-03-15 Thread Stefano Stabellini
On Wed, 15 Mar 2017, Greg Kurz wrote:
> On Mon, 13 Mar 2017 16:55:57 -0700
> Stefano Stabellini  wrote:
> 
> > Upon receiving an event channel notification from the frontend, schedule
> > the bottom half. From the bottom half, read one request from the ring,
> > create a pdu and call pdu_submit to handle it.
> > 
> > For now, only handle one request per ring at a time.
> > 
> > Signed-off-by: Stefano Stabellini 
> > CC: anthony.per...@citrix.com
> > CC: jgr...@suse.com
> > CC: Aneesh Kumar K.V 
> > CC: Greg Kurz 
> > ---
> >  hw/9pfs/xen-9p-backend.c | 47 
> > +++
> >  1 file changed, 47 insertions(+)
> > 
> > diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
> > index 0e4a133..741dd31 100644
> > --- a/hw/9pfs/xen-9p-backend.c
> > +++ b/hw/9pfs/xen-9p-backend.c
> > @@ -94,12 +94,59 @@ static int xen_9pfs_init(struct XenDevice *xendev)
> >  return 0;
> >  }
> >  
> > +static int xen_9pfs_receive(struct Xen9pfsRing *ring)
> 
> Coding style for structured types.

I'll fix


> > +{
> > +struct xen_9pfs_header h;
> > +RING_IDX cons, prod, masked_prod, masked_cons;
> > +V9fsPDU *pdu;
> > +
> > +if (ring->inprogress) {
> > +return 0;
> > +}
> > +
> > +cons = ring->intf->out_cons;
> > +prod = ring->intf->out_prod;
> > +xen_rmb();
> > +
> > +if (xen_9pfs_queued(prod, cons, XEN_9PFS_RING_SIZE) < sizeof(h)) {
> > +return 0;
> > +}
> > +ring->inprogress = true;
> > +
> > +masked_prod = xen_9pfs_mask(prod, XEN_9PFS_RING_SIZE);
> > +masked_cons = xen_9pfs_mask(cons, XEN_9PFS_RING_SIZE);
> > +
> > +xen_9pfs_read_packet(ring->ring.out, masked_prod, _cons,
> > + XEN_9PFS_RING_SIZE, (uint8_t*) , sizeof(h));
> > +
> > +pdu = pdu_alloc(>priv->state);
> 
> pdu_alloc() can theoretically return NULL. Maybe add a comment to
> indicate this won't happen because "for now [we] only handle one
> request per ring at a time".

OK


> > +pdu->size = h.size;
> > +pdu->id = h.id;
> > +pdu->tag = h.tag;
> > +ring->out_size = h.size;
> > +ring->out_cons = cons + h.size;
> > +
> > +qemu_co_queue_init(>complete);
> > +pdu_submit(pdu);
> > +
> > +return 0;
> > +}
> > +
> >  static void xen_9pfs_bh(void *opaque)
> >  {
> > +struct Xen9pfsRing *ring = opaque;
> 
> Coding style for structured types.

OK


> > +xen_9pfs_receive(ring);
> >  }
> >  
> >  static void xen_9pfs_evtchn_event(void *opaque)
> >  {
> > +struct Xen9pfsRing *ring = opaque;
> 
> Coding style for structured types.

OK


> > +evtchn_port_t port;
> > +
> > +port = xenevtchn_pending(ring->evtchndev);
> > +xenevtchn_unmask(ring->evtchndev, port);
> > +
> > +qemu_bh_schedule(ring->bh);
> >  }
> >  
> >  static int xen_9pfs_free(struct XenDevice *xendev)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 5/9] xen/9pfs: connect to the frontend

2017-03-15 Thread Stefano Stabellini
On Wed, 15 Mar 2017, Greg Kurz wrote:
> On Mon, 13 Mar 2017 16:55:56 -0700
> Stefano Stabellini  wrote:
> 
> > Write the limits of the backend to xenstore. Connect to the frontend.
> > Upon connection, allocate the rings according to the protocol
> > specification.
> > 
> > Initialize a QEMUBH to schedule work upon receiving an event channel
> > notification from the frontend.
> > 
> > Signed-off-by: Stefano Stabellini 
> > CC: anthony.per...@citrix.com
> > CC: jgr...@suse.com
> > CC: Aneesh Kumar K.V 
> > CC: Greg Kurz 
> > ---
> >  hw/9pfs/xen-9p-backend.c | 159 
> > ++-
> >  1 file changed, 158 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
> > index 35032d3..0e4a133 100644
> > --- a/hw/9pfs/xen-9p-backend.c
> > +++ b/hw/9pfs/xen-9p-backend.c
> > @@ -17,8 +17,35 @@
> >  #include "qemu/config-file.h"
> >  #include "fsdev/qemu-fsdev.h"
> >  
> > +#define VERSIONS "1"
> > +#define MAX_RINGS 8
> > +#define MAX_RING_ORDER 8
> > +
> > +struct Xen9pfsRing {
> > +struct Xen9pfsDev *priv;
> > +
> > +int ref;
> > +xenevtchn_handle   *evtchndev;
> > +int evtchn;
> > +int local_port;
> > +struct xen_9pfs_data_intf *intf;
> > +unsigned char *data;
> > +struct xen_9pfs_data ring;
> > +
> > +QEMUBH *bh;
> > +
> > +/* local copies, so that we can read/write PDU data directly from
> > + * the ring */
> > +RING_IDX out_cons, out_size, in_cons;
> > +bool inprogress;
> > +};
> > +
> 
> QEMU Coding style requires a typedef.

OK


> FWIW, maybe you could also convert
> other structured types like XenDevice or XenDevOps.

I'll do in a separate series


> >  typedef struct Xen9pfsDev {
> >  struct XenDevice xendev;  /* must be first */
> > +V9fsState state;
> > +
> > +int num_rings;
> > +struct Xen9pfsRing *rings;
> >  } Xen9pfsDev;
> >  
> >  static ssize_t xen_9pfs_pdu_vmarshal(V9fsPDU *pdu,
> > @@ -67,22 +94,152 @@ static int xen_9pfs_init(struct XenDevice *xendev)
> >  return 0;
> >  }
> >  
> > +static void xen_9pfs_bh(void *opaque)
> > +{
> > +}
> > +
> > +static void xen_9pfs_evtchn_event(void *opaque)
> > +{
> > +}
> > +
> >  static int xen_9pfs_free(struct XenDevice *xendev)
> >  {
> > -return -1;
> > +int i;
> > +struct Xen9pfsDev *xen_9pdev = container_of(xendev, struct Xen9pfsDev, 
> > xendev);
> > +
> 
> Coding style.

OK


> > +for (i = 0; i < xen_9pdev->num_rings; i++) {
> > +if (xen_9pdev->rings[i].data != NULL) {
> > +xengnttab_unmap(xen_9pdev->xendev.gnttabdev,
> > +xen_9pdev->rings[i].data,
> > +(1 << XEN_9PFS_RING_ORDER));
> > +}
> > +if (xen_9pdev->rings[i].intf != NULL) {
> > +xengnttab_unmap(xen_9pdev->xendev.gnttabdev,
> > +xen_9pdev->rings[i].intf,
> > +1);
> > +}
> > +if (xen_9pdev->rings[i].evtchndev > 0) {
> > +
> > qemu_set_fd_handler(xenevtchn_fd(xen_9pdev->rings[i].evtchndev),
> > +NULL, NULL, NULL);
> > +xenevtchn_unbind(xen_9pdev->rings[i].evtchndev, 
> > xen_9pdev->rings[i].local_port);
> > +}
> > +if (xen_9pdev->rings[i].bh != NULL) {
> > +qemu_bh_delete(xen_9pdev->rings[i].bh);
> > +}
> > +}
> > +g_free(xen_9pdev->rings);
> > +return 0;
> >  }
> >  
> >  static int xen_9pfs_connect(struct XenDevice *xendev)
> >  {
> > +int i;
> > +struct Xen9pfsDev *xen_9pdev = container_of(xendev, struct Xen9pfsDev, 
> > xendev);
> 
> Coding style.

OK


> > +V9fsState *s = _9pdev->state;
> > +QemuOpts *fsdev;
> > +char *security_model, *path;
> > +
> > +if (xenstore_read_fe_int(_9pdev->xendev, "num-rings",
> > + _9pdev->num_rings) == -1 ||
> > +xen_9pdev->num_rings > MAX_RINGS) {
> > +return -1;
> > +}
> > +
> > +xen_9pdev->rings = g_malloc0(xen_9pdev->num_rings * sizeof(struct 
> > Xen9pfsRing));
> > +for (i = 0; i < xen_9pdev->num_rings; i++) {
> > +char str[16];
> > +
> > +xen_9pdev->rings[i].priv = xen_9pdev;
> > +xen_9pdev->rings[i].evtchn = -1;
> > +xen_9pdev->rings[i].local_port = -1;
> > +
> > +sprintf(str, "ring-ref%u", i);
> > +if (xenstore_read_fe_int(_9pdev->xendev, str,
> > + _9pdev->rings[i].ref) == -1) {
> > +goto out;
> > +}
> > +sprintf(str, "event-channel-%u", i);
> > +if (xenstore_read_fe_int(_9pdev->xendev, str,
> > + _9pdev->rings[i].evtchn) == -1) {
> > +goto out;
> > +}
> > +
> > +xen_9pdev->rings[i].intf =  xengnttab_map_grant_ref(
> > +xen_9pdev->xendev.gnttabdev,
> > +

[Xen-devel] [qemu-mainline test] 106682: regressions - FAIL

2017-03-15 Thread osstest service owner
flight 106682 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106682/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-credit2 15 guest-start/debian.repeat fail REGR. vs. 106574

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 106574
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 106574
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 106574
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 106574
 test-amd64-amd64-xl-rtds  9 debian-install   fail  like 106574
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 106574

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 build-arm64   5 xen-buildfail   never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass

version targeted for testing:
 qemuud84f714eafedd8bb9d4aaec8b76417bef8e3535e
baseline version:
 qemuudd4d2578215cd380f40a38028a9904e15b135ef3

Last test of basis   106574  2017-03-09 19:12:40 Z6 days
Failing since106635  2017-03-13 11:48:20 Z2 days5 attempts
Testing same since   106682  2017-03-15 08:02:07 Z0 days1 attempts


People who touched revisions under test:
  Alex Bennée 
  Alexander Boettcher 
  Andrew Baumann 
  Andrew Jones 
  Christian Borntraeger 
  Christopher Covington 
  David Gibson 
  Dr. David Alan Gilbert 

Re: [Xen-devel] [raisin] Using cirros for tests???

2017-03-15 Thread Gémes Géza

2017-03-13 22:31 keltezéssel, Stefano Stabellini írta:

On Thu, 9 Mar 2017, Gémes Géza wrote:

Hi,

I've sent my last couple of days on trying to make raisin tests run on
different distributions. Tried Ubuntu 16.04, Ubuntu 14.04, CentOS 7 and CentOS
6.8 so far. The tests fail because of different reasons on these
distributions:

1. bussybox-pv passes on Ubuntu 14.04 and 16.04, it fails on Centos 7 (there
is no bussybox in the default repositories, enabling EPEL might be too
intrusive), it also fails on Centos6 (I haven't track that down yet)

2. bussybox-hvm fails on all the tried distros. On Ubuntu and CentOS 7 (all of
them have grub 2.0.2) grub fails to find the filesystem with stage2 and at
boot stops at grub-rescue> with no partitions recognized. In addition on
Ubuntu 16.04 the lopartsetup script fails to set up the partition correctly,
which could be quite easily get fixed.

My idea is that instead of trying to fix the tests (and to continue to do so
for upcoming distro releases) we could start using cirros images for the
tests.

I'd start transforming the existing tests to use cirros if you agree with the
proposal.

We need both pv and hvm tests, because they test different hypervisor
functionalities. But it would be fine to replace our hand crafted VM
filesystem based on busybox with Cirros. In other words, it is fine by
me, but we need both cirros-pv and cirros-hvm tests.


Hi Stefano,

I've started creating the cirros tests. So far have three pv tests: 
separate-kernel, pygrub and pvgrub2. On Ubuntu 14.04 with xen installed 
by raisin the pygrub test fail due to pygrub failure to find a python 
lib, which I'm going to investigate later on. The other two tests pass. 
I'll prepare patches soon.


Cheers,

Geza


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH 4/9] xen/arm: p2m: Update IOMMU mapping whenever possible if page table is not shared

2017-03-15 Thread Oleksandr Tyshchenko
From: Oleksandr Tyshchenko 

Update IOMMU mapping if the IOMMU doesn't share page table with the CPU.
The best place to do so on ARM is p2m_set_entry(). Use mfn as an indicator
of the required action. If mfn is valid call iommu_map_pages(),
otherwise - iommu_unmap_pages().

Signed-off-by: Oleksandr Tyshchenko 
---
 xen/arch/arm/p2m.c | 40 +++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 1fc6ca3..84d3a09 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -979,7 +979,8 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
 if ( p2m_valid(orig_pte) && entry->p2m.base != orig_pte.p2m.base )
 p2m_free_entry(p2m, orig_pte, level);
 
-if ( need_iommu(p2m->domain) && (p2m_valid(orig_pte) || p2m_valid(*entry)) 
)
+if ( need_iommu(p2m->domain) && iommu_use_hap_pt(d) &&
+ (p2m_valid(orig_pte) || p2m_valid(*entry)) )
 rc = iommu_iotlb_flush(p2m->domain, gfn_x(sgfn), 1UL << page_order);
 else
 rc = 0;
@@ -997,6 +998,9 @@ int p2m_set_entry(struct p2m_domain *p2m,
   p2m_type_t t,
   p2m_access_t a)
 {
+unsigned long orig_nr = nr;
+gfn_t orig_sgfn = sgfn;
+mfn_t orig_smfn = smfn;
 int rc = 0;
 
 while ( nr )
@@ -1029,6 +1033,40 @@ int p2m_set_entry(struct p2m_domain *p2m,
 nr -= (1 << order);
 }
 
+if ( likely(!rc) )
+{
+/*
+ * It's time to update IOMMU mapping if the latter doesn't
+ * share page table with the CPU. Pass the whole memory block to let
+ * the IOMMU code decide what to do with it.
+ * The reason to update IOMMU mapping outside "while loop" is that
+ * the IOMMU might not support the pages/superpages the CPU can deal
+ * with (4K, 2M, 1G) and as result this will lead to non-optimal
+ * mapping.
+ * Also we assume that the IOMMU mapping should be updated only
+ * if CPU mapping passed successfully.
+ */
+if ( need_iommu(p2m->domain) && !iommu_use_hap_pt(p2m->domain) )
+{
+if ( !mfn_eq(orig_smfn, INVALID_MFN) )
+{
+unsigned int flags = p2m_get_iommu_flags(t);
+
+rc = iommu_map_pages(p2m->domain,
+ gfn_x(orig_sgfn),
+ mfn_x(orig_smfn),
+ orig_nr,
+ flags);
+}
+else
+{
+rc = iommu_unmap_pages(p2m->domain,
+   gfn_x(orig_sgfn),
+   orig_nr);
+}
+}
+}
+
 return rc;
 }
 
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH 2/9] iommu: Add ability to map/unmap the number of pages

2017-03-15 Thread Oleksandr Tyshchenko
From: Oleksandr Tyshchenko 

Extend the IOMMU code with new APIs and platform callbacks.
These new map_pages/unmap_pages API do almost the same thing
as existing map_page/unmap_page ones except the formers can
handle the number of pages. So do new platform callbacks.

Currently, this patch requires to modify neither
existing IOMMU drivers nor P2M code.
But, the patch might be rewritten to replace existing
single-page stuff with the multi-page one followed by modifications
of all related parts.

Signed-off-by: Oleksandr Tyshchenko 
---
 xen/drivers/passthrough/iommu.c | 50 -
 xen/include/xen/iommu.h | 16 ++---
 2 files changed, 52 insertions(+), 14 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 5e81813..115698f 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -249,22 +249,37 @@ void iommu_domain_destroy(struct domain *d)
 arch_iommu_domain_destroy(d);
 }
 
-int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
-   unsigned int flags)
+int iommu_map_pages(struct domain *d, unsigned long gfn, unsigned long mfn,
+unsigned long page_count, unsigned int flags)
 {
 const struct domain_iommu *hd = dom_iommu(d);
-int rc;
+int rc = 0;
+unsigned long i;
 
 if ( !iommu_enabled || !hd->platform_ops )
 return 0;
 
-rc = hd->platform_ops->map_page(d, gfn, mfn, flags);
+if ( hd->platform_ops->map_pages )
+rc = hd->platform_ops->map_pages(d, gfn, mfn, page_count, flags);
+else
+{
+for ( i = 0; i < page_count; i++ )
+{
+rc = hd->platform_ops->map_page(d, gfn + i, mfn + i, flags);
+if ( unlikely(rc) )
+{
+/* TODO Do we need to unmap if map failed? */
+break;
+}
+}
+}
+
 if ( unlikely(rc) )
 {
 if ( !d->is_shutting_down && printk_ratelimit() )
 printk(XENLOG_ERR
-   "d%d: IOMMU mapping gfn %#lx to mfn %#lx failed: %d\n",
-   d->domain_id, gfn, mfn, rc);
+   "d%d: IOMMU mapping gfn %#lx to mfn %#lx page count %lu 
failed: %d\n",
+   d->domain_id, gfn, mfn, page_count, rc);
 
 if ( !is_hardware_domain(d) )
 domain_crash(d);
@@ -273,21 +288,34 @@ int iommu_map_page(struct domain *d, unsigned long gfn, 
unsigned long mfn,
 return rc;
 }
 
-int iommu_unmap_page(struct domain *d, unsigned long gfn)
+int iommu_unmap_pages(struct domain *d, unsigned long gfn,
+  unsigned long page_count)
 {
 const struct domain_iommu *hd = dom_iommu(d);
-int rc;
+int ret, rc = 0;
+unsigned long i;
 
 if ( !iommu_enabled || !hd->platform_ops )
 return 0;
 
-rc = hd->platform_ops->unmap_page(d, gfn);
+if ( hd->platform_ops->unmap_pages )
+rc = hd->platform_ops->unmap_pages(d, gfn, page_count);
+else
+{
+for ( i = 0; i < page_count; i++ )
+{
+ret = hd->platform_ops->unmap_page(d, gfn + i);
+if ( likely(!rc) )
+rc = ret;
+}
+}
+
 if ( unlikely(rc) )
 {
 if ( !d->is_shutting_down && printk_ratelimit() )
 printk(XENLOG_ERR
-   "d%d: IOMMU unmapping gfn %#lx failed: %d\n",
-   d->domain_id, gfn, rc);
+   "d%d: IOMMU unmapping gfn %#lx page count %lu failed: %d\n",
+   d->domain_id, gfn, page_count, rc);
 
 if ( !is_hardware_domain(d) )
 domain_crash(d);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 5803e3f..0446ed3 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -76,9 +76,14 @@ void iommu_teardown(struct domain *d);
 #define IOMMUF_readable  (1u<<_IOMMUF_readable)
 #define _IOMMUF_writable 1
 #define IOMMUF_writable  (1u<<_IOMMUF_writable)
-int __must_check iommu_map_page(struct domain *d, unsigned long gfn,
-unsigned long mfn, unsigned int flags);
-int __must_check iommu_unmap_page(struct domain *d, unsigned long gfn);
+int __must_check iommu_map_pages(struct domain *d, unsigned long gfn,
+ unsigned long mfn, unsigned long page_count,
+ unsigned int flags);
+int __must_check iommu_unmap_pages(struct domain *d, unsigned long gfn,
+   unsigned long page_count);
+
+#define iommu_map_page(d,gfn,mfn,flags) (iommu_map_pages(d,gfn,mfn,1,flags))
+#define iommu_unmap_page(d,gfn) (iommu_unmap_pages(d,gfn,1))
 
 enum iommu_feature
 {
@@ -170,7 +175,12 @@ struct iommu_ops {
 void (*teardown)(struct domain *d);
 int __must_check (*map_page)(struct domain *d, unsigned long gfn,
  

[Xen-devel] [RFC PATCH 1/9] xen/device-tree: Add dt_count_phandle_with_args helper

2017-03-15 Thread Oleksandr Tyshchenko
From: Oleksandr Tyshchenko 

Port Linux helper of_count_phandle_with_args for counting
number of phandles in a property.

Signed-off-by: Oleksandr Tyshchenko 
---
 xen/common/device_tree.c  |  7 +++
 xen/include/xen/device_tree.h | 19 +++
 2 files changed, 26 insertions(+)

diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index 7b009ea..60b0095 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -1663,6 +1663,13 @@ int dt_parse_phandle_with_args(const struct 
dt_device_node *np,
 index, out_args);
 }
 
+int dt_count_phandle_with_args(const struct dt_device_node *np,
+   const char *list_name,
+   const char *cells_name)
+{
+return __dt_parse_phandle_with_args(np, list_name, cells_name, 0, -1, 
NULL);
+}
+
 /**
  * unflatten_dt_node - Alloc and populate a device_node from the flat tree
  * @fdt: The parent device tree blob
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 0aecbe0..738f1b6 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -764,6 +764,25 @@ int dt_parse_phandle_with_args(const struct dt_device_node 
*np,
const char *cells_name, int index,
struct dt_phandle_args *out_args);
 
+/**
+ * dt_count_phandle_with_args() - Find the number of phandles references in a 
property
+ * @np: pointer to a device tree node containing a list
+ * @list_name: property name that contains a list
+ * @cells_name: property name that specifies phandles' arguments count
+ *
+ * Returns the number of phandle + argument tuples within a property. It
+ * is a typical pattern to encode a list of phandle and variable
+ * arguments into a single property. The number of arguments is encoded
+ * by a property in the phandle-target node. For example, a gpios
+ * property would contain a list of GPIO specifies consisting of a
+ * phandle and 1 or more arguments. The number of arguments are
+ * determined by the #gpio-cells property in the node pointed to by the
+ * phandle.
+ */
+int dt_count_phandle_with_args(const struct dt_device_node *np,
+   const char *list_name,
+   const char *cells_name);
+
 #ifdef CONFIG_DEVICE_TREE_DEBUG
 #define dt_dprintk(fmt, args...)  \
 printk(XENLOG_DEBUG fmt, ## args)
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH 0/9] "Non-shared" IOMMU support on ARM

2017-03-15 Thread Oleksandr Tyshchenko
Hi, all.

The purpose of this patch series is to create a base for porting
any "Non-shared" IOMMUs to Xen on ARM. Saying "Non-shared" IOMMU I mean
the IOMMU that can't share the page table with the CPU.
Primarily, we are interested in IPMMU-VMSA and I hope that it will be the first 
candidate.
It is VMSA-compatible IOMMU that integrated in the newest Renesas R-Car Gen3 
SoCs (ARM).
And this IOMMU can't share the page table with the CPU since it uses
stage-1 page table unlike the CPU that uses stage-2 therefore I name it
"Non-shared" IOMMU.

I have already raised disscusion [2] about some generic problems I had faced
during porting IPMMU-VMSA Linux driver to Xen. And on this basis I made
patches you can see it in this request. Only the first patch is not related to 
IOMMU. But, I decided to ship it with the current request since it is a generic 
change
which we will need in a moment.

The reason for this patch series to be RFC is that I still have some doubts 
about generic code I touched.
I hope that I haven't broken anything for x86, but confirmation is needed.

I didn't include IPMMU-VMSA driver in this request. Although, I am still in 
progress, I want to say
that passthrough use-cases (actually what this all are firstly needed for) work 
for me with some limitations.
I tested on Salvator-X board (R-Car H3) with the next devices that have DMA IPs
connected to IPMMU uTLBs (using current master branch):
1. AUDMA is assigned to dom0 (protected by IOMMU)
2. SDHC0 is assigned to dom1 (passthrough to domain)
3. SDHC3 is assigned to dom2 (passthrough to domain)

During porting IPMMU-VMSA driver to Xen I was trying to be as close as possible 
to Linux [1]. But,
it was a little bit difficult). It would be really nice to have some feedback 
and get your feeling regarding this driver.
I am also interested in if I took the right direction or there are some other 
ideas on doing the same.
So, is it a right direction to move on?

You can find IPMMU-VMSA driver here.
repo: https://github.com/otyshchenko1/xen.git branch: ipmmu_ml
or
https://github.com/otyshchenko1/xen/commits/ipmmu_ml
It is located on top of "Unshared" IOMMU patch series and consist of 6 patches.

Thank you.

[1] Question about porting IPMMU-VMSA Linux driver to XEN
https://lists.xenproject.org/archives/html/xen-devel/2016-12/msg00992.html
[2] Unshared IOMMU issues 
https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg01781.html

Oleksandr Tyshchenko (9):
  xen/device-tree: Add dt_count_phandle_with_args helper
  iommu: Add ability to map/unmap the number of pages
  xen/arm: p2m: Add helper to convert p2m type to IOMMU flags
  xen/arm: p2m: Update IOMMU mapping whenever possible if page table is
not shared
  iommu/arm: Re-define iommu_use_hap_pt(d) as iommu_hap_pt_share
  iommu: Pass additional use_iommu argument to iommu_domain_init()
  iommu/arm: Add alloc_page_table platform callback
  iommu: Split iommu_hwdom_init() into arch specific parts
  xen: Add use_iommu flag to createdomain domctl

 tools/libxl/libxl_create.c  |  5 ++
 xen/arch/arm/domain.c   |  4 +-
 xen/arch/arm/p2m.c  | 40 +++-
 xen/arch/x86/domain.c   |  4 +-
 xen/common/device_tree.c|  7 +++
 xen/common/domctl.c |  5 +-
 xen/drivers/passthrough/arm/iommu.c | 12 -
 xen/drivers/passthrough/arm/smmu.c  |  3 ++
 xen/drivers/passthrough/iommu.c | 91 -
 xen/drivers/passthrough/x86/iommu.c | 36 +++
 xen/include/asm-arm/iommu.h |  7 ++-
 xen/include/asm-arm/p2m.h   | 34 ++
 xen/include/public/domctl.h |  3 ++
 xen/include/xen/device_tree.h   | 19 
 xen/include/xen/iommu.h | 20 ++--
 xen/include/xen/sched.h |  3 ++
 16 files changed, 239 insertions(+), 54 deletions(-)

-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH 7/9] iommu/arm: Add alloc_page_table platform callback

2017-03-15 Thread Oleksandr Tyshchenko
From: Oleksandr Tyshchenko 

The alloc_page_table callback is a mandatory thing
for the IOMMUs that don't share page table with the CPU on ARM.
The unshared IOMMUs have to perform all required actions here
to be ready to handle IOMMU mapping updates right after completing it.

The arch_iommu_populate_page_table() seems an appropriate place
to call newly created callback.
Since we will only be here for the unshared IOMMUs always
return error if the callback wasn't implemented.

Signed-off-by: Oleksandr Tyshchenko 
---
 xen/drivers/passthrough/arm/iommu.c | 5 +++--
 xen/include/xen/iommu.h | 1 +
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/passthrough/arm/iommu.c 
b/xen/drivers/passthrough/arm/iommu.c
index 95b1abb..f132032 100644
--- a/xen/drivers/passthrough/arm/iommu.c
+++ b/xen/drivers/passthrough/arm/iommu.c
@@ -70,6 +70,7 @@ void arch_iommu_domain_destroy(struct domain *d)
 
 int arch_iommu_populate_page_table(struct domain *d)
 {
-/* The IOMMU shares the p2m with the CPU */
-return -ENOSYS;
+const struct iommu_ops *ops = iommu_get_ops();
+
+return ops->alloc_page_table ? ops->alloc_page_table(d) : -ENOSYS;
 }
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index ab68ae2..3150d7b 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -181,6 +181,7 @@ struct iommu_ops {
 int __must_check (*unmap_page)(struct domain *d, unsigned long gfn);
 int __must_check (*unmap_pages)(struct domain *d, unsigned long gfn,
 unsigned long page_count);
+int (*alloc_page_table)(struct domain *d);
 void (*free_page_table)(struct page_info *);
 #ifdef CONFIG_X86
 void (*update_ire_from_apic)(unsigned int apic, unsigned int reg, unsigned 
int value);
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH 6/9] iommu: Pass additional use_iommu argument to iommu_domain_init()

2017-03-15 Thread Oleksandr Tyshchenko
From: Oleksandr Tyshchenko 

The presence of this flag lets us know that the guest
has devices which will most likely be used for passthrough.
In that case we have to call iommu_construct(), actually
what the real assign_device call usually does.

As iommu_domain_init() is called with forced to false use_iommu flag
for now, no functional change is intended.

Basically, this patch is needed for unshared IOMMUs on ARM only
since the unshared IOMMUs on x86 are ok if iommu_construct() is called
later. But, in order to be more generic and for possible future optimization
make this change applicable for both platforms.
So, the patch target is to make ARM happy and not to brake x86.
Confirmation from x86 guys is needed.

Signed-off-by: Oleksandr Tyshchenko 
---
 xen/arch/arm/domain.c   |  2 +-
 xen/arch/x86/domain.c   |  2 +-
 xen/drivers/passthrough/iommu.c | 11 +--
 xen/include/xen/iommu.h |  2 +-
 4 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index bb327da..bab62ee 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -550,7 +550,7 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
 ASSERT(config != NULL);
 
 /* p2m_init relies on some value initialized by the IOMMU subsystem */
-if ( (rc = iommu_domain_init(d)) != 0 )
+if ( (rc = iommu_domain_init(d, false)) != 0 )
 goto fail;
 
 if ( (rc = p2m_init(d)) != 0 )
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 479aee6..8ef4160 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -646,7 +646,7 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
 if ( (rc = init_domain_irq_mapping(d)) != 0 )
 goto fail;
 
-if ( (rc = iommu_domain_init(d)) != 0 )
+if ( (rc = iommu_domain_init(d, false)) != 0 )
 goto fail;
 }
 spin_lock_init(>arch.e820_lock);
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 115698f..6c17c59 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -129,7 +129,7 @@ static void __init parse_iommu_param(char *s)
 } while ( ss );
 }
 
-int iommu_domain_init(struct domain *d)
+int iommu_domain_init(struct domain *d, bool_t use_iommu)
 {
 struct domain_iommu *hd = dom_iommu(d);
 int ret = 0;
@@ -142,7 +142,14 @@ int iommu_domain_init(struct domain *d)
 return 0;
 
 hd->platform_ops = iommu_get_ops();
-return hd->platform_ops->init(d);
+ret = hd->platform_ops->init(d);
+if ( ret )
+return ret;
+
+if ( use_iommu && !is_hardware_domain(d) )
+ret = iommu_construct(d);
+
+return ret;
 }
 
 static void __hwdom_init check_hwdom_reqs(struct domain *d)
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 0446ed3..ab68ae2 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -56,7 +56,7 @@ int iommu_setup(void);
 int iommu_add_device(struct pci_dev *pdev);
 int iommu_enable_device(struct pci_dev *pdev);
 int iommu_remove_device(struct pci_dev *pdev);
-int iommu_domain_init(struct domain *d);
+int iommu_domain_init(struct domain *d, bool_t use_iommu);
 void iommu_hwdom_init(struct domain *d);
 void iommu_domain_destroy(struct domain *d);
 int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn);
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH 8/9] iommu: Split iommu_hwdom_init() into arch specific parts

2017-03-15 Thread Oleksandr Tyshchenko
From: Oleksandr Tyshchenko 

Logic on ARM was changed a bit.
Taking into account that we are here because we have the IOMMU
that doesn't share page table with the CPU and need_iommu flag is set
just call arch_iommu_populate_page_table() to allow unshared IOMMU
to allocate resources.

No functional change for x86 part.

Signed-off-by: Oleksandr Tyshchenko 
---
 xen/drivers/passthrough/arm/iommu.c |  7 +++
 xen/drivers/passthrough/iommu.c | 30 +-
 xen/drivers/passthrough/x86/iommu.c | 36 
 xen/include/xen/iommu.h |  1 +
 4 files changed, 45 insertions(+), 29 deletions(-)

diff --git a/xen/drivers/passthrough/arm/iommu.c 
b/xen/drivers/passthrough/arm/iommu.c
index f132032..2198723 100644
--- a/xen/drivers/passthrough/arm/iommu.c
+++ b/xen/drivers/passthrough/arm/iommu.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static const struct iommu_ops *iommu_ops;
 
@@ -59,6 +60,12 @@ void __hwdom_init 
arch_iommu_check_autotranslated_hwdom(struct domain *d)
 return;
 }
 
+void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
+{
+if ( need_iommu(d) && !iommu_use_hap_pt(d) )
+arch_iommu_populate_page_table(d);
+}
+
 int arch_iommu_domain_init(struct domain *d)
 {
 return iommu_dt_domain_init(d);
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 6c17c59..cfe3bd1 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -177,36 +177,8 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
 
 register_keyhandler('o', _dump_p2m_table, "dump iommu p2m table", 0);
 d->need_iommu = !!iommu_dom0_strict;
-if ( need_iommu(d) && !iommu_use_hap_pt(d) )
-{
-struct page_info *page;
-unsigned int i = 0;
-int rc = 0;
 
-page_list_for_each ( page, >page_list )
-{
-unsigned long mfn = page_to_mfn(page);
-unsigned long gfn = mfn_to_gmfn(d, mfn);
-unsigned int mapping = IOMMUF_readable;
-int ret;
-
-if ( ((page->u.inuse.type_info & PGT_count_mask) == 0) ||
- ((page->u.inuse.type_info & PGT_type_mask)
-  == PGT_writable_page) )
-mapping |= IOMMUF_writable;
-
-ret = hd->platform_ops->map_page(d, gfn, mfn, mapping);
-if ( !rc )
-rc = ret;
-
-if ( !(i++ & 0xf) )
-process_pending_softirqs();
-}
-
-if ( rc )
-printk(XENLOG_WARNING "d%d: IOMMU mapping failed: %d\n",
-   d->domain_id, rc);
-}
+arch_iommu_hwdom_init(d);
 
 return hd->platform_ops->hwdom_init(d);
 }
diff --git a/xen/drivers/passthrough/x86/iommu.c 
b/xen/drivers/passthrough/x86/iommu.c
index 69cd6c5..b353449 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -118,6 +118,42 @@ void __hwdom_init 
arch_iommu_check_autotranslated_hwdom(struct domain *d)
 panic("Presently, iommu must be enabled for PVH hardware domain\n");
 }
 
+void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
+{
+const struct domain_iommu *hd = dom_iommu(d);
+
+if ( need_iommu(d) && !iommu_use_hap_pt(d) )
+{
+struct page_info *page;
+unsigned int i = 0;
+int rc = 0;
+
+page_list_for_each ( page, >page_list )
+{
+unsigned long mfn = page_to_mfn(page);
+unsigned long gfn = mfn_to_gmfn(d, mfn);
+unsigned int mapping = IOMMUF_readable;
+int ret;
+
+if ( ((page->u.inuse.type_info & PGT_count_mask) == 0) ||
+ ((page->u.inuse.type_info & PGT_type_mask)
+  == PGT_writable_page) )
+mapping |= IOMMUF_writable;
+
+ret = hd->platform_ops->map_page(d, gfn, mfn, mapping);
+if ( !rc )
+rc = ret;
+
+if ( !(i++ & 0xf) )
+process_pending_softirqs();
+}
+
+if ( rc )
+printk(XENLOG_WARNING "d%d: IOMMU mapping failed: %d\n",
+   d->domain_id, rc);
+}
+}
+
 int arch_iommu_domain_init(struct domain *d)
 {
 struct domain_iommu *hd = dom_iommu(d);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 3150d7b..43cbb80 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -65,6 +65,7 @@ void arch_iommu_domain_destroy(struct domain *d);
 int arch_iommu_domain_init(struct domain *d);
 int arch_iommu_populate_page_table(struct domain *d);
 void arch_iommu_check_autotranslated_hwdom(struct domain *d);
+void arch_iommu_hwdom_init(struct domain *d);
 
 int iommu_construct(struct domain *d);
 
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH 9/9] xen: Add use_iommu flag to createdomain domctl

2017-03-15 Thread Oleksandr Tyshchenko
From: Oleksandr Tyshchenko 

This flag is intended to let Xen know that the guest has devices
which will most likely be used for passthrough.
The primary aim of this knowledge is to help the IOMMUs that don't
share page tables with the CPU be ready before P2M code starts
updating IOMMU mapping.
So, if this flag is set the unshared IOMMUs will populate their
page tables at the domain creation time and thereby will be able
to handle IOMMU mapping updates from *the very beginning*.

Signed-off-by: Oleksandr Tyshchenko 
---
 tools/libxl/libxl_create.c  | 5 +
 xen/arch/arm/domain.c   | 4 +++-
 xen/arch/x86/domain.c   | 4 +++-
 xen/common/domctl.c | 5 -
 xen/include/public/domctl.h | 3 +++
 xen/include/xen/sched.h | 3 +++
 6 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index e741b9a..4393fa2 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -546,6 +546,11 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config 
*d_config,
 flags |= XEN_DOMCTL_CDF_hap;
 }
 
+/* TODO Are these assumptions enough to make decision about using IOMMU? */
+if ((d_config->num_dtdevs && d_config->dtdevs) ||
+(d_config->num_pcidevs && d_config->pcidevs))
+flags |= XEN_DOMCTL_CDF_use_iommu;
+
 /* Ultimately, handle is an array of 16 uint8_t, same as uuid */
 libxl_uuid_copy(ctx, (libxl_uuid *)handle, >uuid);
 
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index bab62ee..940bb98 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -539,6 +539,7 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
struct xen_arch_domainconfig *config)
 {
 int rc, count = 0;
+bool_t use_iommu;
 
 BUILD_BUG_ON(GUEST_MAX_VCPUS < MAX_VIRT_CPUS);
 d->arch.relmem = RELMEM_not_started;
@@ -550,7 +551,8 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
 ASSERT(config != NULL);
 
 /* p2m_init relies on some value initialized by the IOMMU subsystem */
-if ( (rc = iommu_domain_init(d, false)) != 0 )
+use_iommu = !!(domcr_flags & DOMCRF_use_iommu);
+if ( (rc = iommu_domain_init(d, use_iommu)) != 0 )
 goto fail;
 
 if ( (rc = p2m_init(d)) != 0 )
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 8ef4160..7d634ff 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -525,6 +525,7 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
 {
 bool paging_initialised = false;
 int rc = -ENOMEM;
+bool_t use_iommu;
 
 if ( config == NULL && !is_idle_domain(d) )
 return -EINVAL;
@@ -646,7 +647,8 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
 if ( (rc = init_domain_irq_mapping(d)) != 0 )
 goto fail;
 
-if ( (rc = iommu_domain_init(d, false)) != 0 )
+use_iommu = !!(domcr_flags & DOMCRF_use_iommu);
+if ( (rc = iommu_domain_init(d, use_iommu)) != 0 )
 goto fail;
 }
 spin_lock_init(>arch.e820_lock);
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 93e3029..56c4d38 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -505,7 +505,8 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) 
u_domctl)
| XEN_DOMCTL_CDF_hap
| XEN_DOMCTL_CDF_s3_integrity
| XEN_DOMCTL_CDF_oos_off
-   | XEN_DOMCTL_CDF_xs_domain)) )
+   | XEN_DOMCTL_CDF_xs_domain
+   | XEN_DOMCTL_CDF_use_iommu)) )
 break;
 
 dom = op->domain;
@@ -549,6 +550,8 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) 
u_domctl)
 domcr_flags |= DOMCRF_oos_off;
 if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_xs_domain )
 domcr_flags |= DOMCRF_xs_domain;
+if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_use_iommu )
+domcr_flags |= DOMCRF_use_iommu;
 
 d = domain_create(dom, domcr_flags, op->u.createdomain.ssidref,
   >u.createdomain.config);
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 85cbb7c..a37a566 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -66,6 +66,9 @@ struct xen_domctl_createdomain {
  /* Is this a xenstore domain? */
 #define _XEN_DOMCTL_CDF_xs_domain 5
 #define XEN_DOMCTL_CDF_xs_domain  (1U<<_XEN_DOMCTL_CDF_xs_domain)
+ /* Should IOMMU page tables be populated at the domain creation time? */
+#define _XEN_DOMCTL_CDF_use_iommu 6
+#define XEN_DOMCTL_CDF_use_iommu  (1U<<_XEN_DOMCTL_CDF_use_iommu)
 uint32_t flags;
 struct xen_arch_domainconfig config;
 };
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 0929c0b..80e6fdc 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -561,6 

[Xen-devel] [RFC PATCH 3/9] xen/arm: p2m: Add helper to convert p2m type to IOMMU flags

2017-03-15 Thread Oleksandr Tyshchenko
From: Oleksandr Tyshchenko 

The helper has the same purpose as existing for x86 one.
It is used for choosing IOMMU mapping attribute according to
the memory type.

Signed-off-by: Oleksandr Tyshchenko 
---
 xen/include/asm-arm/p2m.h | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 0899523..4a93ba8 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 #include  /* for vm_event_response_t */
 #include 
 #include 
@@ -354,6 +355,39 @@ static inline gfn_t gfn_next_boundary(gfn_t gfn, unsigned 
int order)
 return gfn_add(gfn, 1UL << order);
 }
 
+/*
+ * p2m type to IOMMU flags
+ */
+static inline unsigned int p2m_get_iommu_flags(p2m_type_t p2mt)
+{
+unsigned int flags;
+
+switch( p2mt )
+{
+case p2m_ram_rw:
+case p2m_iommu_map_rw:
+case p2m_map_foreign:
+case p2m_grant_map_rw:
+case p2m_mmio_direct_dev:
+case p2m_mmio_direct_nc:
+case p2m_mmio_direct_c:
+flags = IOMMUF_readable | IOMMUF_writable;
+break;
+case p2m_ram_ro:
+case p2m_iommu_map_ro:
+case p2m_grant_map_ro:
+flags = IOMMUF_readable;
+break;
+default:
+flags = 0;
+break;
+}
+
+/* TODO Do we need to handle access permissions here? */
+
+return flags;
+}
+
 #endif /* _XEN_P2M_H */
 
 /*
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH 5/9] iommu/arm: Re-define iommu_use_hap_pt(d) as iommu_hap_pt_share

2017-03-15 Thread Oleksandr Tyshchenko
From: Oleksandr Tyshchenko 

Not every integrated into ARM SoCs IOMMU can share page tables
with the CPU and as result the iommu_use_hap_pt(d) is not always true.
Reuse x86's iommu_hap_pt_share flag to indicate whether the IOMMU
page table is shared or not.

Now all IOMMU drivers on ARM are able to change this flag
according to their possibilities like x86-variants do.
Therefore set iommu_hap_pt_share flag for SMMU because it always shares
page table with the CPU.

Signed-off-by: Oleksandr Tyshchenko 
---
 xen/drivers/passthrough/arm/smmu.c | 3 +++
 xen/include/asm-arm/iommu.h| 7 +--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/passthrough/arm/smmu.c 
b/xen/drivers/passthrough/arm/smmu.c
index 1082fcf..b2bb41f 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2833,6 +2833,9 @@ static __init int arm_smmu_dt_init(struct dt_device_node 
*dev,
 
platform_features &= smmu->features;
 
+   /* Always share P2M table between the CPU and the SMMU */
+   iommu_hap_pt_share = true;
+
return 0;
 }
 
diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
index 57d9b1e..10a6f23 100644
--- a/xen/include/asm-arm/iommu.h
+++ b/xen/include/asm-arm/iommu.h
@@ -20,8 +20,11 @@ struct arch_iommu
 void *priv;
 };
 
-/* Always share P2M Table between the CPU and the IOMMU */
-#define iommu_use_hap_pt(d) (1)
+/*
+ * The ARM domain always has a P2M table, but not every integrated into
+ * ARM SoCs IOMMU can use it as page table.
+ */
+#define iommu_use_hap_pt(d) (iommu_hap_pt_share)
 
 const struct iommu_ops *iommu_get_ops(void);
 void __init iommu_set_ops(const struct iommu_ops *ops);
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/time: Don't use virtual TSC if host and guest frequencies are equal

2017-03-15 Thread Boris Ostrovsky
Commit 82713ec8d2 ("x86: use native RDTSC(P) execution when guest and
host frequencies are the same") left out optimization for PV guests
when host and guest run at the same frequency.

For such a case we should be able not to use virtual TSC regardless
of whether we are runing before or after a migration (i.e. regardless
of incarnation value).

Signed-off-by: Boris Ostrovsky 
Suggested-by: Jan Beulich 
---
 xen/arch/x86/time.c | 14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index faa638b..1a13f2f 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -2051,17 +2051,11 @@ void tsc_set_info(struct domain *d,
 d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
 d->arch.tsc_khz = gtsc_khz ?: cpu_khz;
 set_time_scale(>arch.vtsc_to_ns, d->arch.tsc_khz * 1000);
-/*
- * In default mode use native TSC if the host has safe TSC and:
- *  HVM/PVH: host and guest frequencies are the same (either
- *   "naturally" or via TSC scaling)
- *  PV: guest has not migrated yet (and thus arch.tsc_khz == cpu_khz)
- */
+
 if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() &&
- (has_hvm_container_domain(d) ?
-  (d->arch.tsc_khz == cpu_khz ||
-   hvm_get_tsc_scaling_ratio(d->arch.tsc_khz)) :
-  incarnation == 0) )
+ (d->arch.tsc_khz == cpu_khz || incarnation == 0 ||
+  (has_hvm_container_domain(d) &&
+   hvm_get_tsc_scaling_ratio(d->arch.tsc_khz))) )
 {
 case TSC_MODE_NEVER_EMULATE:
 d->arch.vtsc = 0;
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 0/7] Xen transport for 9pfs frontend driver

2017-03-15 Thread Stefano Stabellini
Hi all,

This patch series implements a new transport for 9pfs, aimed at Xen
systems.

The transport is based on a traditional Xen frontend and backend drivers
pair. This patch series implements the frontend, which typically runs in
a regular unprivileged guest.

I also sent a series that implements the backend in userspace in QEMU,
which typically runs in Dom0 (but could also run in a another guest).

The frontend complies to the Xen transport for 9pfs specification
version 1, available here:

http://xenbits.xen.org/gitweb/?p=xen.git;a=blob_plain;f=docs/misc/9pfs.markdown;hb=HEAD


Changes in v4:
- code style improvements
- use xenbus_read_unsigned when possible
- do not leak "versions"
- introduce BUILD_BUG_ON
- introduce rwlock to protect the xen_9pfs_devs list
- add review-by

Changes in v3:
- add full copyright header to trans_xen.c
- rename ring->ring to ring->data
- handle gnttab_grant_foreign_access errors
- remove ring->bytes
- wrap long lines
- add reviewed-by

Changes in v2:
- use XEN_PAGE_SHIFT instead of PAGE_SHIFT
- remove unnecessary initializations
- fix error paths
- fix memory allocations for 64K kernels
- simplify p9_xen_create and p9_xen_close
- use virt_XXX barriers
- set status = REQ_STATUS_ERROR inside the p9_xen_response loop
- add in-code comments


Stefano Stabellini (7):
  xen: import new ring macros in ring.h
  xen: introduce the header file for the Xen 9pfs transport protocol
  xen/9pfs: introduce Xen 9pfs transport driver
  xen/9pfs: connect to the backend
  xen/9pfs: send requests to the backend
  xen/9pfs: receive responses
  xen/9pfs: build 9pfs Xen transport driver

 include/xen/interface/io/9pfs.h |  40 
 include/xen/interface/io/ring.h | 131 ++
 net/9p/Kconfig  |   8 +
 net/9p/Makefile |   4 +
 net/9p/trans_xen.c  | 513 
 5 files changed, 696 insertions(+)
 create mode 100644 include/xen/interface/io/9pfs.h
 create mode 100644 net/9p/trans_xen.c


Cheers,

Stefano

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 5/7] xen/9pfs: send requests to the backend

2017-03-15 Thread Stefano Stabellini
Implement struct p9_trans_module create and close functions by looking
at the available Xen 9pfs frontend-backend connections. We don't expect
many frontend-backend connections, thus walking a list is OK.

Send requests to the backend by copying each request to one of the
available rings (each frontend-backend connection comes with multiple
rings). Handle the ring and notifications following the 9pfs
specification. If there are not enough free bytes on the ring for the
request, wait on the wait_queue: the backend will send a notification
after consuming more requests.

Signed-off-by: Stefano Stabellini 
Reviewed-by: Boris Ostrovsky 
CC: gr...@kaod.org
CC: jgr...@suse.com
CC: Eric Van Hensbergen 
CC: Ron Minnich 
CC: Latchesar Ionkov 
CC: v9fs-develo...@lists.sourceforge.net
---
 net/9p/trans_xen.c | 89 --
 1 file changed, 87 insertions(+), 2 deletions(-)

diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
index ada2b0c..2b18da0 100644
--- a/net/9p/trans_xen.c
+++ b/net/9p/trans_xen.c
@@ -72,22 +72,107 @@ struct xen_9pfs_front_priv {
 static LIST_HEAD(xen_9pfs_devs);
 static DEFINE_RWLOCK(xen_9pfs_lock); 
 
+/* We don't currently allow canceling of requests */
 static int p9_xen_cancel(struct p9_client *client, struct p9_req_t *req)
 {
-   return 0;
+   return 1;
 }
 
 static int p9_xen_create(struct p9_client *client, const char *addr, char 
*args)
 {
-   return 0;
+   struct xen_9pfs_front_priv *priv;
+
+   read_lock(_9pfs_lock);
+   list_for_each_entry(priv, _9pfs_devs, list) {
+   if (!strcmp(priv->tag, addr)) {
+   priv->client = client; 
+   read_unlock(_9pfs_lock);
+   return 0;
+   }
+   }
+   read_unlock(_9pfs_lock);
+   return -EINVAL;
 }
 
 static void p9_xen_close(struct p9_client *client)
 {
+   struct xen_9pfs_front_priv *priv;
+
+   read_lock(_9pfs_lock);
+   list_for_each_entry(priv, _9pfs_devs, list) {
+   if (priv->client == client) {
+   priv->client = NULL; 
+   read_unlock(_9pfs_lock);
+   return;
+   }
+   }
+   read_unlock(_9pfs_lock);
+   return;
+}
+
+static int p9_xen_write_todo(struct xen_9pfs_dataring *ring, RING_IDX size)
+{
+   RING_IDX cons, prod;
+
+   cons = ring->intf->out_cons;
+   prod = ring->intf->out_prod;
+   virt_mb();
+
+   if (XEN_9PFS_RING_SIZE - xen_9pfs_queued(prod, cons, 
XEN_9PFS_RING_SIZE) >= size)
+   return 1;
+   else
+   return 0;
 }
 
 static int p9_xen_request(struct p9_client *client, struct p9_req_t *p9_req)
 {
+   struct xen_9pfs_front_priv *priv = NULL;
+   RING_IDX cons, prod, masked_cons, masked_prod;
+   unsigned long flags;
+   uint32_t size = p9_req->tc->size;
+   struct xen_9pfs_dataring *ring;
+   int num;
+
+   read_lock(_9pfs_lock);
+   list_for_each_entry(priv, _9pfs_devs, list) {
+   if (priv->client == client)
+   break;
+   }
+   read_unlock(_9pfs_lock);
+   if (priv == NULL || priv->client != client)
+   return -EINVAL;
+
+   num = p9_req->tc->tag % priv->num_rings;
+   ring = >rings[num];
+
+again:
+   while (wait_event_interruptible(ring->wq,
+   p9_xen_write_todo(ring, size) > 0) != 0);
+
+   spin_lock_irqsave(>lock, flags);
+   cons = ring->intf->out_cons;
+   prod = ring->intf->out_prod;
+   virt_mb();
+
+   if (XEN_9PFS_RING_SIZE - xen_9pfs_queued(prod, cons, 
XEN_9PFS_RING_SIZE) < size) {
+   spin_unlock_irqrestore(>lock, flags);
+   goto again;
+   }
+
+   masked_prod = xen_9pfs_mask(prod, XEN_9PFS_RING_SIZE);
+   masked_cons = xen_9pfs_mask(cons, XEN_9PFS_RING_SIZE);
+
+   xen_9pfs_write_packet(ring->data.out,
+   _prod, masked_cons,
+   XEN_9PFS_RING_SIZE, p9_req->tc->sdata, size);
+
+   p9_req->status = REQ_STATUS_SENT;
+   virt_wmb(); /* write ring before updating pointer */
+   prod += size;
+   ring->intf->out_prod = prod;
+   spin_unlock_irqrestore(>lock, flags);
+   notify_remote_via_irq(ring->irq);
+
return 0;
 }
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 1/7] xen: import new ring macros in ring.h

2017-03-15 Thread Stefano Stabellini
Sync the ring.h file with upstream Xen, to introduce the new ring macros.
They will be used by the Xen transport for 9pfs.

Signed-off-by: Stefano Stabellini 
CC: konrad.w...@oracle.com
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
CC: gr...@kaod.org

---
NB: The new macros have not been committed to Xen yet. Do not apply this
patch until they do.
---
---
 include/xen/interface/io/ring.h | 131 
 1 file changed, 131 insertions(+)

diff --git a/include/xen/interface/io/ring.h b/include/xen/interface/io/ring.h
index 21f4fbd..500e68d 100644
--- a/include/xen/interface/io/ring.h
+++ b/include/xen/interface/io/ring.h
@@ -283,4 +283,135 @@ struct __name##_back_ring {   
\
 (_work_to_do) = RING_HAS_UNCONSUMED_RESPONSES(_r); \
 } while (0)
 
+
+/*
+ * DEFINE_XEN_FLEX_RING_AND_INTF defines two monodirectional rings and
+ * functions to check if there is data on the ring, and to read and
+ * write to them.
+ *
+ * DEFINE_XEN_FLEX_RING is similar to DEFINE_XEN_FLEX_RING_AND_INTF, but
+ * does not define the indexes page. As different protocols can have
+ * extensions to the basic format, this macro allow them to define their
+ * own struct.
+ *
+ * XEN_FLEX_RING_SIZE
+ *   Convenience macro to calculate the size of one of the two rings
+ *   from the overall order.
+ *
+ * $NAME_mask
+ *   Function to apply the size mask to an index, to reduce the index
+ *   within the range [0-size].
+ *
+ * $NAME_read_packet
+ *   Function to read data from the ring. The amount of data to read is
+ *   specified by the "size" argument.
+ *
+ * $NAME_write_packet
+ *   Function to write data to the ring. The amount of data to write is
+ *   specified by the "size" argument.
+ *
+ * $NAME_get_ring_ptr
+ *   Convenience function that returns a pointer to read/write to the
+ *   ring at the right location.
+ *
+ * $NAME_data_intf
+ *   Indexes page, shared between frontend and backend. It also
+ *   contains the array of grant refs.
+ *
+ * $NAME_queued
+ *   Function to calculate how many bytes are currently on the ring,
+ *   ready to be read. It can also be used to calculate how much free
+ *   space is currently on the ring (ring_size - $NAME_queued()).
+ */
+#define XEN_FLEX_RING_SIZE(order) \
+(1UL << (order + XEN_PAGE_SHIFT - 1))
+
+#define DEFINE_XEN_FLEX_RING_AND_INTF(name)   \
+struct name##_data_intf { \
+RING_IDX in_cons, in_prod;\
+  \
+uint8_t pad1[56]; \
+  \
+RING_IDX out_cons, out_prod;  \
+  \
+uint8_t pad2[56]; \
+  \
+RING_IDX ring_order;  \
+grant_ref_t ref[];\
+};\
+DEFINE_XEN_FLEX_RING(name);
+
+#define DEFINE_XEN_FLEX_RING(name)\
+static inline RING_IDX name##_mask(RING_IDX idx, RING_IDX ring_size)  \
+{ \
+return (idx & (ring_size - 1));   \
+} \
+  \
+static inline RING_IDX name##_mask_order(RING_IDX idx, RING_IDX ring_order)   \
+{ \
+return (idx & (XEN_FLEX_RING_SIZE(ring_order) - 1));  \
+} \
+  \
+static inline unsigned char* name##_get_ring_ptr(unsigned char *buf,  \
+ RING_IDX idx,\
+ RING_IDX ring_order) \
+{ \
+return buf + name##_mask_order(idx, ring_order);  \
+} \
+  \
+static inline void 

[Xen-devel] [PATCH v4 3/7] xen/9pfs: introduce Xen 9pfs transport driver

2017-03-15 Thread Stefano Stabellini
Introduce the Xen 9pfs transport driver: add struct xenbus_driver to
register as a xenbus driver and add struct p9_trans_module to register
as v9fs driver.

All functions are empty stubs for now.

Signed-off-by: Stefano Stabellini 
Reviewed-by: Boris Ostrovsky 
CC: gr...@kaod.org
CC: jgr...@suse.com
CC: Eric Van Hensbergen 
CC: Ron Minnich 
CC: Latchesar Ionkov 
CC: v9fs-develo...@lists.sourceforge.net
---
 net/9p/trans_xen.c | 125 +
 1 file changed, 125 insertions(+)
 create mode 100644 net/9p/trans_xen.c

diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
new file mode 100644
index 000..b1333b2
--- /dev/null
+++ b/net/9p/trans_xen.c
@@ -0,0 +1,125 @@
+/*
+ * linux/fs/9p/trans_xen
+ *
+ * Xen transport layer.
+ *
+ * Copyright (C) 2017 by Stefano Stabellini 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+static int p9_xen_cancel(struct p9_client *client, struct p9_req_t *req)
+{
+   return 0;
+}
+
+static int p9_xen_create(struct p9_client *client, const char *addr, char 
*args)
+{
+   return 0;
+}
+
+static void p9_xen_close(struct p9_client *client)
+{
+}
+
+static int p9_xen_request(struct p9_client *client, struct p9_req_t *p9_req)
+{
+   return 0;
+}
+
+static struct p9_trans_module p9_xen_trans = {
+   .name = "xen",
+   .maxsize = 1 << (XEN_9PFS_RING_ORDER + XEN_PAGE_SHIFT),
+   .def = 1,
+   .create = p9_xen_create,
+   .close = p9_xen_close,
+   .request = p9_xen_request,
+   .cancel = p9_xen_cancel,
+   .owner = THIS_MODULE,
+};
+
+static const struct xenbus_device_id xen_9pfs_front_ids[] = {
+   { "9pfs" },
+   { "" }
+};
+
+static int xen_9pfs_front_remove(struct xenbus_device *dev)
+{
+   return 0;
+}
+
+static int xen_9pfs_front_probe(struct xenbus_device *dev,
+   const struct xenbus_device_id *id)
+{
+   return 0;
+}
+
+static int xen_9pfs_front_resume(struct xenbus_device *dev)
+{
+   return 0;
+}
+
+static void xen_9pfs_front_changed(struct xenbus_device *dev,
+   enum xenbus_state backend_state)
+{
+}
+
+static struct xenbus_driver xen_9pfs_front_driver = {
+   .ids = xen_9pfs_front_ids,
+   .probe = xen_9pfs_front_probe,
+   .remove = xen_9pfs_front_remove,
+   .resume = xen_9pfs_front_resume,
+   .otherend_changed = xen_9pfs_front_changed,
+};
+
+int p9_trans_xen_init(void)
+{
+   if (!xen_domain())
+   return -ENODEV;
+
+   pr_info("Initialising Xen transport for 9pfs\n");
+
+   v9fs_register_trans(_xen_trans);
+   return xenbus_register_frontend(_9pfs_front_driver);
+}
+module_init(p9_trans_xen_init);
+
+void p9_trans_xen_exit(void)
+{
+   v9fs_unregister_trans(_xen_trans);
+   return xenbus_unregister_driver(_9pfs_front_driver);
+}
+module_exit(p9_trans_xen_exit);
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 4/7] xen/9pfs: connect to the backend

2017-03-15 Thread Stefano Stabellini
Implement functions to handle the xenbus handshake. Upon connection,
allocate the rings according to the protocol specification.

Initialize a work_struct and a wait_queue. The work_struct will be used
to schedule work upon receiving an event channel notification from the
backend. The wait_queue will be used to wait when the ring is full and
we need to send a new request.

Signed-off-by: Stefano Stabellini 
CC: gr...@kaod.org
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
CC: Eric Van Hensbergen 
CC: Ron Minnich 
CC: Latchesar Ionkov 
CC: v9fs-develo...@lists.sourceforge.net
---
 net/9p/trans_xen.c | 248 +
 1 file changed, 248 insertions(+)

diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
index b1333b2..ada2b0c 100644
--- a/net/9p/trans_xen.c
+++ b/net/9p/trans_xen.c
@@ -37,10 +37,41 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
 
+#define XEN_9PFS_NUM_RINGS 2
+
+/* One per ring, more than one per 9pfs share */
+struct xen_9pfs_dataring {
+   struct xen_9pfs_front_priv *priv;
+
+   struct xen_9pfs_data_intf *intf;
+   grant_ref_t ref;
+   int evtchn;
+   int irq;
+   spinlock_t lock;
+
+   struct xen_9pfs_data data;
+   wait_queue_head_t wq;
+   struct work_struct work;
+};
+
+/* One per 9pfs share */
+struct xen_9pfs_front_priv {
+   struct list_head list;
+   struct xenbus_device *dev;
+   char *tag;
+   struct p9_client *client;
+
+   int num_rings;
+   struct xen_9pfs_dataring *rings;
+};
+static LIST_HEAD(xen_9pfs_devs);
+static DEFINE_RWLOCK(xen_9pfs_lock); 
+
 static int p9_xen_cancel(struct p9_client *client, struct p9_req_t *req)
 {
return 0;
@@ -60,6 +91,21 @@ static int p9_xen_request(struct p9_client *client, struct 
p9_req_t *p9_req)
return 0;
 }
 
+static void p9_xen_response(struct work_struct *work)
+{
+}
+
+static irqreturn_t xen_9pfs_front_event_handler(int irq, void *r)
+{
+   struct xen_9pfs_dataring *ring = r;
+   BUG_ON(!ring || !ring->priv->client);
+
+   wake_up_interruptible(>wq);
+   schedule_work(>work);
+
+   return IRQ_HANDLED;
+}
+
 static struct p9_trans_module p9_xen_trans = {
.name = "xen",
.maxsize = 1 << (XEN_9PFS_RING_ORDER + XEN_PAGE_SHIFT),
@@ -76,25 +122,227 @@ static int p9_xen_request(struct p9_client *client, 
struct p9_req_t *p9_req)
{ "" }
 };
 
+static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
+{
+   int i, j;
+
+   write_lock(_9pfs_lock);
+   list_del(>list);
+   write_unlock(_9pfs_lock);
+
+   for (i = 0; i < priv->num_rings; i++) {
+   if (priv->rings[i].intf == NULL)
+   break;
+   if (priv->rings[i].irq > 0)
+   unbind_from_irqhandler(priv->rings[i].irq, priv->dev);
+   if (priv->rings[i].data.in != NULL) {
+   for (j = 0; j < (1 << XEN_9PFS_RING_ORDER); j++)
+   
gnttab_end_foreign_access(priv->rings[i].intf->ref[j], 0, 0);
+   free_pages((unsigned long)priv->rings[i].data.in,
+   XEN_9PFS_RING_ORDER - (PAGE_SHIFT - 
XEN_PAGE_SHIFT));
+   }
+   gnttab_end_foreign_access(priv->rings[i].ref, 0, 0);
+   free_page((unsigned long)priv->rings[i].intf);
+   }
+   kfree(priv->rings);
+   kfree(priv);
+}
+
 static int xen_9pfs_front_remove(struct xenbus_device *dev)
 {
+   struct xen_9pfs_front_priv *priv = dev_get_drvdata(>dev);
+
+   dev_set_drvdata(>dev, NULL);
+   xen_9pfs_front_free(priv);
+   return 0;
+}
+
+static int xen_9pfs_front_alloc_dataring(struct xenbus_device *dev,
+   struct xen_9pfs_dataring *ring)
+{
+   int i = 0;
+   int ret = -ENOMEM;
+   void *bytes = NULL;
+
+   init_waitqueue_head(>wq);
+   spin_lock_init(>lock);
+   INIT_WORK(>work, p9_xen_response);
+
+   ring->intf = (struct xen_9pfs_data_intf *) get_zeroed_page(GFP_KERNEL);
+   if (!ring->intf)
+   return ret;
+   ret = ring->ref = gnttab_grant_foreign_access(dev->otherend_id,
+   virt_to_gfn(ring->intf), 0);
+   if (ret < 0)
+   goto out;
+   bytes = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
+   XEN_9PFS_RING_ORDER - (PAGE_SHIFT - XEN_PAGE_SHIFT));
+   if (bytes == NULL)
+   goto out;
+   for (; i < (1 << XEN_9PFS_RING_ORDER); i++) {
+   ret = ring->intf->ref[i] = gnttab_grant_foreign_access(
+   dev->otherend_id, virt_to_gfn(bytes) + i, 0);
+   if (ret < 0)
+   goto out;
+   }
+   ring->data.in = bytes;
+   ring->data.out = bytes + XEN_9PFS_RING_SIZE;
+
+   ret = xenbus_alloc_evtchn(dev, >evtchn);

[Xen-devel] [PATCH v4 7/7] xen/9pfs: build 9pfs Xen transport driver

2017-03-15 Thread Stefano Stabellini
This patch adds a Kconfig option and Makefile support for building the
9pfs Xen driver.

Signed-off-by: Stefano Stabellini 
Reviewed-by: Juergen Gross 
CC: gr...@kaod.org
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
CC: Eric Van Hensbergen 
CC: Ron Minnich 
CC: Latchesar Ionkov 
CC: v9fs-develo...@lists.sourceforge.net
---
 net/9p/Kconfig  | 8 
 net/9p/Makefile | 4 
 2 files changed, 12 insertions(+)

diff --git a/net/9p/Kconfig b/net/9p/Kconfig
index a75174a..5c5649b 100644
--- a/net/9p/Kconfig
+++ b/net/9p/Kconfig
@@ -22,6 +22,14 @@ config NET_9P_VIRTIO
  This builds support for a transports between
  guest partitions and a host partition.
 
+config NET_9P_XEN
+   depends on XEN
+   tristate "9P Xen Transport"
+   help
+ This builds support for a transport between
+ two Xen domains.
+
+
 config NET_9P_RDMA
depends on INET && INFINIBAND && INFINIBAND_ADDR_TRANS
tristate "9P RDMA Transport (Experimental)"
diff --git a/net/9p/Makefile b/net/9p/Makefile
index a0874cc..697ea7c 100644
--- a/net/9p/Makefile
+++ b/net/9p/Makefile
@@ -1,4 +1,5 @@
 obj-$(CONFIG_NET_9P) := 9pnet.o
+obj-$(CONFIG_NET_9P_XEN) += 9pnet_xen.o
 obj-$(CONFIG_NET_9P_VIRTIO) += 9pnet_virtio.o
 obj-$(CONFIG_NET_9P_RDMA) += 9pnet_rdma.o
 
@@ -14,5 +15,8 @@ obj-$(CONFIG_NET_9P_RDMA) += 9pnet_rdma.o
 9pnet_virtio-objs := \
trans_virtio.o \
 
+9pnet_xen-objs := \
+   trans_xen.o \
+
 9pnet_rdma-objs := \
trans_rdma.o \
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 6/7] xen/9pfs: receive responses

2017-03-15 Thread Stefano Stabellini
Upon receiving a notification from the backend, schedule the
p9_xen_response work_struct. p9_xen_response checks if any responses are
available, if so, it reads them one by one, calling p9_client_cb to send
them up to the 9p layer (p9_client_cb completes the request). Handle the
ring following the Xen 9pfs specification.

Signed-off-by: Stefano Stabellini 
Reviewed-by: Boris Ostrovsky 
CC: gr...@kaod.org
CC: jgr...@suse.com
CC: Eric Van Hensbergen 
CC: Ron Minnich 
CC: Latchesar Ionkov 
CC: v9fs-develo...@lists.sourceforge.net
---
 net/9p/trans_xen.c | 55 ++
 1 file changed, 55 insertions(+)

diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
index 2b18da0..2db80c7 100644
--- a/net/9p/trans_xen.c
+++ b/net/9p/trans_xen.c
@@ -178,6 +178,61 @@ static int p9_xen_request(struct p9_client *client, struct 
p9_req_t *p9_req)
 
 static void p9_xen_response(struct work_struct *work)
 {
+   struct xen_9pfs_front_priv *priv;
+   struct xen_9pfs_dataring *ring;
+   RING_IDX cons, prod, masked_cons, masked_prod;
+   struct xen_9pfs_header h;
+   struct p9_req_t *req;
+   int status;
+
+   ring = container_of(work, struct xen_9pfs_dataring, work);
+   priv = ring->priv;
+
+   while (1) {
+   cons = ring->intf->in_cons;
+   prod = ring->intf->in_prod;
+   virt_rmb();
+
+   if (xen_9pfs_queued(prod, cons, XEN_9PFS_RING_SIZE) < 
sizeof(h)) {
+   notify_remote_via_irq(ring->irq);
+   return;
+   }
+
+   masked_prod = xen_9pfs_mask(prod, XEN_9PFS_RING_SIZE);
+   masked_cons = xen_9pfs_mask(cons, XEN_9PFS_RING_SIZE);
+
+   /* First, read just the header */
+   xen_9pfs_read_packet(ring->data.in,
+masked_prod, _cons,
+XEN_9PFS_RING_SIZE, , sizeof(h));
+
+   req = p9_tag_lookup(priv->client, h.tag);
+   if (!req || req->status != REQ_STATUS_SENT) {
+   dev_warn(>dev->dev, "Wrong req tag=%x\n", h.tag);
+   cons += h.size;
+   virt_mb();
+   ring->intf->in_cons = cons;
+   continue;
+   }
+
+   memcpy(req->rc, , sizeof(h));
+   req->rc->offset = 0;
+
+   masked_cons = xen_9pfs_mask(cons, XEN_9PFS_RING_SIZE);
+   /* Then, read the whole packet (including the header) */
+   xen_9pfs_read_packet(ring->data.in,
+   masked_prod, _cons,
+   XEN_9PFS_RING_SIZE, req->rc->sdata, h.size);
+
+   virt_mb();
+   cons += h.size;
+   ring->intf->in_cons = cons;
+
+   status = (req->status != REQ_STATUS_ERROR) ?
+   REQ_STATUS_RCVD : REQ_STATUS_ERROR;
+
+   p9_client_cb(priv->client, req, status);
+   }
 }
 
 static irqreturn_t xen_9pfs_front_event_handler(int irq, void *r)
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 2/7] xen: introduce the header file for the Xen 9pfs transport protocol

2017-03-15 Thread Stefano Stabellini
It uses the new ring.h macros to declare rings and interfaces.

Signed-off-by: Stefano Stabellini 
CC: konrad.w...@oracle.com
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
CC: gr...@kaod.org
---
 include/xen/interface/io/9pfs.h | 40 
 1 file changed, 40 insertions(+)
 create mode 100644 include/xen/interface/io/9pfs.h

diff --git a/include/xen/interface/io/9pfs.h b/include/xen/interface/io/9pfs.h
new file mode 100644
index 000..276eda4
--- /dev/null
+++ b/include/xen/interface/io/9pfs.h
@@ -0,0 +1,40 @@
+/*
+ * 9pfs.h -- Xen 9PFS transport
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (C) 2017 Stefano Stabellini 
+ */
+
+#ifndef __XEN_PUBLIC_IO_9PFS_H__
+#define __XEN_PUBLIC_IO_9PFS_H__
+
+#include "xen/interface/io/ring.h"
+
+struct xen_9pfs_header {
+   uint32_t size;
+   uint8_t id;
+   uint16_t tag;
+} __attribute__((packed));
+
+#define XEN_9PFS_RING_ORDER 6
+#define XEN_9PFS_RING_SIZE  XEN_FLEX_RING_SIZE(XEN_9PFS_RING_ORDER)
+DEFINE_XEN_FLEX_RING_AND_INTF(xen_9pfs);
+
+#endif
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/9] xen: import ring.h from xen

2017-03-15 Thread Stefano Stabellini
On Wed, 15 Mar 2017, Greg Kurz wrote:
> On Wed, 15 Mar 2017 11:36:12 -0700 (PDT)
> Stefano Stabellini  wrote:
> 
> > On Wed, 15 Mar 2017, Greg Kurz wrote:
> > > On Mon, 13 Mar 2017 16:55:53 -0700
> > > Stefano Stabellini  wrote:
> > >   
> > > > Do not use the ring.h header installed on the system. Instead, import
> > > > the header into the QEMU codebase. This avoids problems when QEMU is
> > > > built against a Xen version too old to provide all the ring macros.
> > > >   
> > > 
> > > What kind of problems ?  
> > 
> > The FLEX macros are only available in Xen 4.9+ (still unreleased).
> > However, aside from these macros, the Xen 9pfs frontends and backends
> > could work fine on any Xen releases. In fact, the Xen public/io headers
> > are only provided as reference.
> > 
> 
> Ok.
> 
> > 
> > > > Signed-off-by: Stefano Stabellini 
> > > > CC: anthony.per...@citrix.com
> > > > CC: jgr...@suse.com
> > > > ---
> > > > NB: The new macros have not been committed to Xen yet. Do not apply this
> > > > patch until they do.  
> > > 
> > > Why ? Does this break compat with older Xen ?  
> > 
> > No, it does not break compatibility. But I think it is a good idea to
> > commit the header to QEMU only after the corresponding Xen header has
> > been accepted. I don't want the two to diverge.
> > 
> 
> Fair enough but this will put the entire patchset on hold then. When Xen 4.9
> is supposed to be released ?

In a few months, but I don't think we need to wait until the release,
just for the reviewed-by on the new macros, that should come soon.


> > > > ---
> > > > ---
> > > >  hw/block/xen_blkif.h |   2 +-
> > > >  hw/usb/xen-usb.c |   2 +-
> > > >  include/hw/xen/io/ring.h | 455 
> > > > +++
> > > >  3 files changed, 457 insertions(+), 2 deletions(-)
> > > >  create mode 100644 include/hw/xen/io/ring.h
> > > > 
> > > > diff --git a/hw/block/xen_blkif.h b/hw/block/xen_blkif.h
> > > > index 3300b6f..3e6e1ea 100644
> > > > --- a/hw/block/xen_blkif.h
> > > > +++ b/hw/block/xen_blkif.h
> > > > @@ -1,7 +1,7 @@
> > > >  #ifndef XEN_BLKIF_H
> > > >  #define XEN_BLKIF_H
> > > >  
> > > > -#include 
> > > > +#include "hw/xen/io/ring.h"
> > > >  #include 
> > > >  #include 
> > > >  
> > > > diff --git a/hw/usb/xen-usb.c b/hw/usb/xen-usb.c
> > > > index 8e676e6..370b3d9 100644
> > > > --- a/hw/usb/xen-usb.c
> > > > +++ b/hw/usb/xen-usb.c
> > > > @@ -33,7 +33,7 @@
> > > >  #include "qapi/qmp/qint.h"
> > > >  #include "qapi/qmp/qstring.h"
> > > >  
> > > > -#include 
> > > > +#include "hw/xen/io/ring.h"
> > > >  #include 
> > > >  
> > > >  /*
> > > > diff --git a/include/hw/xen/io/ring.h b/include/hw/xen/io/ring.h
> > > > new file mode 100644
> > > > index 000..cf01fc3
> > > > --- /dev/null
> > > > +++ b/include/hw/xen/io/ring.h
> > > > @@ -0,0 +1,455 @@
> > > > +/**
> > > > + * ring.h
> > > > + * 
> > > > + * Shared producer-consumer ring macros.
> > > > + *
> > > > + * Permission is hereby granted, free of charge, to any person 
> > > > obtaining a copy
> > > > + * of this software and associated documentation files (the 
> > > > "Software"), to
> > > > + * deal in the Software without restriction, including without 
> > > > limitation the
> > > > + * rights to use, copy, modify, merge, publish, distribute, 
> > > > sublicense, and/or
> > > > + * sell copies of the Software, and to permit persons to whom the 
> > > > Software is
> > > > + * furnished to do so, subject to the following conditions:
> > > > + *
> > > > + * The above copyright notice and this permission notice shall be 
> > > > included in
> > > > + * all copies or substantial portions of the Software.
> > > > + *
> > > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
> > > > EXPRESS OR
> > > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
> > > > MERCHANTABILITY,
> > > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT 
> > > > SHALL THE
> > > > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
> > > > OTHER
> > > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
> > > > ARISING
> > > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> > > > + * DEALINGS IN THE SOFTWARE.
> > > > + *
> > > > + * Tim Deegan and Andrew Warfield November 2004.
> > > > + */
> > > > +
> > > > +#ifndef __XEN_PUBLIC_IO_RING_H__
> > > > +#define __XEN_PUBLIC_IO_RING_H__
> > > > +
> > > > +#if __XEN_INTERFACE_VERSION__ < 0x00030208
> > > > +#define xen_mb()  mb()
> > > > +#define xen_rmb() rmb()
> > > > +#define xen_wmb() wmb()
> > > > +#endif
> > > > +
> > > > +typedef unsigned int RING_IDX;
> > > > +
> > > > +/* Round a 32-bit unsigned constant down to the nearest power of two. 
> > > > */
> > > > +#define __RD2(_x)  (((_x) & 0x0002) ? 0x2  : 

Re: [Xen-devel] [PATCH v2 3/9] xen: introduce the header file for the Xen 9pfs transport protocol

2017-03-15 Thread Stefano Stabellini
On Wed, 15 Mar 2017, Greg Kurz wrote:
> On Mon, 13 Mar 2017 16:55:54 -0700
> Stefano Stabellini  wrote:
> 
> > It uses the new ring.h macros to declare rings and interfaces.
> > 
> > Signed-off-by: Stefano Stabellini 
> > CC: anthony.per...@citrix.com
> > CC: jgr...@suse.com
> > ---
> >  hw/9pfs/xen_9pfs.h | 20 
> >  1 file changed, 20 insertions(+)
> >  create mode 100644 hw/9pfs/xen_9pfs.h
> > 
> 
> This header file has only one user: please move its content to
> hw/9pfs/xen-9p-backend.c (except the 9P header structure, see
> below).

OK, I can do that. There is going to be a very similar header in the Xen
tree under xen/include/public/io
(http://marc.info/?l=xen-devel=148952997709142), but from QEMU point
of view, it makes sense to fold it in xen-9p-backend.c.


> > diff --git a/hw/9pfs/xen_9pfs.h b/hw/9pfs/xen_9pfs.h
> > new file mode 100644
> > index 000..c4e8901
> > --- /dev/null
> > +++ b/hw/9pfs/xen_9pfs.h
> > @@ -0,0 +1,20 @@
> > +#ifndef XEN_9PFS_H
> > +#define XEN_9PFS_H
> > +
> > +#include "hw/xen/io/ring.h"
> > +#include 
> > +
> > +struct xen_9pfs_header {
> > +   uint32_t size;
> > +   uint8_t id;
> > +   uint16_t tag;
> > +
> > +   /* uint8_t sdata[]; */
> 
> This doesn't seem useful.

I'll remove


> > +} __attribute__((packed));
> > +
> 
> A few remarks:
> - this is a 9P message header actually, which is also used with virtio,
> - QEMU coding style requires a typedef in CamelCase,
> - the 9P protocol explicitely uses little-endian ordering. Since we
>   don't have endian-specific types, it makes sense to indicate that
>   when naming the fields.
> - we have a QEMU_PACKED macro which seems to be used a lot more than
>   the gcc syntax
> 
> Please define a new type in hw/9pfs/9p.h and use it in both transports.
> Something like:
> 
> typedef struct {
> uint32_t size_le;
> uint8_t id;
> uint16_t tag_le;
> } QEMU_PACKED P9MsgHeader;

OK


> > +#define PAGE_SHIFT XC_PAGE_SHIFT
> 
> I don't see any user for this in hw/9pfs/xen-9p-backend.c

PAGE_SHIFT is used by the macros below, but the original Xen headers
don't have the PAGE_SHIFT definition, so, for the sake of keeping the
two in sync, I didn't add it there.


> > +#define XEN_9PFS_RING_ORDER 6
> > +#define XEN_9PFS_RING_SIZE  XEN_FLEX_RING_SIZE(XEN_9PFS_RING_ORDER)
> > +DEFINE_XEN_FLEX_RING_AND_INTF(xen_9pfs);
> > +
> > +#endif

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/9] xen: import ring.h from xen

2017-03-15 Thread Greg Kurz
On Wed, 15 Mar 2017 11:36:12 -0700 (PDT)
Stefano Stabellini  wrote:

> On Wed, 15 Mar 2017, Greg Kurz wrote:
> > On Mon, 13 Mar 2017 16:55:53 -0700
> > Stefano Stabellini  wrote:
> >   
> > > Do not use the ring.h header installed on the system. Instead, import
> > > the header into the QEMU codebase. This avoids problems when QEMU is
> > > built against a Xen version too old to provide all the ring macros.
> > >   
> > 
> > What kind of problems ?  
> 
> The FLEX macros are only available in Xen 4.9+ (still unreleased).
> However, aside from these macros, the Xen 9pfs frontends and backends
> could work fine on any Xen releases. In fact, the Xen public/io headers
> are only provided as reference.
> 

Ok.

> 
> > > Signed-off-by: Stefano Stabellini 
> > > CC: anthony.per...@citrix.com
> > > CC: jgr...@suse.com
> > > ---
> > > NB: The new macros have not been committed to Xen yet. Do not apply this
> > > patch until they do.  
> > 
> > Why ? Does this break compat with older Xen ?  
> 
> No, it does not break compatibility. But I think it is a good idea to
> commit the header to QEMU only after the corresponding Xen header has
> been accepted. I don't want the two to diverge.
> 

Fair enough but this will put the entire patchset on hold then. When Xen 4.9
is supposed to be released ?

> 
> > > ---
> > > ---
> > >  hw/block/xen_blkif.h |   2 +-
> > >  hw/usb/xen-usb.c |   2 +-
> > >  include/hw/xen/io/ring.h | 455 
> > > +++
> > >  3 files changed, 457 insertions(+), 2 deletions(-)
> > >  create mode 100644 include/hw/xen/io/ring.h
> > > 
> > > diff --git a/hw/block/xen_blkif.h b/hw/block/xen_blkif.h
> > > index 3300b6f..3e6e1ea 100644
> > > --- a/hw/block/xen_blkif.h
> > > +++ b/hw/block/xen_blkif.h
> > > @@ -1,7 +1,7 @@
> > >  #ifndef XEN_BLKIF_H
> > >  #define XEN_BLKIF_H
> > >  
> > > -#include 
> > > +#include "hw/xen/io/ring.h"
> > >  #include 
> > >  #include 
> > >  
> > > diff --git a/hw/usb/xen-usb.c b/hw/usb/xen-usb.c
> > > index 8e676e6..370b3d9 100644
> > > --- a/hw/usb/xen-usb.c
> > > +++ b/hw/usb/xen-usb.c
> > > @@ -33,7 +33,7 @@
> > >  #include "qapi/qmp/qint.h"
> > >  #include "qapi/qmp/qstring.h"
> > >  
> > > -#include 
> > > +#include "hw/xen/io/ring.h"
> > >  #include 
> > >  
> > >  /*
> > > diff --git a/include/hw/xen/io/ring.h b/include/hw/xen/io/ring.h
> > > new file mode 100644
> > > index 000..cf01fc3
> > > --- /dev/null
> > > +++ b/include/hw/xen/io/ring.h
> > > @@ -0,0 +1,455 @@
> > > +/**
> > > + * ring.h
> > > + * 
> > > + * Shared producer-consumer ring macros.
> > > + *
> > > + * Permission is hereby granted, free of charge, to any person obtaining 
> > > a copy
> > > + * of this software and associated documentation files (the "Software"), 
> > > to
> > > + * deal in the Software without restriction, including without 
> > > limitation the
> > > + * rights to use, copy, modify, merge, publish, distribute, sublicense, 
> > > and/or
> > > + * sell copies of the Software, and to permit persons to whom the 
> > > Software is
> > > + * furnished to do so, subject to the following conditions:
> > > + *
> > > + * The above copyright notice and this permission notice shall be 
> > > included in
> > > + * all copies or substantial portions of the Software.
> > > + *
> > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
> > > EXPRESS OR
> > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
> > > MERCHANTABILITY,
> > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT 
> > > SHALL THE
> > > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
> > > ARISING
> > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> > > + * DEALINGS IN THE SOFTWARE.
> > > + *
> > > + * Tim Deegan and Andrew Warfield November 2004.
> > > + */
> > > +
> > > +#ifndef __XEN_PUBLIC_IO_RING_H__
> > > +#define __XEN_PUBLIC_IO_RING_H__
> > > +
> > > +#if __XEN_INTERFACE_VERSION__ < 0x00030208
> > > +#define xen_mb()  mb()
> > > +#define xen_rmb() rmb()
> > > +#define xen_wmb() wmb()
> > > +#endif
> > > +
> > > +typedef unsigned int RING_IDX;
> > > +
> > > +/* Round a 32-bit unsigned constant down to the nearest power of two. */
> > > +#define __RD2(_x)  (((_x) & 0x0002) ? 0x2  : ((_x) & 
> > > 0x1))
> > > +#define __RD4(_x)  (((_x) & 0x000c) ? __RD2((_x)>>2)<<2: 
> > > __RD2(_x))
> > > +#define __RD8(_x)  (((_x) & 0x00f0) ? __RD4((_x)>>4)<<4: 
> > > __RD4(_x))
> > > +#define __RD16(_x) (((_x) & 0xff00) ? __RD8((_x)>>8)<<8: 
> > > __RD8(_x))
> > > +#define __RD32(_x) (((_x) & 0x) ? __RD16((_x)>>16)<<16 : 
> > > __RD16(_x))
> > > +
> > > +/*
> > > + * Calculate size of a shared ring, given 

Re: [Xen-devel] [PATCH v3 4/7] xen/9pfs: connect to the backend

2017-03-15 Thread Stefano Stabellini
On Wed, 15 Mar 2017, Juergen Gross wrote:
> On 14/03/17 22:22, Stefano Stabellini wrote:
> > Hi Juergen,
> > 
> > thank you for the review!
> > 
> > On Tue, 14 Mar 2017, Juergen Gross wrote:
> >> On 14/03/17 00:50, Stefano Stabellini wrote:
> >>> Implement functions to handle the xenbus handshake. Upon connection,
> >>> allocate the rings according to the protocol specification.
> >>>
> >>> Initialize a work_struct and a wait_queue. The work_struct will be used
> >>> to schedule work upon receiving an event channel notification from the
> >>> backend. The wait_queue will be used to wait when the ring is full and
> >>> we need to send a new request.
> >>>
> >>> Signed-off-by: Stefano Stabellini 
> >>> CC: boris.ostrov...@oracle.com
> >>> CC: jgr...@suse.com
> >>> CC: Eric Van Hensbergen 
> >>> CC: Ron Minnich 
> >>> CC: Latchesar Ionkov 
> >>> CC: v9fs-develo...@lists.sourceforge.net
> >>> ---
> 
> >> Did you think about using request_threaded_irq() instead of a workqueue?
> >> For an example see e.g. drivers/scsi/xen-scsifront.c
> > 
> > I like workqueues :-)  It might come down to personal preferences, but I
> > think workqueues are more flexible and a better fit for this use case.
> > Not only it is easy to schedule work in a workqueue from the interrupt
> > handler, but also they can be used for sleeping in the request function
> > if there is not enough room on the ring. Besides, they can easily be
> > configured to share a single thread or to have multiple independent
> > threads.
> 
> I'm fine with the workqueues as long as you have decided to use them
> considering the alternatives. :-)
> 
> >> Can't you use xenbus_read_unsigned() instead of xenbus_read()?
> > 
> > I can use xenbus_read_unsigned in the other cases below, but not here,
> > because versions is in the form: "1,3,4"
> 
> Is this documented somewhere?
> 
> Hmm, are any of the Xenstore entries documented? Shouldn't this be done
> in xen_9pfs.h ?
 
They are documented in docs/misc/9pfs.markdown, under "Xenstore". Given
that it's all written there, especially the semantics, I didn't repeat
it in xen_9pfs.h

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/9] configure: change CONFIG_XEN_BACKEND to be a target property

2017-03-15 Thread Stefano Stabellini
On Wed, 15 Mar 2017, Paolo Bonzini wrote:
> On 14/03/2017 21:23, Stefano Stabellini wrote:
> > On Tue, 14 Mar 2017, Stefano Stabellini wrote:
> >>> Then you add to Makefile:
> >>>
> >>>  CONFIG_SOFTMMU := $(if $(filter %-softmmu,$(TARGET_DIRS)),y)
> >>>  CONFIG_USER_ONLY := $(if $(filter %-user,$(TARGET_DIRS)),y)
> >>> +CONFIG_XEN := $(CONFIG_XEN_BACKEND)
> >>>  CONFIG_ALL=y
> >>>  -include config-all-devices.mak
> >>>  -include config-all-disas.mak
> >>>
> >>> The Makefile change ensures that they are built before descending in the
> >>> target-specific directories.
> >>
> >> But I don't understand this. Please correct me if I am wrong, but this
> >> change looks like it would end up setting CONFIG_XEN every time that
> >> CONFIG_XEN_BACKEND is set. Without the configure change at the top, it
> >> would end up setting CONFIG_XEN whenever the host supports Xen, even for
> >> non-x86 and non-ARM targets. What am I missing?
> 
> This CONFIG_XEN assignment applies to the toplevel only, i.e. to files
> that are built once.  Targets will still take CONFIG_XEN from
> config-target.mak, and it will not be set for non-x86/non-ARM targets.
> This CONFIG_XEN assignment applies to files that are compiled once.
> 
> The issue you reported here:
> 
> >   LINKaarch64-softmmu/qemu-system-aarch64
> > ../hw/9pfs/xen-9p-backend.o: In function `xen_9pfs_alloc':
> > /local/qemu/hw/9pfs/xen-9p-backend.c:387: undefined reference to 
> > `xenstore_write_be_str'
> > /local/qemu/hw/9pfs/xen-9p-backend.c:388: undefined reference to 
> > `xenstore_write_be_int'
> 
> is because you need this in patch 9:
> 
> -common-obj-$(CONFIG_XEN_BACKEND) += xen-9p-backend.o
> +common-obj-$(CONFIG_XEN) += xen-9p-backend.o
> 

/me shakes his head in shame.
Thank you for the explanation!

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/9] xen: import ring.h from xen

2017-03-15 Thread Stefano Stabellini
On Wed, 15 Mar 2017, Greg Kurz wrote:
> On Mon, 13 Mar 2017 16:55:53 -0700
> Stefano Stabellini  wrote:
> 
> > Do not use the ring.h header installed on the system. Instead, import
> > the header into the QEMU codebase. This avoids problems when QEMU is
> > built against a Xen version too old to provide all the ring macros.
> > 
> 
> What kind of problems ?

The FLEX macros are only available in Xen 4.9+ (still unreleased).
However, aside from these macros, the Xen 9pfs frontends and backends
could work fine on any Xen releases. In fact, the Xen public/io headers
are only provided as reference.


> > Signed-off-by: Stefano Stabellini 
> > CC: anthony.per...@citrix.com
> > CC: jgr...@suse.com
> > ---
> > NB: The new macros have not been committed to Xen yet. Do not apply this
> > patch until they do.
> 
> Why ? Does this break compat with older Xen ?

No, it does not break compatibility. But I think it is a good idea to
commit the header to QEMU only after the corresponding Xen header has
been accepted. I don't want the two to diverge.


> > ---
> > ---
> >  hw/block/xen_blkif.h |   2 +-
> >  hw/usb/xen-usb.c |   2 +-
> >  include/hw/xen/io/ring.h | 455 
> > +++
> >  3 files changed, 457 insertions(+), 2 deletions(-)
> >  create mode 100644 include/hw/xen/io/ring.h
> > 
> > diff --git a/hw/block/xen_blkif.h b/hw/block/xen_blkif.h
> > index 3300b6f..3e6e1ea 100644
> > --- a/hw/block/xen_blkif.h
> > +++ b/hw/block/xen_blkif.h
> > @@ -1,7 +1,7 @@
> >  #ifndef XEN_BLKIF_H
> >  #define XEN_BLKIF_H
> >  
> > -#include 
> > +#include "hw/xen/io/ring.h"
> >  #include 
> >  #include 
> >  
> > diff --git a/hw/usb/xen-usb.c b/hw/usb/xen-usb.c
> > index 8e676e6..370b3d9 100644
> > --- a/hw/usb/xen-usb.c
> > +++ b/hw/usb/xen-usb.c
> > @@ -33,7 +33,7 @@
> >  #include "qapi/qmp/qint.h"
> >  #include "qapi/qmp/qstring.h"
> >  
> > -#include 
> > +#include "hw/xen/io/ring.h"
> >  #include 
> >  
> >  /*
> > diff --git a/include/hw/xen/io/ring.h b/include/hw/xen/io/ring.h
> > new file mode 100644
> > index 000..cf01fc3
> > --- /dev/null
> > +++ b/include/hw/xen/io/ring.h
> > @@ -0,0 +1,455 @@
> > +/**
> > + * ring.h
> > + * 
> > + * Shared producer-consumer ring macros.
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a 
> > copy
> > + * of this software and associated documentation files (the "Software"), to
> > + * deal in the Software without restriction, including without limitation 
> > the
> > + * rights to use, copy, modify, merge, publish, distribute, sublicense, 
> > and/or
> > + * sell copies of the Software, and to permit persons to whom the Software 
> > is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be included 
> > in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
> > OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL 
> > THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + *
> > + * Tim Deegan and Andrew Warfield November 2004.
> > + */
> > +
> > +#ifndef __XEN_PUBLIC_IO_RING_H__
> > +#define __XEN_PUBLIC_IO_RING_H__
> > +
> > +#if __XEN_INTERFACE_VERSION__ < 0x00030208
> > +#define xen_mb()  mb()
> > +#define xen_rmb() rmb()
> > +#define xen_wmb() wmb()
> > +#endif
> > +
> > +typedef unsigned int RING_IDX;
> > +
> > +/* Round a 32-bit unsigned constant down to the nearest power of two. */
> > +#define __RD2(_x)  (((_x) & 0x0002) ? 0x2  : ((_x) & 
> > 0x1))
> > +#define __RD4(_x)  (((_x) & 0x000c) ? __RD2((_x)>>2)<<2: __RD2(_x))
> > +#define __RD8(_x)  (((_x) & 0x00f0) ? __RD4((_x)>>4)<<4: __RD4(_x))
> > +#define __RD16(_x) (((_x) & 0xff00) ? __RD8((_x)>>8)<<8: __RD8(_x))
> > +#define __RD32(_x) (((_x) & 0x) ? __RD16((_x)>>16)<<16 : 
> > __RD16(_x))
> > +
> > +/*
> > + * Calculate size of a shared ring, given the total available space for the
> > + * ring and indexes (_sz), and the name tag of the request/response 
> > structure.
> > + * A ring contains as many entries as will fit, rounded down to the 
> > nearest 
> > + * power of two (so we can mask with (size-1) to loop around).
> > + */
> > +#define __CONST_RING_SIZE(_s, _sz) \
> > +(__RD32(((_sz) - offsetof(struct _s##_sring, ring)) / \
> > +   sizeof(((struct _s##_sring *)0)->ring[0])))
> > +/*
> > + * The same for passing in an actual pointer instead 

[Xen-devel] [ovmf test] 106676: regressions - FAIL

2017-03-15 Thread osstest service owner
flight 106676 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106676/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 105963
 test-amd64-i386-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 105963

version targeted for testing:
 ovmf f4fc7d53046e5ce7a4ae1cf586c289fdbebcbb4a
baseline version:
 ovmf e0307a7dad02aa8c0cd8b3b0b9edce8ddb3fef2e

Last test of basis   105963  2017-02-21 21:43:31 Z   21 days
Failing since105980  2017-02-22 10:03:53 Z   21 days   58 attempts
Testing same since   106676  2017-03-15 02:46:17 Z0 days1 attempts


People who touched revisions under test:
  Anthony PERARD 
  Ard Biesheuvel 
  Bi, Dandan 
  Brijesh Singh 
  Chao Zhang 
  Chen A Chen 
  Dandan Bi 
  edk2-devel On Behalf Of rthomaiy <[mailto:edk2-devel-boun...@lists.01.org]>
  Feng Tian 
  Fu Siyuan 
  Hao Wu 
  Hegde Nagaraj P 
  Hess Chen 
  Jeff Fan 
  Jiaxin Wu 
  Jiewen Yao 
  Laszlo Ersek 
  Leo Duran 
  Marvin Haeuser 
  Marvin Häuser 
  Michael Zimmermann 
  Paolo Bonzini 
  Qin Long 
  Richard Thomaiyar 
  Ruiyu Ni 
  Star Zeng 
  Wu Jiaxin 
  Yonghong Zhu 
  Zhang Lubo 
  Zhang, Chao B 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 4321 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 106674: regressions - FAIL

2017-03-15 Thread osstest service owner
flight 106674 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106674/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl  11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-xl-xsm  11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-libvirt-xsm 11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-xl-cubietruck 11 guest-start fail REGR. vs. 59254
 test-armhf-armhf-libvirt 11 guest-start   fail REGR. vs. 59254
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 15 guest-localmigrate/x10 fail 
REGR. vs. 59254
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail REGR. vs. 59254
 test-armhf-armhf-xl-arndale  11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-xl-credit2  11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-xl-multivcpu 11 guest-start  fail REGR. vs. 59254
 test-amd64-amd64-xl-qemut-win7-amd64 15 guest-localmigrate/x10 fail REGR. vs. 
59254

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds  9 debian-installfail REGR. vs. 59254
 test-armhf-armhf-xl-rtds 11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-xl-vhd   9 debian-di-install   fail baseline untested
 test-armhf-armhf-libvirt-raw  9 debian-di-install   fail baseline untested
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 59254
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 59254
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 59254

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 build-arm64   5 xen-buildfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass

version targeted for testing:
 linux352526f45387cb96671f13b003bdd5b249e509bd
baseline version:
 linux45820c294fe1b1a9df495d57f40585ef2d069a39

Last test of basis59254  2015-07-09 04:20:48 Z  615 days
Failing since 59348  2015-07-10 04:24:05 Z  614 days  337 attempts
Testing same since   106674  2017-03-15 01:00:01 Z0 days1 attempts


8069 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  fail
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  blocked 
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops

Re: [Xen-devel] [Xen-users] "Hello Xen Project" Book.

2017-03-15 Thread Mohsen
Thank you so much Lars.
I used LibreOffice and I will test HTML format and inform you.
The structure that you listed was good and I hope Xen experts and developers 
like you dedicate some hours at the weekends for update this book and add more 
topics to it. I bet it is a good project for help beginners and introduce Xen 
to people. This book can become a Bible for Xen if friends working on it.


On Wed, 3/15/17, Lars Kurth  wrote:

 Subject: Re: [Xen-users] "Hello Xen Project" Book.
 To: "Mohsen" 
 Cc: "xen-de...@lists.xenproject.org" , 
"xen-us...@lists.xenproject.org" 
 Date: Wednesday, March 15, 2017, 4:04 PM
 
 Hi Mohsen,
 
 > On 15 Mar 2017, at 09:50,
 Mohsen 
 wrote:
 > 
 > Dear Xen
 Project community members,
 > 
 > I have written a Xen book recently (pdf
 attached) which is aimed at teaching Xen newbies. I would
 like to make the book available to the Xen Project under a
 CC-BY-SA-3.0 license. Ideally, I would like to publish the
 content on the Xen Project wiki in an editable form, such
 that others can contribute and build on it and it stays
 up-to-date. I also noticed that the Xen Wiki has the 
https://www.mediawiki.org/wiki/Extension:Collection
 extension, which should make it possible to create a
 PDF, ODF or DocBook from the pages for those who want a
 manual rather than wiki pages.
 
 Thank you for doing this. As far as I can tell
 the fact that you published the book under CC-BY-SA-3.0
 would make it possible to move the content to the wiki. 
 
 > I had a conversation with
 Lars to check whether this is possible and he believes it
 is. He suggests that first we upload the book as pdf to the
 wiki and as a second step, agree an information architecture
 and then convert the book to mark-down. There are a number
 of conversion tools which should get us there some of the
 way, with a bit of cleanup and beautification needed after
 the initial import. I can make the source available in a
 format that makes conversion to markdown easier.
 
 We do need to find a way to
 convert the content into markdown format though, which may
 be quite a bit of work.
 
 I
 have done this before for html pages, converting them into
 docman markdown. I have not checked whether there are online
 or command line tools which do that for mediawiki markdown.
 In any case, the conversion is fundamentally doable,
 although it will be somewhat tedious to do this. If anyone
 has more experience, please share and advise what the best
 way forward is.
 
 The main
 problem that I faced when doing something similar were
 tables, figures and other more advanced formatting. Much of
 this may get lost or "corrupted" in some way and
 will have to be re-introduced post conversion.  
 
 @Mohsen: as far as I recall,
 you used Word or LibreOffice to create the book? Is that
 correct? If so, it should be possible to save it in html,
 which would ensure that figures and so on are saved in some
 sensible way. We would probably need to find a temporary
 location where to store this. And we can start experimenting
 a little and maybe provide a quick guide on how to do
 this.
 
 As for the
 information architecture, I was thinking about a structure
 such as ...
 
 https://wiki.xenproject.org/wiki/
 
 https://wiki.xenproject.org/wiki//title_and_credits
 
 https://wiki.xenproject.org/wiki//
 
 https://wiki.xenproject.org/wiki///
 
 ... a separate article for each article in
 the book, such as "Virtualization and Security".
 As a first step, we would probably keep the original chapter
 structure. 
 
 This would then
 look something like ...
 https://wiki.xenproject.org/wiki/HelloXenProject
 https://wiki.xenproject.org/wiki/HelloXenProject/0/Title
 https://wiki.xenproject.org/wiki/HelloXenProject/0/Credits
 https://wiki.xenproject.org/wiki/HelloXenProject/0/Licence
 https://wiki.xenproject.org/wiki/HelloXenProject/1-Intro
 https://wiki.xenproject.org/wiki/HelloXenProject/1-Intro/History
 https://wiki.xenproject.org/wiki/HelloXenProject/1-Intro/TypesOfVirtualization
 
 We may need some other
 extensible numbering scheme, which would make it easy to
 create PDF's with https://www.mediawiki.org/wiki/Extension:Collection
 - again, this is something I don't have experience
 with.
 
 > What do people
 think? Is this a good idea? Would anyone be willing to help?
 I am not very familiar with Markdown and would need someone
 else to help with the wikification of the book. Lars already
 volunteered to help.
 
 I will
 definitely help, but this would be an activity, which could
 easily be distributed. So help from others would be very
 highly appreciated.
 
 Best
 Regards
 Lars
 
 
 ___
 Xen-users mailing list
 xen-us...@lists.xen.org
 https://lists.xen.org/xen-users

___
Xen-devel mailing 

[Xen-devel] [xen-unstable test] 106671: regressions - FAIL

2017-03-15 Thread osstest service owner
flight 106671 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106671/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail REGR. vs. 
106652

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 106642
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 106652
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 106652
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 106652
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 106652
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 106652
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 106652
 test-amd64-amd64-xl-rtds  9 debian-install   fail  like 106652

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 build-arm64   5 xen-buildfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass

version targeted for testing:
 xen  48321fa86ddefe2fddf728dc972b01bb7c7c8559
baseline version:
 xen  bd8ad2a52aba4911ada897c72f8795172a09a193

Last test of basis   106652  2017-03-14 08:24:37 Z1 days
Testing same since   106671  2017-03-14 20:44:11 Z0 days1 attempts


People who touched revisions under test:
  Jan Beulich 
  Kevin Tian 
  Sergey Dyasli 
  Wei Liu 
  Zhang Chen 

jobs:
 build-amd64-xsm  pass
 

Re: [Xen-devel] [early RFC] ARM PCI Passthrough design document

2017-03-15 Thread Roger Pau Monn?
On Wed, Mar 15, 2017 at 11:54:07AM -0500, Venu Busireddy wrote:
> On Wed, Mar 15, 2017 at 04:38:39PM +, Roger Pau Monn? wrote:
> > On Wed, Mar 15, 2017 at 10:11:35AM -0500, Venu Busireddy wrote:
> > > On Wed, Mar 15, 2017 at 12:56:50PM +, Roger Pau Monn? wrote:
> > > > On Wed, Mar 15, 2017 at 08:42:04AM -0400, Konrad Rzeszutek Wilk wrote:
> > > > > On Wed, Mar 15, 2017 at 12:07:28PM +, Roger Pau Monn? wrote:
> > > > > > On Fri, Mar 10, 2017 at 10:28:43AM -0500, Konrad Rzeszutek Wilk 
> > > > > > wrote:
> > > > > > > On Fri, Mar 10, 2017 at 12:23:18PM +0900, Roger Pau Monn? wrote:
> > > > > > > > On Thu, Mar 09, 2017 at 07:29:34PM -0500, Konrad Rzeszutek Wilk 
> > > > > > > > wrote:
> > > > > > > > > On Thu, Mar 09, 2017 at 01:26:45PM +, Julien Grall wrote:
> > > > > > > > > > Hi Konrad,
> > > > > > > > > > 
> > > > > > > > > > On 09/03/17 11:17, Konrad Rzeszutek Wilk wrote:
> > > > > > > > > > > On Thu, Mar 09, 2017 at 11:59:51AM +0900, Roger Pau Monn? 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > On Wed, Mar 08, 2017 at 02:12:09PM -0500, Konrad 
> > > > > > > > > > > > Rzeszutek Wilk wrote:
> > > > > > > > > > > > > On Wed, Mar 08, 2017 at 07:06:23PM +, Julien 
> > > > > > > > > > > > > Grall wrote:
> > > > > > > > > > > > > .. this as for SR-IOV devices you need the drivers to 
> > > > > > > > > > > > > kick the hardware
> > > > > > > > > > > > > to generate the new bus addresses. And those (along 
> > > > > > > > > > > > > with the BAR regions) are
> > > > > > > > > > > > > not visible in ACPI (they are constructued 
> > > > > > > > > > > > > dynamically).
> > > > > > > > > > > > 
> > > > > > > > > > > > There's already code in Xen [0] to find out the size of 
> > > > > > > > > > > > the BARs of SR-IOV
> > > > > > > > > > > > devices, but I'm not sure what's the intended usage of 
> > > > > > > > > > > > that, does it need to
> > > > > > > > > > > > happen _after_ the driver in Dom0 has done whatever 
> > > > > > > > > > > > magic for this to work?
> > > > > > > > > > > 
> > > > > > > > > > > Yes. This is called via the PHYSDEVOP_pci_device_add 
> > > > > > > > > > > hypercall when
> > > > > > > > > > > the device driver in dom0 has finished "creating" the VF. 
> > > > > > > > > > > See drivers/xen/pci.c
> > > > > > > > > > 
> > > > > > > > > > We are thinking to not use PHYSDEVOP_pci_device_add 
> > > > > > > > > > hypercall for ARM and do
> > > > > > > > > > the PCI scanning in Xen.
> > > > > > > > > > 
> > > > > > > > > > If I understand correctly what you said, only the PCI 
> > > > > > > > > > driver will be able to
> > > > > > > > > > kick SR-IOV device and Xen would not be able to detect the 
> > > > > > > > > > device until it
> > > > > > > > > > has been fully configured. So it would mean that we have to 
> > > > > > > > > > keep
> > > > > > > > > > PHYSDEVOP_pci_device_add around to know when Xen can use 
> > > > > > > > > > the device.
> > > > > > > > > > 
> > > > > > > > > > Am I correct?
> > > > > > > > > 
> > > > > > > > > Yes. Unless the PCI drivers come up with some other way to 
> > > > > > > > > tell the
> > > > > > > > > OS that oh, hey, there is this new PCI device with this BDF.
> > > > > > > > > 
> > > > > > > > > Or the underlaying bus on ARM can send some 'new device' 
> > > > > > > > > information?
> > > > > > > > 
> > > > > > > > Hm, is this something standard between all the SR-IOV 
> > > > > > > > implementations, or each
> > > > > > > > vendors have their own sauce?
> > > > > > > 
> > > > > > > Gosh, all of them have their own sauce. The only thing that is 
> > > > > > > the same
> > > > > > > is that suddenly behind the PF device there are PCI devies that 
> > > > > > > are responding
> > > > > > > to 0xcfc requests. MAgic!
> > > > > > 
> > > > > > I'm reading the PCI SR-IOV 1.1 spec, and I think we don't need to 
> > > > > > wait for the
> > > > > > device driver in Dom0 in order to get the information of the VF 
> > > > > > devices, what
> > > > > > Xen cares about is the position of the BARs (so that they can be 
> > > > > > mapped into
> > > > > > Dom0 at boot), and the PCI SBDF of each PF/VF, so that Xen can trap 
> > > > > > accesses to
> > > > > > it.
> > > > > > 
> > > > > > AFAICT both of this can be obtained without any driver-specific 
> > > > > > code, since
> > > > > > it's all contained in the PCI SR-IOV spec (but maybe I'm missing 
> > > > > > something).
> > > > > 
> > > > > CC-ing Venu,
> > > > > 
> > > > > Roger, could you point out which of the chapters has this?
> > > > 
> > > > This would be chapter 2 ("Initialization and Resource Allocation"), and 
> > > > then
> > > > there's a "IMPLEMENTATION NOTE" that shows how the PF/VF are matched to
> > > > function numbers in page 45 (I have the following copy, which is the 
> > > > latest
> > > > revision: "Single Root I/O Virtualization and Sharing Specification 
> > > > Revision
> > > > 1.1" from January 20 2010 [0]).
> > > > 
> > > > The document is quite complex, but it is a 

Re: [Xen-devel] [PATCH V2] x86/emulate: synchronize LOCKed instruction emulation

2017-03-15 Thread Jan Beulich
>>> On 15.03.17 at 17:46,  wrote:
> On 03/15/2017 06:30 PM, Jan Beulich wrote:
> On 15.03.17 at 17:04,  wrote:
>>> ---
>>> Changes since V1:
>>>  - Added Andrew Cooper's credit, as he's kept the patch current
>>>througout non-trivial code changes since the initial patch.
>>>  - Significantly more patch testing (with XenServer).
>>>  - Restricted lock scope.
>> 
>> Not by much, as it seems. In particular you continue to take the
>> lock even for instructions not accessing memory at all.
> 
> I'll take a closer look.
> 
>> Also, by "reworked" I did assume you mean converted to at least the
>> cmpxchg based model.
> 
> I haven't been able to follow the latest emulator changes closely, could
> you please clarify what you mean by "the cmpxchg model"? Thanks.

This is unrelated to any recent changes. The idea is to make the
->cmpxchg() hook actually behave like what its name says. It's
being used for LOCKed insn writeback already, and it could
therefore simply force a retry of the full instruction if the compare
part of it fails. It may need to be given another parameter, to
allow the hook function to tell LOCKed from "normal" uses.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [early RFC] ARM PCI Passthrough design document

2017-03-15 Thread Venu Busireddy
On Wed, Mar 15, 2017 at 04:38:39PM +, Roger Pau Monn? wrote:
> On Wed, Mar 15, 2017 at 10:11:35AM -0500, Venu Busireddy wrote:
> > On Wed, Mar 15, 2017 at 12:56:50PM +, Roger Pau Monn? wrote:
> > > On Wed, Mar 15, 2017 at 08:42:04AM -0400, Konrad Rzeszutek Wilk wrote:
> > > > On Wed, Mar 15, 2017 at 12:07:28PM +, Roger Pau Monn? wrote:
> > > > > On Fri, Mar 10, 2017 at 10:28:43AM -0500, Konrad Rzeszutek Wilk wrote:
> > > > > > On Fri, Mar 10, 2017 at 12:23:18PM +0900, Roger Pau Monn? wrote:
> > > > > > > On Thu, Mar 09, 2017 at 07:29:34PM -0500, Konrad Rzeszutek Wilk 
> > > > > > > wrote:
> > > > > > > > On Thu, Mar 09, 2017 at 01:26:45PM +, Julien Grall wrote:
> > > > > > > > > Hi Konrad,
> > > > > > > > > 
> > > > > > > > > On 09/03/17 11:17, Konrad Rzeszutek Wilk wrote:
> > > > > > > > > > On Thu, Mar 09, 2017 at 11:59:51AM +0900, Roger Pau Monn? 
> > > > > > > > > > wrote:
> > > > > > > > > > > On Wed, Mar 08, 2017 at 02:12:09PM -0500, Konrad 
> > > > > > > > > > > Rzeszutek Wilk wrote:
> > > > > > > > > > > > On Wed, Mar 08, 2017 at 07:06:23PM +, Julien Grall 
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > .. this as for SR-IOV devices you need the drivers to 
> > > > > > > > > > > > kick the hardware
> > > > > > > > > > > > to generate the new bus addresses. And those (along 
> > > > > > > > > > > > with the BAR regions) are
> > > > > > > > > > > > not visible in ACPI (they are constructued dynamically).
> > > > > > > > > > > 
> > > > > > > > > > > There's already code in Xen [0] to find out the size of 
> > > > > > > > > > > the BARs of SR-IOV
> > > > > > > > > > > devices, but I'm not sure what's the intended usage of 
> > > > > > > > > > > that, does it need to
> > > > > > > > > > > happen _after_ the driver in Dom0 has done whatever magic 
> > > > > > > > > > > for this to work?
> > > > > > > > > > 
> > > > > > > > > > Yes. This is called via the PHYSDEVOP_pci_device_add 
> > > > > > > > > > hypercall when
> > > > > > > > > > the device driver in dom0 has finished "creating" the VF. 
> > > > > > > > > > See drivers/xen/pci.c
> > > > > > > > > 
> > > > > > > > > We are thinking to not use PHYSDEVOP_pci_device_add hypercall 
> > > > > > > > > for ARM and do
> > > > > > > > > the PCI scanning in Xen.
> > > > > > > > > 
> > > > > > > > > If I understand correctly what you said, only the PCI driver 
> > > > > > > > > will be able to
> > > > > > > > > kick SR-IOV device and Xen would not be able to detect the 
> > > > > > > > > device until it
> > > > > > > > > has been fully configured. So it would mean that we have to 
> > > > > > > > > keep
> > > > > > > > > PHYSDEVOP_pci_device_add around to know when Xen can use the 
> > > > > > > > > device.
> > > > > > > > > 
> > > > > > > > > Am I correct?
> > > > > > > > 
> > > > > > > > Yes. Unless the PCI drivers come up with some other way to tell 
> > > > > > > > the
> > > > > > > > OS that oh, hey, there is this new PCI device with this BDF.
> > > > > > > > 
> > > > > > > > Or the underlaying bus on ARM can send some 'new device' 
> > > > > > > > information?
> > > > > > > 
> > > > > > > Hm, is this something standard between all the SR-IOV 
> > > > > > > implementations, or each
> > > > > > > vendors have their own sauce?
> > > > > > 
> > > > > > Gosh, all of them have their own sauce. The only thing that is the 
> > > > > > same
> > > > > > is that suddenly behind the PF device there are PCI devies that are 
> > > > > > responding
> > > > > > to 0xcfc requests. MAgic!
> > > > > 
> > > > > I'm reading the PCI SR-IOV 1.1 spec, and I think we don't need to 
> > > > > wait for the
> > > > > device driver in Dom0 in order to get the information of the VF 
> > > > > devices, what
> > > > > Xen cares about is the position of the BARs (so that they can be 
> > > > > mapped into
> > > > > Dom0 at boot), and the PCI SBDF of each PF/VF, so that Xen can trap 
> > > > > accesses to
> > > > > it.
> > > > > 
> > > > > AFAICT both of this can be obtained without any driver-specific code, 
> > > > > since
> > > > > it's all contained in the PCI SR-IOV spec (but maybe I'm missing 
> > > > > something).
> > > > 
> > > > CC-ing Venu,
> > > > 
> > > > Roger, could you point out which of the chapters has this?
> > > 
> > > This would be chapter 2 ("Initialization and Resource Allocation"), and 
> > > then
> > > there's a "IMPLEMENTATION NOTE" that shows how the PF/VF are matched to
> > > function numbers in page 45 (I have the following copy, which is the 
> > > latest
> > > revision: "Single Root I/O Virtualization and Sharing Specification 
> > > Revision
> > > 1.1" from January 20 2010 [0]).
> > > 
> > > The document is quite complex, but it is a standard that all SR-IOV 
> > > devices
> > > should follow so AFAICT Xen should be able to get all the information 
> > > that it
> > > needs from the PCI config space in order to detect the PF/VF BARs and the 
> > > BDF
> > > device addresses.
> > > 
> > > Roger.
> > > 
> > > [0] 

Re: [Xen-devel] [PATCH v3 00/21] x86/xen: untangle PV and PVHVM guest support code

2017-03-15 Thread Boris Ostrovsky
On 03/15/2017 05:42 AM, Juergen Gross wrote:
> On 14/03/17 18:35, Vitaly Kuznetsov wrote:
>> Changes since v2:
>> - Rebase to 4.11.0-rc1+
>> - XEN_HAVE_PVMMU moved to config XEN_PV [Juergen Gross]
>> - .pin_vcpu kept for x86_hyper_xen_hvm to support PVH Dom0 in future
>>[Juergen Gross]
>> - 'extern' qualifiers dropped from newly introduced function prototypes
>>   in headers [Juergen Gross]
>> - A couple of #includes added to address build issues with different
>>   configs [kbuild test robot].
>> - Juergen reviewed-bys added (hope they stand with the above mentioned
>>   changes).
> They do.
>
> In case nobody objects I'll take this series for 4.12.

I haven't tried different combinations but the default set (all yes)
passed my tests.

In the light of 5-level page tables patches the most interesting, I
think, would be !CONFIG_XEN_PV.

-boris


>
>
> Juergen
>
>> The series can also be pulled from https://github.com/vittyvk/linux.git
>>  (xen_pv_hvm_split_v3 branch).
>>
>> Some patches are known to produce checkpatch.pl WARNINGS and a couple of
>> ERRORs, I fixed a few (mostly in _hvm* code I split) and I refrained from
>> fixing the rest to make it easier to review. I think that we may leave PV
>> code as it is as sooner or later it will go away.
>>
>> Original description:
>>
>> I have a long-standing idea to separate PV and PVHVM code in kernel and 
>> introduce Kconfig options to make it possible to enable the required
>> parts only breaking the current 'all or nothing' approach.
>>
>> Motivation:
>> - Xen related x86 code in kernel is rather big and it is unclear which
>>   parts of it are required for PV, for HVM or for both. With PVH coming
>>   into picture is becomes even more tangled. It makes it hard to
>>   understand/audit the code.
>>
>> - In some case we may want to avoid bloating kernel by supporting Xen
>>   guests we don't need. In particular, 90% of the code in arch/x86/xen/ is
>>   required to support PV guests and one may require PVHVM support only.
>>
>> - PV guests are supposed to go away one day and such code separation would
>>   help us to get ready.
>>
>> This series adds XEN_PV Kconfig option and makes it possible to build PV-only
>> and PVHVM-only kernels. It also makes it possible to disable Dom0 support.
>>
>> Some patches are rather big but this is mostly just moving code around, no
>> functional changes intended. I smoke tested it with PV-only and PVHVM-only
>> builds, booted and did save/restore test. I also tried the newly introduced
>> PVHv2 guest, it even worked!
>>
>> Vitaly Kuznetsov (21):
>>   x86/xen: separate PV and HVM hypervisors
>>   x86/xen: globalize have_vcpu_info_placement
>>   x86/xen: add CONFIG_XEN_PV to Kconfig
>>   x86/xen: split off enlighten_pvh.c
>>   x86/xen: split off enlighten_hvm.c
>>   x86/xen: split off enlighten_pv.c
>>   x86/xen: split xen_smp_intr_init()/xen_smp_intr_free()
>>   x86/xen: split xen_smp_prepare_boot_cpu()
>>   x86/xen: split xen_cpu_die()
>>   x86/xen: split off smp_hvm.c
>>   x86/xen: split off smp_pv.c
>>   x86/xen: split off mmu_hvm.c
>>   x86/xen: split off mmu_pv.c
>>   x86/xen: split suspend.c for PV and PVHVM guests
>>   x86/xen: put setup.c, pmu.c and apic.c under CONFIG_XEN_PV
>>   x86/xen: define startup_xen for XEN PV only
>>   x86/xen: create stubs for HVM-only builds in page.h
>>   xen/balloon: decorate PV-only parts with #ifdef CONFIG_XEN_PV
>>   xen: create xen_create/destroy_contiguous_region() stubs for PVHVM
>> only builds
>>   x86/xen: enable PVHVM-only builds
>>   x86/xen: rename some PV-only functions in smp_pv.c
>>
>>  arch/x86/include/asm/hypervisor.h |3 +-
>>  arch/x86/include/asm/xen/page.h   |   25 +
>>  arch/x86/kernel/cpu/hypervisor.c  |7 +-
>>  arch/x86/kernel/process_64.c  |2 +-
>>  arch/x86/xen/Kconfig  |   33 +-
>>  arch/x86/xen/Makefile |   16 +-
>>  arch/x86/xen/enlighten.c  | 1925 +
>>  arch/x86/xen/enlighten_hvm.c  |  213 +++
>>  arch/x86/xen/enlighten_pv.c   | 1522 
>>  arch/x86/xen/enlighten_pvh.c  |  115 ++
>>  arch/x86/xen/mmu.c| 2776 
>> +
>>  arch/x86/xen/mmu_hvm.c|   79 ++
>>  arch/x86/xen/mmu_pv.c | 2635 +++
>>  arch/x86/xen/pmu.h|5 +
>>  arch/x86/xen/smp.c|  517 +--
>>  arch/x86/xen/smp.h|   16 +
>>  arch/x86/xen/smp_hvm.c|   58 +
>>  arch/x86/xen/smp_pv.c |  500 +++
>>  arch/x86/xen/suspend.c|   54 -
>>  arch/x86/xen/suspend_hvm.c|   22 +
>>  arch/x86/xen/suspend_pv.c |   46 +
>>  arch/x86/xen/xen-head.S   |4 +
>>  arch/x86/xen/xen-ops.h|   23 +
>>  drivers/xen/balloon.c |   30 +-
>>  include/xen/xen-ops.h |   14 +
>>  25 files changed, 5461 insertions(+), 5179 deletions(-)
>>  create mode 100644 

Re: [Xen-devel] PV performance degraded after live migration

2017-03-15 Thread Jan Beulich
>>> On 15.03.17 at 17:41,  wrote:
> On 03/15/2017 12:02 PM, Jan Beulich wrote:
> On 15.03.17 at 16:26,  wrote:
>>> On 15/03/17 15:23, Olaf Hering wrote:
 On Wed, Mar 15, Olaf Hering wrote:

> On Wed, Mar 15, Andrew Cooper wrote:
>> As a crazy idea, doest this help?
>> tsc->incarnation = 0
 This does indeed help. One system shows now the results below, which
 means the performance goes down during migration (to localhost) and goes
 back to normal after migration.

 What impact has such change to ->incarnation?
>>> So what this means is that, after migrate, Xen sees that the advertised
>>> TSC value doesn't match the current hardwares TSC, so enables rdtsc
>>> interception so the values reported back can be emulated at the old
>>> frequency.
>>>
>>> There is no easy solution to this problem.
>> Especially for localhost migration it ought to be possible to deal
>> with this. What I'm wondering is why the frequency comparison
>> in default mode handling in tsc_set_info() is HVM-only:
>>
>> if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() &&
>>  (has_hvm_container_domain(d) ?
>>   (d->arch.tsc_khz == cpu_khz ||
>>hvm_get_tsc_scaling_ratio(d->arch.tsc_khz)) :
>>   incarnation == 0) )
>> {
>>
>> Boris, you've made it be this way in 82713ec8d2 ("x86: use native
>> RDTSC(P) execution when guest and host frequencies are the
>> same").
> 
> Don't know why.
> 
> In fact, I looked at the history of this patch and earlier versions had
> 
> @@ -1889,10 +1890,14 @@ void tsc_set_info(struct domain *d,
>  d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
>  d->arch.tsc_khz = gtsc_khz ? gtsc_khz : cpu_khz;
>  set_time_scale(>arch.vtsc_to_ns, d->arch.tsc_khz * 1000 );
> -/* use native TSC if initial host has safe TSC, has not migrated
> - * yet and tsc_khz == cpu_khz */
> -if ( host_tsc_is_safe() && incarnation == 0 &&
> -d->arch.tsc_khz == cpu_khz )
> +/*
> + * Use native TSC if initial host has safe TSC and either has not
> + * migrated yet or tsc_khz == cpu_khz (either "naturally" or via
> + * TSC scaling)
> + */
> +if ( host_tsc_is_safe() &&
> + (incarnation == 0 || d->arch.tsc_khz == cpu_khz ||
> +  cpu_has_tsc_ratio) )
>  d->arch.vtsc = 0;

Which, as pointed out back then, was not PV-correct.

> However, the original code was even stricter.

True.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 4/6] VT-d: introduce update_irte to update irte safely

2017-03-15 Thread Jan Beulich
>>> On 15.03.17 at 06:11,  wrote:
> +static void update_irte(struct iremap_entry *entry,
> +const struct iremap_entry *new_ire)
> +{
> +if ( cpu_has_cx16 )
> +{
> +__uint128_t ret;
> +struct iremap_entry old_ire;
> +
> +old_ire = *entry;
> +ret = cmpxchg16b(entry, _ire, new_ire);
> +
> +/*
> + * In the above, we use cmpxchg16 to atomically update the 128-bit
> + * IRTE, and the hardware cannot update the IRTE behind us, so
> + * the return value of cmpxchg16 should be the same as old_ire.
> + * This ASSERT validate it.
> + */
> +ASSERT(ret == old_ire.val);
> +}
> +else
> +{
> +/*
> + * The following method to update IRTE is safe on condition that
> + * only the high qword or the low qword is to be updated.
> + * If entire IRTE is to be updated, callers should make sure the
> + * IRTE is not in use.
> + */
> +entry->lo = new_ire->lo;
> +entry->hi = new_ire->hi;

How is this any better than structure assignment? Furthermore
the comment here partially contradicts the commit message. I
guess callers need to be given a way (another function parameter?)
to signal the function whether the unsafe variant is okay to use.
You should then add a suitable BUG_ON() in the else path here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2] x86/emulate: synchronize LOCKed instruction emulation

2017-03-15 Thread Razvan Cojocaru
On 03/15/2017 06:30 PM, Jan Beulich wrote:
 On 15.03.17 at 17:04,  wrote:
>> ---
>> Changes since V1:
>>  - Added Andrew Cooper's credit, as he's kept the patch current
>>througout non-trivial code changes since the initial patch.
>>  - Significantly more patch testing (with XenServer).
>>  - Restricted lock scope.
> 
> Not by much, as it seems. In particular you continue to take the
> lock even for instructions not accessing memory at all.

I'll take a closer look.

> Also, by "reworked" I did assume you mean converted to at least the
> cmpxchg based model.

I haven't been able to follow the latest emulator changes closely, could
you please clarify what you mean by "the cmpxchg model"? Thanks.

>> --- a/tools/tests/x86_emulator/test_x86_emulator.c
>> +++ b/tools/tests/x86_emulator/test_x86_emulator.c
>> @@ -283,6 +283,14 @@ static int read_msr(
>>  return X86EMUL_UNHANDLEABLE;
>>  }
>>  
>> +static void smp_lock(bool locked)
>> +{
>> +}
>> +
>> +static void smp_unlock(bool locked)
>> +{
>> +}
> 
> I don't think the hooks should be a requirement, and hence these
> shouldn't be needed.

I'll make them optional.

>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -529,6 +529,8 @@ int arch_domain_create(struct domain *d, unsigned int 
>> domcr_flags,
>>  if ( config == NULL && !is_idle_domain(d) )
>>  return -EINVAL;
>>  
>> +percpu_rwlock_resource_init(>arch.emulate_lock, 
>> emulate_locked_rwlock);
> 
> This should move into the same file as where the lock and the hook
> functions live, so that the variable can be static. I'm not sure ...
> 
>> --- a/xen/arch/x86/hvm/emulate.c
>> +++ b/xen/arch/x86/hvm/emulate.c
>> @@ -24,6 +24,8 @@
>>  #include 
>>  #include 
>>  
>> +DEFINE_PERCPU_RWLOCK_GLOBAL(emulate_locked_rwlock);
> 
> ... this is the right file, though, considering the wide (including PV)
> use of it.

I'll hunt for a better place for it.

>> @@ -3065,6 +3065,8 @@ x86_emulate(
>>  d = state.desc;
>>  #define state ()
>>  
>> +ops->smp_lock(lock_prefix);
> 
> There's a "goto complete_insn" upwards from here, which therefore
> bypasses the acquire, but goes through the release path. Also this is
> still too early to take the lock.

True, it appears that there have been changes in staging since the last
test. I'll need to follow the locking patch carefully.

>> @@ -7925,8 +7932,11 @@ x86_emulate(
>>  ctxt->regs->eflags &= ~X86_EFLAGS_RF;
>>  
>>   done:
>> +ops->smp_unlock(lock_prefix);
> 
> And this, imo, is too late (except for covering error exits coming
> here). I don't think you can avoid having a local tracking variable.

Fair enough. Will use one.

>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>> @@ -448,6 +448,14 @@ struct x86_emulate_ops
>>  /* vmfunc: Emulate VMFUNC via given set of EAX ECX inputs */
>>  int (*vmfunc)(
>>  struct x86_emulate_ctxt *ctxt);
>> +
>> +/* smp_lock: Take a write lock if locked, read lock otherwise. */
>> +void (*smp_lock)(
>> +bool locked);
>> +
>> +/* smp_unlock: Write unlock if locked, read unlock otherwise. */
>> +void (*smp_unlock)(
>> +bool locked);
>>  };
> 
> All hooks should take a ctxt pointer.

I'll try to adapt them.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] PV performance degraded after live migration

2017-03-15 Thread Boris Ostrovsky
On 03/15/2017 12:02 PM, Jan Beulich wrote:
 On 15.03.17 at 16:26,  wrote:
>> On 15/03/17 15:23, Olaf Hering wrote:
>>> On Wed, Mar 15, Olaf Hering wrote:
>>>
 On Wed, Mar 15, Andrew Cooper wrote:
> As a crazy idea, doest this help?
> tsc->incarnation = 0
>>> This does indeed help. One system shows now the results below, which
>>> means the performance goes down during migration (to localhost) and goes
>>> back to normal after migration.
>>>
>>> What impact has such change to ->incarnation?
>> So what this means is that, after migrate, Xen sees that the advertised
>> TSC value doesn't match the current hardwares TSC, so enables rdtsc
>> interception so the values reported back can be emulated at the old
>> frequency.
>>
>> There is no easy solution to this problem.
> Especially for localhost migration it ought to be possible to deal
> with this. What I'm wondering is why the frequency comparison
> in default mode handling in tsc_set_info() is HVM-only:
>
> if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() &&
>  (has_hvm_container_domain(d) ?
>   (d->arch.tsc_khz == cpu_khz ||
>hvm_get_tsc_scaling_ratio(d->arch.tsc_khz)) :
>   incarnation == 0) )
> {
>
> Boris, you've made it be this way in 82713ec8d2 ("x86: use native
> RDTSC(P) execution when guest and host frequencies are the
> same").

Don't know why.

In fact, I looked at the history of this patch and earlier versions had

@@ -1889,10 +1890,14 @@ void tsc_set_info(struct domain *d,
 d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
 d->arch.tsc_khz = gtsc_khz ? gtsc_khz : cpu_khz;
 set_time_scale(>arch.vtsc_to_ns, d->arch.tsc_khz * 1000 );
-/* use native TSC if initial host has safe TSC, has not migrated
- * yet and tsc_khz == cpu_khz */
-if ( host_tsc_is_safe() && incarnation == 0 &&
-d->arch.tsc_khz == cpu_khz )
+/*
+ * Use native TSC if initial host has safe TSC and either has not
+ * migrated yet or tsc_khz == cpu_khz (either "naturally" or via
+ * TSC scaling)
+ */
+if ( host_tsc_is_safe() &&
+ (incarnation == 0 || d->arch.tsc_khz == cpu_khz ||
+  cpu_has_tsc_ratio) )
 d->arch.vtsc = 0;
 else 
 d->arch.ns_to_vtsc = scale_reciprocal(d->arch.vtsc_to_ns);


However, the original code was even stricter.


-boris




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 1/6] VT-d: Introduce new fields in msi_desc to track binding with guest interrupt

2017-03-15 Thread Jan Beulich
>>> On 15.03.17 at 06:11,  wrote:
> --- a/xen/drivers/passthrough/vtd/intremap.c
> +++ b/xen/drivers/passthrough/vtd/intremap.c
> @@ -552,11 +552,12 @@ static int msi_msg_to_remap_entry(
>  struct msi_desc *msi_desc, struct msi_msg *msg)
>  {
>  struct iremap_entry *iremap_entry = NULL, *iremap_entries;
> -struct iremap_entry new_ire;
> +struct iremap_entry new_ire = {{0}};

Any reason this isn't simple "{ }"?

> @@ -595,33 +596,35 @@ static int msi_msg_to_remap_entry(
>  GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, index,
>   iremap_entries, iremap_entry);
>  
> -memcpy(_ire, iremap_entry, sizeof(struct iremap_entry));
> -
> -/* Set interrupt remapping table entry */
> -new_ire.remap.fpd = 0;
> -new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT) & 0x1;
> -new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
> -new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x1;
> -/* Hardware require RH = 1 for LPR delivery mode */
> -new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
> -new_ire.remap.avail = 0;
> -new_ire.remap.res_1 = 0;
> -new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
> -MSI_DATA_VECTOR_MASK;
> -new_ire.remap.res_2 = 0;
> -if ( x2apic_enabled )
> -new_ire.remap.dst = msg->dest32;
> +if ( !pi_desc )
> +{
> +new_ire.remap.dm = msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT;
> +new_ire.remap.tm = msg->data >> MSI_DATA_TRIGGER_SHIFT;
> +new_ire.remap.dlm = msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT;
> +/* Hardware require RH = 1 for LPR delivery mode */

As you're touching this anyway, please make it "requires" and
"lowest priority" respectively.

> @@ -968,59 +927,14 @@ int pi_update_irte(const struct vcpu *v, const struct 
> pirq *pirq,
>  rc = -ENODEV;
>  goto unlock_out;
>  }
> -
> -pci_dev = msi_desc->dev;
> -if ( !pci_dev )
> -{
> -rc = -ENODEV;
> -goto unlock_out;
> -}
> -
> -remap_index = msi_desc->remap_index;
> +msi_desc->pi_desc = pi_desc;
> +msi_desc->gvec = gvec;

Am I overlooking something - I can't seem to find any place where these
two fields (or at least the former) get cleared again? This may be correct,
but if it is the reason wants recording in the commit message.

> --- a/xen/include/asm-x86/msi.h
> +++ b/xen/include/asm-x86/msi.h
> @@ -118,6 +118,8 @@ struct msi_desc {
>   struct msi_msg msg; /* Last set MSI message */
>  
>   int remap_index;/* index in interrupt remapping table */
> + const void *pi_desc;/* PDA, indicates msi is delivered via 
> VT-d PI */

Why "void"? Please let's play type safe wherever we can.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [early RFC] ARM PCI Passthrough design document

2017-03-15 Thread Roger Pau Monn?
On Wed, Mar 15, 2017 at 10:11:35AM -0500, Venu Busireddy wrote:
> On Wed, Mar 15, 2017 at 12:56:50PM +, Roger Pau Monn? wrote:
> > On Wed, Mar 15, 2017 at 08:42:04AM -0400, Konrad Rzeszutek Wilk wrote:
> > > On Wed, Mar 15, 2017 at 12:07:28PM +, Roger Pau Monn? wrote:
> > > > On Fri, Mar 10, 2017 at 10:28:43AM -0500, Konrad Rzeszutek Wilk wrote:
> > > > > On Fri, Mar 10, 2017 at 12:23:18PM +0900, Roger Pau Monn? wrote:
> > > > > > On Thu, Mar 09, 2017 at 07:29:34PM -0500, Konrad Rzeszutek Wilk 
> > > > > > wrote:
> > > > > > > On Thu, Mar 09, 2017 at 01:26:45PM +, Julien Grall wrote:
> > > > > > > > Hi Konrad,
> > > > > > > > 
> > > > > > > > On 09/03/17 11:17, Konrad Rzeszutek Wilk wrote:
> > > > > > > > > On Thu, Mar 09, 2017 at 11:59:51AM +0900, Roger Pau Monn? 
> > > > > > > > > wrote:
> > > > > > > > > > On Wed, Mar 08, 2017 at 02:12:09PM -0500, Konrad Rzeszutek 
> > > > > > > > > > Wilk wrote:
> > > > > > > > > > > On Wed, Mar 08, 2017 at 07:06:23PM +, Julien Grall 
> > > > > > > > > > > wrote:
> > > > > > > > > > > .. this as for SR-IOV devices you need the drivers to 
> > > > > > > > > > > kick the hardware
> > > > > > > > > > > to generate the new bus addresses. And those (along with 
> > > > > > > > > > > the BAR regions) are
> > > > > > > > > > > not visible in ACPI (they are constructued dynamically).
> > > > > > > > > > 
> > > > > > > > > > There's already code in Xen [0] to find out the size of the 
> > > > > > > > > > BARs of SR-IOV
> > > > > > > > > > devices, but I'm not sure what's the intended usage of 
> > > > > > > > > > that, does it need to
> > > > > > > > > > happen _after_ the driver in Dom0 has done whatever magic 
> > > > > > > > > > for this to work?
> > > > > > > > > 
> > > > > > > > > Yes. This is called via the PHYSDEVOP_pci_device_add 
> > > > > > > > > hypercall when
> > > > > > > > > the device driver in dom0 has finished "creating" the VF. See 
> > > > > > > > > drivers/xen/pci.c
> > > > > > > > 
> > > > > > > > We are thinking to not use PHYSDEVOP_pci_device_add hypercall 
> > > > > > > > for ARM and do
> > > > > > > > the PCI scanning in Xen.
> > > > > > > > 
> > > > > > > > If I understand correctly what you said, only the PCI driver 
> > > > > > > > will be able to
> > > > > > > > kick SR-IOV device and Xen would not be able to detect the 
> > > > > > > > device until it
> > > > > > > > has been fully configured. So it would mean that we have to keep
> > > > > > > > PHYSDEVOP_pci_device_add around to know when Xen can use the 
> > > > > > > > device.
> > > > > > > > 
> > > > > > > > Am I correct?
> > > > > > > 
> > > > > > > Yes. Unless the PCI drivers come up with some other way to tell 
> > > > > > > the
> > > > > > > OS that oh, hey, there is this new PCI device with this BDF.
> > > > > > > 
> > > > > > > Or the underlaying bus on ARM can send some 'new device' 
> > > > > > > information?
> > > > > > 
> > > > > > Hm, is this something standard between all the SR-IOV 
> > > > > > implementations, or each
> > > > > > vendors have their own sauce?
> > > > > 
> > > > > Gosh, all of them have their own sauce. The only thing that is the 
> > > > > same
> > > > > is that suddenly behind the PF device there are PCI devies that are 
> > > > > responding
> > > > > to 0xcfc requests. MAgic!
> > > > 
> > > > I'm reading the PCI SR-IOV 1.1 spec, and I think we don't need to wait 
> > > > for the
> > > > device driver in Dom0 in order to get the information of the VF 
> > > > devices, what
> > > > Xen cares about is the position of the BARs (so that they can be mapped 
> > > > into
> > > > Dom0 at boot), and the PCI SBDF of each PF/VF, so that Xen can trap 
> > > > accesses to
> > > > it.
> > > > 
> > > > AFAICT both of this can be obtained without any driver-specific code, 
> > > > since
> > > > it's all contained in the PCI SR-IOV spec (but maybe I'm missing 
> > > > something).
> > > 
> > > CC-ing Venu,
> > > 
> > > Roger, could you point out which of the chapters has this?
> > 
> > This would be chapter 2 ("Initialization and Resource Allocation"), and then
> > there's a "IMPLEMENTATION NOTE" that shows how the PF/VF are matched to
> > function numbers in page 45 (I have the following copy, which is the latest
> > revision: "Single Root I/O Virtualization and Sharing Specification Revision
> > 1.1" from January 20 2010 [0]).
> > 
> > The document is quite complex, but it is a standard that all SR-IOV devices
> > should follow so AFAICT Xen should be able to get all the information that 
> > it
> > needs from the PCI config space in order to detect the PF/VF BARs and the 
> > BDF
> > device addresses.
> > 
> > Roger.
> > 
> > [0] https://members.pcisig.com/wg/PCI-SIG/document/download/8238
> 
> I do not have access to this document, so I have to rely on Rev 1.0
> document, but I don't think this aspect of the spec changed much.
> 
> In any case, I am afraid I am not seeing the overall picture, but I
> would like to comment on the last part of this discussion. 

Re: [Xen-devel] [PATCH V2] x86/emulate: synchronize LOCKed instruction emulation

2017-03-15 Thread Jan Beulich
>>> On 15.03.17 at 17:04,  wrote:
> ---
> Changes since V1:
>  - Added Andrew Cooper's credit, as he's kept the patch current
>througout non-trivial code changes since the initial patch.
>  - Significantly more patch testing (with XenServer).
>  - Restricted lock scope.

Not by much, as it seems. In particular you continue to take the
lock even for instructions not accessing memory at all.

Also, by "reworked" I did assume you mean converted to at least the
cmpxchg based model.

> --- a/tools/tests/x86_emulator/test_x86_emulator.c
> +++ b/tools/tests/x86_emulator/test_x86_emulator.c
> @@ -283,6 +283,14 @@ static int read_msr(
>  return X86EMUL_UNHANDLEABLE;
>  }
>  
> +static void smp_lock(bool locked)
> +{
> +}
> +
> +static void smp_unlock(bool locked)
> +{
> +}

I don't think the hooks should be a requirement, and hence these
shouldn't be needed.

> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -529,6 +529,8 @@ int arch_domain_create(struct domain *d, unsigned int 
> domcr_flags,
>  if ( config == NULL && !is_idle_domain(d) )
>  return -EINVAL;
>  
> +percpu_rwlock_resource_init(>arch.emulate_lock, 
> emulate_locked_rwlock);

This should move into the same file as where the lock and the hook
functions live, so that the variable can be static. I'm not sure ...

> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -24,6 +24,8 @@
>  #include 
>  #include 
>  
> +DEFINE_PERCPU_RWLOCK_GLOBAL(emulate_locked_rwlock);

... this is the right file, though, considering the wide (including PV)
use of it.

> @@ -1731,6 +1755,8 @@ static const struct x86_emulate_ops 
> hvm_emulate_ops_no_write = {
>  .put_fpu   = hvmemul_put_fpu,
>  .invlpg= hvmemul_invlpg,
>  .vmfunc= hvmemul_vmfunc,
> +.smp_lock  = emulate_smp_lock,
> +.smp_unlock= emulate_smp_unlock,
>  };

No need for the hooks here.

> @@ -5485,6 +5487,8 @@ static const struct x86_emulate_ops mmio_ro_emulate_ops 
> = {
>  .write  = mmio_ro_emulated_write,
>  .validate   = pv_emul_is_mem_write,
>  .cpuid  = pv_emul_cpuid,
> +.smp_lock   = emulate_smp_lock,
> +.smp_unlock = emulate_smp_unlock,
>  };

Nor here.

> @@ -5524,6 +5528,8 @@ static const struct x86_emulate_ops mmcfg_intercept_ops 
> = {
>  .write  = mmcfg_intercept_write,
>  .validate   = pv_emul_is_mem_write,
>  .cpuid  = pv_emul_cpuid,
> +.smp_lock   = emulate_smp_lock,
> +.smp_unlock = emulate_smp_unlock,
>  };

Not sure about this one, but generally I'd expect no LOCKed accesses
to MMCFG space.

> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -2957,6 +2957,8 @@ static const struct x86_emulate_ops priv_op_ops = {
>  .write_msr   = priv_op_write_msr,
>  .cpuid   = pv_emul_cpuid,
>  .wbinvd  = priv_op_wbinvd,
> +.smp_lock= emulate_smp_lock,
> +.smp_unlock  = emulate_smp_unlock,
>  };

No need for the hooks again.

> @@ -3065,6 +3065,8 @@ x86_emulate(
>  d = state.desc;
>  #define state ()
>  
> +ops->smp_lock(lock_prefix);

There's a "goto complete_insn" upwards from here, which therefore
bypasses the acquire, but goes through the release path. Also this is
still too early to take the lock.

> @@ -7925,8 +7932,11 @@ x86_emulate(
>  ctxt->regs->eflags &= ~X86_EFLAGS_RF;
>  
>   done:
> +ops->smp_unlock(lock_prefix);

And this, imo, is too late (except for covering error exits coming
here). I don't think you can avoid having a local tracking variable.

> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -448,6 +448,14 @@ struct x86_emulate_ops
>  /* vmfunc: Emulate VMFUNC via given set of EAX ECX inputs */
>  int (*vmfunc)(
>  struct x86_emulate_ctxt *ctxt);
> +
> +/* smp_lock: Take a write lock if locked, read lock otherwise. */
> +void (*smp_lock)(
> +bool locked);
> +
> +/* smp_unlock: Write unlock if locked, read unlock otherwise. */
> +void (*smp_unlock)(
> +bool locked);
>  };

All hooks should take a ctxt pointer.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] PV performance degraded after live migration

2017-03-15 Thread Jan Beulich
>>> On 15.03.17 at 16:53,  wrote:
> On 15/03/17 16:43, Boris Ostrovsky wrote:
>> On 03/15/2017 11:26 AM, Andrew Cooper wrote:
>>> On 15/03/17 15:23, Olaf Hering wrote:
 On Wed, Mar 15, Olaf Hering wrote:

> On Wed, Mar 15, Andrew Cooper wrote:
>> As a crazy idea, doest this help?
>> tsc->incarnation = 0
 This does indeed help. One system shows now the results below, which
 means the performance goes down during migration (to localhost) and goes
 back to normal after migration.

 What impact has such change to ->incarnation?
>>> So what this means is that, after migrate, Xen sees that the advertised
>>> TSC value doesn't match the current hardwares TSC, so enables rdtsc
>>> interception so the values reported back can be emulated at the old
>>> frequency.
>>>
>>> There is no easy solution to this problem.
>> 
>> Would
>> 
>> tsc_mode="never"
>> 
>> help?
> 
> Only for frequency matched hosts.
> 
> Hmm, especially for pv guests this should be solvable: after a migration
> the guest kernel could resync the tsc frequency and there wouldn't be
> further tsc emulation needed. This would just require a new hypercall
> for obtaining the current tsc frequency. This hypercall would:
> 
> - switch off tsc emulation for the calling vcpu (if allowed by tsc_mode)
> - return the real tsc frequency (and offset?)
> 
> As the guest kernel is aware of the migration it could issue the new
> hypercall on each vcpu and everyone is happy again.

Why a new hypercall? The guest has all the information available.
Not sure where in pv-ops code this is, but in our XenoLinux forward
port init_cpu_khz() takes care of this also after migration. In fact I
don't really understand why PV is being penalized here at all by
default, and iirc PV migration didn't have any issue with TSC freq
changing during migration prior to the introduction of vTSC.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH V2] x86/emulate: synchronize LOCKed instruction emulation

2017-03-15 Thread Razvan Cojocaru
LOCK-prefixed instructions are currenly allowed to run in parallel
in x86_emulate(), which can lead the guest into an undefined state.
This patch fixes the issue.

Signed-off-by: Razvan Cojocaru 
Signed-off-by: Andrew Cooper 

---
Changes since V1:
 - Added Andrew Cooper's credit, as he's kept the patch current
   througout non-trivial code changes since the initial patch.
 - Significantly more patch testing (with XenServer).
 - Restricted lock scope.
 - Logic fixes.
---
 tools/tests/x86_emulator/test_x86_emulator.c | 10 ++
 xen/arch/x86/domain.c|  2 ++
 xen/arch/x86/hvm/emulate.c   | 26 ++
 xen/arch/x86/mm.c|  6 ++
 xen/arch/x86/mm/shadow/common.c  |  2 ++
 xen/arch/x86/traps.c |  2 ++
 xen/arch/x86/x86_emulate/x86_emulate.c   | 14 --
 xen/arch/x86/x86_emulate/x86_emulate.h   |  8 
 xen/include/asm-x86/domain.h |  4 
 xen/include/asm-x86/hvm/emulate.h|  3 +++
 10 files changed, 75 insertions(+), 2 deletions(-)

diff --git a/tools/tests/x86_emulator/test_x86_emulator.c 
b/tools/tests/x86_emulator/test_x86_emulator.c
index 04332bb..86b79a1 100644
--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -283,6 +283,14 @@ static int read_msr(
 return X86EMUL_UNHANDLEABLE;
 }
 
+static void smp_lock(bool locked)
+{
+}
+
+static void smp_unlock(bool locked)
+{
+}
+
 static struct x86_emulate_ops emulops = {
 .read   = read,
 .insn_fetch = fetch,
@@ -293,6 +301,8 @@ static struct x86_emulate_ops emulops = {
 .read_cr= emul_test_read_cr,
 .read_msr   = read_msr,
 .get_fpu= emul_test_get_fpu,
+.smp_lock   = smp_lock,
+.smp_unlock = smp_unlock,
 };
 
 int main(int argc, char **argv)
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 479aee6..55010f4 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -529,6 +529,8 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
 if ( config == NULL && !is_idle_domain(d) )
 return -EINVAL;
 
+percpu_rwlock_resource_init(>arch.emulate_lock, emulate_locked_rwlock);
+
 d->arch.s3_integrity = !!(domcr_flags & DOMCRF_s3_integrity);
 
 INIT_LIST_HEAD(>arch.pdev_list);
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index f36d7c9..d5bfbf1 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -24,6 +24,8 @@
 #include 
 #include 
 
+DEFINE_PERCPU_RWLOCK_GLOBAL(emulate_locked_rwlock);
+
 static void hvmtrace_io_assist(const ioreq_t *p)
 {
 unsigned int size, event;
@@ -1682,6 +1684,26 @@ static int hvmemul_vmfunc(
 return rc;
 }
 
+void emulate_smp_lock(bool locked)
+{
+struct domain *d = current->domain;
+
+if ( locked )
+percpu_write_lock(emulate_locked_rwlock, >arch.emulate_lock);
+else
+percpu_read_lock(emulate_locked_rwlock, >arch.emulate_lock);
+}
+
+void emulate_smp_unlock(bool locked)
+{
+struct domain *d = current->domain;
+
+if ( locked )
+percpu_write_unlock(emulate_locked_rwlock, >arch.emulate_lock);
+else
+percpu_read_unlock(emulate_locked_rwlock, >arch.emulate_lock);
+}
+
 static const struct x86_emulate_ops hvm_emulate_ops = {
 .read  = hvmemul_read,
 .insn_fetch= hvmemul_insn_fetch,
@@ -1706,6 +1728,8 @@ static const struct x86_emulate_ops hvm_emulate_ops = {
 .put_fpu   = hvmemul_put_fpu,
 .invlpg= hvmemul_invlpg,
 .vmfunc= hvmemul_vmfunc,
+.smp_lock  = emulate_smp_lock,
+.smp_unlock= emulate_smp_unlock,
 };
 
 static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
@@ -1731,6 +1755,8 @@ static const struct x86_emulate_ops 
hvm_emulate_ops_no_write = {
 .put_fpu   = hvmemul_put_fpu,
 .invlpg= hvmemul_invlpg,
 .vmfunc= hvmemul_vmfunc,
+.smp_lock  = emulate_smp_lock,
+.smp_unlock= emulate_smp_unlock,
 };
 
 static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 7bc951d..2fb3325 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5369,6 +5369,8 @@ static const struct x86_emulate_ops ptwr_emulate_ops = {
 .cmpxchg= ptwr_emulated_cmpxchg,
 .validate   = pv_emul_is_mem_write,
 .cpuid  = pv_emul_cpuid,
+.smp_lock   = emulate_smp_lock,
+.smp_unlock = emulate_smp_unlock,
 };
 
 /* Write page fault handler: check if guest is trying to modify a PTE. */
@@ -5485,6 +5487,8 @@ static const struct x86_emulate_ops mmio_ro_emulate_ops = 
{
 .write  = mmio_ro_emulated_write,
 .validate   = pv_emul_is_mem_write,
 .cpuid  = pv_emul_cpuid,
+.smp_lock   = emulate_smp_lock,
+.smp_unlock = emulate_smp_unlock,
 };
 
 int 

Re: [Xen-devel] PV performance degraded after live migration

2017-03-15 Thread Jan Beulich
>>> On 15.03.17 at 16:26,  wrote:
> On 15/03/17 15:23, Olaf Hering wrote:
>> On Wed, Mar 15, Olaf Hering wrote:
>>
>>> On Wed, Mar 15, Andrew Cooper wrote:
 As a crazy idea, doest this help?
 tsc->incarnation = 0
>> This does indeed help. One system shows now the results below, which
>> means the performance goes down during migration (to localhost) and goes
>> back to normal after migration.
>>
>> What impact has such change to ->incarnation?
> 
> So what this means is that, after migrate, Xen sees that the advertised
> TSC value doesn't match the current hardwares TSC, so enables rdtsc
> interception so the values reported back can be emulated at the old
> frequency.
> 
> There is no easy solution to this problem.

Especially for localhost migration it ought to be possible to deal
with this. What I'm wondering is why the frequency comparison
in default mode handling in tsc_set_info() is HVM-only:

if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() &&
 (has_hvm_container_domain(d) ?
  (d->arch.tsc_khz == cpu_khz ||
   hvm_get_tsc_scaling_ratio(d->arch.tsc_khz)) :
  incarnation == 0) )
{

Boris, you've made it be this way in 82713ec8d2 ("x86: use native
RDTSC(P) execution when guest and host frequencies are the
same").

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5] xen: don't save/restore the physmap on VM save/restore

2017-03-15 Thread Igor Druzhinin
Saving/restoring the physmap to/from xenstore was introduced to
QEMU majorly in order to cover up the VRAM region restore issue.
The sequence of restore operations implies that we should know
the effective guest VRAM address *before* we have the VRAM region
restored (which happens later). Unfortunately, in Xen environment
VRAM memory does actually belong to a guest - not QEMU itself -
which means the position of this region is unknown beforehand and
can't be mapped into QEMU address space immediately.

Previously, recreating xenstore keys, holding the physmap, by the
toolstack helped to get this information in place at the right
moment ready to be consumed by QEMU to map the region properly.

The extraneous complexity of having those keys transferred by the
toolstack and unnecessary redundancy prompted us to propose a
solution which doesn't require any extra data in xenstore. The idea
is to defer the VRAM region mapping till the point we actually know
the effective address and able to map it. To that end, we initially
just skip the mapping request for the framebuffer if we unable to
map it now. Then, after the memory region restore phase, we perform
the mapping again, this time successfully, and update the VRAM region
metadata accordingly.

Signed-off-by: Igor Druzhinin 
---
v5:
* Add an assertion and debug printf

v4:
* Use VGA post_load handler for vram_ptr update

v3:
* Modify qemu_ram_ptr_length similarly with qemu_map_ram_ptr
* Add a comment explaining qemu_map_ram_ptr and qemu_ram_ptr_length
  semantic change for Xen
* Dropped some redundant changes

v2:
* Fix some building and coding style issues
---
 exec.c   |  16 +
 hw/display/vga.c |  11 ++
 xen-hvm.c| 104 ++-
 3 files changed, 46 insertions(+), 85 deletions(-)

diff --git a/exec.c b/exec.c
index aabb035..a1ac8cd 100644
--- a/exec.c
+++ b/exec.c
@@ -2008,6 +2008,14 @@ void *qemu_map_ram_ptr(RAMBlock *ram_block, ram_addr_t 
addr)
 }
 
 block->host = xen_map_cache(block->offset, block->max_length, 1);
+if (block->host == NULL) {
+/* In case we cannot establish the mapping right away we might
+ * still be able to do it later e.g. on a later stage of restore.
+ * We don't touch the block and return NULL here to indicate
+ * that intention.
+ */
+return NULL;
+}
 }
 return ramblock_ptr(block, addr);
 }
@@ -2041,6 +2049,14 @@ static void *qemu_ram_ptr_length(RAMBlock *ram_block, 
ram_addr_t addr,
 }
 
 block->host = xen_map_cache(block->offset, block->max_length, 1);
+if (block->host == NULL) {
+/* In case we cannot establish the mapping right away we might
+ * still be able to do it later e.g. on a later stage of restore.
+ * We don't touch the block and return NULL here to indicate
+ * that intention.
+ */
+return NULL;
+}
 }
 
 return ramblock_ptr(block, addr);
diff --git a/hw/display/vga.c b/hw/display/vga.c
index 69c3e1d..7d85fd8 100644
--- a/hw/display/vga.c
+++ b/hw/display/vga.c
@@ -2035,6 +2035,12 @@ static int vga_common_post_load(void *opaque, int 
version_id)
 {
 VGACommonState *s = opaque;
 
+if (xen_enabled() && !s->vram_ptr) {
+/* update VRAM region pointer in case we've failed
+ * the last time during init phase */
+s->vram_ptr = memory_region_get_ram_ptr(>vram);
+assert(s->vram_ptr);
+}
 /* force refresh */
 s->graphic_mode = -1;
 vbe_update_vgaregs(s);
@@ -2165,6 +2171,11 @@ void vga_common_init(VGACommonState *s, Object *obj, 
bool global_vmstate)
 vmstate_register_ram(>vram, global_vmstate ? NULL : DEVICE(obj));
 xen_register_framebuffer(>vram);
 s->vram_ptr = memory_region_get_ram_ptr(>vram);
+/* VRAM pointer might still be NULL here if we are restoring on Xen.
+   We try to get it again later at post-load phase. */
+#ifdef DEBUG_VGA_MEM
+printf("vga: vram ptr: %p\n", s->vram_ptr);
+#endif
 s->get_bpp = vga_get_bpp;
 s->get_offsets = vga_get_offsets;
 s->get_resolution = vga_get_resolution;
diff --git a/xen-hvm.c b/xen-hvm.c
index 5043beb..8bedd9b 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -317,7 +317,6 @@ static int xen_add_to_physmap(XenIOState *state,
 XenPhysmap *physmap = NULL;
 hwaddr pfn, start_gpfn;
 hwaddr phys_offset = memory_region_get_ram_addr(mr);
-char path[80], value[17];
 const char *mr_name;
 
 if (get_physmapping(state, start_addr, size)) {
@@ -340,6 +339,22 @@ go_physmap:
 DPRINTF("mapping vram to %"HWADDR_PRIx" - %"HWADDR_PRIx"\n",
 start_addr, start_addr + size);
 
+mr_name = memory_region_name(mr);
+
+physmap = g_malloc(sizeof(XenPhysmap));
+
+physmap->start_addr = start_addr;
+physmap->size = size;
+physmap->name = mr_name;
+

[Xen-devel] [PATCH v2] xen: credit2: remove undefined declaration of __dump_execstate()

2017-03-15 Thread Dario Faggioli
Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Jan Beulich 
---
Changes from v1:
* improved subject line.
---
 xen/common/sched_credit2.c |2 --
 1 file changed, 2 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index af457c1..bb1c657 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -2437,8 +2437,6 @@ csched2_runtime(const struct scheduler *ops, int cpu,
 return time;
 }
 
-void __dump_execstate(void *unused);
-
 /*
  * Find a candidate.
  */


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] PV performance degraded after live migration

2017-03-15 Thread Alan Robinson
On Wed, Mar 15, 2017 at 09:53:02AM -0600, Jan Beulich wrote:
> >>> On 15.03.17 at 16:43,  wrote:
...
> > We have seen something similar being caused by the superpage
> > attribute getting dropped when the domU was migrated. The new
> > copy of the domU only has 4K pages instead of 2MB pages eventually
> > this would seem to fit your memory access sysbench results. 
> 
> That would be an explanation for HVM, but here we consider PV only.
> And whether a HVM guest can be re-created using large pages
> mainly depends on how fragmented the destination host memory is.
>

Yes indeed - sorry for the HVM noise..

Alan

-- 
Alan Robinson
Principal Developer, Enterprise Platform Services, Germany

Fujitsu
Mies-van-der-Rohe-Str. 8, 80807 Muenchen, Deutschland
Tel.: +49 (89) 62060 3927
Mob.: +49 (172) 8512843
E-Mail: alan.robin...@ts.fujitsu.com
Web: http://www.fujitsu.com/de/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] PV performance degraded after live migration

2017-03-15 Thread Juergen Gross
On 15/03/17 16:43, Boris Ostrovsky wrote:
> On 03/15/2017 11:26 AM, Andrew Cooper wrote:
>> On 15/03/17 15:23, Olaf Hering wrote:
>>> On Wed, Mar 15, Olaf Hering wrote:
>>>
 On Wed, Mar 15, Andrew Cooper wrote:
> As a crazy idea, doest this help?
> tsc->incarnation = 0
>>> This does indeed help. One system shows now the results below, which
>>> means the performance goes down during migration (to localhost) and goes
>>> back to normal after migration.
>>>
>>> What impact has such change to ->incarnation?
>> So what this means is that, after migrate, Xen sees that the advertised
>> TSC value doesn't match the current hardwares TSC, so enables rdtsc
>> interception so the values reported back can be emulated at the old
>> frequency.
>>
>> There is no easy solution to this problem.
> 
> Would
> 
> tsc_mode="never"
> 
> help?

Only for frequency matched hosts.

Hmm, especially for pv guests this should be solvable: after a migration
the guest kernel could resync the tsc frequency and there wouldn't be
further tsc emulation needed. This would just require a new hypercall
for obtaining the current tsc frequency. This hypercall would:

- switch off tsc emulation for the calling vcpu (if allowed by tsc_mode)
- return the real tsc frequency (and offset?)

As the guest kernel is aware of the migration it could issue the new
hypercall on each vcpu and everyone is happy again.

Thoughts?


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] PV performance degraded after live migration

2017-03-15 Thread Jan Beulich
>>> On 15.03.17 at 16:43,  wrote:
> On Wed, Mar 15, 2017 at 12:20:44PM +0100, Olaf Hering wrote:
>> After reports about degraded performance after a PV domU was migrated
>> from one dom0 to another it turned out that this issue happens with
>> every version of Xen and every version of domU kernel.
>> 
>> The used benchmark is 'sysbench memory'. I hacked it up to show how long
>> the actual work takes, and that loop takes longer to execute after the
>> domU is migrated. In my testing the loop (memory_execute_event) takes
>> about 1200ns, after migration it takes about 1500ns. It just writes 0 to
>> an array of memory. In total sysbench reports 6500 MiB/sec, after
>> migration its just 3350 MiB/sec.
> 
> We have seen something similar being caused by the superpage
> attribute getting dropped when the domU was migrated. The new
> copy of the domU only has 4K pages instead of 2MB pages eventually
> this would seem to fit your memory access sysbench results. 

That would be an explanation for HVM, but here we consider PV only.
And whether a HVM guest can be re-created using large pages
mainly depends on how fragmented the destination host memory is.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] PV performance degraded after live migration

2017-03-15 Thread Andrew Cooper
On 15/03/17 15:43, Alan Robinson wrote:
> Hi Olaf,
>
> On Wed, Mar 15, 2017 at 12:20:44PM +0100, Olaf Hering wrote:
>> After reports about degraded performance after a PV domU was migrated
>> from one dom0 to another it turned out that this issue happens with
>> every version of Xen and every version of domU kernel.
>>
>> The used benchmark is 'sysbench memory'. I hacked it up to show how long
>> the actual work takes, and that loop takes longer to execute after the
>> domU is migrated. In my testing the loop (memory_execute_event) takes
>> about 1200ns, after migration it takes about 1500ns. It just writes 0 to
>> an array of memory. In total sysbench reports 6500 MiB/sec, after
>> migration its just 3350 MiB/sec.
> We have seen something similar being caused by the superpage
> attribute getting dropped when the domU was migrated. The new
> copy of the domU only has 4K pages instead of 2MB pages eventually
> this would seem to fit your memory access sysbench results.

PV guests with superpages can't be migrated at all.  There is no way to
guarantee that there is an available superpage on the destination, and
the absence of one can't be handled on behalf of the guest.

Before my migration v2 work, this was pot luck and you might resume with
the 2M mapping simply missing.  After my migration v2 work, this
situation is detected and causes an abort on the send side.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/3] x86emul: correct handling of FPU insns faulting on memory write

2017-03-15 Thread Jan Beulich
>>> On 15.03.17 at 14:48,  wrote:
> On 03/15/2017 09:31 AM, Jan Beulich wrote:
> On 15.03.17 at 14:24,  wrote:
>>> On 03/15/2017 06:28 AM, Jan Beulich wrote:
 @@ -3716,9 +3720,9 @@ x86_emulate(
  break;
  
  case 0x9b:  /* wait/fwait */
 -fic.insn_bytes = 1;
  host_and_vcpu_must_have(fpu);
  get_fpu(X86EMUL_FPU_wait, );
 +fic.insn_bytes = 1;
  asm volatile ( "fwait" ::: "memory" );
  check_fpu_exn();
  break;
>>> Why is this needed?
>> This isn't strictly needed, but desirable, due to the conditional being
>> added in
>>
>> @@ -7916,7 +7920,7 @@ x86_emulate(
>>  ctxt->regs->eflags &= ~X86_EFLAGS_RF;
>>  
>>   done:
>> -put_fpu(, ctxt, ops);
>> +put_fpu(, fic.insn_bytes > 0 && dst.type == OP_MEM, ctxt, ops);
>>  put_stub(stub);
>>  return rc;
>>  #undef state
>>
>> (both host_and_vcpu_must_have() and get_fpu() may end up
>> branching to "done"). Everywhere else the field is already being
>> set after such basic checks.
> 
> Ah, OK.
> 
> But fic is a local variable that is not initialized (is it?) so
> insn_bytes may be non-zero anyway?

We have this at the top of x86_emulate():

struct fpu_insn_ctxt fic = { .type = X86EMUL_FPU_none, .exn_raised = -1 };

(introduced by patch 1).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] PV performance degraded after live migration

2017-03-15 Thread Alan Robinson
Hi Olaf,

On Wed, Mar 15, 2017 at 12:20:44PM +0100, Olaf Hering wrote:
> After reports about degraded performance after a PV domU was migrated
> from one dom0 to another it turned out that this issue happens with
> every version of Xen and every version of domU kernel.
> 
> The used benchmark is 'sysbench memory'. I hacked it up to show how long
> the actual work takes, and that loop takes longer to execute after the
> domU is migrated. In my testing the loop (memory_execute_event) takes
> about 1200ns, after migration it takes about 1500ns. It just writes 0 to
> an array of memory. In total sysbench reports 6500 MiB/sec, after
> migration its just 3350 MiB/sec.

We have seen something similar being caused by the superpage
attribute getting dropped when the domU was migrated. The new
copy of the domU only has 4K pages instead of 2MB pages eventually
this would seem to fit your memory access sysbench results. 

Alan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] PV performance degraded after live migration

2017-03-15 Thread Boris Ostrovsky
On 03/15/2017 11:26 AM, Andrew Cooper wrote:
> On 15/03/17 15:23, Olaf Hering wrote:
>> On Wed, Mar 15, Olaf Hering wrote:
>>
>>> On Wed, Mar 15, Andrew Cooper wrote:
 As a crazy idea, doest this help?
 tsc->incarnation = 0
>> This does indeed help. One system shows now the results below, which
>> means the performance goes down during migration (to localhost) and goes
>> back to normal after migration.
>>
>> What impact has such change to ->incarnation?
> So what this means is that, after migrate, Xen sees that the advertised
> TSC value doesn't match the current hardwares TSC, so enables rdtsc
> interception so the values reported back can be emulated at the old
> frequency.
>
> There is no easy solution to this problem.

Would

tsc_mode="never"

help?

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Future x86 emulator direction

2017-03-15 Thread Jan Beulich
>>> On 15.03.17 at 14:08,  wrote:
> With that said, should I submit a new version of the original LOCK patch
> to have in the meantime (until the fix suggested by Andrew is
> implemented, and presumably to be reverted once it lands), or is it not
> worth xen-devel's extra time?

I think it would be worthwhile, but there's no promise that it'll be
taken.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] PV performance degraded after live migration

2017-03-15 Thread Juergen Gross
On 15/03/17 16:26, Andrew Cooper wrote:
> On 15/03/17 15:23, Olaf Hering wrote:
>> On Wed, Mar 15, Olaf Hering wrote:
>>
>>> On Wed, Mar 15, Andrew Cooper wrote:
 As a crazy idea, doest this help?
 tsc->incarnation = 0
>> This does indeed help. One system shows now the results below, which
>> means the performance goes down during migration (to localhost) and goes
>> back to normal after migration.
>>
>> What impact has such change to ->incarnation?
> 
> So what this means is that, after migrate, Xen sees that the advertised
> TSC value doesn't match the current hardwares TSC, so enables rdtsc
> interception so the values reported back can be emulated at the old
> frequency.

Why does Xen detect a different frequency after a localhost migration?


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] PV performance degraded after live migration

2017-03-15 Thread Andrew Cooper
On 15/03/17 15:23, Olaf Hering wrote:
> On Wed, Mar 15, Olaf Hering wrote:
>
>> On Wed, Mar 15, Andrew Cooper wrote:
>>> As a crazy idea, doest this help?
>>> tsc->incarnation = 0
> This does indeed help. One system shows now the results below, which
> means the performance goes down during migration (to localhost) and goes
> back to normal after migration.
>
> What impact has such change to ->incarnation?

So what this means is that, after migrate, Xen sees that the advertised
TSC value doesn't match the current hardwares TSC, so enables rdtsc
interception so the values reported back can be emulated at the old
frequency.

There is no easy solution to this problem.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] PV performance degraded after live migration

2017-03-15 Thread Olaf Hering
On Wed, Mar 15, Olaf Hering wrote:

> On Wed, Mar 15, Andrew Cooper wrote:
> > As a crazy idea, doest this help?
> > tsc->incarnation = 0

This does indeed help. One system shows now the results below, which
means the performance goes down during migration (to localhost) and goes
back to normal after migration.

What impact has such change to ->incarnation?

Olaf

./sysbench --memory-block-size=8k --memory-total-size=1024T --verbosity=5 
--threads=4 --report-interval=3 --time=1234 memory run
...
[ 3s ]  6322.61 MiB/sec 2421800: 96212(006fb 0088a 00c7e6) 96387(0074f 0083d 
00506f) 926b7(006d4 0088f 00e5c9) 92706(00488 008c4 0458a6) 
[ 6s ]  6319.72 MiB/sec 2418340: 95dab(004a9 00864 01e42f) 961cc(006dc 0087a 
00e824) 921b3(0049b 00898 00cbd5) 924e5(00705 007ea 00f51e) 
[ 9s ]  6319.37 MiB/sec 2418534: 95f1d(00748 009ab 00e5e4) 9607f(006aa 008ef 
00b358) 92237(00497 008a6 00b332) 9253a(0071a 008a8 00e605) 
[ 12s ]  6320.20 MiB/sec 2419101: 95c70(006fb 008cb 08cbb8) 9611b(00722 00855 
00dfc1) 92383(006ce 008d6 00fbc6) 92745(006de 0073c 00fbde) 
[ 15s ]  6320.12 MiB/sec 2419118: 95fc3(00779 00867 00e36b) 96199(00703 0082f 
00e356) 9229e(004ab 0089b 1b5c72) 9242f(00714 0076b 04807b) 
[ 18s ]  6320.38 MiB/sec 2419150: 95ede(00717 0086b 008fbf) 9618b(006de 00855 
00c119) 922e8(0049c 00846 1b1f1a) 92548(00725 007bd 0052f5) 
[ 21s ]  6321.28 MiB/sec 2419539: 95f23(006fc 008bc 00cce3) 9624c(00701 008ae 
00bec0) 921ee(00495 008b7 1b6358) 92686(0071e 008c6 00bf2e) 
[ 24s ]  6319.37 MiB/sec 2418534: 96061(006e2 00904 00c7c8) 96131(006de 00812 
00de5d) 9213e(006d9 00872 1b29a0) 9243e(00712 007cd 00df76) 
[ 27s ]  6320.15 MiB/sec 2418970: 95e0c(006e1 00869 00df81) 9621a(0056f 0088e 
00dfc1) 921f4(004ad 008ae 1b309c) 9261e(006e6 00895 00ee5d) 
[ 30s ]  6322.09 MiB/sec 2419663: 96014(0045f 0084a 00ea91) 96142(006fb 00868 
00eb32) 922f1(006c8 008cc 1b2ce4) 926db(006f8 007b9 00e6f7) 
[ 33s ]  6321.72 MiB/sec 2419470: 95f3a(00506 008ed 00c75c) 96223(006b2 008b5 
00b685) 92412(00497 00909 1b1e69) 92527(00727 0084e 00c72c) 
[ 36s ]  6321.09 MiB/sec 2419262: 95ffa(006f5 00878 00e7fc) 960e5(006de 00868 
00e87d) 921ea(00763 008eb 1b28e2) 926da(00461 007ca 00ed17) 
[ 39s ]  6321.06 MiB/sec 2419519: 95df8(006e8 0095f 00fee4) 96175(006da 00847 
00ea77) 92334(004a5 00915 00fecb) 926f5(006ac 008c8 0050e0) 
[ 42s ]  6319.72 MiB/sec 2418867: 95f05(004c7 00885 00b39d) 9620a(006fd 0086b 
00b70c) 9208b(0048f 0089f 1b24c4) 925fa(006f2 007eb 00b6d6) 
[ 45s ]  6323.08 MiB/sec 2420218: 95fbe(0075a 00864 00eede) 96298(00749 0086b 
004397) 923c9(0049d 008f7 1b1c50) 9267f(0070f 00753 04003e) 
[ 48s ]  6317.78 MiB/sec 2417897: 95dde(0076c 00862 00e858) 9612c(006bd 008a7 
00e859) 92031(0048d 00836 00e916) 92570(006cc 0078b 00e8d7) 
[ 51s ]  6321.18 MiB/sec 2419442: 95f84(0070d 00829 00e37a) 9616a(006bd 0081f 
00ea08) 9224b(006e4 008ac 00e9d6) 9268d(00724 007bc 00e461) 
[ 54s ]  6321.25 MiB/sec 2419366: 95ed3(004b1 0089e 004103) 961c7(0055e 008aa 
00b832) 922c2(0048f 00857 1b1880) 92682(00706 0081f 00deeb) 
[ 57s ]  6322.12 MiB/sec 2419798: 95f73(00766 00825 00e8e5) 9620a(006d5 008a1 
00e841) 92359(00497 00890 1b16b2) 9265a(006dc 0085d 00e8d7) 
[ 60s ]  6320.37 MiB/sec 2419005: 95fc0(006d4 0082b 002927) 96180(006df 0088a 
005721) 92224(0052b 00960 00be51) 92528(00706 008da 00bd32) 
[ 63s ]  6318.10 MiB/sec 2418347: 95e00(006ec 008a0 00c124) 96307(006c7 00801 
0043ac) 91feb(00709 00851 00c226) 92433(006e6 00802 00bbbf) 
[ 66s ]  6320.55 MiB/sec 2419144: 95ed1(006f8 00825 00c152) 9636e(00774 00832 
00bb27) 92107(004a3 008d7 00c21b) 9258d(00714 007e1 006673) 
[ 69s ]  6319.12 MiB/sec 2418691: 95f9a(006ea 00864 00c31c) 962ba(006f8 00891 
00b9d4) 92022(006a5 008ed 00c3a0) 9244b(00680 00789 00bc68) 
[ 72s ]  1522.05 MiB/sec 584258: 23c48(0077e 0172d 0c9a40) 23c92(0068d 04250 
0c9b64) 2391e(004a4 01561 11369a) 2398f(00733 01412 0c6867) 
[ 75s ]  1284.93 MiB/sec 493314: 1e1db(00d70 0105a 00c227) 1e1e6(00db9 0130f 
00db37) 1e18e(00df0 01677 00f62f) 1e1b3(00da8 00fd6 1a1b4f) 
[ 78s ]  1301.39 MiB/sec 499734: 1e79f(00cd3 01034 420524) 1e867(00cbc 01071 
00871e) 1e7ec(0097f 01732 00f5fc) 1e825(00d10 012bf 00cf0f) 
[ 81s ]  1300.86 MiB/sec 499525: 1e7ea(00d6a 0121b 01a401) 1e7e7(00c65 0108d 
00f67b) 1e799(009a2 0186d 00adcc) 1e7dc(00e20 01157 015b74) 
[ 84s ]  1290.08 MiB/sec 495391: 1e3d8(00e13 01133 007bb1) 1e3e8(00d0a 01395 
00d77c) 1e399(00828 0187c 00ee9a) 1e3c7(00d15 0115c 00ae06) 
[ 87s ]  1297.33 MiB/sec 498175: 1e6ae(00d13 010e4 00f34e) 1e686(00be0 00ebe 
00c62f) 1e64b(00cbe 015ee 01118b) 1e681(00d96 0118d 01ea53) 
[ 90s ]  4939.91 MiB/sec 1890347: 750c7(00635 00844 3898ce) 751ee(00657 008b1 
300e9fe) 724c2(0066d 009a8 2f3c63) 72a6a(0065c 00785 2dc2723) 
[ 93s ]  6232.37 MiB/sec 2385348: 936ff(0045f 0087b 2d56e3) 946ff(0067d 00889 
21db77) 90e65(00486 008ce 1a6b04) 8f82a(006cb 008af 298f81) 
[ 96s ]  6320.31 MiB/sec 2419131: 95fda(006e3 0086c 00c323) 96229(006c8 00874 
01132a) 920b0(00771 00890 1b88ae) 925b6(00723 0078f 00c2af) 
[ 99s ]  6320.29 MiB/sec 

Re: [Xen-devel] [early RFC] ARM PCI Passthrough design document

2017-03-15 Thread Venu Busireddy
On Wed, Mar 15, 2017 at 12:56:50PM +, Roger Pau Monn? wrote:
> On Wed, Mar 15, 2017 at 08:42:04AM -0400, Konrad Rzeszutek Wilk wrote:
> > On Wed, Mar 15, 2017 at 12:07:28PM +, Roger Pau Monn? wrote:
> > > On Fri, Mar 10, 2017 at 10:28:43AM -0500, Konrad Rzeszutek Wilk wrote:
> > > > On Fri, Mar 10, 2017 at 12:23:18PM +0900, Roger Pau Monn? wrote:
> > > > > On Thu, Mar 09, 2017 at 07:29:34PM -0500, Konrad Rzeszutek Wilk wrote:
> > > > > > On Thu, Mar 09, 2017 at 01:26:45PM +, Julien Grall wrote:
> > > > > > > Hi Konrad,
> > > > > > > 
> > > > > > > On 09/03/17 11:17, Konrad Rzeszutek Wilk wrote:
> > > > > > > > On Thu, Mar 09, 2017 at 11:59:51AM +0900, Roger Pau Monn? wrote:
> > > > > > > > > On Wed, Mar 08, 2017 at 02:12:09PM -0500, Konrad Rzeszutek 
> > > > > > > > > Wilk wrote:
> > > > > > > > > > On Wed, Mar 08, 2017 at 07:06:23PM +, Julien Grall 
> > > > > > > > > > wrote:
> > > > > > > > > > .. this as for SR-IOV devices you need the drivers to kick 
> > > > > > > > > > the hardware
> > > > > > > > > > to generate the new bus addresses. And those (along with 
> > > > > > > > > > the BAR regions) are
> > > > > > > > > > not visible in ACPI (they are constructued dynamically).
> > > > > > > > > 
> > > > > > > > > There's already code in Xen [0] to find out the size of the 
> > > > > > > > > BARs of SR-IOV
> > > > > > > > > devices, but I'm not sure what's the intended usage of that, 
> > > > > > > > > does it need to
> > > > > > > > > happen _after_ the driver in Dom0 has done whatever magic for 
> > > > > > > > > this to work?
> > > > > > > > 
> > > > > > > > Yes. This is called via the PHYSDEVOP_pci_device_add hypercall 
> > > > > > > > when
> > > > > > > > the device driver in dom0 has finished "creating" the VF. See 
> > > > > > > > drivers/xen/pci.c
> > > > > > > 
> > > > > > > We are thinking to not use PHYSDEVOP_pci_device_add hypercall for 
> > > > > > > ARM and do
> > > > > > > the PCI scanning in Xen.
> > > > > > > 
> > > > > > > If I understand correctly what you said, only the PCI driver will 
> > > > > > > be able to
> > > > > > > kick SR-IOV device and Xen would not be able to detect the device 
> > > > > > > until it
> > > > > > > has been fully configured. So it would mean that we have to keep
> > > > > > > PHYSDEVOP_pci_device_add around to know when Xen can use the 
> > > > > > > device.
> > > > > > > 
> > > > > > > Am I correct?
> > > > > > 
> > > > > > Yes. Unless the PCI drivers come up with some other way to tell the
> > > > > > OS that oh, hey, there is this new PCI device with this BDF.
> > > > > > 
> > > > > > Or the underlaying bus on ARM can send some 'new device' 
> > > > > > information?
> > > > > 
> > > > > Hm, is this something standard between all the SR-IOV 
> > > > > implementations, or each
> > > > > vendors have their own sauce?
> > > > 
> > > > Gosh, all of them have their own sauce. The only thing that is the same
> > > > is that suddenly behind the PF device there are PCI devies that are 
> > > > responding
> > > > to 0xcfc requests. MAgic!
> > > 
> > > I'm reading the PCI SR-IOV 1.1 spec, and I think we don't need to wait 
> > > for the
> > > device driver in Dom0 in order to get the information of the VF devices, 
> > > what
> > > Xen cares about is the position of the BARs (so that they can be mapped 
> > > into
> > > Dom0 at boot), and the PCI SBDF of each PF/VF, so that Xen can trap 
> > > accesses to
> > > it.
> > > 
> > > AFAICT both of this can be obtained without any driver-specific code, 
> > > since
> > > it's all contained in the PCI SR-IOV spec (but maybe I'm missing 
> > > something).
> > 
> > CC-ing Venu,
> > 
> > Roger, could you point out which of the chapters has this?
> 
> This would be chapter 2 ("Initialization and Resource Allocation"), and then
> there's a "IMPLEMENTATION NOTE" that shows how the PF/VF are matched to
> function numbers in page 45 (I have the following copy, which is the latest
> revision: "Single Root I/O Virtualization and Sharing Specification Revision
> 1.1" from January 20 2010 [0]).
> 
> The document is quite complex, but it is a standard that all SR-IOV devices
> should follow so AFAICT Xen should be able to get all the information that it
> needs from the PCI config space in order to detect the PF/VF BARs and the BDF
> device addresses.
> 
> Roger.
> 
> [0] https://members.pcisig.com/wg/PCI-SIG/document/download/8238

I do not have access to this document, so I have to rely on Rev 1.0
document, but I don't think this aspect of the spec changed much.

In any case, I am afraid I am not seeing the overall picture, but I
would like to comment on the last part of this discussion. Indeed, the
configuration space (including the SR-IOV extended capability) contains
all the information, but only the information necessary for the OS to
"enumerate" the device (PF as well as VFs). The bus and device number
(SBDF) assignment, and programming of the BARs, are all done during that
enumeration. In this discussion, 

[Xen-devel] [libvirt test] 106678: tolerable FAIL - PUSHED

2017-03-15 Thread osstest service owner
flight 106678 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106678/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 106650
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 106650
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 106650

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 build-arm64-pvops 5 kernel-build fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 build-arm64   5 xen-buildfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  32b675456e75de3487b7f14faaa2f40f0d59e570
baseline version:
 libvirt  065564c8401d7db24057e7eaabfba7037aee4d96

Last test of basis   106650  2017-03-14 04:20:21 Z1 days
Testing same since   106678  2017-03-15 04:23:17 Z0 days1 attempts


People who touched revisions under test:
  Alexander Vasilenko 
  Roman Bogorodskiy 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  fail
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  blocked 
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopsfail
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-arm64-arm64-libvirt-xsm blocked 
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-arm64-arm64-libvirt blocked 
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-arm64-arm64-libvirt-qcow2   blocked 
 test-armhf-armhf-libvirt-raw pass
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master

Re: [Xen-devel] [PATCH v16 4/9] x86: add multiboot2 protocol support for EFI platforms

2017-03-15 Thread Daniel Kiper
On Wed, Mar 15, 2017 at 09:42:53AM -0500, Doug Goldstein wrote:
> On 3/15/17 9:38 AM, Daniel Kiper wrote:
> > On Wed, Mar 15, 2017 at 09:27:27AM -0500, Doug Goldstein wrote:
> >> On 3/15/17 6:35 AM, Daniel Kiper wrote:
> >>> On Thu, Mar 09, 2017 at 02:02:49PM -0600, Doug Goldstein wrote:
> >>>
> >>> [...]
> >>>
>  Still missing 'xl info'.
> >>>
> >>> Got Intel NUC5i3MYHE (internally it is NUC5i3MYBE board) into my hands.
> >>> I have put 8 GiB RAM and 500 GB SATA 3 into it. Updated BIOS/EFI to 0041
> >>> version (it is the latest one). Installed latest Debian testing (Debian
> >>> GNU/Linux 9 (stretch)), built GRUB2 and Xen, with and without relocation
> >>> patches, on it. Everything works (I left machine working last night).
> >>> Guest boots without any issue. Please take look at attached logs.
> >>>
> >>> Doug, could you tell me how exactly did you test your machine? I need OS
> >>> type, version, C version (GCC, clang, anything else), bintuils version,
> >>> etc. "xl dmesg", "xl info" and "dmesg" full outputs are welcome too.
> >>>
> >>> Daniel
> >>
> >> I thought I already responded to Konrad saying that latest staging +
> >> relocation patches also comes up.
> >
> > I do not remember it. Maybe I have missed that.
> >
> >> My guess is that it is related to the IOMMU "fix" that Andrew and Jan
> >> did by #if 0'ing out some of ebmalloc. But I'm not sure. I haven't had
> >
> > I reenabled free_ebmalloc_unused_mem() during QEMU tests last week.
> > It has not changed anything. I will do the same on my NUC.
> >
> >> time to look at any of this stuff lately. I went to ELC and then
> >> vacation and then managed to hurt myself on vacation so I've been away
> >> from my computer a bit.
> >>
> >> All my branches are available in https://github.com/cardoe/xen and I've
> >> been on Ubuntu 16.04.
> >
> > I will try this too. Thanks for update.
> >
> > Daniel
>
> Where's the branch you are using on that NUC? Because you can't be using
> plain staging on there because the firmware version you reported dead
> locks when the EFI GetTime() is called. Its a known issue by Intel. So
> you must be patching that out of your Xen?

I rebased relocation patches (v16) on top of 
9dc1e0cd81ee469d638d1962a92d9b4bd2972bfa
(x86/pagewalk: Consistently use guest_walk_*() helpers for translation) commit.
No more no less. Everything works. Am I missing something?

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [linux-4.1 bisection] complete test-amd64-amd64-xl-pvh-intel

2017-03-15 Thread osstest service owner
branch xen-unstable
xenbranch xen-unstable
job test-amd64-amd64-xl-pvh-intel
testid guest-start

Tree: linux 
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: xen git://xenbits.xen.org/xen.git

*** Found and reproduced problem changeset ***

  Bug is in tree:  xen git://xenbits.xen.org/xen.git
  Bug introduced:  9dc1e0cd81ee469d638d1962a92d9b4bd2972bfa
  Bug not present: bab2bd8e222de9e596699ac080ea985af828c4c4
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/106695/


  (Revision log too long, omitted.)


For bisection revision-tuple graph see:
   
http://logs.test-lab.xenproject.org/osstest/results/bisect/linux-4.1/test-amd64-amd64-xl-pvh-intel.guest-start.html
Revision IDs in each graph node refer, respectively, to the Trees above.


Running cs-bisection-step 
--graph-out=/home/logs/results/bisect/linux-4.1/test-amd64-amd64-xl-pvh-intel.guest-start
 --summary-out=tmp/106695.bisection-summary --basis-template=104301 
--blessings=real,real-bisect linux-4.1 test-amd64-amd64-xl-pvh-intel guest-start
Searching for failure / basis pass:
 106669 fail [host=elbling0] / 104301 [host=godello0] 104272 [host=godello0] 
103995 [host=elbling1] 103991 [host=elbling1] 103988 [host=elbling1] 103978 
[host=elbling1] 101737 [host=godello1] 101715 [host=godello1] 101687 
[host=godello1] 101672 [host=godello1] 101659 [host=godello1] 101649 
[host=godello1] 101401 [host=godello1] 101004 [host=godello1] 101001 
[host=godello1] 100753 [host=godello1] 100594 [host=godello1] 100587 
[host=godello1] 100383 [host=godello1] 100371 [host=godello1] 99879 
[host=godello1] 99873 [host=godello1] 99847 [host=godello1] 96211 
[host=elbling1] 96183 [host=elbling1] 96160 [host=elbling1] 95848 
[host=elbling1] 95818 [host=elbling1] 95591 [host=elbling1] 95517 
[host=elbling1] 95455 [host=elbling1] 95408 [host=elbling1] 94729 ok.
Failure / basis pass flights: 106669 / 94729
(tree with no url: minios)
(tree with no url: ovmf)
(tree with no url: seabios)
Tree: linux 
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: xen git://xenbits.xen.org/xen.git
Latest d9e0350d2575a20ee7783427da9bd6b6107eb983 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
8b4834ee1202852ed83a9fc61268c65fb6961ea7 
57e8fbb2f702001a18bd81e9fe31b26d94247ac9 
9dc1e0cd81ee469d638d1962a92d9b4bd2972bfa
Basis pass e429f243df2823451c92227317e5fce5f310b674 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
df553c056104e3dd8a2bd2e72539a57c4c085bae 
62b3d206425c245ed0a020390a64640d40d97471 
bab2bd8e222de9e596699ac080ea985af828c4c4
Generating revisions with ./adhoc-revtuple-generator  
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git#e429f243df2823451c92227317e5fce5f310b674-d9e0350d2575a20ee7783427da9bd6b6107eb983
 
git://xenbits.xen.org/osstest/linux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-c530a75c1e6a472b0eb9558310b518f0dfcd8860
 
git://xenbits.xen.org/qemu-xen-traditional.git#df553c056104e3dd8a2bd2e72539a57c4c085bae-8b4834ee1202852ed83a9fc61268c65fb6961ea7
 
git://xenbits.xen.org/qemu-xen.git#62b3d206425c245ed0a020390a64640d40d97471-57e8fbb2f702001a18bd81e9fe31b26d94247ac9
 
git://xenbits.xen.org/xen.git#bab2bd8e222de9e596699ac080ea985af828c4c4-9dc1e0cd81ee469d638d1962a92d9b4bd2972bfa
adhoc-revtuple-generator: tree discontiguous: linux-stable
From git://cache:9419/git://xenbits.xen.org/qemu-xen
   57e8fbb..acde9f3  staging-> origin/staging
adhoc-revtuple-generator: tree discontiguous: qemu-xen
From git://cache:9419/git://xenbits.xen.org/xen
   15e90cd..bfd9a20  smoke  -> origin/smoke
adhoc-revtuple-generator: tree discontiguous: xen
Loaded 1007 nodes in revision graph
Searching for test results:
 94729 pass e429f243df2823451c92227317e5fce5f310b674 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
df553c056104e3dd8a2bd2e72539a57c4c085bae 
62b3d206425c245ed0a020390a64640d40d97471 
bab2bd8e222de9e596699ac080ea985af828c4c4
 95408 [host=elbling1]
 95455 [host=elbling1]
 95517 [host=elbling1]
 95591 [host=elbling1]
 95848 [host=elbling1]
 95818 [host=elbling1]
 96211 [host=elbling1]
 96160 [host=elbling1]
 96183 [host=elbling1]
 97279 [host=godello1]
 97434 [host=godello1]
 97394 [host=godello1]
 97496 [host=godello1]
 97558 [host=godello1]
 97613 [host=godello1]
 97644 [host=godello1]
 97692 [host=godello1]
 97730 [host=godello1]
 99604 []
 99664 [host=godello1]
 99701 [host=godello1]
 99714 [host=godello1]
 99741 [host=godello1]
 99801 [host=godello1]
 99847 [host=godello1]
 99873 [host=godello1]
 99879 [host=godello1]
 100371 [host=godello1]
 100383 [host=godello1]
 100587 [host=godello1]
 100594 [host=godello1]

[Xen-devel] [xen-unstable-smoke test] 106693: tolerable trouble: broken/fail/pass - PUSHED

2017-03-15 Thread osstest service owner
flight 106693 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106693/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64   5 xen-buildfail   never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  bfd9a2095f1882e8c074b2d911bcb07d12cf6cf5
baseline version:
 xen  15e90cd01f68ff8b23b426e7f91155b81d73db13

Last test of basis   106690  2017-03-15 10:22:16 Z0 days
Testing same since   106693  2017-03-15 13:00:59 Z0 days1 attempts


People who touched revisions under test:
  Juergen Gross 
  Olaf Hering 
  Roger Pau Monné 
  Tim Deegan 
  Wei Liu 
  Zhang Chen 

jobs:
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-amd64-libvirt  pass
 build-arm64-pvopsfail
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  broken  
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=bfd9a2095f1882e8c074b2d911bcb07d12cf6cf5
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
bfd9a2095f1882e8c074b2d911bcb07d12cf6cf5
+ branch=xen-unstable-smoke
+ revision=bfd9a2095f1882e8c074b2d911bcb07d12cf6cf5
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.8-testing
+ '[' xbfd9a2095f1882e8c074b2d911bcb07d12cf6cf5 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : 

Re: [Xen-devel] PV performance degraded after live migration

2017-03-15 Thread Olaf Hering
On Wed, Mar 15, Andrew Cooper wrote:

> As a crazy idea, doest this help?

> tsc->incarnation = 0

Had to move to another testhost and this seems to help. Will do more testing
once the original testsystems are accessible again. Thanks!

Olaf


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v16 4/9] x86: add multiboot2 protocol support for EFI platforms

2017-03-15 Thread Doug Goldstein
On 3/15/17 9:38 AM, Daniel Kiper wrote:
> On Wed, Mar 15, 2017 at 09:27:27AM -0500, Doug Goldstein wrote:
>> On 3/15/17 6:35 AM, Daniel Kiper wrote:
>>> On Thu, Mar 09, 2017 at 02:02:49PM -0600, Doug Goldstein wrote:
>>>
>>> [...]
>>>
 Still missing 'xl info'.
>>>
>>> Got Intel NUC5i3MYHE (internally it is NUC5i3MYBE board) into my hands.
>>> I have put 8 GiB RAM and 500 GB SATA 3 into it. Updated BIOS/EFI to 0041
>>> version (it is the latest one). Installed latest Debian testing (Debian
>>> GNU/Linux 9 (stretch)), built GRUB2 and Xen, with and without relocation
>>> patches, on it. Everything works (I left machine working last night).
>>> Guest boots without any issue. Please take look at attached logs.
>>>
>>> Doug, could you tell me how exactly did you test your machine? I need OS
>>> type, version, C version (GCC, clang, anything else), bintuils version,
>>> etc. "xl dmesg", "xl info" and "dmesg" full outputs are welcome too.
>>>
>>> Daniel
>>>
>>
>> I thought I already responded to Konrad saying that latest staging +
>> relocation patches also comes up.
> 
> I do not remember it. Maybe I have missed that.
> 
>> My guess is that it is related to the IOMMU "fix" that Andrew and Jan
>> did by #if 0'ing out some of ebmalloc. But I'm not sure. I haven't had
> 
> I reenabled free_ebmalloc_unused_mem() during QEMU tests last week.
> It has not changed anything. I will do the same on my NUC.
> 
>> time to look at any of this stuff lately. I went to ELC and then
>> vacation and then managed to hurt myself on vacation so I've been away
>> from my computer a bit.
>>
>> All my branches are available in https://github.com/cardoe/xen and I've
>> been on Ubuntu 16.04.
> 
> I will try this too. Thanks for update.
> 
> Daniel
> 

Where's the branch you are using on that NUC? Because you can't be using
plain staging on there because the firmware version you reported dead
locks when the EFI GetTime() is called. Its a known issue by Intel. So
you must be patching that out of your Xen?

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v16 4/9] x86: add multiboot2 protocol support for EFI platforms

2017-03-15 Thread Daniel Kiper
On Wed, Mar 15, 2017 at 09:27:27AM -0500, Doug Goldstein wrote:
> On 3/15/17 6:35 AM, Daniel Kiper wrote:
> > On Thu, Mar 09, 2017 at 02:02:49PM -0600, Doug Goldstein wrote:
> >
> > [...]
> >
> >> Still missing 'xl info'.
> >
> > Got Intel NUC5i3MYHE (internally it is NUC5i3MYBE board) into my hands.
> > I have put 8 GiB RAM and 500 GB SATA 3 into it. Updated BIOS/EFI to 0041
> > version (it is the latest one). Installed latest Debian testing (Debian
> > GNU/Linux 9 (stretch)), built GRUB2 and Xen, with and without relocation
> > patches, on it. Everything works (I left machine working last night).
> > Guest boots without any issue. Please take look at attached logs.
> >
> > Doug, could you tell me how exactly did you test your machine? I need OS
> > type, version, C version (GCC, clang, anything else), bintuils version,
> > etc. "xl dmesg", "xl info" and "dmesg" full outputs are welcome too.
> >
> > Daniel
> >
>
> I thought I already responded to Konrad saying that latest staging +
> relocation patches also comes up.

I do not remember it. Maybe I have missed that.

> My guess is that it is related to the IOMMU "fix" that Andrew and Jan
> did by #if 0'ing out some of ebmalloc. But I'm not sure. I haven't had

I reenabled free_ebmalloc_unused_mem() during QEMU tests last week.
It has not changed anything. I will do the same on my NUC.

> time to look at any of this stuff lately. I went to ELC and then
> vacation and then managed to hurt myself on vacation so I've been away
> from my computer a bit.
>
> All my branches are available in https://github.com/cardoe/xen and I've
> been on Ubuntu 16.04.

I will try this too. Thanks for update.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Future x86 emulator direction

2017-03-15 Thread Razvan Cojocaru
On 03/15/2017 02:42 PM, Jan Beulich wrote:
 On 15.03.17 at 13:08,  wrote:
>> On 15/03/17 07:49, Jan Beulich wrote:
>> On 14.03.17 at 22:07,  wrote:
 On 12/14/2016 09:37 AM, Razvan Cojocaru wrote:
> On 12/14/2016 09:14 AM, Jan Beulich wrote:
> On 13.12.16 at 23:02,  wrote:
>>> On 13/12/2016 21:55, Razvan Cojocaru wrote:
 On a somewhat related note, it's important to also figure out how best
 to avoid emulation races such as the LOCK CMPXCHG issue we've discussed
 in the past. Maybe that's also worth taking into consideration at this
 early stage.
>>> Funny you should ask that.
>>>
>>> The only possible way to do this safely is to have the emulator map the
>>> target frame(s) and execute a locked stub instruction with a memory
>>> operand pointing at the mapping.  We have no other way of interacting
>>> with the cache coherency fabric.
>> Well, that approach is necessary only if one path (vCPU) can write
>> to a page, while another one needs emulation. If pages are globally
>> write-protected, an approach following the model from Razvan's
>> earlier patch (which I have no idea what has become of) would
>> seem to suffice.
> As previously stated, you've raised performance concerns which seemed to
> require a different direction, namely the one Andrew is now suggesting,
> which indeed, aside from being somewhat faster is also safer for all
> cases (including the one you've mentioned, where one path can write
> normally and the other does so via emulation).
>
> The old patch itself is still alive in the XenServer patch queue, albeit
> quite unlikely to be trivial to apply to the current Xen 4.9-unstable
> code in its current form:
>
>
 https://github.com/xenserver/xen-4.7.pg/blob/master/master/xen-x86-emulate-sy
  
 ncrhonise-LOCKed-instruction-emulation.patch
> Again, if you decide that this patch is preferable, I can try to rework
> it for the current version of Xen.
 Sorry to revive this old thread, but I'm still not sure what the
 upstream solution for this very real problem should be. Should I bring
 back the old patch that synchronizes LOCKed CMPXCHGs (perhaps with
 Andrew's kind help, as he's stated that they keep an up-to-date patch
 that works against staging)? Or are you considering implementing a stub
 as part of the work being done on the emulator?
>>> Both are options imo. The stub approach likely would be the long term
>>> better solution, but carries with it quite a bit of emulator rework, since
>>> we'd have to completely change the way memory writes get carried
>>> out: As we'd need to act on the actual (guest) memory location, we'd
>>> have to do a page walk (or possibly two for an access crossing a page
>>> boundary) before running the stub, presumably completely replacing
>>> the ->write() hook. Compared with this making the ->cmpxchg() hook
>>> work as originally intended seems to be the more straightforward
>>> solution.
>>
>> We already need to change how reads and writes happen.  As it currently
>> stands, accesses which cross a page boundary are not handled correctly,
>> and will complete a partial read/write on the first page before finding
>> that the 2nd page takes a pagefault.  (The root of the problem is that
>> hvm_copy() has dual use; originally as a memcpy(), and later to
>> implement an individual instructions access.)
> 
> Well, even for the memcpy() purpose of the function it would be
> better if no partial writes happened (at least when size is
> meaningfully smaller than a page).
> 
>> The HVM side of the code needs to be altered to work in the same way
>> that sh_x86_emulate_{write,cmpxchg}() currently uses
>> sh_emulate_map_dest(), except that the read side needs including as
>> well.  This important for handling MMIO where reads may have side effects.
> 
> Except that commonly MMIO crossing page boundaries is considered
> undefined. Are you convinced CPUs never do partial writes when an
> access spans pages?
> 
>> Once that is complete, the cmpxchg hook at least should have proper
>> atomic properties.
>>
>> The next question is how to go about making all other LOCKed
>> instructions have atomic properties.  One suggestion was to try and
>> implement all LOCKed instructions in terms of cmpxchg, but I suspect
>> that will come with an unreasonably high overhead for introspection when
>> all vcpus are hitting the same spinlock.
> 
> Hmm, that's a valid concern.

If you'd like I could try to keep count of all the emulated LOCK
instructions that happen during a typical introspection session -
however I suspect there won't be that many, since the application will
not mark all of the guest's pages and hence not require emulation for
all LOCKed instructions. I remember that when I originally found the
issue 

Re: [Xen-devel] [PATCH v16 4/9] x86: add multiboot2 protocol support for EFI platforms

2017-03-15 Thread Doug Goldstein
On 3/15/17 6:35 AM, Daniel Kiper wrote:
> On Thu, Mar 09, 2017 at 02:02:49PM -0600, Doug Goldstein wrote:
> 
> [...]
> 
>> Still missing 'xl info'.
> 
> Got Intel NUC5i3MYHE (internally it is NUC5i3MYBE board) into my hands.
> I have put 8 GiB RAM and 500 GB SATA 3 into it. Updated BIOS/EFI to 0041
> version (it is the latest one). Installed latest Debian testing (Debian
> GNU/Linux 9 (stretch)), built GRUB2 and Xen, with and without relocation
> patches, on it. Everything works (I left machine working last night).
> Guest boots without any issue. Please take look at attached logs.
> 
> Doug, could you tell me how exactly did you test your machine? I need OS
> type, version, C version (GCC, clang, anything else), bintuils version,
> etc. "xl dmesg", "xl info" and "dmesg" full outputs are welcome too.
> 
> Daniel
> 

I thought I already responded to Konrad saying that latest staging +
relocation patches also comes up.

My guess is that it is related to the IOMMU "fix" that Andrew and Jan
did by #if 0'ing out some of ebmalloc. But I'm not sure. I haven't had
time to look at any of this stuff lately. I went to ELC and then
vacation and then managed to hurt myself on vacation so I've been away
from my computer a bit.

All my branches are available in https://github.com/cardoe/xen and I've
been on Ubuntu 16.04.

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2] x86/altp2m: Added xc_altp2m_set_mem_access_multi()

2017-03-15 Thread Razvan Cojocaru
On 03/13/2017 07:17 PM, Tamas K Lengyel wrote:
> On Mon, Mar 13, 2017 at 6:29 AM, Razvan Cojocaru
>  wrote:
>> On 03/13/2017 02:19 PM, Jan Beulich wrote:
>> On 13.03.17 at 13:01,  wrote:
 On 03/10/2017 09:01 PM, Tamas K Lengyel wrote:
> On Fri, Mar 10, 2017 at 4:21 AM, Andrew Cooper
>  wrote:
>> On 10/03/17 07:34, Jan Beulich wrote:
>> On 09.03.17 at 18:29,  wrote:
 On Thu, Mar 9, 2017 at 9:56 AM, Jan Beulich  wrote:
> However - is this interface supposed to be usable by a guest on 
> itself?
> Arguably the same question would apply to some of the other sub-ops
> too, but anyway.
 AFAIK the only op the guest would use on itself is
 HVMOP_altp2m_vcpu_enable_notify.
>>> Which then means we should move all of them out of here, into a
>>> suitable domctl. That will in turn reduce the scope of the bogus
>>> interface versioning, which Andrew did point out, quite a bit.
>>
>> The original usecase for altp2m was for an entirely in-guest agent,
>> which is why they got in as hvmops to start with.  I don't think it is
>> wise to break that.
>>
>> I think there needs to be slightly finer grain control, identifying
>> whether a domain may use altp2m, and whether it may configure altp2m
>> permissions itself.
>>
>> The nature of altp2m means that using EPTP switching/etc necessarily can
>> only happen from inside guest context, but whether you trust the domain
>> to make adjustments to the permissions itself depends on your usecase
>> and threat model.
>>
>
> So I'm actively using EPT switching and gfn remapping from a
> privileged monitor domain (not with VMFUNC). My entire usecase for
> altp2m is purely external without any in-guest agents. In fact, I have
> to deploy a custom XSM policy to blacklist altp2mhvm_op being issued
> from the guest.
>
> The reason I mentioned HVMOP_altp2m_vcpu_enable_notify as being the
> only one I believe that is only accessible from within the guest is
> this distinction in arch/x86/hvm/hvm.c:
>
> d = ( a.cmd != HVMOP_altp2m_vcpu_enable_notify ) ?
> rcu_lock_domain_by_any_id(a.domain) : rcu_lock_current_domain();
>
> For the other ops I'm not sure if they were really required to be
> accessible from within the guest or not. I'm not even sure using them
> would work from the guest with the above check in place. However, if
> they do work from the guest then I have no idea how it was supposed to
> work for security purposes as any application in that guest could just
> issue a hypercall to manipulate it or even turn it off.

 Thanks to all for the replies! What I'm taking away from this is:

 1. The hypercall continuation model proposed by Tamas is fine for HVMOPs.

 2. But we're not sure if these should be DOMCTLs or HVMOPs (except for
 HVMOP_altp2m_vcpu_enable_notify).

 3. If we keep them as HVMOPs, the code for handling the set_mem_access()
 part needs to be duplicated, both for the hypercall continuation / HVMOP
 hypercall structure part, and for the compat part (since the _multi()
 function sends arrays / handles to the hypervisor).

 So an agreement on point 2 is required before being able to proceed.
>>>
>>> I think as long as there's no need for the guest to use an operation
>>> on itself, it should not be a hvmop. After all, if you make it a domctl
>>> now and later find a need for it to be called by the guest, we can
>>> always replace the domctl by a hvmop. If, however, you start out
>>> with a hvmop, we'll be bound to be supporting it virtually forever.
>>
>> Since we're on this point, IMHO the xc_altp2m_ prefixed versions of
>> set_mem_access() and set_mem_access_multi() shouldn't exist at all.
>> Plain xc_set_mem_access() and xc_set_mem_access_multi() (as DOMCTLs)
>> should be enough, as long as we also add the view_id as an
>> extra-parameter, where view ID 0 is (already) the default EPT view.
>>
>> As it stands now, xc_set_mem_access() can do less than
>> xc_altp2m_set_mem_access() in that its view ID is always 0, but more
>> than xc_altp2m_set_mem_access() in that it is able to set more than one
>> page with a single hypercall, while the underlying hypervisor code is
>> the same.
> 
> Yeap, I remember suggesting that the two set_mem_access interfaces
> should be merged when altp2m was being contributed. Unfortunately we
> were not yet maintainers to make that suggestion a requirement so it
> was let in without that change..
> 
>>
>> Maybe I'm missing something design-wise (obviously if these really do
>> need to be HVMOPs a separate libxc function is required). Maybe the
>> altp2m maintainers have a different view of the matter.
>>
> 
> I think altp2m is 

Re: [Xen-devel] [PATCH v7 1/3] x86/mm: Adapt MODULES_END based on Fixmap section size

2017-03-15 Thread Boris Ostrovsky
On 03/14/2017 01:05 PM, Thomas Garnier wrote:
> This patch aligns MODULES_END to the beginning of the Fixmap section.
> It optimizes the space available for both sections. The address is
> pre-computed based on the number of pages required by the Fixmap
> section.
>
> It will allow GDT remapping in the Fixmap section. The current
> MODULES_END static address does not provide enough space for the kernel
> to support a large number of processors.
>
> Signed-off-by: Thomas Garnier 

For Xen bits (and to some extent bare-metal):

Reviewed-and-Tested-by: Boris Ostrovsky 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86emul: parallelize SIMD test code building

2017-03-15 Thread Andrew Cooper
On 15/03/17 13:37, Jan Beulich wrote:
> In anticipation of further flavors (AVX, AVX-512) going to be added
> (which would make the current situation even worse), facilitate
> reduction of build time (and hence latency to availability of test
> results) via use of make's -j option.
>
> Signed-off-by: Jan Beulich 

Acked-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/3] x86emul: correct handling of FPU insns faulting on memory write

2017-03-15 Thread Boris Ostrovsky
On 03/15/2017 09:31 AM, Jan Beulich wrote:
 On 15.03.17 at 14:24,  wrote:
>> On 03/15/2017 06:28 AM, Jan Beulich wrote:
>>> @@ -3716,9 +3720,9 @@ x86_emulate(
>>>  break;
>>>  
>>>  case 0x9b:  /* wait/fwait */
>>> -fic.insn_bytes = 1;
>>>  host_and_vcpu_must_have(fpu);
>>>  get_fpu(X86EMUL_FPU_wait, );
>>> +fic.insn_bytes = 1;
>>>  asm volatile ( "fwait" ::: "memory" );
>>>  check_fpu_exn();
>>>  break;
>> Why is this needed?
> This isn't strictly needed, but desirable, due to the conditional being
> added in
>
> @@ -7916,7 +7920,7 @@ x86_emulate(
>  ctxt->regs->eflags &= ~X86_EFLAGS_RF;
>  
>   done:
> -put_fpu(, ctxt, ops);
> +put_fpu(, fic.insn_bytes > 0 && dst.type == OP_MEM, ctxt, ops);
>  put_stub(stub);
>  return rc;
>  #undef state
>
> (both host_and_vcpu_must_have() and get_fpu() may end up
> branching to "done"). Everywhere else the field is already being
> set after such basic checks.

Ah, OK.

But fic is a local variable that is not initialized (is it?) so
insn_bytes may be non-zero anyway?

-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86emul: parallelize SIMD test code building

2017-03-15 Thread Jan Beulich
In anticipation of further flavors (AVX, AVX-512) going to be added
(which would make the current situation even worse), facilitate
reduction of build time (and hence latency to availability of test
results) via use of make's -j option.

Signed-off-by: Jan Beulich 

--- a/.gitignore
+++ b/.gitignore
@@ -222,10 +222,12 @@
 tools/security/secpol_tool
 tools/security/xen/*
 tools/security/xensec_tool
+tools/tests/x86_emulator/*.bin
+tools/tests/x86_emulator/*.tmp
 tools/tests/x86_emulator/asm
-tools/tests/x86_emulator/blowfish.bin
+tools/tests/x86_emulator/avx*.h
 tools/tests/x86_emulator/blowfish.h
-tools/tests/x86_emulator/simd.h
+tools/tests/x86_emulator/sse*.h
 tools/tests/x86_emulator/test_x86_emulator
 tools/tests/x86_emulator/x86_emulate
 tools/tests/xen-access/xen-access
--- a/tools/tests/x86_emulator/Makefile
+++ b/tools/tests/x86_emulator/Makefile
@@ -11,7 +11,8 @@ all: $(TARGET)
 run: $(TARGET)
./$(TARGET)
 
-TESTCASES := blowfish simd
+SIMD := sse sse2 sse4
+TESTCASES := blowfish $(SIMD) $(addsuffix -avx,$(filter sse%,$(SIMD)))
 
 blowfish-cflags := ""
 blowfish-cflags-x86_32 := "-mno-accumulate-outgoing-args -Dstatic="
@@ -34,19 +35,28 @@ sse2avx-sse  := -ffixed-xmm0 -Wa,-msse2a
 sse2avx-sse2 := $(sse2avx-sse)
 sse2avx-sse4 := -Wa,-msse2avx
 
-simd-cflags := $(foreach flavor,sse sse2 sse4, \
- $(foreach vec,$($(flavor)-vecs), \
-   $(foreach int,$($(flavor)-ints), \
- "-D$(flavor)_$(vec)i$(int) -m$(flavor) -O2 
-DVEC_SIZE=$(vec) -DINT_SIZE=$(int)" \
- "-D$(flavor)_$(vec)u$(int) -m$(flavor) -O2 
-DVEC_SIZE=$(vec) -DUINT_SIZE=$(int)" \
- "-D$(flavor)_avx_$(vec)i$(int) -m$(flavor) 
$(sse2avx-$(flavor)) -O2 -DVEC_SIZE=$(vec) -DINT_SIZE=$(int)" \
- "-D$(flavor)_avx_$(vec)u$(int) -m$(flavor) 
$(sse2avx-$(flavor)) -O2 -DVEC_SIZE=$(vec) -DUINT_SIZE=$(int)") \
-   $(foreach flt,$($(flavor)-flts), \
- "-D$(flavor)_$(vec)f$(flt) -m$(flavor) -O2 
-DVEC_SIZE=$(vec) -DFLOAT_SIZE=$(flt)" \
- "-D$(flavor)_avx_$(vec)f$(flt) -m$(flavor) 
$(sse2avx-$(flavor)) -O2 -DVEC_SIZE=$(vec) -DFLOAT_SIZE=$(flt)")) \
- $(foreach flt,$($(flavor)-flts), \
-   "-D$(flavor)_f$(flt) -m$(flavor) -mfpmath=sse -O2 
-DFLOAT_SIZE=$(flt)" \
-   "-D$(flavor)_avx_f$(flt) -m$(flavor) -mfpmath=sse 
$(sse2avx-$(flavor)) -O2 -DFLOAT_SIZE=$(flt)"))
+define simd-defs
+$(1)-cflags := \
+   $(foreach vec,$($(1)-vecs), \
+ $(foreach int,$($(1)-ints), \
+   "-D_$(vec)i$(int) -m$(1) -O2 -DVEC_SIZE=$(vec) -DINT_SIZE=$(int)" \
+   "-D_$(vec)u$(int) -m$(1) -O2 -DVEC_SIZE=$(vec) -DUINT_SIZE=$(int)") 
\
+ $(foreach flt,$($(1)-flts), \
+   "-D_$(vec)f$(flt) -m$(1) -O2 -DVEC_SIZE=$(vec) 
-DFLOAT_SIZE=$(flt)")) \
+   $(foreach flt,$($(1)-flts), \
+ "-D_f$(flt) -m$(1) -mfpmath=sse -O2 -DFLOAT_SIZE=$(flt)")
+$(1)-avx-cflags := \
+   $(foreach vec,$($(1)-vecs), \
+ $(foreach int,$($(1)-ints), \
+   "-D_$(vec)i$(int) -m$(1) $(sse2avx-$(1)) -O2 -DVEC_SIZE=$(vec) 
-DINT_SIZE=$(int)" \
+   "-D_$(vec)u$(int) -m$(1) $(sse2avx-$(1)) -O2 -DVEC_SIZE=$(vec) 
-DUINT_SIZE=$(int)") \
+ $(foreach flt,$($(1)-flts), \
+   "-D_$(vec)f$(flt) -m$(1) $(sse2avx-$(1)) -O2 -DVEC_SIZE=$(vec) 
-DFLOAT_SIZE=$(flt)")) \
+   $(foreach flt,$($(1)-flts), \
+ "-D_f$(flt) -m$(1) -mfpmath=sse $(sse2avx-$(1)) -O2 
-DFLOAT_SIZE=$(flt)")
+endef
+
+$(foreach flavor,$(SIMD),$(eval $(call simd-defs,$(flavor
 
 $(addsuffix .h,$(TESTCASES)): %.h: %.c testcase.mk Makefile
rm -f $@.new $*.bin
@@ -54,7 +64,7 @@ $(addsuffix .h,$(TESTCASES)): %.h: %.c t
for cflags in $($*-cflags) $($*-cflags-$(arch)); do \
$(MAKE) -f testcase.mk TESTCASE=$* XEN_TARGET_ARCH=$(arch) 
$*-cflags="$$cflags" all; \
flavor=$$(echo $${cflags} | sed -e 's, .*,,' -e 'y,-=,__,') ; \
-   (echo "static const unsigned int $*_$(arch)$${flavor}[] = {"; \
+   (echo "static const unsigned int $(subst 
-,_,$*)_$(arch)$${flavor}[] = {"; \
 od -v -t x $*.bin | sed -e 's/^[0-9]* /0x/' -e 's/ /, 0x/g' -e 
's/$$/,/'; \
 echo "};") >>$@.new; \
rm -f $*.bin; \
@@ -62,6 +72,9 @@ $(addsuffix .h,$(TESTCASES)): %.h: %.c t
)
mv $@.new $@
 
+$(addsuffix .c,$(SIMD)) $(addsuffix -avx.c,$(filter sse%,$(SIMD))):
+   ln -sf simd.c $@
+
 $(TARGET): x86_emulate.o test_x86_emulator.o
$(HOSTCC) -o $@ $^
 
--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -5,7 +5,12 @@
 
 #include "x86_emulate.h"
 #include "blowfish.h"
-#include "simd.h"
+#include "sse.h"
+#include "sse2.h"
+#include "sse4.h"
+#include "sse-avx.h"
+#include "sse2-avx.h"
+#include "sse4-avx.h"
 
 #define verbose false /* Switch to true for 

Re: [Xen-devel] [PATCH v2 2/3] x86emul: correct handling of FPU insns faulting on memory write

2017-03-15 Thread Jan Beulich
>>> On 15.03.17 at 14:24,  wrote:
> On 03/15/2017 06:28 AM, Jan Beulich wrote:
>> @@ -3716,9 +3720,9 @@ x86_emulate(
>>  break;
>>  
>>  case 0x9b:  /* wait/fwait */
>> -fic.insn_bytes = 1;
>>  host_and_vcpu_must_have(fpu);
>>  get_fpu(X86EMUL_FPU_wait, );
>> +fic.insn_bytes = 1;
>>  asm volatile ( "fwait" ::: "memory" );
>>  check_fpu_exn();
>>  break;
> 
> Why is this needed?

This isn't strictly needed, but desirable, due to the conditional being
added in

@@ -7916,7 +7920,7 @@ x86_emulate(
 ctxt->regs->eflags &= ~X86_EFLAGS_RF;
 
  done:
-put_fpu(, ctxt, ops);
+put_fpu(, fic.insn_bytes > 0 && dst.type == OP_MEM, ctxt, ops);
 put_stub(stub);
 return rc;
 #undef state

(both host_and_vcpu_must_have() and get_fpu() may end up
branching to "done"). Everywhere else the field is already being
set after such basic checks.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/3] x86emul: correct handling of FPU insns faulting on memory write

2017-03-15 Thread Boris Ostrovsky
On 03/15/2017 06:28 AM, Jan Beulich wrote:
> When an FPU instruction with a memory destination fails during the
> memory write, it should not affect FPU register state. Due to the way
> we emulate FPU (and SIMD) instructions, we can only guarantee this by
> - backing out changes to the FPU register state in such a case or
> - doing a descriptor read and/or page walk up front, perhaps with the
>   stubs accessing the actual memory location then.
> The latter would require a significant change in how the emulator does
> its guest memory accessing, so for now the former variant is being
> chosen.
>
> Signed-off-by: Jan Beulich 
> Reviewed-by: Andrew Cooper 
> Reviewed-by: Kevin Tian 

Reviewed-by: Boris Ostrovsky 

with one question:

>  
> @@ -3716,9 +3720,9 @@ x86_emulate(
>  break;
>  
>  case 0x9b:  /* wait/fwait */
> -fic.insn_bytes = 1;
>  host_and_vcpu_must_have(fpu);
>  get_fpu(X86EMUL_FPU_wait, );
> +fic.insn_bytes = 1;
>  asm volatile ( "fwait" ::: "memory" );
>  check_fpu_exn();
>  break;
>


Why is this needed?

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] preparations for 4.8.1

2017-03-15 Thread Roger Pau Monné
On Tue, Mar 14, 2017 at 05:47:08AM -0600, Jan Beulich wrote:
> All,
> 
> with the goal of releasing in about 3 weeks time, please point out
> backport candidates you find missing from the respective staging
> branch, but which you consider relevant.

It's maybe a little bit premature but I would like to request the backport of
the clang 4.0 fixes to 4.8.1 (note than one has not even passed the push gate
yet):

Hypervisor:
9e4d116faff4545a7f21c2b01008e94d68e6db58 build/clang: fix XSM dummy policy when 
using clang 4.0
4036e7c592905c2292cdeba8269e969959427237 x86: drop unneeded __packed attributes

Tools:
bfd9a2095f1882e8c074b2d911bcb07d12cf6cf5 tools/kdd: don't use a pointer to an 
unaligned field.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [early RFC] ARM PCI Passthrough design document

2017-03-15 Thread Roger Pau Monné
On Wed, Mar 15, 2017 at 08:42:04AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Mar 15, 2017 at 12:07:28PM +, Roger Pau Monné wrote:
> > On Fri, Mar 10, 2017 at 10:28:43AM -0500, Konrad Rzeszutek Wilk wrote:
> > > On Fri, Mar 10, 2017 at 12:23:18PM +0900, Roger Pau Monné wrote:
> > > > On Thu, Mar 09, 2017 at 07:29:34PM -0500, Konrad Rzeszutek Wilk wrote:
> > > > > On Thu, Mar 09, 2017 at 01:26:45PM +, Julien Grall wrote:
> > > > > > Hi Konrad,
> > > > > > 
> > > > > > On 09/03/17 11:17, Konrad Rzeszutek Wilk wrote:
> > > > > > > On Thu, Mar 09, 2017 at 11:59:51AM +0900, Roger Pau Monné wrote:
> > > > > > > > On Wed, Mar 08, 2017 at 02:12:09PM -0500, Konrad Rzeszutek Wilk 
> > > > > > > > wrote:
> > > > > > > > > On Wed, Mar 08, 2017 at 07:06:23PM +, Julien Grall wrote:
> > > > > > > > > .. this as for SR-IOV devices you need the drivers to kick 
> > > > > > > > > the hardware
> > > > > > > > > to generate the new bus addresses. And those (along with the 
> > > > > > > > > BAR regions) are
> > > > > > > > > not visible in ACPI (they are constructued dynamically).
> > > > > > > > 
> > > > > > > > There's already code in Xen [0] to find out the size of the 
> > > > > > > > BARs of SR-IOV
> > > > > > > > devices, but I'm not sure what's the intended usage of that, 
> > > > > > > > does it need to
> > > > > > > > happen _after_ the driver in Dom0 has done whatever magic for 
> > > > > > > > this to work?
> > > > > > > 
> > > > > > > Yes. This is called via the PHYSDEVOP_pci_device_add hypercall 
> > > > > > > when
> > > > > > > the device driver in dom0 has finished "creating" the VF. See 
> > > > > > > drivers/xen/pci.c
> > > > > > 
> > > > > > We are thinking to not use PHYSDEVOP_pci_device_add hypercall for 
> > > > > > ARM and do
> > > > > > the PCI scanning in Xen.
> > > > > > 
> > > > > > If I understand correctly what you said, only the PCI driver will 
> > > > > > be able to
> > > > > > kick SR-IOV device and Xen would not be able to detect the device 
> > > > > > until it
> > > > > > has been fully configured. So it would mean that we have to keep
> > > > > > PHYSDEVOP_pci_device_add around to know when Xen can use the device.
> > > > > > 
> > > > > > Am I correct?
> > > > > 
> > > > > Yes. Unless the PCI drivers come up with some other way to tell the
> > > > > OS that oh, hey, there is this new PCI device with this BDF.
> > > > > 
> > > > > Or the underlaying bus on ARM can send some 'new device' information?
> > > > 
> > > > Hm, is this something standard between all the SR-IOV implementations, 
> > > > or each
> > > > vendors have their own sauce?
> > > 
> > > Gosh, all of them have their own sauce. The only thing that is the same
> > > is that suddenly behind the PF device there are PCI devies that are 
> > > responding
> > > to 0xcfc requests. MAgic!
> > 
> > I'm reading the PCI SR-IOV 1.1 spec, and I think we don't need to wait for 
> > the
> > device driver in Dom0 in order to get the information of the VF devices, 
> > what
> > Xen cares about is the position of the BARs (so that they can be mapped into
> > Dom0 at boot), and the PCI SBDF of each PF/VF, so that Xen can trap 
> > accesses to
> > it.
> > 
> > AFAICT both of this can be obtained without any driver-specific code, since
> > it's all contained in the PCI SR-IOV spec (but maybe I'm missing something).
> 
> CC-ing Venu,
> 
> Roger, could you point out which of the chapters has this?

This would be chapter 2 ("Initialization and Resource Allocation"), and then
there's a "IMPLEMENTATION NOTE" that shows how the PF/VF are matched to
function numbers in page 45 (I have the following copy, which is the latest
revision: "Single Root I/O Virtualization and Sharing Specification Revision
1.1" from January 20 2010 [0]).

The document is quite complex, but it is a standard that all SR-IOV devices
should follow so AFAICT Xen should be able to get all the information that it
needs from the PCI config space in order to detect the PF/VF BARs and the BDF
device addresses.

Roger.

[0] https://members.pcisig.com/wg/PCI-SIG/document/download/8238

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Backport / cherrypick request for xen 4.8.1 patches fixing direct kernel boot regression

2017-03-15 Thread Sander Eikelenboom
On 15/03/17 13:08, Wei Liu wrote:
> On Wed, Mar 15, 2017 at 01:01:52PM +0100, Sander Eikelenboom wrote:
>> Hi Ian / Wei,
>>
>> Qemu-xen commits:
>> 021746c131cdfeab9d82ff918795a9f18d20d7ae PCMachineState: introduce 
>> acpi_build_enabled field
>> 804ba7c10bbc66bb8a8aa73ecc60f620da7423d5 xen: Fix xenpv machine 
>> initialisation
>> (the second one is a fix for the first one)
>>
>> Fixed a regression with direct kernel boot, which was introduced as feature 
>> in 4.7 and regressed in the 4.8.0 release.
>> At present they don't seem to be in the staging-4.8 branch of the qemu-xen 
>> tree, so are these
>> two patches eligible to be included / backported for the 4.8.1 release ?
>>
> 
> They are already released with 4.8 qemu-xen, don't they?

Terribly sorry for the noise.
Quite a brainfart, somehow i was convinced it wasn't and didn't check
properly.

--
Sander

> 
>> --
>> Sander


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [early RFC] ARM PCI Passthrough design document

2017-03-15 Thread Konrad Rzeszutek Wilk
On Wed, Mar 15, 2017 at 12:07:28PM +, Roger Pau Monné wrote:
> On Fri, Mar 10, 2017 at 10:28:43AM -0500, Konrad Rzeszutek Wilk wrote:
> > On Fri, Mar 10, 2017 at 12:23:18PM +0900, Roger Pau Monné wrote:
> > > On Thu, Mar 09, 2017 at 07:29:34PM -0500, Konrad Rzeszutek Wilk wrote:
> > > > On Thu, Mar 09, 2017 at 01:26:45PM +, Julien Grall wrote:
> > > > > Hi Konrad,
> > > > > 
> > > > > On 09/03/17 11:17, Konrad Rzeszutek Wilk wrote:
> > > > > > On Thu, Mar 09, 2017 at 11:59:51AM +0900, Roger Pau Monné wrote:
> > > > > > > On Wed, Mar 08, 2017 at 02:12:09PM -0500, Konrad Rzeszutek Wilk 
> > > > > > > wrote:
> > > > > > > > On Wed, Mar 08, 2017 at 07:06:23PM +, Julien Grall wrote:
> > > > > > > > .. this as for SR-IOV devices you need the drivers to kick the 
> > > > > > > > hardware
> > > > > > > > to generate the new bus addresses. And those (along with the 
> > > > > > > > BAR regions) are
> > > > > > > > not visible in ACPI (they are constructued dynamically).
> > > > > > > 
> > > > > > > There's already code in Xen [0] to find out the size of the BARs 
> > > > > > > of SR-IOV
> > > > > > > devices, but I'm not sure what's the intended usage of that, does 
> > > > > > > it need to
> > > > > > > happen _after_ the driver in Dom0 has done whatever magic for 
> > > > > > > this to work?
> > > > > > 
> > > > > > Yes. This is called via the PHYSDEVOP_pci_device_add hypercall when
> > > > > > the device driver in dom0 has finished "creating" the VF. See 
> > > > > > drivers/xen/pci.c
> > > > > 
> > > > > We are thinking to not use PHYSDEVOP_pci_device_add hypercall for ARM 
> > > > > and do
> > > > > the PCI scanning in Xen.
> > > > > 
> > > > > If I understand correctly what you said, only the PCI driver will be 
> > > > > able to
> > > > > kick SR-IOV device and Xen would not be able to detect the device 
> > > > > until it
> > > > > has been fully configured. So it would mean that we have to keep
> > > > > PHYSDEVOP_pci_device_add around to know when Xen can use the device.
> > > > > 
> > > > > Am I correct?
> > > > 
> > > > Yes. Unless the PCI drivers come up with some other way to tell the
> > > > OS that oh, hey, there is this new PCI device with this BDF.
> > > > 
> > > > Or the underlaying bus on ARM can send some 'new device' information?
> > > 
> > > Hm, is this something standard between all the SR-IOV implementations, or 
> > > each
> > > vendors have their own sauce?
> > 
> > Gosh, all of them have their own sauce. The only thing that is the same
> > is that suddenly behind the PF device there are PCI devies that are 
> > responding
> > to 0xcfc requests. MAgic!
> 
> I'm reading the PCI SR-IOV 1.1 spec, and I think we don't need to wait for the
> device driver in Dom0 in order to get the information of the VF devices, what
> Xen cares about is the position of the BARs (so that they can be mapped into
> Dom0 at boot), and the PCI SBDF of each PF/VF, so that Xen can trap accesses 
> to
> it.
> 
> AFAICT both of this can be obtained without any driver-specific code, since
> it's all contained in the PCI SR-IOV spec (but maybe I'm missing something).

CC-ing Venu,

Roger, could you point out which of the chapters has this?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Future x86 emulator direction

2017-03-15 Thread Jan Beulich
>>> On 15.03.17 at 13:08,  wrote:
> On 15/03/17 07:49, Jan Beulich wrote:
> On 14.03.17 at 22:07,  wrote:
>>> On 12/14/2016 09:37 AM, Razvan Cojocaru wrote:
 On 12/14/2016 09:14 AM, Jan Beulich wrote:
 On 13.12.16 at 23:02,  wrote:
>> On 13/12/2016 21:55, Razvan Cojocaru wrote:
>>> On a somewhat related note, it's important to also figure out how best
>>> to avoid emulation races such as the LOCK CMPXCHG issue we've discussed
>>> in the past. Maybe that's also worth taking into consideration at this
>>> early stage.
>> Funny you should ask that.
>>
>> The only possible way to do this safely is to have the emulator map the
>> target frame(s) and execute a locked stub instruction with a memory
>> operand pointing at the mapping.  We have no other way of interacting
>> with the cache coherency fabric.
> Well, that approach is necessary only if one path (vCPU) can write
> to a page, while another one needs emulation. If pages are globally
> write-protected, an approach following the model from Razvan's
> earlier patch (which I have no idea what has become of) would
> seem to suffice.
 As previously stated, you've raised performance concerns which seemed to
 require a different direction, namely the one Andrew is now suggesting,
 which indeed, aside from being somewhat faster is also safer for all
 cases (including the one you've mentioned, where one path can write
 normally and the other does so via emulation).

 The old patch itself is still alive in the XenServer patch queue, albeit
 quite unlikely to be trivial to apply to the current Xen 4.9-unstable
 code in its current form:


>>> https://github.com/xenserver/xen-4.7.pg/blob/master/master/xen-x86-emulate-sy
>>>  
>>> ncrhonise-LOCKed-instruction-emulation.patch
 Again, if you decide that this patch is preferable, I can try to rework
 it for the current version of Xen.
>>> Sorry to revive this old thread, but I'm still not sure what the
>>> upstream solution for this very real problem should be. Should I bring
>>> back the old patch that synchronizes LOCKed CMPXCHGs (perhaps with
>>> Andrew's kind help, as he's stated that they keep an up-to-date patch
>>> that works against staging)? Or are you considering implementing a stub
>>> as part of the work being done on the emulator?
>> Both are options imo. The stub approach likely would be the long term
>> better solution, but carries with it quite a bit of emulator rework, since
>> we'd have to completely change the way memory writes get carried
>> out: As we'd need to act on the actual (guest) memory location, we'd
>> have to do a page walk (or possibly two for an access crossing a page
>> boundary) before running the stub, presumably completely replacing
>> the ->write() hook. Compared with this making the ->cmpxchg() hook
>> work as originally intended seems to be the more straightforward
>> solution.
> 
> We already need to change how reads and writes happen.  As it currently
> stands, accesses which cross a page boundary are not handled correctly,
> and will complete a partial read/write on the first page before finding
> that the 2nd page takes a pagefault.  (The root of the problem is that
> hvm_copy() has dual use; originally as a memcpy(), and later to
> implement an individual instructions access.)

Well, even for the memcpy() purpose of the function it would be
better if no partial writes happened (at least when size is
meaningfully smaller than a page).

> The HVM side of the code needs to be altered to work in the same way
> that sh_x86_emulate_{write,cmpxchg}() currently uses
> sh_emulate_map_dest(), except that the read side needs including as
> well.  This important for handling MMIO where reads may have side effects.

Except that commonly MMIO crossing page boundaries is considered
undefined. Are you convinced CPUs never do partial writes when an
access spans pages?

> Once that is complete, the cmpxchg hook at least should have proper
> atomic properties.
> 
> The next question is how to go about making all other LOCKed
> instructions have atomic properties.  One suggestion was to try and
> implement all LOCKed instructions in terms of cmpxchg, but I suspect
> that will come with an unreasonably high overhead for introspection when
> all vcpus are hitting the same spinlock.

Hmm, that's a valid concern.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] "Hello Xen Project" Book.

2017-03-15 Thread Lars Kurth
Hi Mohsen,

> On 15 Mar 2017, at 09:50, Mohsen  wrote:
> 
> Dear Xen Project community members,
> 
> I have written a Xen book recently (pdf attached) which is aimed at teaching 
> Xen newbies. I would like to make the book available to the Xen Project under 
> a CC-BY-SA-3.0 license. Ideally, I would like to publish the content on the 
> Xen Project wiki in an editable form, such that others can contribute and 
> build on it and it stays up-to-date. I also noticed that the Xen Wiki has the 
> https://www.mediawiki.org/wiki/Extension:Collection extension, which should 
> make it possible to create a PDF, ODF or DocBook from the pages for those who 
> want a manual rather than wiki pages.

Thank you for doing this. As far as I can tell the fact that you published the 
book under CC-BY-SA-3.0 would make it possible to move the content to the wiki. 

> I had a conversation with Lars to check whether this is possible and he 
> believes it is. He suggests that first we upload the book as pdf to the wiki 
> and as a second step, agree an information architecture and then convert the 
> book to mark-down. There are a number of conversion tools which should get us 
> there some of the way, with a bit of cleanup and beautification needed after 
> the initial import. I can make the source available in a format that makes 
> conversion to markdown easier.

We do need to find a way to convert the content into markdown format though, 
which may be quite a bit of work.

I have done this before for html pages, converting them into docman markdown. I 
have not checked whether there are online or command line tools which do that 
for mediawiki markdown. In any case, the conversion is fundamentally doable, 
although it will be somewhat tedious to do this. If anyone has more experience, 
please share and advise what the best way forward is.

The main problem that I faced when doing something similar were tables, figures 
and other more advanced formatting. Much of this may get lost or "corrupted" in 
some way and will have to be re-introduced post conversion.  

@Mohsen: as far as I recall, you used Word or LibreOffice to create the book? 
Is that correct? If so, it should be possible to save it in html, which would 
ensure that figures and so on are saved in some sensible way. We would probably 
need to find a temporary location where to store this. And we can start 
experimenting a little and maybe provide a quick guide on how to do this.

As for the information architecture, I was thinking about a structure such as 
...

https://wiki.xenproject.org/wiki/ 
https://wiki.xenproject.org/wiki//title_and_credits 
https://wiki.xenproject.org/wiki// 
https://wiki.xenproject.org/wiki/// 
... a separate article for each article in the book, such as "Virtualization 
and Security". As a first step, we would probably keep the original chapter 
structure. 

This would then look something like ...
https://wiki.xenproject.org/wiki/HelloXenProject
https://wiki.xenproject.org/wiki/HelloXenProject/0/Title
https://wiki.xenproject.org/wiki/HelloXenProject/0/Credits
https://wiki.xenproject.org/wiki/HelloXenProject/0/Licence
https://wiki.xenproject.org/wiki/HelloXenProject/1-Intro
https://wiki.xenproject.org/wiki/HelloXenProject/1-Intro/History
https://wiki.xenproject.org/wiki/HelloXenProject/1-Intro/TypesOfVirtualization

We may need some other extensible numbering scheme, which would make it easy to 
create PDF's with https://www.mediawiki.org/wiki/Extension:Collection - again, 
this is something I don't have experience with.

> What do people think? Is this a good idea? Would anyone be willing to help? I 
> am not very familiar with Markdown and would need someone else to help with 
> the wikification of the book. Lars already volunteered to help.

I will definitely help, but this would be an activity, which could easily be 
distributed. So help from others would be very highly appreciated.

Best Regards
Lars


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 106690: tolerable trouble: broken/fail/pass - PUSHED

2017-03-15 Thread osstest service owner
flight 106690 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106690/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-i386  1 build-check(1) blocked n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64   5 xen-buildfail   never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  15e90cd01f68ff8b23b426e7f91155b81d73db13
baseline version:
 xen  48321fa86ddefe2fddf728dc972b01bb7c7c8559

Last test of basis   106664  2017-03-14 14:01:45 Z0 days
Failing since10  2017-03-14 16:01:04 Z0 days   10 attempts
Testing same since   106690  2017-03-15 10:22:16 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Daniel De Graaf 
  Jan Beulich 
  Juergen Gross 
  Julien Grall 
  Razvan Cojocaru 
  Roger Pau Monné 
  Wei Liu 

jobs:
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-amd64-libvirt  pass
 build-arm64-pvopsfail
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  broken  
 test-amd64-amd64-xl-qemuu-debianhvm-i386 broken  
 test-amd64-amd64-libvirt broken  



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=15e90cd01f68ff8b23b426e7f91155b81d73db13
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
15e90cd01f68ff8b23b426e7f91155b81d73db13
+ branch=xen-unstable-smoke
+ revision=15e90cd01f68ff8b23b426e7f91155b81d73db13
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.8-testing
+ '[' x15e90cd01f68ff8b23b426e7f91155b81d73db13 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git

[Xen-devel] [PATCH v10 6/6] passthrough/io: Fall back to remapping interrupt when we can't use VT-d PI

2017-03-15 Thread Chao Gao
The current logic of using VT-d pi is when guest configurates the pirq's
destination vcpu to a single vcpu, the according IRTE is updated to
posted format. If the destination of the pirq is multiple vcpus, we just use
interrupt remapping. Obviously, we should fall back to remapping interrupt
when guest wrongly configurate destination of pirq or makes it have
multi-destination vcpus.

Signed-off-by: Chao Gao 
---
v10:
- Newly added

 xen/drivers/passthrough/io.c   | 9 +
 xen/drivers/passthrough/vtd/intremap.c | 2 +-
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index 37d9af6..be06b10 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -457,14 +457,7 @@ int pt_irq_create_bind(
 
 /* Use interrupt posting if it is supported. */
 if ( iommu_intpost )
-{
-if ( vcpu )
-pi_update_irte(vcpu, info, pirq_dpci->gmsi.gvec);
-else
-dprintk(XENLOG_G_INFO,
-"%pv: deliver interrupt in remapping mode,gvec:%02x\n",
-vcpu, pirq_dpci->gmsi.gvec);
-}
+pi_update_irte(vcpu, info, pirq_dpci->gmsi.gvec);
 
 break;
 }
diff --git a/xen/drivers/passthrough/vtd/intremap.c 
b/xen/drivers/passthrough/vtd/intremap.c
index 342b45f..331c7d5 100644
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -946,7 +946,7 @@ int pi_update_irte(const struct vcpu *v, const struct pirq 
*pirq,
 {
 struct irq_desc *desc;
 struct msi_desc *msi_desc;
-const struct pi_desc *pi_desc = >arch.hvm_vmx.pi_desc;
+const struct pi_desc *pi_desc = v ? >arch.hvm_vmx.pi_desc : NULL;
 int rc;
 
 desc = pirq_spin_lock_irq_desc(pirq, NULL);
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v10 5/6] passthrough/io: don't migrate pirq when it is delivered through VT-d PI

2017-03-15 Thread Chao Gao
When a vCPU migrated to another pCPU, pt irqs binded to this vCPU also needed
migration. When VT-d PI is enabled, interrupt vector will be recorded to
a main memory resident data-structure and a notification whose destination
is decided by NDST is generated. NDST is properly adjusted during vCPU
migration so pirq directly injected to guest needn't be migrated.

This patch adds a indicator, @via_pi, to show whether the pt irq is delivered
through VT-d PI.

Signed-off-by: Chao Gao 
---
v10:
- Newly added.

 xen/arch/x86/hvm/hvm.c   |  3 +++
 xen/drivers/passthrough/io.c | 15 +++
 xen/include/xen/hvm/irq.h|  1 +
 3 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index ccfae4f..abbdab9 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -445,6 +445,9 @@ static int hvm_migrate_pirq(struct domain *d, struct 
hvm_pirq_dpci *pirq_dpci,
 struct vcpu *v = arg;
 
 if ( (pirq_dpci->flags & HVM_IRQ_DPCI_MACH_MSI) &&
+ (pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI) &&
+ /* Needn't migrate pirq if this pirq is delivered to guest directly.*/
+ (pirq_dpci->gmsi.via_pi != 1) &&
  (pirq_dpci->gmsi.dest_vcpu_id == v->vcpu_id) )
 {
 struct irq_desc *desc =
diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index 080183e..37d9af6 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -365,6 +365,7 @@ int pt_irq_create_bind(
 {
 uint8_t dest, dest_mode, delivery_mode;
 int dest_vcpu_id;
+const struct vcpu *vcpu = NULL;
 
 if ( !(pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) )
 {
@@ -441,6 +442,15 @@ int pt_irq_create_bind(
 
 dest_vcpu_id = hvm_girq_dest_2_vcpu_id(d, dest, dest_mode);
 pirq_dpci->gmsi.dest_vcpu_id = dest_vcpu_id;
+
+pirq_dpci->gmsi.via_pi = 0;
+if ( iommu_intpost )
+{
+vcpu = pi_find_dest_vcpu(d, dest, dest_mode, delivery_mode,
+ pirq_dpci->gmsi.gvec);
+if ( vcpu )
+pirq_dpci->gmsi.via_pi = 1;
+}
 spin_unlock(>event_lock);
 if ( dest_vcpu_id >= 0 )
 hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]);
@@ -448,11 +458,8 @@ int pt_irq_create_bind(
 /* Use interrupt posting if it is supported. */
 if ( iommu_intpost )
 {
-const struct vcpu *vcpu = pi_find_dest_vcpu(d, dest, dest_mode,
-  delivery_mode, pirq_dpci->gmsi.gvec);
-
 if ( vcpu )
-pi_update_irte( vcpu, info, pirq_dpci->gmsi.gvec );
+pi_update_irte(vcpu, info, pirq_dpci->gmsi.gvec);
 else
 dprintk(XENLOG_G_INFO,
 "%pv: deliver interrupt in remapping mode,gvec:%02x\n",
diff --git a/xen/include/xen/hvm/irq.h b/xen/include/xen/hvm/irq.h
index d3f8623..ba74a31 100644
--- a/xen/include/xen/hvm/irq.h
+++ b/xen/include/xen/hvm/irq.h
@@ -63,6 +63,7 @@ struct hvm_gmsi_info {
 uint32_t gvec;
 uint32_t gflags;
 int dest_vcpu_id; /* -1 :multi-dest, non-negative: dest_vcpu_id */
+bool via_pi; /* directly deliver to guest via VT-d PI? */
 };
 
 struct hvm_girq_dpci_mapping {
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v10 0/6] VMX: Properly handle pi descriptor and per-cpu blocking list

2017-03-15 Thread Chao Gao
The current VT-d PI related code may operate incorrectly in the 
following scenarios: 
1. When IRTE is in posted mode, we don't need to set the irq 
affinity for it, since the destination of these interrupts is 
vCPU and the vCPU affinity is set during vCPU scheduling. Patch 
[1/6] handles this. In order to make the logic clearer, it also
extracts the common part from pi_update_irte() and
msi_msg_to_rte_entry() which two functions update irte to make
the logic clearer.

2. [2/6] is a cleanup patch 

3. When a pCPU is unplugged, and there might be vCPUs on its 
list. Since the pCPU is offline, those vCPUs might not be woken 
up again. [3/6] addresses it. 

4. IRTE is updated through structure assigment or memcpy() which is
unsafe. To resolve this, Patch [4/6] use cmpxchg16b() if supported or
two 64-bit write operations to update irte.

5. When VT-d PI is enabled, neen't migrate pirq which is using VT-d PI during
vCPU migration. Patch [5/6] solves this by introducing a new flag to indicate
that the pt-irq is delivered through VT-d PI.

6. We didn't change the IRTE to remapping format when pt-irq is configurated
to have multi-destination vCPUs. Patch [6/6] resolves this problem.

Chao Gao (4):
  VT-d: Introduce new fields in msi_desc to track binding with guest
interrupt
  VT-d: introduce update_irte to update irte safely
  passthrough/io: don't migrate pirq when it is delivered through VT-d PI
  passthrough/io: Fall back to remapping interrupt when we can't use
VT-d PI

Feng Wu (2):
  VT-d: Some cleanups
  VMX: Fixup PI descriptor when cpu is offline

 xen/arch/x86/hvm/hvm.c |   3 +
 xen/arch/x86/hvm/vmx/vmcs.c|   1 +
 xen/arch/x86/hvm/vmx/vmx.c |  70 
 xen/arch/x86/msi.c |   1 +
 xen/drivers/passthrough/io.c   |  22 ++--
 xen/drivers/passthrough/vtd/intremap.c | 196 -
 xen/include/asm-x86/hvm/vmx/vmx.h  |   1 +
 xen/include/asm-x86/msi.h  |   2 +
 xen/include/xen/hvm/irq.h  |   1 +
 9 files changed, 161 insertions(+), 136 deletions(-)

-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v10 3/6] VMX: Fixup PI descriptor when cpu is offline

2017-03-15 Thread Chao Gao
From: Feng Wu 

When cpu is offline, we need to move all the vcpus in its blocking
list to another online cpu, this patch handles it.

Signed-off-by: Feng Wu 
Signed-off-by: Chao Gao 
Reviewed-by: Jan Beulich 
Acked-by: Kevin Tian 
---
v7: 
- Pass unsigned int to vmx_pi_desc_fixup()

v6: 
- Carefully suppress 'SN' to avoid missing notification event
during moving the vcpu to the new list

v5: 
- Add some comments to explain why it doesn't cause deadlock
for the ABBA deadlock scenario. 

v4: 
- Remove the pointless check since we are in machine stop
context and no other cpus go down in parallel.

 xen/arch/x86/hvm/vmx/vmcs.c   |  1 +
 xen/arch/x86/hvm/vmx/vmx.c| 70 +++
 xen/include/asm-x86/hvm/vmx/vmx.h |  1 +
 3 files changed, 72 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 0c1b711..b7f6a5e 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -591,6 +591,7 @@ void vmx_cpu_dead(unsigned int cpu)
 vmx_free_vmcs(per_cpu(vmxon_region, cpu));
 per_cpu(vmxon_region, cpu) = 0;
 nvmx_cpu_dead(cpu);
+vmx_pi_desc_fixup(cpu);
 }
 
 int vmx_cpu_up(void)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 894d7d4..dee0463 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -199,6 +199,76 @@ static void vmx_pi_do_resume(struct vcpu *v)
 vmx_pi_unblock_vcpu(v);
 }
 
+void vmx_pi_desc_fixup(unsigned int cpu)
+{
+unsigned int new_cpu, dest;
+unsigned long flags;
+struct arch_vmx_struct *vmx, *tmp;
+spinlock_t *new_lock, *old_lock = _cpu(vmx_pi_blocking, cpu).lock;
+struct list_head *blocked_vcpus = _cpu(vmx_pi_blocking, cpu).list;
+
+if ( !iommu_intpost )
+return;
+
+/*
+ * We are in the context of CPU_DEAD or CPU_UP_CANCELED notification,
+ * and it is impossible for a second CPU go down in parallel. So we
+ * can safely acquire the old cpu's lock and then acquire the new_cpu's
+ * lock after that.
+ */
+spin_lock_irqsave(old_lock, flags);
+
+list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocking.list)
+{
+/*
+ * Suppress notification or we may miss an interrupt when the
+ * target cpu is dying.
+ */
+pi_set_sn(>pi_desc);
+
+/*
+ * Check whether a notification is pending before doing the
+ * movement, if that is the case we need to wake up it directly
+ * other than moving it to the new cpu's list.
+ */
+if ( pi_test_on(>pi_desc) )
+{
+list_del(>pi_blocking.list);
+vmx->pi_blocking.lock = NULL;
+vcpu_unblock(container_of(vmx, struct vcpu, arch.hvm_vmx));
+}
+else
+{
+/*
+ * We need to find an online cpu as the NDST of the PI descriptor, 
it
+ * doesn't matter whether it is within the cpupool of the domain or
+ * not. As long as it is online, the vCPU will be woken up once the
+ * notification event arrives.
+ */
+new_cpu = cpumask_any(_online_map);
+new_lock = _cpu(vmx_pi_blocking, new_cpu).lock;
+
+spin_lock(new_lock);
+
+ASSERT(vmx->pi_blocking.lock == old_lock);
+
+dest = cpu_physical_id(new_cpu);
+write_atomic(>pi_desc.ndst,
+ x2apic_enabled ? dest : MASK_INSR(dest, 
PI_xAPIC_NDST_MASK));
+
+list_move(>pi_blocking.list,
+  _cpu(vmx_pi_blocking, new_cpu).list);
+vmx->pi_blocking.lock = new_lock;
+
+spin_unlock(new_lock);
+}
+
+pi_clear_sn(>pi_desc);
+}
+
+spin_unlock_irqrestore(old_lock, flags);
+}
+
 /*
  * To handle posted interrupts correctly, we need to set the following
  * state:
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h 
b/xen/include/asm-x86/hvm/vmx/vmx.h
index 2b781ab..5ead57c 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -597,6 +597,7 @@ void free_p2m_hap_data(struct p2m_domain *p2m);
 void p2m_init_hap_data(struct p2m_domain *p2m);
 
 void vmx_pi_per_cpu_init(unsigned int cpu);
+void vmx_pi_desc_fixup(unsigned int cpu);
 
 void vmx_pi_hooks_assign(struct domain *d);
 void vmx_pi_hooks_deassign(struct domain *d);
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


  1   2   >