[Xen-devel] [linux-linus test] 116119: regressions - FAIL

2017-11-13 Thread osstest service owner
flight 116119 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/116119/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qcow2   19 guest-start/debian.repeat fail REGR. vs. 115643
 test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeat fail REGR. vs. 115643

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 115643
 test-amd64-i386-libvirt-qcow2 17 guest-start/debian.repeatfail like 115643
 test-armhf-armhf-xl-credit2  16 guest-start/debian.repeatfail  like 115643
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 115643
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 115643
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 115643
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 115643
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 115643
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 115643
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 115643
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 115643
 test-armhf-armhf-xl-vhd  15 guest-start/debian.repeatfail  like 115643
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass

version targeted for testing:
 linuxb39545684a90ef3374abc0969d64c7bc540d128d
baseline version:
 linuxe4880bc5dfb1f02b152e62a894b5c6f3e995b3cf

Last test of basis   115643  2017-11-07 12:06:20 Z5 days
Failing since115658  2017-11-08 02:33:06 Z5 days8 attempts
Testing same since   116103  2017-11-12 07:16:53 Z1 days2 attempts


People who touched revisions under test:
  Andrew Morton 
  Andrey Konovalov 
  Arnd Bergmann 
  Arvind Yadav 
  Benjamin Tissoires 
  Bjorn Andersson 
  Bjorn Helgaas 
  Bjørn Mork 
  Borislav Petkov 
  Chris Redpath 

Re: [Xen-devel] 答复: Re: Help:Can xen restore several snapshots more faster at same time?

2017-11-13 Thread Wei Liu
Please avoid top-posting.

On Mon, Nov 13, 2017 at 08:25:16AM +, Chenjia (C) wrote:
> 1.   is there some way to improve the xenstored process performance?
> 

The latest version of Cxenstored and Oxenstored have improved
transaction handling. Not sure which version you're using.

> 2.   We also try the virsh tool to restore several xen at same
> time ,we fount the virsh can restore 40+ snapshot(1G per snapshot) at
> same time, the performance is good when snapshot is all in ramdisk.
> But we can’t let all the snapshot is always in ramdisk for it is too
> big. Is there some way to reduce these virsh snapshots space?(these
> snapshot is same: same OS, same config, but need with different IP
> address)

For libvirt question you need to ask on libvirt list.

> 
> Would you please offer us some feedback? Thanks.
> By the way, can we talk with Chinese if possible?

Sorry but communication on mailing list needs to be in English so other
people can join the discussion and / or provide suggestions.

> 
> 发件人: HUANG SHENGQIANG
> 发送时间: 2017年11月6日 18:32
> 收件人: Chenjia (C) >
> 主题: FW: Re: [Xen-devel] Help:Can xen restore several snapshots more faster at 
> same time?
> 
> --
> HUANG SHENGQIANG HUANG SHENGQIANG
> M: 
> +86-18201587800(优先)/+1-6046180423(海外出差)
> E: huang.shengqi...@huawei.com
> 产品与解决方案-交换机与企业网关解决方案架构与设计部
> Products & Solutions-Switch & Enterprise Gateway Solution Architecture & 
> Design Dept
> From:Wei Liu
> To:HUANG SHENGQIANG,
> Cc:xen-de...@lists.xenproject.org,Wangjianbing,Wei Liu,
> Date:2017-11-06 18:28:58
> Subject:Re: [Xen-devel] Help:Can xen restore several snapshots more faster at 
> same time?
> 
> On Mon, Nov 06, 2017 at 04:38:51AM +, HUANG SHENGQIANG wrote:
> > Dear XEN expert,
> >
> > I find a blocker issue in my project. Would you please offer us some 
> > feedback?
> >
> > The description from my development team:
> > we need restore as much as xen snapshot at same times, but we found ‘xl 
> > restore’ command is work linearly,  if we want to restore a new xen 
> > snapshot, we need to wait for the previous snapshot finish it’s work. We 
> > try to debug the xl source ,we found the follow information:
> > [cid:image001.png@01D356F6.B8EE87E0]
> >
> 
> Please don't send pictures.
> 
> > When an snapshot is being restore, we can see another process is blocked.  
> > We trying to delete the acquire_lock from the source code , then we see all 
> > the snapshots are being restore at same time, but restore is still very 
> > slow, one snapshot needs about 25 seconds  to finish restore(our 
> > environment is cpu E52620,  256G memory, SSD hard disk. The snapshot is 
> > Win7 OS with 2G memory).
> >
> 
> There is a lock in xl as you can see in the stack trace.
> 
> > So , does xen have the way to restore more faster when several snapshot is 
> > restore at same time? Why KVM can restore several snapshot fast at same 
> > time? (We try the same experiment in KVM, we got we can restore about 50+ 
> > snapshot in 20S. )
> >
> 
> Part of the toolstack needs to be reworked. You can start by removing
> the lock in xl to see what breaks.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH RFC v3 6/6] KVM guest: introduce smart idle poll algorithm

2017-11-13 Thread Quan Xu
From: Yang Zhang 

using smart idle poll to reduce the useless poll when system is idle.

Signed-off-by: Quan Xu 
Signed-off-by: Yang Zhang 
Cc: Paolo Bonzini 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: k...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
---
 arch/x86/kernel/kvm.c |   47 +++
 1 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 2a6e402..8bb6d55 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -365,11 +366,57 @@ static void kvm_guest_cpu_init(void)
kvm_register_steal_time();
 }
 
+static unsigned int grow_poll_ns(unsigned int old, unsigned int grow,
+unsigned int max)
+{
+   unsigned int val;
+
+   /* set base poll time to 1ns */
+   if (old == 0 && grow)
+   return 1;
+
+   val = old * grow;
+   if (val > max)
+   val = max;
+
+   return val;
+}
+
+static unsigned int shrink_poll_ns(unsigned int old, unsigned int shrink)
+{
+   if (shrink == 0)
+   return 0;
+
+   return old / shrink;
+}
+
+static void kvm_idle_update_poll_duration(ktime_t idle)
+{
+   unsigned long poll_duration = this_cpu_read(poll_duration_ns);
+
+   /* so far poll duration is based on nohz */
+   if (idle == -1ULL)
+   return;
+
+   if (poll_duration && idle > paravirt_poll_threshold_ns)
+   poll_duration = shrink_poll_ns(poll_duration,
+  paravirt_poll_shrink);
+   else if (poll_duration < paravirt_poll_threshold_ns &&
+idle < paravirt_poll_threshold_ns)
+   poll_duration = grow_poll_ns(poll_duration, paravirt_poll_grow,
+paravirt_poll_threshold_ns);
+
+   this_cpu_write(poll_duration_ns, poll_duration);
+}
+
 static void kvm_idle_poll(void)
 {
unsigned long poll_duration = this_cpu_read(poll_duration_ns);
+   ktime_t idle = tick_nohz_get_last_idle_length();
ktime_t start, cur, stop;
 
+   kvm_idle_update_poll_duration(idle);
+
start = cur = ktime_get();
stop = ktime_add_ns(ktime_get(), poll_duration);
 
-- 
1.7.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops

2017-11-13 Thread Wanpeng Li
2017-11-13 18:53 GMT+08:00 Juergen Gross :
> On 13/11/17 11:06, Quan Xu wrote:
>> From: Quan Xu 
>>
>> So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called
>> in idle path which will poll for a while before we enter the real idle
>> state.
>>
>> In virtualization, idle path includes several heavy operations
>> includes timer access(LAPIC timer or TSC deadline timer) which will
>> hurt performance especially for latency intensive workload like message
>> passing task. The cost is mainly from the vmexit which is a hardware
>> context switch between virtual machine and hypervisor. Our solution is
>> to poll for a while and do not enter real idle path if we can get the
>> schedule event during polling.
>>
>> Poll may cause the CPU waste so we adopt a smart polling mechanism to
>> reduce the useless poll.
>>
>> Signed-off-by: Yang Zhang 
>> Signed-off-by: Quan Xu 
>> Cc: Juergen Gross 
>> Cc: Alok Kataria 
>> Cc: Rusty Russell 
>> Cc: Thomas Gleixner 
>> Cc: Ingo Molnar 
>> Cc: "H. Peter Anvin" 
>> Cc: x...@kernel.org
>> Cc: virtualizat...@lists.linux-foundation.org
>> Cc: linux-ker...@vger.kernel.org
>> Cc: xen-de...@lists.xenproject.org
>
> Hmm, is the idle entry path really so critical to performance that a new
> pvops function is necessary? Wouldn't a function pointer, maybe guarded
> by a static key, be enough? A further advantage would be that this would
> work on other architectures, too.

There is a "Adaptive halt-polling" which are merged to upstream more
than two years ago avoids to thread the critical path and has already
been ported to other architectures. https://lkml.org/lkml/2015/9/3/615

Regards,
Wanpeng Li

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [qemu-upstream-unstable test] 116118: FAIL

2017-11-13 Thread Julien Grall

Hi,

On 11/13/2017 11:53 AM, Wei Liu wrote:

On Mon, Nov 13, 2017 at 11:52:12AM +, Julien Grall wrote:

Hi,

On 11/13/2017 06:44 AM, osstest service owner wrote:

flight 116118 qemu-upstream-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/116118/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
   build-armhf-pvopsbroken  in 
115713
   build-armhf-pvops  4 host-install(4) broken in 115713 REGR. vs. 
114457


Looking at the test result, build-armhf-pvops seems to have passed nicely
the past few weeks but one time.

So I am not sure why we are blocking here. Mostly the  one.


host-install failure is probably not related to change in code. It is
trying to install a host to do test. In this case, to build kernel.


They have been failing with that for quite a while now. It looks like a 
force push would be justified here. Any opinions?


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH net-next v1] xen-netback: make copy batch size configurable

2017-11-13 Thread Paul Durrant
> -Original Message-
> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> Sent: 13 November 2017 11:54
> To: Paul Durrant 
> Cc: net...@vger.kernel.org; Wei Liu ; xen-
> de...@lists.xenproject.org
> Subject: Re: [PATCH net-next v1] xen-netback: make copy batch size
> configurable
> 
> On 11/13/2017 10:33 AM, Paul Durrant wrote:
> >> -Original Message-
> >> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> >> Sent: 10 November 2017 19:35
> >> To: net...@vger.kernel.org
> >> Cc: Joao Martins ; Wei Liu
> >> ; Paul Durrant ; xen-
> >> de...@lists.xenproject.org
> >> Subject: [PATCH net-next v1] xen-netback: make copy batch size
> >> configurable
> >>
> >> Commit eb1723a29b9a ("xen-netback: refactor guest rx") refactored Rx
> >> handling and as a result decreased max grant copy ops from 4352 to 64.
> >> Before this commit it would drain the rx_queue (while there are
> >> enough slots in the ring to put packets) then copy to all pages and write
> >> responses on the ring. With the refactor we do almost the same albeit
> >> the last two steps are done every COPY_BATCH_SIZE (64) copies.
> >>
> >> For big packets, the value of 64 means copying 3 packets best case
> scenario
> >> (17 copies) and worst-case only 1 packet (34 copies, i.e. if all frags
> >> plus head cross the 4k grant boundary) which could be the case when
> >> packets go from local backend process.
> >>
> >> Instead of making it static to 64 grant copies, lets allow the user to
> >> select its value (while keeping the current as default) by introducing
> >> the `copy_batch_size` module parameter. This allows users to select
> >> the higher batches (i.e. for better throughput with big packets) as it
> >> was prior to the above mentioned commit.
> >>
> >> Signed-off-by: Joao Martins 
> >> ---
> >>  drivers/net/xen-netback/common.h|  6 --
> >>  drivers/net/xen-netback/interface.c | 25
> -
> >>  drivers/net/xen-netback/netback.c   |  5 +
> >>  drivers/net/xen-netback/rx.c|  5 -
> >>  4 files changed, 37 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-
> >> netback/common.h
> >> index a46a1e94505d..a5fe36e098a7 100644
> >> --- a/drivers/net/xen-netback/common.h
> >> +++ b/drivers/net/xen-netback/common.h
> >> @@ -129,8 +129,9 @@ struct xenvif_stats {
> >>  #define COPY_BATCH_SIZE 64
> >>
> >>  struct xenvif_copy_state {
> >> -  struct gnttab_copy op[COPY_BATCH_SIZE];
> >> -  RING_IDX idx[COPY_BATCH_SIZE];
> >> +  struct gnttab_copy *op;
> >> +  RING_IDX *idx;
> >> +  unsigned int size;
> >
> > Could you name this batch_size, or something like that to make it clear
> what it means?
> >
> Yeap, will change it.
> 
> >>unsigned int num;
> >>struct sk_buff_head *completed;
> >>  };
> >> @@ -381,6 +382,7 @@ extern unsigned int rx_drain_timeout_msecs;
> >>  extern unsigned int rx_stall_timeout_msecs;
> >>  extern unsigned int xenvif_max_queues;
> >>  extern unsigned int xenvif_hash_cache_size;
> >> +extern unsigned int xenvif_copy_batch_size;
> >>
> >>  #ifdef CONFIG_DEBUG_FS
> >>  extern struct dentry *xen_netback_dbg_root;
> >> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> >> netback/interface.c
> >> index d6dff347f896..a558868a883f 100644
> >> --- a/drivers/net/xen-netback/interface.c
> >> +++ b/drivers/net/xen-netback/interface.c
> >> @@ -516,7 +516,20 @@ struct xenvif *xenvif_alloc(struct device *parent,
> >> domid_t domid,
> >>
> >>  int xenvif_init_queue(struct xenvif_queue *queue)
> >>  {
> >> +  int size = xenvif_copy_batch_size;
> >
> > unsigned int
> >>>   int err, i;
> >> +  void *addr;
> >> +
> >> +  addr = vzalloc(size * sizeof(struct gnttab_copy));
> >
> > Does the memory need to be zeroed?
> >
> It doesn't need to be but given that xenvif_queue is zeroed (which included
> this
> region) thus thought I would leave the same way.

Ok.

> 
> >> +  if (!addr)
> >> +  goto err;
> >> +  queue->rx_copy.op = addr;
> >> +
> >> +  addr = vzalloc(size * sizeof(RING_IDX));
> >
> > Likewise.
> >
> >> +  if (!addr)
> >> +  goto err;
> >> +  queue->rx_copy.idx = addr;
> >> +  queue->rx_copy.size = size;
> >>
> >>queue->credit_bytes = queue->remaining_credit = ~0UL;
> >>queue->credit_usec  = 0UL;
> >> @@ -544,7 +557,7 @@ int xenvif_init_queue(struct xenvif_queue
> *queue)
> >> queue->mmap_pages);
> >>if (err) {
> >>netdev_err(queue->vif->dev, "Could not reserve
> >> mmap_pages\n");
> >> -  return -ENOMEM;
> >> +  goto err;
> >>}
> >>
> >>for (i = 0; i < MAX_PENDING_REQS; i++) {
> >> @@ -556,6 +569,13 @@ int xenvif_init_queue(struct xenvif_queue
> *queue)
> >>}
> >>
> >>return 0;
> >> +
> >> +err:
> >> +  if (queue->rx_copy.op)
> >> +  

[Xen-devel] 答复: Re: Help:Can xen restore several snapshots more faster at same time?

2017-11-13 Thread Chenjia (C)
Dear XEN expert:
   For the last question, we try the follow steps:

1)  remove the lock  acquire_lock in xl_cmdimpl.c, then we find the bottleneck 
is the IO

2)  we move the all the 40 win7 snapshots to the ramdisk(total 40G, 1G per 
snapshot), then we find  when we restore all 40  snapshots at same time , the 
new bottleneck is the xenstored  process, the xenstored process is work with 1 
thread, and the cpu which run the thread is always 100%

3)  we start  the xenstored process with ‘-I’ argument,  xenstored process  
is still  very busy.
So , we have 2 questions:

1.   is there some way to improve the xenstored process performance?

2.   We also try the virsh tool to restore several xen at same time ,we 
fount the virsh can restore 40+ snapshot(1G per snapshot) at same time, the 
performance is good when snapshot is all in ramdisk. But we can’t let all the 
snapshot is always in ramdisk for it is too big. Is there some way to reduce 
these virsh snapshots space?(these snapshot is same: same OS, same config, but 
need with different IP address)

Would you please offer us some feedback? Thanks.
By the way, can we talk with Chinese if possible?

发件人: HUANG SHENGQIANG
发送时间: 2017年11月6日 18:32
收件人: Chenjia (C) >
主题: FW: Re: [Xen-devel] Help:Can xen restore several snapshots more faster at 
same time?

--
HUANG SHENGQIANG HUANG SHENGQIANG
M: 
+86-18201587800(优先)/+1-6046180423(海外出差)
E: huang.shengqi...@huawei.com
产品与解决方案-交换机与企业网关解决方案架构与设计部
Products & Solutions-Switch & Enterprise Gateway Solution Architecture & Design 
Dept
From:Wei Liu
To:HUANG SHENGQIANG,
Cc:xen-de...@lists.xenproject.org,Wangjianbing,Wei Liu,
Date:2017-11-06 18:28:58
Subject:Re: [Xen-devel] Help:Can xen restore several snapshots more faster at 
same time?

On Mon, Nov 06, 2017 at 04:38:51AM +, HUANG SHENGQIANG wrote:
> Dear XEN expert,
>
> I find a blocker issue in my project. Would you please offer us some feedback?
>
> The description from my development team:
> we need restore as much as xen snapshot at same times, but we found ‘xl 
> restore’ command is work linearly,  if we want to restore a new xen snapshot, 
> we need to wait for the previous snapshot finish it’s work. We try to debug 
> the xl source ,we found the follow information:
> [cid:image001.png@01D356F6.B8EE87E0]
>

Please don't send pictures.

> When an snapshot is being restore, we can see another process is blocked.  We 
> trying to delete the acquire_lock from the source code , then we see all the 
> snapshots are being restore at same time, but restore is still very slow, one 
> snapshot needs about 25 seconds  to finish restore(our environment is cpu 
> E52620,  256G memory, SSD hard disk. The snapshot is Win7 OS with 2G memory).
>

There is a lock in xl as you can see in the stack trace.

> So , does xen have the way to restore more faster when several snapshot is 
> restore at same time? Why KVM can restore several snapshot fast at same time? 
> (We try the same experiment in KVM, we got we can restore about 50+ snapshot 
> in 20S. )
>

Part of the toolstack needs to be reworked. You can start by removing
the lock in xl to see what breaks.
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] x86/mm: fix a potential race condition in modify_xen_mappings().

2017-11-13 Thread Jan Beulich
>>> On 10.11.17 at 15:02,  wrote:
> On 11/10/2017 5:57 PM, Jan Beulich wrote:
> On 10.11.17 at 08:18,  wrote:
>>> --- a/xen/arch/x86/mm.c
>>> +++ b/xen/arch/x86/mm.c
>>> @@ -5097,6 +5097,17 @@ int modify_xen_mappings(unsigned long s, unsigned 
>>> long e, unsigned int nf)
>>>*/
>>>   if ( (nf & _PAGE_PRESENT) || ((v != e) && (l1_table_offset(v) 
>>> != 0)) )
>>>   continue;
>>> +if ( locking )
>>> +spin_lock(_pgdir_lock);
>>> +
>>> +/* L2E may be cleared on another CPU. */
>>> +if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
>> I think you also need a PSE check here, or else the l2e_to_l1e() below
>> may be illegal.
> 
> Hmm, interesting point, and thanks! :-)
> I did not check the PSE, because modify_xen_mappings() will not do the 
> re-consolidation, and
> concurrent invokes of this routine will not change this flag. But now I 
> believe this presumption
> shall not be made, because the paging structures may be modified by 
> other routines, like
> map_pages_to_xen() on other CPUs.
> 
> So yes, I think a _PAGE_PSE check is necessary here. And I suggest we 
> also check the _PAGE_PRESENT
> flag as well, for the re-consolidation part in my first patch for 
> map_pages_to_xen(). Do you agree?

Oh, yes, definitely. I should have noticed this myself.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH net-next v1] xen-netback: make copy batch size configurable

2017-11-13 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 13 November 2017 10:50
> To: Paul Durrant 
> Cc: Wei Liu ; xen-de...@lists.xenproject.org; 'Joao
> Martins' ; net...@vger.kernel.org
> Subject: Re: [Xen-devel] [PATCH net-next v1] xen-netback: make copy batch
> size configurable
> 
> >>> On 13.11.17 at 11:33,  wrote:
> >> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> >> Sent: 10 November 2017 19:35
> >> --- a/drivers/net/xen-netback/netback.c
> >> +++ b/drivers/net/xen-netback/netback.c
> >> @@ -96,6 +96,11 @@ unsigned int xenvif_hash_cache_size =
> >> XENVIF_HASH_CACHE_SIZE_DEFAULT;
> >>  module_param_named(hash_cache_size, xenvif_hash_cache_size, uint,
> >> 0644);
> 
> Isn't the "owner-write" permission here ...
> 
> >> --- a/drivers/net/xen-netback/rx.c
> >> +++ b/drivers/net/xen-netback/rx.c
> >> @@ -168,11 +168,14 @@ static void xenvif_rx_copy_add(struct
> >> xenvif_queue *queue,
> >>   struct xen_netif_rx_request *req,
> >>   unsigned int offset, void *data, size_t len)
> >>  {
> >> +  unsigned int batch_size;
> >>struct gnttab_copy *op;
> >>struct page *page;
> >>struct xen_page_foreign *foreign;
> >>
> >> -  if (queue->rx_copy.num == COPY_BATCH_SIZE)
> >> +  batch_size = min(xenvif_copy_batch_size, queue->rx_copy.size);
> >
> > Surely queue->rx_copy.size and xenvif_copy_batch_size are always
> identical?
> > Why do you need this statement (and hence stack variable)?
> 
> ... the answer to your question?

Yes, I guess it could be... but since there's no re-alloc code for the arrays I 
wonder whether the intention was to make this dynamic or not.

  Paul

> 
> Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH RFC v3 0/6] x86/idle: add halt poll support

2017-11-13 Thread Quan Xu
From: Yang Zhang 

Some latency-intensive workload have seen obviously performance
drop when running inside VM. The main reason is that the overhead
is amplified when running inside VM. The most cost I have seen is
inside idle path.

This patch introduces a new mechanism to poll for a while before
entering idle state. If schedule is needed during poll, then we
don't need to goes through the heavy overhead path.

Here is the data we get when running benchmark contextswitch to measure
the latency(lower is better):

   1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
 3402.9 ns/ctxsw -- 199.8 %CPU

   2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0):
  halt_poll_threshold=1  -- 1151.4 ns/ctxsw -- 200.1 %CPU
  halt_poll_threshold=2  -- 1149.7 ns/ctxsw -- 199.9 %CPU
  halt_poll_threshold=3  -- 1151.0 ns/ctxsw -- 199.9 %CPU
  halt_poll_threshold=4  -- 1155.4 ns/ctxsw -- 199.3 %CPU
  halt_poll_threshold=5  -- 1161.0 ns/ctxsw -- 200.0 %CPU
  halt_poll_threshold=10 -- 1163.8 ns/ctxsw -- 200.4 %CPU
  halt_poll_threshold=30 -- 1159.4 ns/ctxsw -- 201.9 %CPU
  halt_poll_threshold=50 -- 1163.5 ns/ctxsw -- 205.5 %CPU

   3. w/ kvm dynamic poll:
  halt_poll_ns=1  -- 3470.5 ns/ctxsw -- 199.6 %CPU
  halt_poll_ns=2  -- 3273.0 ns/ctxsw -- 199.7 %CPU
  halt_poll_ns=3  -- 3628.7 ns/ctxsw -- 199.4 %CPU
  halt_poll_ns=4  -- 2280.6 ns/ctxsw -- 199.5 %CPU
  halt_poll_ns=5  -- 3200.3 ns/ctxsw -- 199.7 %CPU
  halt_poll_ns=10 -- 2186.6 ns/ctxsw -- 199.6 %CPU
  halt_poll_ns=30 -- 3178.7 ns/ctxsw -- 199.6 %CPU
  halt_poll_ns=50 -- 3505.4 ns/ctxsw -- 199.7 %CPU

   4. w/patch and w/ kvm dynamic poll:

  halt_poll_ns=1 & halt_poll_threshold=1  -- 1155.5 ns/ctxsw -- 
199.8 %CPU
  halt_poll_ns=1 & halt_poll_threshold=2  -- 1165.6 ns/ctxsw -- 
199.8 %CPU
  halt_poll_ns=1 & halt_poll_threshold=3  -- 1161.1 ns/ctxsw -- 
200.0 %CPU

  halt_poll_ns=2 & halt_poll_threshold=1  -- 1158.1 ns/ctxsw -- 
199.8 %CPU
  halt_poll_ns=2 & halt_poll_threshold=2  -- 1161.0 ns/ctxsw -- 
199.7 %CPU
  halt_poll_ns=2 & halt_poll_threshold=3  -- 1163.7 ns/ctxsw -- 
199.9 %CPU

  halt_poll_ns=3 & halt_poll_threshold=1  -- 1158.7 ns/ctxsw -- 
199.7 %CPU
  halt_poll_ns=3 & halt_poll_threshold=2  -- 1153.8 ns/ctxsw -- 
199.8 %CPU
  halt_poll_ns=3 & halt_poll_threshold=3  -- 1155.1 ns/ctxsw -- 
199.8 %CPU

   5. idle=poll
  3957.57 ns/ctxsw --  999.4%CPU

Here is the data we get when running benchmark netperf:

   1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
  29031.6 bit/s -- 76.1 %CPU

   2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0):
  halt_poll_threshold=1  -- 29021.7 bit/s -- 105.1 %CPU
  halt_poll_threshold=2  -- 33463.5 bit/s -- 128.2 %CPU
  halt_poll_threshold=3  -- 34436.4 bit/s -- 127.8 %CPU
  halt_poll_threshold=4  -- 35563.3 bit/s -- 129.6 %CPU
  halt_poll_threshold=5  -- 35787.7 bit/s -- 129.4 %CPU
  halt_poll_threshold=10 -- 35477.7 bit/s -- 130.0 %CPU
  halt_poll_threshold=30 -- 35730.0 bit/s -- 132.4 %CPU
  halt_poll_threshold=50 -- 34978.4 bit/s -- 134.2 %CPU

   3. w/ kvm dynamic poll:
  halt_poll_ns=1  -- 28849.8 bit/s -- 75.2  %CPU
  halt_poll_ns=2  -- 29004.8 bit/s -- 76.1  %CPU
  halt_poll_ns=3  -- 35662.0 bit/s -- 199.7 %CPU
  halt_poll_ns=4  -- 35874.8 bit/s -- 187.5 %CPU
  halt_poll_ns=5  -- 35603.1 bit/s -- 199.8 %CPU
  halt_poll_ns=10 -- 35588.8 bit/s -- 200.0 %CPU
  halt_poll_ns=30 -- 35912.4 bit/s -- 200.0 %CPU
  halt_poll_ns=50 -- 35735.6 bit/s -- 200.0 %CPU

   4. w/patch and w/ kvm dynamic poll:

  halt_poll_ns=1 & halt_poll_threshold=1  -- 29427.9 bit/s -- 107.8 
%CPU
  halt_poll_ns=1 & halt_poll_threshold=2  -- 33048.4 bit/s -- 128.1 
%CPU
  halt_poll_ns=1 & halt_poll_threshold=3  -- 35129.8 bit/s -- 129.1 
%CPU

  halt_poll_ns=2 & halt_poll_threshold=1  -- 31091.3 bit/s -- 130.3 
%CPU
  halt_poll_ns=2 & halt_poll_threshold=2  -- 33587.9 bit/s -- 128.9 
%CPU
  halt_poll_ns=2 & halt_poll_threshold=3  -- 35532.9 bit/s -- 129.1 
%CPU

  halt_poll_ns=3 & halt_poll_threshold=1  -- 35633.1 bit/s -- 199.4 
%CPU
  halt_poll_ns=3 & halt_poll_threshold=2  -- 42225.3 bit/s -- 198.7 
%CPU
  halt_poll_ns=3 & halt_poll_threshold=3  -- 42210.7 bit/s -- 200.3 
%CPU

   5. idle=poll
  37081.7 bit/s -- 998.1 %CPU

---
V2 -> V3:
- move poll update into arch/. in v3, poll update is based on duration of the
  last idle loop which is from tick_nohz_idle_enter to tick_nohz_idle_exit,
  and try our best not to interfere with scheduler/idle code. (This seems
  not to follow Peter's v2 comment, however we had a 

[Xen-devel] [PATCH RFC v3 3/6] sched/idle: Add a generic poll before enter real idle path

2017-11-13 Thread Quan Xu
From: Yang Zhang 

Implement a generic idle poll which resembles the functionality
found in arch/. Provide weak arch_cpu_idle_poll function which
can be overridden by the architecture code if needed.

Interrupts arrive which may not cause a reschedule in idle loops.
In KVM guest, this costs several VM-exit/VM-entry cycles, VM-entry
for interrupts and VM-exit immediately. Also this becomes more
expensive than bare metal. Add a generic idle poll before enter
real idle path. When a reschedule event is pending, we can bypass
the real idle path.

Signed-off-by: Quan Xu 
Signed-off-by: Yang Zhang 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: Peter Zijlstra 
Cc: Borislav Petkov 
Cc: Kyle Huey 
Cc: Len Brown 
Cc: Andy Lutomirski 
Cc: Tom Lendacky 
Cc: Tobias Klauser 
Cc: linux-ker...@vger.kernel.org
---
 arch/x86/kernel/process.c |7 +++
 kernel/sched/idle.c   |2 ++
 2 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index c676853..f7db8b5 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -333,6 +333,13 @@ void arch_cpu_idle(void)
x86_idle();
 }
 
+#ifdef CONFIG_PARAVIRT
+void arch_cpu_idle_poll(void)
+{
+   paravirt_idle_poll();
+}
+#endif
+
 /*
  * We use this if we don't have any better idle routine..
  */
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 257f4f0..df7c422 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -74,6 +74,7 @@ static noinline int __cpuidle cpu_idle_poll(void)
 }
 
 /* Weak implementations for optional arch specific functions */
+void __weak arch_cpu_idle_poll(void) { }
 void __weak arch_cpu_idle_prepare(void) { }
 void __weak arch_cpu_idle_enter(void) { }
 void __weak arch_cpu_idle_exit(void) { }
@@ -219,6 +220,7 @@ static void do_idle(void)
 */
 
__current_set_polling();
+   arch_cpu_idle_poll();
quiet_vmstat();
tick_nohz_idle_enter();
 
-- 
1.7.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops

2017-11-13 Thread Quan Xu
From: Quan Xu 

So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called
in idle path which will poll for a while before we enter the real idle
state.

In virtualization, idle path includes several heavy operations
includes timer access(LAPIC timer or TSC deadline timer) which will
hurt performance especially for latency intensive workload like message
passing task. The cost is mainly from the vmexit which is a hardware
context switch between virtual machine and hypervisor. Our solution is
to poll for a while and do not enter real idle path if we can get the
schedule event during polling.

Poll may cause the CPU waste so we adopt a smart polling mechanism to
reduce the useless poll.

Signed-off-by: Yang Zhang 
Signed-off-by: Quan Xu 
Cc: Juergen Gross 
Cc: Alok Kataria 
Cc: Rusty Russell 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: virtualizat...@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: xen-de...@lists.xenproject.org
---
 arch/x86/include/asm/paravirt.h   |5 +
 arch/x86/include/asm/paravirt_types.h |6 ++
 arch/x86/kernel/paravirt.c|6 ++
 3 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index fd81228..3c83727 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -198,6 +198,11 @@ static inline unsigned long long paravirt_read_pmc(int 
counter)
 
 #define rdpmcl(counter, val) ((val) = paravirt_read_pmc(counter))
 
+static inline void paravirt_idle_poll(void)
+{
+   PVOP_VCALL0(pv_idle_ops.poll);
+}
+
 static inline void paravirt_alloc_ldt(struct desc_struct *ldt, unsigned 
entries)
 {
PVOP_VCALL2(pv_cpu_ops.alloc_ldt, ldt, entries);
diff --git a/arch/x86/include/asm/paravirt_types.h 
b/arch/x86/include/asm/paravirt_types.h
index 10cc3b9..95c0e3e 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -313,6 +313,10 @@ struct pv_lock_ops {
struct paravirt_callee_save vcpu_is_preempted;
 } __no_randomize_layout;
 
+struct pv_idle_ops {
+   void (*poll)(void);
+} __no_randomize_layout;
+
 /* This contains all the paravirt structures: we get a convenient
  * number for each function using the offset which we use to indicate
  * what to patch. */
@@ -323,6 +327,7 @@ struct paravirt_patch_template {
struct pv_irq_ops pv_irq_ops;
struct pv_mmu_ops pv_mmu_ops;
struct pv_lock_ops pv_lock_ops;
+   struct pv_idle_ops pv_idle_ops;
 } __no_randomize_layout;
 
 extern struct pv_info pv_info;
@@ -332,6 +337,7 @@ struct paravirt_patch_template {
 extern struct pv_irq_ops pv_irq_ops;
 extern struct pv_mmu_ops pv_mmu_ops;
 extern struct pv_lock_ops pv_lock_ops;
+extern struct pv_idle_ops pv_idle_ops;
 
 #define PARAVIRT_PATCH(x)  \
(offsetof(struct paravirt_patch_template, x) / sizeof(void *))
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 19a3e8f..67cab22 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -128,6 +128,7 @@ unsigned paravirt_patch_jmp(void *insnbuf, const void 
*target,
 #ifdef CONFIG_PARAVIRT_SPINLOCKS
.pv_lock_ops = pv_lock_ops,
 #endif
+   .pv_idle_ops = pv_idle_ops,
};
return *((void **) + type);
 }
@@ -312,6 +313,10 @@ struct pv_time_ops pv_time_ops = {
.steal_clock = native_steal_clock,
 };
 
+struct pv_idle_ops pv_idle_ops = {
+   .poll = paravirt_nop,
+};
+
 __visible struct pv_irq_ops pv_irq_ops = {
.save_fl = __PV_IS_CALLEE_SAVE(native_save_fl),
.restore_fl = __PV_IS_CALLEE_SAVE(native_restore_fl),
@@ -463,3 +468,4 @@ struct pv_mmu_ops pv_mmu_ops __ro_after_init = {
 EXPORT_SYMBOL(pv_mmu_ops);
 EXPORT_SYMBOL_GPL(pv_info);
 EXPORT_SYMBOL(pv_irq_ops);
+EXPORT_SYMBOL(pv_idle_ops);
-- 
1.7.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH RFC v3 2/6] KVM guest: register kvm_idle_poll for pv_idle_ops

2017-11-13 Thread Quan Xu
From: Quan Xu 

Although smart idle poll has nothing to do with paravirt, it can
not bring any benifit to native. So we only enable it when Linux
runs as a KVM guest( also it can extend to other hypervisor like
Xen, HyperV and VMware).

Introduce per-CPU variable poll_duration_ns to control the max
poll time.

Signed-off-by: Yang Zhang 
Signed-off-by: Quan Xu 
Cc: Paolo Bonzini 
Cc: "Radim Krčmář" 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: k...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
---
 arch/x86/kernel/kvm.c |   26 ++
 1 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 8bb9594..2a6e402 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -75,6 +75,7 @@ static int parse_no_kvmclock_vsyscall(char *arg)
 
 early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
 
+static DEFINE_PER_CPU(unsigned long, poll_duration_ns);
 static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
 static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
 static int has_steal_clock = 0;
@@ -364,6 +365,29 @@ static void kvm_guest_cpu_init(void)
kvm_register_steal_time();
 }
 
+static void kvm_idle_poll(void)
+{
+   unsigned long poll_duration = this_cpu_read(poll_duration_ns);
+   ktime_t start, cur, stop;
+
+   start = cur = ktime_get();
+   stop = ktime_add_ns(ktime_get(), poll_duration);
+
+   do {
+   if (need_resched())
+   break;
+   cur = ktime_get();
+   } while (ktime_before(cur, stop));
+}
+
+static void kvm_guest_idle_init(void)
+{
+   if (!kvm_para_available())
+   return;
+
+   pv_idle_ops.poll = kvm_idle_poll;
+}
+
 static void kvm_pv_disable_apf(void)
 {
if (!__this_cpu_read(apf_reason.enabled))
@@ -499,6 +523,8 @@ void __init kvm_guest_init(void)
kvm_guest_cpu_init();
 #endif
 
+   kvm_guest_idle_init();
+
/*
 * Hard lockup detection is enabled by default. Disable it, as guests
 * can get false positives too easily, for example if the host is
-- 
1.7.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Xen 4.10 RC4

2017-11-13 Thread Julien Grall

Hi all,

Xen 4.10 RC4 is tagged. You can check that out from xen.git:

  git://xenbits.xen.org/xen.git 4.10.0-rc4

For your convenience there is also a tarball at:
https://downloads.xenproject.org/release/xen/4.10.0-rc4/xen-4.10.0-rc4.tar.gz

And the signature is at:
https://downloads.xenproject.org/release/xen/4.10.0-rc4/xen-4.10.0-rc4.tar.gz.sig

Please send bug reports and test reports to
xen-de...@lists.xenproject.org. When sending bug reports, please CC
relevant maintainers and me (julien.gr...@linaro.org).

Thanks,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH RFC v3 0/6] x86/idle: add halt poll support

2017-11-13 Thread Quan Xu
From: Yang Zhang 

Some latency-intensive workload have seen obviously performance
drop when running inside VM. The main reason is that the overhead
is amplified when running inside VM. The most cost I have seen is
inside idle path.

This patch introduces a new mechanism to poll for a while before
entering idle state. If schedule is needed during poll, then we
don't need to goes through the heavy overhead path.

Here is the data we get when running benchmark contextswitch to measure
the latency(lower is better):

   1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
 3402.9 ns/ctxsw -- 199.8 %CPU

   2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0):
  halt_poll_threshold=1  -- 1151.4 ns/ctxsw -- 200.1 %CPU
  halt_poll_threshold=2  -- 1149.7 ns/ctxsw -- 199.9 %CPU
  halt_poll_threshold=3  -- 1151.0 ns/ctxsw -- 199.9 %CPU
  halt_poll_threshold=4  -- 1155.4 ns/ctxsw -- 199.3 %CPU
  halt_poll_threshold=5  -- 1161.0 ns/ctxsw -- 200.0 %CPU
  halt_poll_threshold=10 -- 1163.8 ns/ctxsw -- 200.4 %CPU
  halt_poll_threshold=30 -- 1159.4 ns/ctxsw -- 201.9 %CPU
  halt_poll_threshold=50 -- 1163.5 ns/ctxsw -- 205.5 %CPU

   3. w/ kvm dynamic poll:
  halt_poll_ns=1  -- 3470.5 ns/ctxsw -- 199.6 %CPU
  halt_poll_ns=2  -- 3273.0 ns/ctxsw -- 199.7 %CPU
  halt_poll_ns=3  -- 3628.7 ns/ctxsw -- 199.4 %CPU
  halt_poll_ns=4  -- 2280.6 ns/ctxsw -- 199.5 %CPU
  halt_poll_ns=5  -- 3200.3 ns/ctxsw -- 199.7 %CPU
  halt_poll_ns=10 -- 2186.6 ns/ctxsw -- 199.6 %CPU
  halt_poll_ns=30 -- 3178.7 ns/ctxsw -- 199.6 %CPU
  halt_poll_ns=50 -- 3505.4 ns/ctxsw -- 199.7 %CPU

   4. w/patch and w/ kvm dynamic poll:

  halt_poll_ns=1 & halt_poll_threshold=1  -- 1155.5 ns/ctxsw -- 
199.8 %CPU
  halt_poll_ns=1 & halt_poll_threshold=2  -- 1165.6 ns/ctxsw -- 
199.8 %CPU
  halt_poll_ns=1 & halt_poll_threshold=3  -- 1161.1 ns/ctxsw -- 
200.0 %CPU

  halt_poll_ns=2 & halt_poll_threshold=1  -- 1158.1 ns/ctxsw -- 
199.8 %CPU
  halt_poll_ns=2 & halt_poll_threshold=2  -- 1161.0 ns/ctxsw -- 
199.7 %CPU
  halt_poll_ns=2 & halt_poll_threshold=3  -- 1163.7 ns/ctxsw -- 
199.9 %CPU

  halt_poll_ns=3 & halt_poll_threshold=1  -- 1158.7 ns/ctxsw -- 
199.7 %CPU
  halt_poll_ns=3 & halt_poll_threshold=2  -- 1153.8 ns/ctxsw -- 
199.8 %CPU
  halt_poll_ns=3 & halt_poll_threshold=3  -- 1155.1 ns/ctxsw -- 
199.8 %CPU

   5. idle=poll
  3957.57 ns/ctxsw --  999.4%CPU

Here is the data we get when running benchmark netperf:

   1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
  29031.6 bit/s -- 76.1 %CPU

   2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0):
  halt_poll_threshold=1  -- 29021.7 bit/s -- 105.1 %CPU
  halt_poll_threshold=2  -- 33463.5 bit/s -- 128.2 %CPU
  halt_poll_threshold=3  -- 34436.4 bit/s -- 127.8 %CPU
  halt_poll_threshold=4  -- 35563.3 bit/s -- 129.6 %CPU
  halt_poll_threshold=5  -- 35787.7 bit/s -- 129.4 %CPU
  halt_poll_threshold=10 -- 35477.7 bit/s -- 130.0 %CPU
  halt_poll_threshold=30 -- 35730.0 bit/s -- 132.4 %CPU
  halt_poll_threshold=50 -- 34978.4 bit/s -- 134.2 %CPU

   3. w/ kvm dynamic poll:
  halt_poll_ns=1  -- 28849.8 bit/s -- 75.2  %CPU
  halt_poll_ns=2  -- 29004.8 bit/s -- 76.1  %CPU
  halt_poll_ns=3  -- 35662.0 bit/s -- 199.7 %CPU
  halt_poll_ns=4  -- 35874.8 bit/s -- 187.5 %CPU
  halt_poll_ns=5  -- 35603.1 bit/s -- 199.8 %CPU
  halt_poll_ns=10 -- 35588.8 bit/s -- 200.0 %CPU
  halt_poll_ns=30 -- 35912.4 bit/s -- 200.0 %CPU
  halt_poll_ns=50 -- 35735.6 bit/s -- 200.0 %CPU

   4. w/patch and w/ kvm dynamic poll:

  halt_poll_ns=1 & halt_poll_threshold=1  -- 29427.9 bit/s -- 107.8 
%CPU
  halt_poll_ns=1 & halt_poll_threshold=2  -- 33048.4 bit/s -- 128.1 
%CPU
  halt_poll_ns=1 & halt_poll_threshold=3  -- 35129.8 bit/s -- 129.1 
%CPU

  halt_poll_ns=2 & halt_poll_threshold=1  -- 31091.3 bit/s -- 130.3 
%CPU
  halt_poll_ns=2 & halt_poll_threshold=2  -- 33587.9 bit/s -- 128.9 
%CPU
  halt_poll_ns=2 & halt_poll_threshold=3  -- 35532.9 bit/s -- 129.1 
%CPU

  halt_poll_ns=3 & halt_poll_threshold=1  -- 35633.1 bit/s -- 199.4 
%CPU
  halt_poll_ns=3 & halt_poll_threshold=2  -- 42225.3 bit/s -- 198.7 
%CPU
  halt_poll_ns=3 & halt_poll_threshold=3  -- 42210.7 bit/s -- 200.3 
%CPU

   5. idle=poll
  37081.7 bit/s -- 998.1 %CPU

---
V2 -> V3:
- move poll update into arch/. in v3, poll update is based on duration of the
  last idle loop which is from tick_nohz_idle_enter to tick_nohz_idle_exit,
  and try our best not to interfere with scheduler/idle code. (This seems
  not to follow Peter's v2 comment, however we had a 

[Xen-devel] [distros-debian-sid test] 72441: tolerable FAIL

2017-11-13 Thread Platform Team regression test user
flight 72441 distros-debian-sid real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/72441/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-i386-i386-sid-netboot-pvgrub 10 debian-di-install   fail like 72429
 test-armhf-armhf-armhf-sid-netboot-pygrub 10 debian-di-install fail like 72429
 test-amd64-i386-amd64-sid-netboot-pygrub 10 debian-di-install  fail like 72429
 test-amd64-amd64-amd64-sid-netboot-pvgrub 10 debian-di-install fail like 72429
 test-amd64-amd64-i386-sid-netboot-pygrub 10 debian-di-install  fail like 72429

baseline version:
 flight   72429

jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-amd64-sid-netboot-pvgrubfail
 test-amd64-i386-i386-sid-netboot-pvgrub  fail
 test-amd64-i386-amd64-sid-netboot-pygrub fail
 test-armhf-armhf-armhf-sid-netboot-pygrubfail
 test-amd64-amd64-i386-sid-netboot-pygrub fail



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH net-next v1] xen-netback: make copy batch size configurable

2017-11-13 Thread Jan Beulich
>>> On 13.11.17 at 11:33,  wrote:
>> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
>> Sent: 10 November 2017 19:35
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -96,6 +96,11 @@ unsigned int xenvif_hash_cache_size =
>> XENVIF_HASH_CACHE_SIZE_DEFAULT;
>>  module_param_named(hash_cache_size, xenvif_hash_cache_size, uint,
>> 0644);

Isn't the "owner-write" permission here ...

>> --- a/drivers/net/xen-netback/rx.c
>> +++ b/drivers/net/xen-netback/rx.c
>> @@ -168,11 +168,14 @@ static void xenvif_rx_copy_add(struct
>> xenvif_queue *queue,
>> struct xen_netif_rx_request *req,
>> unsigned int offset, void *data, size_t len)
>>  {
>> +unsigned int batch_size;
>>  struct gnttab_copy *op;
>>  struct page *page;
>>  struct xen_page_foreign *foreign;
>> 
>> -if (queue->rx_copy.num == COPY_BATCH_SIZE)
>> +batch_size = min(xenvif_copy_batch_size, queue->rx_copy.size);
> 
> Surely queue->rx_copy.size and xenvif_copy_batch_size are always identical? 
> Why do you need this statement (and hence stack variable)?

... the answer to your question?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [qemu-upstream-unstable test] 116118: FAIL

2017-11-13 Thread Anthony PERARD
On Mon, Nov 13, 2017 at 06:44:49AM +, osstest service owner wrote:
> flight 116118 qemu-upstream-unstable real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/116118/
> 
> Failures and problems with tests :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  build-armhf-pvopsbroken  in 
> 115713
>  build-armhf-pvops  4 host-install(4) broken in 115713 REGR. vs. 
> 114457

This test pass in this flight. Why the fact that it was broken in an
older flight is blocking here?

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.10] docs: update hvmlite.markdown

2017-11-13 Thread Jan Beulich
>>> On 12.11.17 at 12:03,  wrote:
> --- a/docs/misc/hvmlite.markdown
> +++ b/docs/misc/hvmlite.markdown
> @@ -1,6 +1,3 @@
> -**NOTE**: this document will be merged into `pvh.markdown` once PVH is 
> replaced
> -with the HVMlite implementation.
> -

This being stale, wouldn't it then be better to rename the doc to
pvh.markdown at the same time? Either way
Acked-by: Jan Beulich 

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH net-next v1] xen-netback: make copy batch size configurable

2017-11-13 Thread Paul Durrant
> -Original Message-
> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> Sent: 10 November 2017 19:35
> To: net...@vger.kernel.org
> Cc: Joao Martins ; Wei Liu
> ; Paul Durrant ; xen-
> de...@lists.xenproject.org
> Subject: [PATCH net-next v1] xen-netback: make copy batch size
> configurable
> 
> Commit eb1723a29b9a ("xen-netback: refactor guest rx") refactored Rx
> handling and as a result decreased max grant copy ops from 4352 to 64.
> Before this commit it would drain the rx_queue (while there are
> enough slots in the ring to put packets) then copy to all pages and write
> responses on the ring. With the refactor we do almost the same albeit
> the last two steps are done every COPY_BATCH_SIZE (64) copies.
> 
> For big packets, the value of 64 means copying 3 packets best case scenario
> (17 copies) and worst-case only 1 packet (34 copies, i.e. if all frags
> plus head cross the 4k grant boundary) which could be the case when
> packets go from local backend process.
> 
> Instead of making it static to 64 grant copies, lets allow the user to
> select its value (while keeping the current as default) by introducing
> the `copy_batch_size` module parameter. This allows users to select
> the higher batches (i.e. for better throughput with big packets) as it
> was prior to the above mentioned commit.
> 
> Signed-off-by: Joao Martins 
> ---
>  drivers/net/xen-netback/common.h|  6 --
>  drivers/net/xen-netback/interface.c | 25 -
>  drivers/net/xen-netback/netback.c   |  5 +
>  drivers/net/xen-netback/rx.c|  5 -
>  4 files changed, 37 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-
> netback/common.h
> index a46a1e94505d..a5fe36e098a7 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -129,8 +129,9 @@ struct xenvif_stats {
>  #define COPY_BATCH_SIZE 64
> 
>  struct xenvif_copy_state {
> - struct gnttab_copy op[COPY_BATCH_SIZE];
> - RING_IDX idx[COPY_BATCH_SIZE];
> + struct gnttab_copy *op;
> + RING_IDX *idx;
> + unsigned int size;

Could you name this batch_size, or something like that to make it clear what it 
means?

>   unsigned int num;
>   struct sk_buff_head *completed;
>  };
> @@ -381,6 +382,7 @@ extern unsigned int rx_drain_timeout_msecs;
>  extern unsigned int rx_stall_timeout_msecs;
>  extern unsigned int xenvif_max_queues;
>  extern unsigned int xenvif_hash_cache_size;
> +extern unsigned int xenvif_copy_batch_size;
> 
>  #ifdef CONFIG_DEBUG_FS
>  extern struct dentry *xen_netback_dbg_root;
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> netback/interface.c
> index d6dff347f896..a558868a883f 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -516,7 +516,20 @@ struct xenvif *xenvif_alloc(struct device *parent,
> domid_t domid,
> 
>  int xenvif_init_queue(struct xenvif_queue *queue)
>  {
> + int size = xenvif_copy_batch_size;

unsigned int

>   int err, i;
> + void *addr;
> +
> + addr = vzalloc(size * sizeof(struct gnttab_copy));

Does the memory need to be zeroed?

> + if (!addr)
> + goto err;
> + queue->rx_copy.op = addr;
> +
> + addr = vzalloc(size * sizeof(RING_IDX));

Likewise.

> + if (!addr)
> + goto err;
> + queue->rx_copy.idx = addr;
> + queue->rx_copy.size = size;
> 
>   queue->credit_bytes = queue->remaining_credit = ~0UL;
>   queue->credit_usec  = 0UL;
> @@ -544,7 +557,7 @@ int xenvif_init_queue(struct xenvif_queue *queue)
>queue->mmap_pages);
>   if (err) {
>   netdev_err(queue->vif->dev, "Could not reserve
> mmap_pages\n");
> - return -ENOMEM;
> + goto err;
>   }
> 
>   for (i = 0; i < MAX_PENDING_REQS; i++) {
> @@ -556,6 +569,13 @@ int xenvif_init_queue(struct xenvif_queue *queue)
>   }
> 
>   return 0;
> +
> +err:
> + if (queue->rx_copy.op)
> + vfree(queue->rx_copy.op);

vfree is safe to be called with NULL.

> + if (queue->rx_copy.idx)
> + vfree(queue->rx_copy.idx);
> + return -ENOMEM;
>  }
> 
>  void xenvif_carrier_on(struct xenvif *vif)
> @@ -788,6 +808,9 @@ void xenvif_disconnect_ctrl(struct xenvif *vif)
>   */
>  void xenvif_deinit_queue(struct xenvif_queue *queue)
>  {
> + vfree(queue->rx_copy.op);
> + vfree(queue->rx_copy.idx);
> + queue->rx_copy.size = 0;
>   gnttab_free_pages(MAX_PENDING_REQS, queue->mmap_pages);
>  }
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> netback/netback.c
> index a27daa23c9dc..3a5e1d7ac2f4 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -96,6 +96,11 @@ unsigned int xenvif_hash_cache_size =
> 

[Xen-devel] [PATCH RFC v3 5/6] tick: get duration of the last idle loop

2017-11-13 Thread Quan Xu
From: Quan Xu 

the last idle loop is from tick_nohz_idle_enter to tick_nohz_idle_exit.

Signed-off-by: Yang Zhang 
Signed-off-by: Quan Xu 
Cc: Frederic Weisbecker 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: linux-ker...@vger.kernel.org
---
 include/linux/tick.h |2 ++
 kernel/time/tick-sched.c |   11 +++
 kernel/time/tick-sched.h |3 +++
 3 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index cf413b3..77ae46d 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -118,6 +118,7 @@ enum tick_dep_bits {
 extern void tick_nohz_idle_exit(void);
 extern void tick_nohz_irq_exit(void);
 extern ktime_t tick_nohz_get_sleep_length(void);
+extern ktime_t tick_nohz_get_last_idle_length(void);
 extern unsigned long tick_nohz_get_idle_calls(void);
 extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
 extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time);
@@ -127,6 +128,7 @@ enum tick_dep_bits {
 static inline void tick_nohz_idle_enter(void) { }
 static inline void tick_nohz_idle_exit(void) { }
 
+static ktime_t tick_nohz_get_last_idle_length(void) { return -1; }
 static inline ktime_t tick_nohz_get_sleep_length(void)
 {
return NSEC_PER_SEC / HZ;
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index c7a899c..65c9cc0 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -548,6 +548,7 @@ static void tick_nohz_update_jiffies(ktime_t now)
else
ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, 
delta);
ts->idle_entrytime = now;
+   ts->idle_length = delta;
}
 
if (last_update_time)
@@ -998,6 +999,16 @@ void tick_nohz_irq_exit(void)
 }
 
 /**
+ * tick_nohz_get_last_idle_length - return the length of the last idle loop
+ */
+ktime_t tick_nohz_get_last_idle_length(void)
+{
+   struct tick_sched *ts = this_cpu_ptr(_cpu_sched);
+
+   return ts->idle_length;
+}
+
+/**
  * tick_nohz_get_sleep_length - return the length of the current sleep
  *
  * Called from power state control code with interrupts disabled
diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h
index 954b43d..2630cf9 100644
--- a/kernel/time/tick-sched.h
+++ b/kernel/time/tick-sched.h
@@ -39,6 +39,8 @@ enum tick_nohz_mode {
  * @idle_sleeptime:Sum of the time slept in idle with sched tick stopped
  * @iowait_sleeptime:  Sum of the time slept in idle with sched tick stopped, 
with IO outstanding
  * @sleep_length:  Duration of the current idle sleep
+ * @idle_length:   Duration of the last idle loop is from
+ * tick_nohz_idle_enter to tick_nohz_idle_exit.
  * @do_timer_lst:  CPU was the last one doing do_timer before going idle
  */
 struct tick_sched {
@@ -59,6 +61,7 @@ struct tick_sched {
ktime_t idle_sleeptime;
ktime_t iowait_sleeptime;
ktime_t sleep_length;
+   ktime_t idle_length;
unsigned long   last_jiffies;
u64 next_timer;
ktime_t idle_expires;
-- 
1.7.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops

2017-11-13 Thread Juergen Gross
On 13/11/17 11:06, Quan Xu wrote:
> From: Quan Xu 
> 
> So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called
> in idle path which will poll for a while before we enter the real idle
> state.
> 
> In virtualization, idle path includes several heavy operations
> includes timer access(LAPIC timer or TSC deadline timer) which will
> hurt performance especially for latency intensive workload like message
> passing task. The cost is mainly from the vmexit which is a hardware
> context switch between virtual machine and hypervisor. Our solution is
> to poll for a while and do not enter real idle path if we can get the
> schedule event during polling.
> 
> Poll may cause the CPU waste so we adopt a smart polling mechanism to
> reduce the useless poll.
> 
> Signed-off-by: Yang Zhang 
> Signed-off-by: Quan Xu 
> Cc: Juergen Gross 
> Cc: Alok Kataria 
> Cc: Rusty Russell 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: x...@kernel.org
> Cc: virtualizat...@lists.linux-foundation.org
> Cc: linux-ker...@vger.kernel.org
> Cc: xen-de...@lists.xenproject.org

Hmm, is the idle entry path really so critical to performance that a new
pvops function is necessary? Wouldn't a function pointer, maybe guarded
by a static key, be enough? A further advantage would be that this would
work on other architectures, too.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen().

2017-11-13 Thread Jan Beulich
>>> On 13.11.17 at 11:34,  wrote:
> Our debug showed the concerned page->count_info was already(and 
> unexpectedly)
> cleared in free_xenheap_pages(), and the call trace should be like this:
> 
> free_xenheap_pages()
>  ^
>  |
> free_xen_pagetable()
>  ^
>  |
> map_pages_to_xen()
>  ^
>  |
> update_xen_mappings()
>  ^
>  |
> get_page_from_l1e()
>  ^
>  |
> mod_l1_entry()
>  ^
>  |
> do_mmu_update()

This ...

> Is above description convincing enough? :-)

... is indeed enough for me to suggest to Julien that we take both
patches (once they're ready). But it's his decision in the end.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen().

2017-11-13 Thread Julien Grall

Hi,

On 11/13/2017 11:06 AM, Jan Beulich wrote:

On 13.11.17 at 11:34,  wrote:

Our debug showed the concerned page->count_info was already(and
unexpectedly)
cleared in free_xenheap_pages(), and the call trace should be like this:

free_xenheap_pages()
  ^
  |
free_xen_pagetable()
  ^
  |
map_pages_to_xen()
  ^
  |
update_xen_mappings()
  ^
  |
get_page_from_l1e()
  ^
  |
mod_l1_entry()
  ^
  |
do_mmu_update()


This ...


Is above description convincing enough? :-)


... is indeed enough for me to suggest to Julien that we take both
patches (once they're ready). But it's his decision in the end.


I will wait the series to be ready before giving my release-ack.

Cheers,



Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [qemu-upstream-unstable test] 116118: FAIL

2017-11-13 Thread Julien Grall

Hi,

On 11/13/2017 06:44 AM, osstest service owner wrote:

flight 116118 qemu-upstream-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/116118/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
  build-armhf-pvopsbroken  in 115713
  build-armhf-pvops  4 host-install(4) broken in 115713 REGR. vs. 114457


Looking at the test result, build-armhf-pvops seems to have passed 
nicely the past few weeks but one time.


So I am not sure why we are blocking here. Mostly the  one.

Cheers,

[1] 
http://logs.test-lab.xenproject.org/osstest/results/history/build-armhf-pvops/qemu-upstream-unstable




Tests which are failing intermittently (not blocking):
  test-amd64-amd64-xl-qcow2 19 guest-start/debian.repeat fail in 115713 pass in 
116118
  test-armhf-armhf-libvirt  6 xen-install  fail in 116105 pass in 116118
  test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeat fail in 116105 pass 
in 116118
  test-amd64-i386-libvirt-qcow2 17 guest-start/debian.repeat fail pass in 115713
  test-armhf-armhf-xl-rtds  6 xen-installfail pass in 116105
  test-amd64-i386-xl-xsm   20 guest-start/debian.repeat  fail pass in 116105
  test-armhf-armhf-xl-vhd  15 guest-start/debian.repeat  fail pass in 116105

Regressions which are regarded as allowable (not blocking):
  test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stopfail REGR. vs. 114457
  test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop   fail REGR. vs. 114457

Tests which did not succeed, but are not blocking:
  test-armhf-armhf-xl-multivcpu  1 build-check(1)  blocked in 115713 n/a
  test-armhf-armhf-libvirt  1 build-check(1)   blocked in 115713 n/a
  test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked in 115713 n/a
  test-armhf-armhf-xl   1 build-check(1)   blocked in 115713 n/a
  test-armhf-armhf-xl-vhd   1 build-check(1)   blocked in 115713 n/a
  test-armhf-armhf-xl-credit2   1 build-check(1)   blocked in 115713 n/a
  test-armhf-armhf-xl-cubietruck  1 build-check(1) blocked in 115713 n/a
  test-armhf-armhf-xl-rtds  1 build-check(1)   blocked in 115713 n/a
  test-armhf-armhf-xl-arndale   1 build-check(1)   blocked in 115713 n/a
  test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked in 115713 n/a
  test-armhf-armhf-xl-xsm   1 build-check(1)   blocked in 115713 n/a
  test-armhf-armhf-xl-rtds13 migrate-support-check fail in 116105 never pass
  test-armhf-armhf-xl-rtds 14 saverestore-support-check fail in 116105 never 
pass
  test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 114457
  test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 114457
  test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 114457
  test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
  test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
  test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
  test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
  test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
  test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
  test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
  test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
  test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
  test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
  test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
  test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
  test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
  test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
  test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
  test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
  test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
  test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
  test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
  test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
  test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
  test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
  test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
  test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
  test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
  

Re: [Xen-devel] [PATCH for-4.10] libs/evtchn: Remove active handler on clean-up or failure

2017-11-13 Thread Ross Lagerwall

On 11/10/2017 05:10 PM, Julien Grall wrote:

Commit 89d55473ed16543044a31d1e0d4660cf5a3f49df "xentoolcore_restrict_all:
Implement for libxenevtchn" added a call to register allowing to
restrict the event channel.

However, the call to deregister the handler was not performed if open
failed or when closing the event channel. This will result to corrupt
the list of handlers and potentially crash the application later one.

Fix it by calling xentoolcore_deregister_active_handle on failure and
closure.


Thanks for fixing this.



Signed-off-by: Julien Grall 

---

This patch is fixing a bug introduced after the code freeze by
"xentoolcore_restrict_all: Implement for libxenevtchn".

The call to xentoolcore_deregister_active_handle is done at the same
place as for the grants. But I am not convinced this is thread safe as
there are potential race between close the event channel and restict
handler. Do we care about that?


Both xentoolcore__deregister_active_handle() and 
xentoolcore_restrict_all() hold the same lock when mutating the list so 
there shouldn't be a problem with the list itself.


However, I think it should call xentoolcore__deregister_active_handle() 
_before_ calling osdep_evtchn_close() to avoid trying to restrict a 
closed fd or some other fd that happens to have the same number.


I think all the other libs need to be fixed as well, unless there was a 
reason it was done this way.


--
Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] VMX: sync CPU state upon vCPU destruction

2017-11-13 Thread Jan Beulich
>>> On 10.11.17 at 15:46,  wrote:
> On 10/11/17 10:30, Jan Beulich wrote:
> On 10.11.17 at 09:41,  wrote:
>>>2. Drop v->is_running check inside vmx_ctxt_switch_from() making
>>>vmx_vmcs_reload() unconditional.
>> 
>> This is an option, indeed (and I don't think it would have a
>> meaningful performance impact, as vmx_vmcs_reload() does
>> nothing if the right VMCS is already in place). Iirc I had added the
>> conditional back then merely to introduce as little of a behavioral
>> change as was (appeared to be at that time) necessary. What I'm
>> not certain about, however, is the final state we'll end up in then.
>> Coming back to your flow scheme (altered to represent the
>> suggested new flow):
>> 
> 
> I was thinking of this approach for a while and I couldn't find anything
> dangerous that can be potentially done by vmcs_reload() since it looks
> like that it already has all the necessary checks inside.
> 
>> pCPU1   pCPU2
>> =   =
>> current == vCPU1
>> context_switch(next == idle)
>> !! __context_switch() is skipped
>> vcpu_migrate(vCPU1)
>> RCU callbacks
>> vmx_vcpu_destroy()
>> vmx_vcpu_disable_pml()
>> current_vmcs = 0
>> 
>> schedule(next == vCPU1)
>> vCPU1->is_running = 1;
>> context_switch(next == vCPU1)
>> flush_tlb_mask(_mask);
>> 
>> <--- IPI
>> 
>> __sync_local_execstate()
>> __context_switch(prev == vCPU1)
>> vmx_ctxt_switch_from(vCPU1)
>> vmx_vmcs_reload()
>> ...
>> 
>> We'd now leave the being destroyed vCPU's VMCS active in pCPU1
>> (at least I can't see where it would be deactivated again).
> 
> This would be VMCS of the migrated vCPU - not the destroyed one.

Oh, right. Nevertheless I favor the other approach (or some of the
possible variants of it). In particular, I'm increasingly in favor of
moving the sync up the call stack, at least into
complete_domain_destroy(), or even rcu_do_batch(), as mentioned
before. The former would be more clearly of no concern performance
wise (to please Dario), while the latter would be the obvious and
clean equivalent of the old tasklet commit I've pointed out last week.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH net-next v1] xen-netback: make copy batch size configurable

2017-11-13 Thread Joao Martins
On 11/13/2017 10:33 AM, Paul Durrant wrote:
>> -Original Message-
>> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
>> Sent: 10 November 2017 19:35
>> To: net...@vger.kernel.org
>> Cc: Joao Martins ; Wei Liu
>> ; Paul Durrant ; xen-
>> de...@lists.xenproject.org
>> Subject: [PATCH net-next v1] xen-netback: make copy batch size
>> configurable
>>
>> Commit eb1723a29b9a ("xen-netback: refactor guest rx") refactored Rx
>> handling and as a result decreased max grant copy ops from 4352 to 64.
>> Before this commit it would drain the rx_queue (while there are
>> enough slots in the ring to put packets) then copy to all pages and write
>> responses on the ring. With the refactor we do almost the same albeit
>> the last two steps are done every COPY_BATCH_SIZE (64) copies.
>>
>> For big packets, the value of 64 means copying 3 packets best case scenario
>> (17 copies) and worst-case only 1 packet (34 copies, i.e. if all frags
>> plus head cross the 4k grant boundary) which could be the case when
>> packets go from local backend process.
>>
>> Instead of making it static to 64 grant copies, lets allow the user to
>> select its value (while keeping the current as default) by introducing
>> the `copy_batch_size` module parameter. This allows users to select
>> the higher batches (i.e. for better throughput with big packets) as it
>> was prior to the above mentioned commit.
>>
>> Signed-off-by: Joao Martins 
>> ---
>>  drivers/net/xen-netback/common.h|  6 --
>>  drivers/net/xen-netback/interface.c | 25 -
>>  drivers/net/xen-netback/netback.c   |  5 +
>>  drivers/net/xen-netback/rx.c|  5 -
>>  4 files changed, 37 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-
>> netback/common.h
>> index a46a1e94505d..a5fe36e098a7 100644
>> --- a/drivers/net/xen-netback/common.h
>> +++ b/drivers/net/xen-netback/common.h
>> @@ -129,8 +129,9 @@ struct xenvif_stats {
>>  #define COPY_BATCH_SIZE 64
>>
>>  struct xenvif_copy_state {
>> -struct gnttab_copy op[COPY_BATCH_SIZE];
>> -RING_IDX idx[COPY_BATCH_SIZE];
>> +struct gnttab_copy *op;
>> +RING_IDX *idx;
>> +unsigned int size;
> 
> Could you name this batch_size, or something like that to make it clear what 
> it means?
>
Yeap, will change it.

>>  unsigned int num;
>>  struct sk_buff_head *completed;
>>  };
>> @@ -381,6 +382,7 @@ extern unsigned int rx_drain_timeout_msecs;
>>  extern unsigned int rx_stall_timeout_msecs;
>>  extern unsigned int xenvif_max_queues;
>>  extern unsigned int xenvif_hash_cache_size;
>> +extern unsigned int xenvif_copy_batch_size;
>>
>>  #ifdef CONFIG_DEBUG_FS
>>  extern struct dentry *xen_netback_dbg_root;
>> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
>> netback/interface.c
>> index d6dff347f896..a558868a883f 100644
>> --- a/drivers/net/xen-netback/interface.c
>> +++ b/drivers/net/xen-netback/interface.c
>> @@ -516,7 +516,20 @@ struct xenvif *xenvif_alloc(struct device *parent,
>> domid_t domid,
>>
>>  int xenvif_init_queue(struct xenvif_queue *queue)
>>  {
>> +int size = xenvif_copy_batch_size;
> 
> unsigned int
>>> int err, i;
>> +void *addr;
>> +
>> +addr = vzalloc(size * sizeof(struct gnttab_copy));
> 
> Does the memory need to be zeroed?
>
It doesn't need to be but given that xenvif_queue is zeroed (which included this
region) thus thought I would leave the same way.

>> +if (!addr)
>> +goto err;
>> +queue->rx_copy.op = addr;
>> +
>> +addr = vzalloc(size * sizeof(RING_IDX));
> 
> Likewise.
> 
>> +if (!addr)
>> +goto err;
>> +queue->rx_copy.idx = addr;
>> +queue->rx_copy.size = size;
>>
>>  queue->credit_bytes = queue->remaining_credit = ~0UL;
>>  queue->credit_usec  = 0UL;
>> @@ -544,7 +557,7 @@ int xenvif_init_queue(struct xenvif_queue *queue)
>>   queue->mmap_pages);
>>  if (err) {
>>  netdev_err(queue->vif->dev, "Could not reserve
>> mmap_pages\n");
>> -return -ENOMEM;
>> +goto err;
>>  }
>>
>>  for (i = 0; i < MAX_PENDING_REQS; i++) {
>> @@ -556,6 +569,13 @@ int xenvif_init_queue(struct xenvif_queue *queue)
>>  }
>>
>>  return 0;
>> +
>> +err:
>> +if (queue->rx_copy.op)
>> +vfree(queue->rx_copy.op);
> 
> vfree is safe to be called with NULL.
> 
Oh, almost forgot - thanks.

>> +if (queue->rx_copy.idx)
>> +vfree(queue->rx_copy.idx);
>> +return -ENOMEM;
>>  }
>>
>>  void xenvif_carrier_on(struct xenvif *vif)
>> @@ -788,6 +808,9 @@ void xenvif_disconnect_ctrl(struct xenvif *vif)
>>   */
>>  void xenvif_deinit_queue(struct xenvif_queue *queue)
>>  {
>> +vfree(queue->rx_copy.op);
>> +vfree(queue->rx_copy.idx);
>> +queue->rx_copy.size = 0;
>>  gnttab_free_pages(MAX_PENDING_REQS, 

Re: [Xen-devel] [qemu-upstream-unstable test] 116118: FAIL

2017-11-13 Thread Wei Liu
On Mon, Nov 13, 2017 at 11:52:12AM +, Julien Grall wrote:
> Hi,
> 
> On 11/13/2017 06:44 AM, osstest service owner wrote:
> > flight 116118 qemu-upstream-unstable real [real]
> > http://logs.test-lab.xenproject.org/osstest/logs/116118/
> > 
> > Failures and problems with tests :-(
> > 
> > Tests which did not succeed and are blocking,
> > including tests which could not be run:
> >   build-armhf-pvopsbroken  in 
> > 115713
> >   build-armhf-pvops  4 host-install(4) broken in 115713 REGR. vs. 
> > 114457
> 
> Looking at the test result, build-armhf-pvops seems to have passed nicely
> the past few weeks but one time.
> 
> So I am not sure why we are blocking here. Mostly the  one.

host-install failure is probably not related to change in code. It is
trying to install a host to do test. In this case, to build kernel.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen().

2017-11-13 Thread Jan Beulich
>>> On 10.11.17 at 15:05,  wrote:
> On 11/10/2017 5:49 PM, Jan Beulich wrote:
>> I'm not certain this is important enough a fix to consider for 4.10,
>> and you seem to think it's good enough if this gets applied only
>> after the tree would be branched, as you didn't Cc Julien. Please
>> indicate if you actually simply weren't aware, and you indeed
>> there's an important aspect to this that I'm overlooking.
> 
> Well, at first I have not expected this to be accepted for 4.10. But 
> since we have
> met this issue in practice, when running a graphic application which 
> consumes
> memory intensively in dom0, I think it also makes sense if we can fix it 
> in xen's
> release as early as possible. Do you think this is a reasonable 
> requirement? :-)

You'd need to provide further details for us to understand the
scenario. It obviously depends on whether you have other
patches to Xen which actually trigger this. If the problem can
be triggered from outside of a vanilla upstream Xen, then yes,
I think I would favor the fixes being included.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.10] docs: update hvmlite.markdown

2017-11-13 Thread Wei Liu
On Mon, Nov 13, 2017 at 03:26:13AM -0700, Jan Beulich wrote:
> >>> On 12.11.17 at 12:03,  wrote:
> > --- a/docs/misc/hvmlite.markdown
> > +++ b/docs/misc/hvmlite.markdown
> > @@ -1,6 +1,3 @@
> > -**NOTE**: this document will be merged into `pvh.markdown` once PVH is 
> > replaced
> > -with the HVMlite implementation.
> > -
> 
> This being stale, wouldn't it then be better to rename the doc to
> pvh.markdown at the same time? Either way

Will do.

> Acked-by: Jan Beulich 
> 

Thanks.

> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen().

2017-11-13 Thread Yu Zhang



On 11/13/2017 5:31 PM, Jan Beulich wrote:

On 10.11.17 at 15:05,  wrote:

On 11/10/2017 5:49 PM, Jan Beulich wrote:

I'm not certain this is important enough a fix to consider for 4.10,
and you seem to think it's good enough if this gets applied only
after the tree would be branched, as you didn't Cc Julien. Please
indicate if you actually simply weren't aware, and you indeed
there's an important aspect to this that I'm overlooking.

Well, at first I have not expected this to be accepted for 4.10. But
since we have
met this issue in practice, when running a graphic application which
consumes
memory intensively in dom0, I think it also makes sense if we can fix it
in xen's
release as early as possible. Do you think this is a reasonable
requirement? :-)

You'd need to provide further details for us to understand the
scenario. It obviously depends on whether you have other
patches to Xen which actually trigger this. If the problem can
be triggered from outside of a vanilla upstream Xen, then yes,
I think I would favor the fixes being included.


Thank, Jan. Let me try to give an explaination of the scenario. :-)

We saw an ASSERT failue in ASSERT((page->count_info & PGC_count_mask) != 0)
in is_iomem_page() <- put_page_from_l1e() <- alloc_l1_table(), when we run a
graphic application(which is a memory eater, but close sourced) in dom0. 
And this

failure only happens when dom0 is configured with 2 vCPUs.

Our debug showed the concerned page->count_info was already(and 
unexpectedly)

cleared in free_xenheap_pages(), and the call trace should be like this:

free_xenheap_pages()
    ^
    |
free_xen_pagetable()
    ^
    |
map_pages_to_xen()
    ^
    |
update_xen_mappings()
    ^
    |
get_page_from_l1e()
    ^
    |
mod_l1_entry()
    ^
    |
do_mmu_update()

And we then realized that it happened when dom0 tries to update its page 
table,
and when the cache attributes are gonna be changed for referenced page 
frame,
corresponding mappings for xen VA space will be updated by 
map_pages_to_xen()

as well.

However, since routine map_pages_to_xen() has the aforementioned racing 
problem,

when MMU_NORMAL_PT_UPDATE is triggered concurrently on different CPUs, it
may mistakenly free a superpage referenced by pl2e. That's why our 
ASSERT failure

only happens when dom0 has more than one vCPU configured.

As to the code base, we were running XenGT code, which has only a few 
non-upstreamed
patches in Xen - I believe most of them are libxl related ones, and none 
of them is
mmu related. So I believe this issue could be triggered by a pv guest to 
a vanilla

upstream xen.

Is above description convincing enough? :-)

Yu


Jan





___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline baseline-only test] 72442: regressions - FAIL

2017-11-13 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 72442 qemu-mainline real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/72442/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-xsm   6 xen-build fail REGR. vs. 72437
 build-armhf   6 xen-build fail REGR. vs. 72437
 test-amd64-amd64-xl-qcow219 guest-start/debian.repeat fail REGR. vs. 72437
 test-amd64-i386-libvirt-qcow2 17 guest-start/debian.repeat fail REGR. vs. 72437
 test-amd64-amd64-qemuu-nested-intel 10 debian-hvm-install fail REGR. vs. 72437
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
72437

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-midway1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail like 72437
 test-amd64-i386-xl-qemuu-win10-i386 17 guest-stop  fail like 72437
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 17 guest-stop fail never pass

version targeted for testing:
 qemuu4ffa88c99c54d2a30f79e3dbecec50b023eff1c8
baseline version:
 qemuub0fbe46ad82982b289a44ee2495b59b0bad8a842

Last test of basis72437  2017-11-10 06:18:32 Z3 days
Testing same since72442  2017-11-13 18:45:05 Z0 days1 attempts


People who touched revisions under test:
  Daniel P. Berrange 
  David Gibson 
  Greg Kurz 
  Longpeng 
  Michael Davidsaver 
  Peter Maydell 
  Thomas Huth 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  fail
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  fail
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  blocked 
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  blocked 
 test-amd64-i386-xl   pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-xsm   

[Xen-devel] [linux-next test] 116135: regressions - FAIL

2017-11-13 Thread osstest service owner
flight 116135 linux-next real [real]
http://logs.test-lab.xenproject.org/osstest/logs/116135/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 116119
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-boot  fail REGR. vs. 116119
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-boot fail REGR. vs. 116119
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-boot   fail REGR. vs. 116119
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 116119
 test-amd64-i386-freebsd10-amd64  7 xen-boot  fail REGR. vs. 116119
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-boot  fail REGR. vs. 116119
 test-amd64-i386-libvirt-xsm   7 xen-boot fail REGR. vs. 116119
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-boot   fail REGR. vs. 116119
 test-amd64-i386-libvirt-qcow2  7 xen-bootfail REGR. vs. 116119
 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail REGR. vs. 116119
 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail REGR. vs. 116119
 build-amd64-pvops 6 kernel-build fail REGR. vs. 116119

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-qemuu-nested-intel  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-win10-i386  1 build-check(1) blocked n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win10-i386  1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-pygrub   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qcow2 1 build-check(1)   blocked  n/a
 test-amd64-amd64-amd64-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-amd64-rumprun-amd64  1 build-check(1)   blocked  n/a
 test-amd64-amd64-i386-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-ws16-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-ws16-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvhv2-intel  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvhv2-amd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-qemuu-nested-amd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-examine  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 116119
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 116119
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 116119
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 116119
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 116119
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 

[Xen-devel] [linux-linus test] 116136: regressions - FAIL

2017-11-13 Thread osstest service owner
flight 116136 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/116136/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-credit2  10 debian-install   fail REGR. vs. 115643
 test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeat fail REGR. vs. 115643

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 115643
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 115643
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 115643
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 115643
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 115643
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 115643
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 115643
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 115643
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 115643
 test-armhf-armhf-xl-vhd  15 guest-start/debian.repeatfail  like 115643
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass

version targeted for testing:
 linux516fb7f2e73dcc303fb97fc3593209fcacf2d982
baseline version:
 linuxe4880bc5dfb1f02b152e62a894b5c6f3e995b3cf

Last test of basis   115643  2017-11-07 12:06:20 Z6 days
Failing since115658  2017-11-08 02:33:06 Z6 days9 attempts
Testing same since   116136  2017-11-13 09:48:27 Z0 days1 attempts


People who touched revisions under test:
  Alexander Shishkin 
  Andrei Vagin 
  Andrew Morton 
  Andrey Konovalov 
  Arnaldo Carvalho de Melo 
  Arnaldo Carvalho de Melo 
  Arnd Bergmann 
  Arvind Yadav 
  Benjamin Tissoires 
  Bjorn Andersson 
  Bjorn Helgaas 
  Bjørn Mork 
  Borislav Petkov 
  Chris Redpath 
  Chris Wilson 
  Colin Ian King 
  Cong Wang 

[Xen-devel] [xen-unstable test] 116132: tolerable FAIL - PUSHED

2017-11-13 Thread osstest service owner
flight 116132 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/116132/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail in 116108 
pass in 116132
 test-armhf-armhf-xl-credit2 16 guest-start/debian.repeat fail in 116108 pass 
in 116132
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 16 guest-localmigrate/x10 fail 
pass in 116108

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt-qcow2 17 guest-start/debian.repeat fail in 116108 like 
115654
 test-amd64-amd64-xl-qcow2 19 guest-start/debian.repeat fail in 116108 like 
115665
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop  fail in 116108 like 115665
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail like 115654
 test-armhf-armhf-xl-vhd  15 guest-start/debian.repeatfail  like 115654
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 115665
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 115665
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 115665
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 115665
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 115665
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 115665
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 115665
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 115665
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 115665
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass

version targeted for testing:
 xen  3b2966e72c414592cd2c86c21a0d4664cf627b9c
baseline version:
 xen  92f0d4392e73727819c5a83fcce447515efaf2f5

Last test of basis   115665  2017-11-08 08:36:43 Z5 days
Testing same since   115730  2017-11-10 14:32:56 Z3 days4 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 
  Roger Pau Monne 

[Xen-devel] [PATCH v3 for-4.10 2/2] x86/mm: fix a potential race condition in modify_xen_mappings().

2017-11-13 Thread Yu Zhang
In modify_xen_mappings(), a L1/L2 page table shall be freed,
if all entries of this page table are empty. Corresponding
L2/L3 PTE will need be cleared in such scenario.

However, concurrent paging structure modifications on different
CPUs may cause the L2/L3 PTEs to be already be cleared or set
to reference a superpage.

Therefore the logic to enumerate the L1/L2 page table and to
reset the corresponding L2/L3 PTE need to be protected with
spinlock. And the _PAGE_PRESENT and _PAGE_PSE flags need be
checked after the lock is obtained.

Signed-off-by: Yu Zhang 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 

Changes in v3: 
According to comments from Jan Beulich:
  - indent the label by one space;
  - also check the _PAGE_PSE for L2E/L3E.
Others:
  - commit message changes.
---
 xen/arch/x86/mm.c | 45 +
 1 file changed, 45 insertions(+)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 1697be9..64ccd70 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5111,6 +5111,27 @@ int modify_xen_mappings(unsigned long s, unsigned long 
e, unsigned int nf)
  */
 if ( (nf & _PAGE_PRESENT) || ((v != e) && (l1_table_offset(v) != 
0)) )
 continue;
+if ( locking )
+spin_lock(_pgdir_lock);
+
+/*
+ * L2E may be already cleared, or set to a superpage, by
+ * concurrent paging structure modifications on other CPUs.
+ */
+if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
+{
+if ( locking )
+spin_unlock(_pgdir_lock);
+goto check_l3;
+}
+
+if ( l2e_get_flags(*pl2e) & _PAGE_PSE )
+{
+if ( locking )
+spin_unlock(_pgdir_lock);
+continue;
+}
+
 pl1e = l2e_to_l1e(*pl2e);
 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
 if ( l1e_get_intpte(pl1e[i]) != 0 )
@@ -5119,11 +5140,16 @@ int modify_xen_mappings(unsigned long s, unsigned long 
e, unsigned int nf)
 {
 /* Empty: zap the L2E and free the L1 page. */
 l2e_write_atomic(pl2e, l2e_empty());
+if ( locking )
+spin_unlock(_pgdir_lock);
 flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
 free_xen_pagetable(pl1e);
 }
+else if ( locking )
+spin_unlock(_pgdir_lock);
 }
 
+ check_l3:
 /*
  * If we are not destroying mappings, or not done with the L3E,
  * skip the empty check.
@@ -5131,6 +5157,21 @@ int modify_xen_mappings(unsigned long s, unsigned long 
e, unsigned int nf)
 if ( (nf & _PAGE_PRESENT) ||
  ((v != e) && (l2_table_offset(v) + l1_table_offset(v) != 0)) )
 continue;
+if ( locking )
+spin_lock(_pgdir_lock);
+
+/*
+ * L3E may be already cleared, or set to a superpage, by
+ * concurrent paging structure modifications on other CPUs.
+ */
+if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) ||
+  (l3e_get_flags(*pl3e) & _PAGE_PSE) )
+{
+if ( locking )
+spin_unlock(_pgdir_lock);
+continue;
+}
+
 pl2e = l3e_to_l2e(*pl3e);
 for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
 if ( l2e_get_intpte(pl2e[i]) != 0 )
@@ -5139,9 +5180,13 @@ int modify_xen_mappings(unsigned long s, unsigned long 
e, unsigned int nf)
 {
 /* Empty: zap the L3E and free the L2 page. */
 l3e_write_atomic(pl3e, l3e_empty());
+if ( locking )
+spin_unlock(_pgdir_lock);
 flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
 free_xen_pagetable(pl2e);
 }
+else if ( locking )
+spin_unlock(_pgdir_lock);
 }
 
 flush_area(NULL, FLUSH_TLB_GLOBAL);
-- 
2.5.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 for-4.10 1/2] x86/mm: fix potential race conditions in map_pages_to_xen().

2017-11-13 Thread Yu Zhang
From: Min He 

In map_pages_to_xen(), a L2 page table entry may be reset to point to
a superpage, and its corresponding L1 page table need be freed in such
scenario, when these L1 page table entries are mapping to consecutive
page frames and having the same mapping flags.

However, variable `pl1e` is not protected by the lock before L1 page table
is enumerated. A race condition may happen if this code path is invoked
simultaneously on different CPUs.

For example, `pl1e` value on CPU0 may hold an obsolete value, pointing
to a page which has just been freed on CPU1. Besides, before this page
is reused, it will still be holding the old PTEs, referencing consecutive
page frames. Consequently the `free_xen_pagetable(l2e_to_l1e(ol2e))` will
be triggered on CPU0, resulting the unexpected free of a normal page.

This patch fixes the above problem by protecting the `pl1e` with the lock.

Also, there're other potential race conditions. For instance, the L2/L3
entry may be modified concurrently on different CPUs, by routines such as
map_pages_to_xen(), modify_xen_mappings() etc. To fix this, this patch will
check the _PAGE_PRESENT and _PAGE_PSE flags, after the spinlock is obtained,
for the corresponding L2/L3 entry.

Signed-off-by: Min He 
Signed-off-by: Yi Zhang 
Signed-off-by: Yu Zhang 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 

Changes in v3:
According to comments from Jan Beulich:
  - use local variable instead of dereference pointer to pte to check flag.
  - also check the _PAGE_PRESENT for L2E/L3E.
Others:
  - Commit message changes.

Changes in v2:
According to comments from Jan Beulich:
  - check PSE of pl2e and pl3e, and skip the re-consolidation if set.
  - commit message changes, e.g. add "From :" tag etc.
  - code style changes.
  - introduce a seperate patch to resolve the similar issue in
modify_xen_mappings().
---
 xen/arch/x86/mm.c | 36 ++--
 1 file changed, 34 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index a20fdca..1697be9 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4844,9 +4844,29 @@ int map_pages_to_xen(
 {
 unsigned long base_mfn;
 
-pl1e = l2e_to_l1e(*pl2e);
 if ( locking )
 spin_lock(_pgdir_lock);
+
+ol2e = *pl2e;
+/*
+ * L2E may be already cleared, or set to a superpage, by
+ * concurrent paging structure modifications on other CPUs.
+ */
+if ( !(l2e_get_flags(ol2e) & _PAGE_PRESENT) )
+{
+if ( locking )
+spin_unlock(_pgdir_lock);
+continue;
+}
+
+if ( l2e_get_flags(ol2e) & _PAGE_PSE )
+{
+if ( locking )
+spin_unlock(_pgdir_lock);
+goto check_l3;
+}
+
+pl1e = l2e_to_l1e(ol2e);
 base_mfn = l1e_get_pfn(*pl1e) & ~(L1_PAGETABLE_ENTRIES - 1);
 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++, pl1e++ )
 if ( (l1e_get_pfn(*pl1e) != (base_mfn + i)) ||
@@ -4854,7 +4874,6 @@ int map_pages_to_xen(
 break;
 if ( i == L1_PAGETABLE_ENTRIES )
 {
-ol2e = *pl2e;
 l2e_write_atomic(pl2e, l2e_from_pfn(base_mfn,
 l1f_to_lNf(flags)));
 if ( locking )
@@ -4880,7 +4899,20 @@ int map_pages_to_xen(
 
 if ( locking )
 spin_lock(_pgdir_lock);
+
 ol3e = *pl3e;
+/*
+ * L3E may be already cleared, or set to a superpage, by
+ * concurrent paging structure modifications on other CPUs.
+ */
+if ( !(l3e_get_flags(ol3e) & _PAGE_PRESENT) ||
+(l3e_get_flags(ol3e) & _PAGE_PSE) )
+{
+if ( locking )
+spin_unlock(_pgdir_lock);
+continue;
+}
+
 pl2e = l3e_to_l2e(ol3e);
 base_mfn = l2e_get_pfn(*pl2e) & ~(L2_PAGETABLE_ENTRIES *
   L1_PAGETABLE_ENTRIES - 1);
-- 
2.5.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops

2017-11-13 Thread Quan Xu



On 2017/11/13 18:53, Juergen Gross wrote:

On 13/11/17 11:06, Quan Xu wrote:

From: Quan Xu 

So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called
in idle path which will poll for a while before we enter the real idle
state.

In virtualization, idle path includes several heavy operations
includes timer access(LAPIC timer or TSC deadline timer) which will
hurt performance especially for latency intensive workload like message
passing task. The cost is mainly from the vmexit which is a hardware
context switch between virtual machine and hypervisor. Our solution is
to poll for a while and do not enter real idle path if we can get the
schedule event during polling.

Poll may cause the CPU waste so we adopt a smart polling mechanism to
reduce the useless poll.

Signed-off-by: Yang Zhang 
Signed-off-by: Quan Xu 
Cc: Juergen Gross 
Cc: Alok Kataria 
Cc: Rusty Russell 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: virtualizat...@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: xen-de...@lists.xenproject.org

Hmm, is the idle entry path really so critical to performance that a new
pvops function is necessary?

Juergen, Here is the data we get when running benchmark netperf:
 1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
    29031.6 bit/s -- 76.1 %CPU

 2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0):
    35787.7 bit/s -- 129.4 %CPU

 3. w/ kvm dynamic poll:
    35735.6 bit/s -- 200.0 %CPU

 4. w/patch and w/ kvm dynamic poll:
    42225.3 bit/s -- 198.7 %CPU

 5. idle=poll
    37081.7 bit/s -- 998.1 %CPU



 w/ this patch, we will improve performance by 23%.. even we could improve
 performance by 45.4%, if we use w/patch and w/ kvm dynamic poll. also the
 cost of CPU is much lower than 'idle=poll' case..


Wouldn't a function pointer, maybe guarded
by a static key, be enough? A further advantage would be that this would
work on other architectures, too.


I assume this feature will be ported to other archs.. a new pvops makes code
clean and easy to maintain. also I tried to add it into existed pvops, 
but it

doesn't match.



Quan
Alibaba Cloud


Juergen




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops

2017-11-13 Thread Juergen Gross
On 14/11/17 08:02, Quan Xu wrote:
> 
> 
> On 2017/11/13 18:53, Juergen Gross wrote:
>> On 13/11/17 11:06, Quan Xu wrote:
>>> From: Quan Xu 
>>>
>>> So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called
>>> in idle path which will poll for a while before we enter the real idle
>>> state.
>>>
>>> In virtualization, idle path includes several heavy operations
>>> includes timer access(LAPIC timer or TSC deadline timer) which will
>>> hurt performance especially for latency intensive workload like message
>>> passing task. The cost is mainly from the vmexit which is a hardware
>>> context switch between virtual machine and hypervisor. Our solution is
>>> to poll for a while and do not enter real idle path if we can get the
>>> schedule event during polling.
>>>
>>> Poll may cause the CPU waste so we adopt a smart polling mechanism to
>>> reduce the useless poll.
>>>
>>> Signed-off-by: Yang Zhang 
>>> Signed-off-by: Quan Xu 
>>> Cc: Juergen Gross 
>>> Cc: Alok Kataria 
>>> Cc: Rusty Russell 
>>> Cc: Thomas Gleixner 
>>> Cc: Ingo Molnar 
>>> Cc: "H. Peter Anvin" 
>>> Cc: x...@kernel.org
>>> Cc: virtualizat...@lists.linux-foundation.org
>>> Cc: linux-ker...@vger.kernel.org
>>> Cc: xen-de...@lists.xenproject.org
>> Hmm, is the idle entry path really so critical to performance that a new
>> pvops function is necessary?
> Juergen, Here is the data we get when running benchmark netperf:
>  1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
>     29031.6 bit/s -- 76.1 %CPU
> 
>  2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0):
>     35787.7 bit/s -- 129.4 %CPU
> 
>  3. w/ kvm dynamic poll:
>     35735.6 bit/s -- 200.0 %CPU
> 
>  4. w/patch and w/ kvm dynamic poll:
>     42225.3 bit/s -- 198.7 %CPU
> 
>  5. idle=poll
>     37081.7 bit/s -- 998.1 %CPU
> 
> 
> 
>  w/ this patch, we will improve performance by 23%.. even we could improve
>  performance by 45.4%, if we use w/patch and w/ kvm dynamic poll. also the
>  cost of CPU is much lower than 'idle=poll' case..

I don't question the general idea. I just think pvops isn't the best way
to implement it.

>> Wouldn't a function pointer, maybe guarded
>> by a static key, be enough? A further advantage would be that this would
>> work on other architectures, too.
> 
> I assume this feature will be ported to other archs.. a new pvops makes
> code
> clean and easy to maintain. also I tried to add it into existed pvops,
> but it
> doesn't match.

You are aware that pvops is x86 only?

I really don't see the big difference in maintainability compared to the
static key / function pointer variant:

void (*guest_idle_poll_func)(void);
struct static_key guest_idle_poll_key __read_mostly;

static inline void guest_idle_poll(void)
{
if (static_key_false(_idle_poll_key))
guest_idle_poll_func();
}

And KVM would just need to set guest_idle_poll_func and enable the
static key. Works on non-x86 architectures, too.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [qemu-upstream-unstable baseline-only test] 72443: regressions - FAIL

2017-11-13 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 72443 qemu-upstream-unstable real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/72443/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qcow219 guest-start/debian.repeat fail REGR. vs. 72236
 test-armhf-armhf-xl-vhd  15 guest-start/debian.repeat fail REGR. vs. 72236
 test-amd64-amd64-qemuu-nested-intel 13 xen-install/l1 fail REGR. vs. 72236

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt-qcow2 17 guest-start/debian.repeat fail baseline 
untested
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail   like 72236
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   like 72236
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   like 72236
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail like 72236
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail like 72236
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-midway   13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 qemuub79708a8ed1b3d18bee67baeaf33b3fa529493e2
baseline version:
 qemuu5cd7ce5dde3f228b3b669ed9ca432f588947bd40

Last test of basis72236  2017-10-14 17:20:30 Z   30 days
Testing same since72443  2017-11-13 23:16:21 Z0 days1 attempts


People who touched revisions under test:
  Anthony PERARD 
  Michael Tokarev 
  Roger Pau Monne 
  Roger Pau Monné 
  Stefano Stabellini 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvops   

Re: [Xen-devel] [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops

2017-11-13 Thread Wanpeng Li
2017-11-14 15:02 GMT+08:00 Quan Xu :
>
>
> On 2017/11/13 18:53, Juergen Gross wrote:
>>
>> On 13/11/17 11:06, Quan Xu wrote:
>>>
>>> From: Quan Xu 
>>>
>>> So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called
>>> in idle path which will poll for a while before we enter the real idle
>>> state.
>>>
>>> In virtualization, idle path includes several heavy operations
>>> includes timer access(LAPIC timer or TSC deadline timer) which will
>>> hurt performance especially for latency intensive workload like message
>>> passing task. The cost is mainly from the vmexit which is a hardware
>>> context switch between virtual machine and hypervisor. Our solution is
>>> to poll for a while and do not enter real idle path if we can get the
>>> schedule event during polling.
>>>
>>> Poll may cause the CPU waste so we adopt a smart polling mechanism to
>>> reduce the useless poll.
>>>
>>> Signed-off-by: Yang Zhang 
>>> Signed-off-by: Quan Xu 
>>> Cc: Juergen Gross 
>>> Cc: Alok Kataria 
>>> Cc: Rusty Russell 
>>> Cc: Thomas Gleixner 
>>> Cc: Ingo Molnar 
>>> Cc: "H. Peter Anvin" 
>>> Cc: x...@kernel.org
>>> Cc: virtualizat...@lists.linux-foundation.org
>>> Cc: linux-ker...@vger.kernel.org
>>> Cc: xen-de...@lists.xenproject.org
>>
>> Hmm, is the idle entry path really so critical to performance that a new
>> pvops function is necessary?
>
> Juergen, Here is the data we get when running benchmark netperf:
>  1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
> 29031.6 bit/s -- 76.1 %CPU
>
>  2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0):
> 35787.7 bit/s -- 129.4 %CPU
>
>  3. w/ kvm dynamic poll:
> 35735.6 bit/s -- 200.0 %CPU

Actually we can reduce the CPU utilization by sleeping a period of
time as what has already been done in the poll logic of IO subsystem,
then we can improve the algorithm in kvm instead of introduing another
duplicate one in the kvm guest.

Regards,
Wanpeng Li

>
>  4. w/patch and w/ kvm dynamic poll:
> 42225.3 bit/s -- 198.7 %CPU
>
>  5. idle=poll
> 37081.7 bit/s -- 998.1 %CPU
>
>
>
>  w/ this patch, we will improve performance by 23%.. even we could improve
>  performance by 45.4%, if we use w/patch and w/ kvm dynamic poll. also the
>  cost of CPU is much lower than 'idle=poll' case..
>
>> Wouldn't a function pointer, maybe guarded
>> by a static key, be enough? A further advantage would be that this would
>> work on other architectures, too.
>
>
> I assume this feature will be ported to other archs.. a new pvops makes code
> clean and easy to maintain. also I tried to add it into existed pvops, but
> it
> doesn't match.
>
>
>
> Quan
> Alibaba Cloud
>>
>>
>> Juergen
>>
>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] Config.mk: Update QEMU changeset

2017-11-13 Thread Anthony PERARD
New commits:
- xen/pt: allow QEMU to request MSI unmasking at bind time
To fix a passthrough bug.
- ui/gtk: Fix deprecation of vte_terminal_copy_clipboard
A build fix.

Signed-off-by: Anthony PERARD 
---
Should already be released-acked.
---
 Config.mk | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Config.mk b/Config.mk
index 664f97e726..072a3bc62c 100644
--- a/Config.mk
+++ b/Config.mk
@@ -273,7 +273,7 @@ SEABIOS_UPSTREAM_URL ?= git://xenbits.xen.org/seabios.git
 MINIOS_UPSTREAM_URL ?= git://xenbits.xen.org/mini-os.git
 endif
 OVMF_UPSTREAM_REVISION ?= 947f3737abf65fda63f3ffd97fddfa6986986868
-QEMU_UPSTREAM_REVISION ?= qemu-xen-4.10.0-rc1
+QEMU_UPSTREAM_REVISION ?= b79708a8ed1b3d18bee67baeaf33b3fa529493e2
 MINIOS_UPSTREAM_REVISION ?= 0b4b7897e08b967a09bed2028a79fabff82342dd
 # Mon Oct 16 16:36:41 2017 +0100
 # Update Xen header files again
-- 
Anthony PERARD


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c

2017-11-13 Thread Juergen Gross
On 11/11/17 00:57, Stefano Stabellini wrote:
> On Tue, 7 Nov 2017, Juergen Gross wrote:
>> On 06/11/17 23:17, Stefano Stabellini wrote:
>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>> take in_mutex on the first try, but you can't take out_mutex. Next times
>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>> loop.
>>>
>>> Solve the problem by moving the two mutex_trylock calls to two separate
>>> loops.
>>>
>>> Reported-by: Dan Carpenter 
>>> Signed-off-by: Stefano Stabellini 
>>> CC: boris.ostrov...@oracle.com
>>> CC: jgr...@suse.com
>>> ---
>>>  drivers/xen/pvcalls-front.c | 5 +++--
>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>> index 0c1ec68..047dce7 100644
>>> --- a/drivers/xen/pvcalls-front.c
>>> +++ b/drivers/xen/pvcalls-front.c
>>> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
>>>  * is set to NULL -- we only need to wait for the existing
>>>  * waiters to return.
>>>  */
>>> -   while (!mutex_trylock(>active.in_mutex) ||
>>> -  !mutex_trylock(>active.out_mutex))
>>> +   while (!mutex_trylock(>active.in_mutex))
>>> +   cpu_relax();
>>> +   while (!mutex_trylock(>active.out_mutex))
>>> cpu_relax();
>>
>> Any reason you don't just use mutex_lock()?
> 
> Hi Juergen, sorry for the late reply.
> 
> Yes, you are right. Given the patch, it would be just the same to use
> mutex_lock.
> 
> This is where I realized that actually we have a problem: no matter if
> we use mutex_lock or mutex_trylock, there are no guarantees that we'll
> be the last to take the in/out_mutex. Other waiters could be still
> outstanding.
> 
> We solved the same problem using a refcount in pvcalls_front_remove. In
> this case, I was thinking of reusing the mutex internal counter for
> efficiency, instead of adding one more refcount.
> 
> For using the mutex as a refcount, there is really no need to call
> mutex_trylock or mutex_lock. I suggest checking on the mutex counter
> directly:
> 
> 
>   while (atomic_long_read(>active.in_mutex.owner) != 0UL ||
>  atomic_long_read(>active.out_mutex.owner) != 0UL)
>   cpu_relax();
> 
> Cheers,
> 
> Stefano
> 
> 
> ---
> 
> xen/pvcalls: fix potential endless loop in pvcalls-front.c
> 
> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> take in_mutex on the first try, but you can't take out_mutex. Next time
> you call mutex_trylock() in_mutex is going to fail. It's an endless
> loop.
> 
> Actually, we don't want to use mutex_trylock at all: we don't need to
> take the mutex, we only need to wait until the last mutex waiter/holder
> releases it.
> 
> Instead of calling mutex_trylock or mutex_lock, just check on the mutex
> refcount instead.
> 
> Reported-by: Dan Carpenter 
> Signed-off-by: Stefano Stabellini 
> CC: boris.ostrov...@oracle.com
> CC: jgr...@suse.com
> 
> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> index 0c1ec68..9f33cb8 100644
> --- a/drivers/xen/pvcalls-front.c
> +++ b/drivers/xen/pvcalls-front.c
> @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
>* is set to NULL -- we only need to wait for the existing
>* waiters to return.
>*/
> - while (!mutex_trylock(>active.in_mutex) ||
> -!mutex_trylock(>active.out_mutex))
> + while (atomic_long_read(>active.in_mutex.owner) != 0UL ||
> +atomic_long_read(>active.out_mutex.owner) != 0UL)

I don't like this.

Can't you use a kref here? Even if it looks like more overhead it is
much cleaner. There will be no questions regarding possible races,
while with an approach like yours will always smell racy (can't there
be someone taking the mutex just after above test?).

In no case you should make use of the mutex internals.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 116137: tolerable all pass - PUSHED

2017-11-13 Thread osstest service owner
flight 116137 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/116137/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  a08ee2c06dd06ef07a0c4f9c5f84b0d86694be7f
baseline version:
 xen  3b2966e72c414592cd2c86c21a0d4664cf627b9c

Last test of basis   115704  2017-11-09 18:22:55 Z3 days
Testing same since   116137  2017-11-13 11:02:47 Z0 days1 attempts


People who touched revisions under test:
  Jan Beulich 
  Wei Liu 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To osst...@xenbits.xen.org:/home/xen/git/xen.git
   3b2966e..a08ee2c  a08ee2c06dd06ef07a0c4f9c5f84b0d86694be7f -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [linux-3.18 test] 116121: regressions - FAIL

2017-11-13 Thread osstest service owner
flight 116121 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/116121/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeat fail REGR. vs. 115495

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qcow219 guest-start/debian.repeat  fail pass in 116106

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-vhd 15 guest-start/debian.repeat fail in 116106 like 115495
 test-amd64-i386-libvirt-qcow2 17 guest-start/debian.repeatfail like 115495
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 115495
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 115495
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 115495
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 115495
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 115495
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 115495
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 115495
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass

version targeted for testing:
 linux943dc0b3ef9f0168494d6dca305cd0cf53a0b3d4
baseline version:
 linux4f823316dac3de3463dfbea2be3812102a76e246

Last test of basis   115495  2017-11-02 19:35:18 Z   10 days
Testing same since   115673  2017-11-08 09:43:38 Z5 days8 attempts


People who touched revisions under test:
  Alexander Boyko 
  Andrew Morton 
  Andy Shevchenko 
  Arnd Bergmann 
  Ashish Samant 
  Boris Ostrovsky 
  Borislav Petkov 
  Catalin Marinas 

Re: [Xen-devel] [PATCH 01/16] Introduce skeleton SUPPORT.md

2017-11-13 Thread George Dunlap
On 11/13/2017 03:41 PM, George Dunlap wrote:
> Add a machine-readable file to describe what features are in what
> state of being 'supported', as well as information about how long this
> release will be supported, and so on.
> 
> The document should be formatted using "semantic newlines" [1], to make
> changes easier.
> 
> Begin with the basic framework.
> 
> Signed-off-by: Ian Jackson 
> Signed-off-by: George Dunlap 

Sending this series out slightly unfinished, as I've gotten diverted
with some security issues.

I think patches 1-14 should be mostly ready.  Patches 15 and 16 both
need some work; if anyone could pick them up I'd appreciate it.

Thanks,
 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 13/16] SUPPORT.md: Add secondary memory management features

2017-11-13 Thread George Dunlap
Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Tamas K Lengyel 
---
 SUPPORT.md | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 0f7426593e..3e352198ce 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -187,6 +187,37 @@ Export hypervisor coverage data suitable for analysis by 
gcov or lcov.
 
 Status: Supported
 
+### Memory Sharing
+
+Status, x86 HVM: Tech Preview
+Status, ARM: Tech Preview
+
+Allow sharing of identical pages between guests
+
+### Memory Paging
+
+Status, x86 HVM: Experimenal
+
+Allow pages belonging to guests to be paged to disk
+
+### Transcendent Memory
+
+Status: Experimental
+
+Transcendent Memory (tmem) allows the creation of hypervisor memory pools
+which guests can use to store memory 
+rather than caching in its own memory or swapping to disk.
+Having these in the hypervisor
+can allow more efficient aggregate use of memory across VMs.
+
+### Alternative p2m
+
+Status, x86 HVM: Tech Preview
+Status, ARM: Tech Preview
+
+Allows external monitoring of hypervisor memory
+by maintaining multiple physical to machine (p2m) memory mappings.
+
 ## Resource Management
 
 ### CPU Pools
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 12/16] SUPPORT.md: Add Security-releated features

2017-11-13 Thread George Dunlap
With the exception of driver domains, which depend on PCI passthrough,
and will be introduced later.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Tamas K Lengyel 
CC: Rich Persaud 
---
 SUPPORT.md | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 722a29fec5..0f7426593e 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -421,6 +421,40 @@ there is currently no xl support.
 
 Status: Supported
 
+## Security
+
+### Device Model Stub Domains
+
+Status: Supported
+
+### KCONFIG Expert
+
+Status: Experimental
+
+### Live Patching
+
+Status, x86: Supported
+Status, ARM: Experimental
+
+Compile time disabled for ARM
+
+### Virtual Machine Introspection
+
+Status, x86: Supported, not security supported
+
+### XSM & FLASK
+
+Status: Experimental
+
+Compile time disabled
+
+### FLASK default policy
+
+Status: Experimental
+
+The default policy includes FLASK labels and roles for a "typical" Xen-based 
system
+with dom0, driver domains, stub domains, domUs, and so on.
+
 ## Virtual Hardware, Hypervisor
 
 ### x86/Nested PV
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 14/16] SUPPORT.md: Add statement on PCI passthrough

2017-11-13 Thread George Dunlap
Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Rich Persaud 
CC: Marek Marczykowski-Górecki 
CC: Christopher Clark 
CC: James McKenzie 
---
 SUPPORT.md | 33 -
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/SUPPORT.md b/SUPPORT.md
index 3e352198ce..a8388f3dc5 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -454,9 +454,23 @@ there is currently no xl support.
 
 ## Security
 
+### Driver Domains
+
+Status: Supported, with caveats
+
+"Driver domains" means allowing non-Domain 0 domains 
+with access to physical devices to act as back-ends.
+
+See the appropriate "Device Passthrough" section
+for more information about security support.
+
 ### Device Model Stub Domains
 
-Status: Supported
+Status: Supported, with caveats
+
+Vulnerabilities of a device model stub domain 
+to a hostile driver domain (either compromised or untrusted)
+are excluded from security support.
 
 ### KCONFIG Expert
 
@@ -522,6 +536,23 @@ Virtual Performance Management Unit for HVM guests
 Disabled by default (enable with hypervisor command line option).
 This feature is not security supported: see 
http://xenbits.xen.org/xsa/advisory-163.html
 
+### x86/PCI Device Passthrough
+
+Status: Supported, with caveats
+
+Only systems using IOMMUs will be supported.
+
+Not compatible with migration, altp2m, introspection, memory sharing, or 
memory paging.
+
+Because of hardware limitations
+(affecting any operating system or hypervisor),
+it is generally not safe to use this feature 
+to expose a physical device to completely untrusted guests.
+However, this feature can still confer significant security benefit 
+when used to remove drivers and backends from domain 0
+(i.e., Driver Domains).
+See docs/PCI-IOMMU-bugs.txt for more information.
+
 ### ARM/Non-PCI device passthrough
 
 Status: Supported
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 16/16] SUPPORT.md: Add limits RFC

2017-11-13 Thread George Dunlap
Signed-off-by: George Dunlap 
---
Could someone take this one over as well?

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
---
 SUPPORT.md | 61 +
 1 file changed, 61 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index e72f9f3892..d11e05fc2a 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -64,6 +64,53 @@ for the definitions of the support status levels etc.
 
 Extension to the GICv3 interrupt controller to support MSI.
 
+## Limits/Host
+
+### CPUs
+
+Limit, x86: 4095
+Limit, ARM32: 8
+Limit, ARM64: 128
+
+Note that for x86, very large number of cpus may not work/boot,
+but we will still provide security support
+
+### x86/RAM
+
+Limit, x86: 123TiB
+Limit, ARM32: 16GiB
+Limit, ARM64: 5TiB
+
+## Limits/Guest
+
+### Virtual CPUs
+
+Limit, x86 PV: 8192
+Limit-security, x86 PV: 32
+Limit, x86 HVM: 128
+Limit-security, x86 HVM: 32
+Limit, ARM32: 8
+Limit, ARM64: 128
+
+### Virtual RAM
+
+Limit-security, x86 PV: 2047GiB
+Limit-security, x86 HVM: 1.5TiB
+Limit, ARM32: 16GiB
+Limit, ARM64: 1TiB
+
+Note that there are no theoretical limits to PV or HVM guest sizes
+other than those determined by the processor architecture.
+
+### Event Channel 2-level ABI
+
+Limit, 32-bit: 1024
+Limit, 64-bit: 4096
+
+### Event Channel FIFO ABI
+
+Limit: 131072
+
 ## Guest Type
 
 ### x86/PV
@@ -685,6 +732,20 @@ If support differs based on implementation
 (for instance, x86 / ARM, Linux / QEMU / FreeBSD),
 one line for each set of implementations will be listed.
 
+### Limit-security
+
+For size limits.
+This figure shows the largest configuration which will receive
+security support.
+It is generally determined by the maximum amount that is regularly tested.
+This limit will only be listed explicitly
+if it is different than the theoretical limit.
+
+### Limit
+
+This figure shows a theoretical size limit.
+This does not mean that such a large configuration will actually work.
+
 ## Definition of Status labels
 
 Each Status value corresponds to levels of security support,
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 15/16] SUPPORT.md: Add statement on migration RFC

2017-11-13 Thread George Dunlap
Signed-off-by: George Dunlap 
---
Would someone be willing to take over this one?

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Roger Pau Monne 
CC: Anthony Perard 
CC: Paul Durrant 
CC: Julien Grall 
---
 SUPPORT.md | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index a8388f3dc5..e72f9f3892 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -294,6 +294,36 @@ This includes exposing event channels to HVM guests.
 
 ## High Availability and Fault Tolerance
 
+### Live Migration, Save & Restore
+
+Status, x86: Supported, with caveats
+
+A number of features don't work with live migration / save / restore.  These 
include:
+ * PCI passthrough
+ * vNUMA
+ * Nested HVM
+
+XXX Need to check the following:
+ 
+ * Guest serial console
+ * Crash kernels
+ * Transcendent Memory
+ * Alternative p2m
+ * vMCE
+ * vPMU
+ * Intel Platform QoS
+ * Remus
+ * COLO
+ * PV protocols: Keyboard, PVUSB, PVSCSI, PVTPM, 9pfs, pvcalls?
+ * FlASK?
+ * CPU / memory hotplug?
+
+Additionally, if an HVM guest was booted with memory != maxmem,
+and the balloon driver hadn't hit the target before migration,
+the size of the guest on the far side might be unexpected.
+
+See docs/features/migration.pandoc for more details
+
 ### Remus Fault Tolerance
 
 Status: Experimental
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 11/16] SUPPORT.md: Add 'easy' HA / FT features

2017-11-13 Thread George Dunlap
Migration being one of the key 'non-easy' ones to be added later.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
---
 SUPPORT.md | 16 
 1 file changed, 16 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index bd83c81557..722a29fec5 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -261,6 +261,22 @@ which add paravirtualized functionality to HVM guests
 for improved performance and scalability.
 This includes exposing event channels to HVM guests.
 
+## High Availability and Fault Tolerance
+
+### Remus Fault Tolerance
+
+Status: Experimental
+
+### COLO Manager
+
+Status: Experimental
+
+### x86/vMCE
+
+Status: Supported
+
+Forward Machine Check Exceptions to Appropriate guests
+
 ## Virtual driver support, guest side
 
 ### Blkfront
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.10] tools/xenstored: Check number of strings passed to do_control()

2017-11-13 Thread Julien Grall

Hi,

Apologies for the late answer, I missed the e-mail in my inbox.

On 10/27/2017 05:37 PM, Ian Jackson wrote:

Pawel Wieczorkiewicz writes ("[PATCH] tools/xenstored: Check number of strings 
passed to do_control()"):

It is possible to send a zero-string message body to xenstore's
XS_CONTROL handling function. Then the number of strings is used
for an array allocation. This leads to a crash in strcmp() in a
CONTROL sub-command invocation loop.
The output of xs_count_string() should be verified and all 0 or
negative values should be rejected with an EINVAL. At least the
sub-command name must be specified.

The xenstore crash can only be triggered from within dom0 (there
is a check in do_control() rejecting all non-dom0 requests with
an EACCES).


Acked-by: Ian Jackson 

(Added the for-4.10 tag to the Subject.)


Release-acked-by: Julien Grall 

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH net-next v1] xen-netback: make copy batch size configurable

2017-11-13 Thread Joao Martins
On Mon, Nov 13, 2017 at 11:58:03AM +, Paul Durrant wrote:
> On Mon, Nov 13, 2017 at 11:54:00AM +, Joao Martins wrote:
> > On 11/13/2017 10:33 AM, Paul Durrant wrote:
> > > On 11/10/2017 19:35 PM, Joao Martins wrote:

[snip]

> > >> diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
> > >> index b1cf7c6f407a..793a85f61f9d 100644
> > >> --- a/drivers/net/xen-netback/rx.c
> > >> +++ b/drivers/net/xen-netback/rx.c
> > >> @@ -168,11 +168,14 @@ static void xenvif_rx_copy_add(struct
> > >> xenvif_queue *queue,
> > >> struct xen_netif_rx_request *req,
> > >> unsigned int offset, void *data, size_t 
> > >> len)
> > >>  {
> > >> +unsigned int batch_size;
> > >>  struct gnttab_copy *op;
> > >>  struct page *page;
> > >>  struct xen_page_foreign *foreign;
> > >>
> > >> -if (queue->rx_copy.num == COPY_BATCH_SIZE)
> > >> +batch_size = min(xenvif_copy_batch_size, queue->rx_copy.size);
> > >
> > > Surely queue->rx_copy.size and xenvif_copy_batch_size are always
> > > identical? Why do you need this statement (and hence stack variable)?
> > >
> > This statement was to allow to be changed dynamically and would
> > affect all newly created guests or running guests if value happened
> > to be smaller than initially allocated. But I suppose I should make
> > behaviour more consistent with the other params we have right now
> > and just look at initially allocated one `queue->rx_copy.batch_size` ?
> 
> Yes, that would certainly be consistent but I can see value in
> allowing it to be dynamically tuned, so perhaps adding some re-allocation
> code to allow the batch to be grown as well as shrunk might be nice.

The shrink one we potentially risk losing data, so we need to gate the
reallocation whenever `rx_copy.num` is less than the new requested
batch. Worst case means guestrx_thread simply uses the initial
allocated value.

Anyhow, something like the below scissored diff (on top of your comments):

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index a165a4123396..8e4eaf3a507d 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -359,6 +359,7 @@ irqreturn_t xenvif_ctrl_irq_fn(int irq, void *data);
 
 void xenvif_rx_action(struct xenvif_queue *queue);
 void xenvif_rx_queue_tail(struct xenvif_queue *queue, struct sk_buff *skb);
+int xenvif_rx_copy_realloc(struct xenvif_queue *queue, unsigned int size);
 
 void xenvif_carrier_on(struct xenvif *vif);
 
diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index 1892bf9327e4..14613b5fcccb 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -516,20 +516,13 @@ struct xenvif *xenvif_alloc(struct device *parent, 
domid_t domid,
 
 int xenvif_init_queue(struct xenvif_queue *queue)
 {
-   unsigned int size = xenvif_copy_batch_size;
int err, i;
-   void *addr;
-
-   addr = vzalloc(size * sizeof(struct gnttab_copy));
-   if (!addr)
-   goto err;
-   queue->rx_copy.op = addr;
 
-   addr = vzalloc(size * sizeof(RING_IDX));
-   if (!addr)
+   err = xenvif_rx_copy_realloc(queue, xenvif_copy_batch_size);
+   if (err) {
+   netdev_err(queue->vif->dev, "Could not alloc rx_copy\n");
goto err;
-   queue->rx_copy.idx = addr;
-   queue->rx_copy.batch_size = size;
+   }
 
queue->credit_bytes = queue->remaining_credit = ~0UL;
queue->credit_usec  = 0UL;
diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
index be3946cdaaf6..f54bfe72188c 100644
--- a/drivers/net/xen-netback/rx.c
+++ b/drivers/net/xen-netback/rx.c
@@ -130,6 +130,51 @@ static void xenvif_rx_queue_drop_expired(struct 
xenvif_queue *queue)
}
 }
 
+int xenvif_rx_copy_realloc(struct xenvif_queue *queue, unsigned int size)
+{
+   void *op = NULL, *idx = NULL;
+
+   /* No reallocation if new size doesn't fit ongoing requests */
+   if (!size || queue->rx_copy.num > size)
+   return -EINVAL;
+
+   op = vzalloc(size * sizeof(struct gnttab_copy));
+   if (!op)
+   goto err;
+
+   idx = vzalloc(size * sizeof(RING_IDX));
+   if (!idx)
+   goto err;
+
+   /* Ongoing requests need copying */
+   if (queue->rx_copy.num) {
+   unsigned int tmp;
+
+   tmp = queue->rx_copy.num * sizeof(struct gnttab_copy);
+   memcpy(op, queue->rx_copy.op, tmp);
+
+   tmp = queue->rx_copy.num * sizeof(RING_IDX);
+   memcpy(idx, queue->rx_copy.idx, tmp);
+   }
+
+   if (queue->rx_copy.op || queue->rx_copy.idx) {
+   vfree(queue->rx_copy.op);
+   vfree(queue->rx_copy.idx);
+   }
+
+   queue->rx_copy.op = op;
+   queue->rx_copy.idx = idx;
+   queue->rx_copy.batch_size = size;
+   

[Xen-devel] [linux-4.1 test] 116124: regressions - FAIL

2017-11-13 Thread osstest service owner
flight 116124 linux-4.1 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/116124/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl  broken  in 115738
 test-amd64-amd64-xl-pvhv2-amd broken in 115738
 test-armhf-armhf-xl-multivcpu broken in 115738
 test-amd64-amd64-xl-qemut-win7-amd64  broken in 115738
 test-armhf-armhf-libvirt-xsm broken  in 115738
 test-amd64-amd64-xl-qemuu-win7-amd64  broken in 115738
 test-armhf-armhf-libvirt broken  in 115738
 test-armhf-armhf-xl-cubietruckbroken in 115738
 test-amd64-amd64-xl-qemut-ws16-amd64  broken in 115738
 test-armhf-armhf-xl-vhd  broken  in 115738
 test-armhf-armhf-xl-credit2  broken  in 115738
 test-armhf-armhf-libvirt-raw broken  in 115738
 test-amd64-amd64-xl-qcow2broken  in 115738
 build-i386-rumprun8 xen-build  fail in 115738 REGR. vs. 114665
 build-i386-xsm6 xen-build  fail in 115738 REGR. vs. 114665
 build-i386-pvops  6 kernel-build   fail in 115738 REGR. vs. 114665

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-vhd  4 host-install(4) broken in 115738 pass in 116124
 test-armhf-armhf-xl-multivcpu 4 host-install(4) broken in 115738 pass in 116124
 test-armhf-armhf-examine  5 host-install   broken in 115738 pass in 116124
 test-armhf-armhf-libvirt 4 host-install(4) broken in 115738 pass in 116124
 test-armhf-armhf-libvirt-raw 4 host-install(4) broken in 115738 pass in 116124
 test-armhf-armhf-libvirt-xsm 4 host-install(4) broken in 115738 pass in 116124
 test-armhf-armhf-xl-credit2  4 host-install(4) broken in 115738 pass in 116124
 test-amd64-amd64-xl-pvhv2-amd 4 host-install(4) broken in 115738 pass in 116124
 test-amd64-amd64-xl-qemut-ws16-amd64 4 host-install(4) broken in 115738 pass 
in 116124
 test-amd64-amd64-xl-qemut-win7-amd64 4 host-install(4) broken in 115738 pass 
in 116124
 test-amd64-amd64-xl-qemuu-win7-amd64 4 host-install(4) broken in 115738 pass 
in 116124
 test-amd64-amd64-xl-qcow24 host-install(4) broken in 115738 pass in 116124
 test-armhf-armhf-xl  4 host-install(4) broken in 115738 pass in 116124
 test-armhf-armhf-xl-cubietruck 4 host-install(4) broken in 115738 pass in 
116124
 test-amd64-amd64-libvirt 10 debian-install   fail in 115738 pass in 116124
 test-amd64-amd64-xl-credit2  10 debian-install   fail in 115738 pass in 116124
 test-amd64-amd64-xl-multivcpu 10 debian-install  fail in 115738 pass in 116124
 test-amd64-amd64-qemuu-nested-intel 10 debian-hvm-install fail in 115738 pass 
in 116124
 test-amd64-amd64-qemuu-nested-amd 10 debian-hvm-install fail in 115738 pass in 
116124
 test-amd64-amd64-i386-pvgrub 10 debian-di-install fail in 115738 pass in 116124
 test-amd64-amd64-xl-qcow2 19 guest-start/debian.repeat fail in 116104 pass in 
116124
 test-armhf-armhf-xl-arndale   7 xen-boot fail in 116104 pass in 116124
 test-armhf-armhf-xl-vhd  15 guest-start/debian.repeat  fail pass in 115693
 test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeat  fail pass in 115738
 test-amd64-i386-libvirt-qcow2 18 guest-start.2 fail pass in 116104

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt-qcow2  1 build-check(1)  blocked in 115738 n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1) blocked in 115738 n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked in 
115738 n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked in 
115738 n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 1 build-check(1) blocked in 
115738 n/a
 test-amd64-i386-xl-qemut-win10-i386  1 build-check(1)blocked in 115738 n/a
 test-amd64-i386-xl-qemuu-win10-i386  1 build-check(1)blocked in 115738 n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked in 115738 n/a
 test-amd64-i386-examine   1 build-check(1)   blocked in 115738 n/a
 test-amd64-i386-xl-qemut-ws16-amd64  1 build-check(1)blocked in 115738 n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64 1 build-check(1) blocked in 115738 n/a
 test-amd64-i386-qemut-rhel6hvm-intel  1 build-check(1)   blocked in 115738 n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)blocked in 115738 n/a
 test-amd64-i386-xl-qemuu-ws16-amd64  1 build-check(1)blocked in 115738 n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64 1 build-check(1) blocked in 115738 n/a
 test-amd64-i386-xl1 build-check(1)   blocked in 115738 n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked in 115738 n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked in 115738 n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 

Re: [Xen-devel] [PATCH 1/4 v3 for-4.10] libxl: Fix the bug introduced in commit "libxl: use correct type modifier for vuart_gfn"

2017-11-13 Thread Julien Grall

Hi Wei,

Sorry I missed that e-mail.

On 10/31/2017 10:07 AM, Wei Liu wrote:

Change the tag to for-4.10.

Julien, this is needed to fix vuart emulation.


To confirm, only patch #1 is candidate for Xen 4.10, right? The rest 
will be queued for Xen 4.11?


For patch #1:

Release-acked-by: Julien Grall 

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.10] common/spinlock: Improve the output from check_lock() if it trips

2017-11-13 Thread Julien Grall

Hi Jan,

On 11/06/2017 11:09 AM, Jan Beulich wrote:

On 31.10.17 at 11:49,  wrote:

--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -44,7 +44,13 @@ static void check_lock(struct lock_debug *debug)
  if ( unlikely(debug->irq_safe != irq_safe) )
  {
  int seen = cmpxchg(>irq_safe, -1, irq_safe);
-BUG_ON(seen == !irq_safe);
+
+if ( seen == !irq_safe )
+{
+printk("CHECKLOCK FAILURE: prev irqsafe: %d, curr irqsafe %d\n",
+   seen, irq_safe);
+BUG();


This really should use XENLOG_ERR imo, so that the message won't
be lost if warnings are rate limited.


The patch was already merged. I guess a follow-up could be done for Xen 
4.10.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] x86/hvm: do not register hpet mmio during s3 cycle

2017-11-13 Thread Julien Grall

Hi Jan,

On 11/09/2017 02:45 PM, Jan Beulich wrote:

On 09.11.17 at 15:42,  wrote:

Hi,

On 09/11/17 08:55, Jan Beulich wrote:

On 08.11.17 at 20:46,  wrote:

Do it once at domain creation (hpet_init).

Sleep -> Resume cycles will end up crashing an HVM guest with hpet as
the sequence during resume takes the path:
-> hvm_s3_suspend
-> hpet_reset
  -> hpet_deinit
  -> hpet_init
-> register_mmio_handler
  -> hvm_next_io_handler

register_mmio_handler will use a new io handler each time, until
eventually it reaches NR_IO_HANDLERS, then hvm_next_io_handler calls
domain_crash.

Signed-off-by: Eric Chanudet 

---
v2:
* make hpet_reinit static inline (one call site in this file)


Perhaps my prior reply was ambiguous: By "inlining" I meant
literally inlining it (i.e. dropping the standalone function
altogether). Static functions outside of header files should not
normally be marked "inline" explicitly - it should be the compiler
to make that decision.

As doing the adjustment it relatively simple, I wouldn't mind
doing so while committing, saving another round trip. With
that adjustment (or at the very least with the "inline" dropped)
Reviewed-by: Jan Beulich 


What would be the risk to get this patch in Xen 4.10?


Close to none, I would say. Of course, if there really was
something wrong with the code restructuring to fix the bug,
basically all HVM guests would be hosed HPET-wise.


On that basis:

Release-acked-by: Julien Grall 

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH net-next v1] xen-netback: make copy batch size configurable

2017-11-13 Thread Joao Martins
On Mon, Nov 13, 2017 at 04:39:09PM +, Paul Durrant wrote:
> > -Original Message-
> > From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> > Sent: 13 November 2017 16:34
> > To: Paul Durrant 
> > Cc: net...@vger.kernel.org; Wei Liu ; xen-
> > de...@lists.xenproject.org
> > Subject: Re: [PATCH net-next v1] xen-netback: make copy batch size
> > configurable
> > 
> > On Mon, Nov 13, 2017 at 11:58:03AM +, Paul Durrant wrote:
> > > On Mon, Nov 13, 2017 at 11:54:00AM +, Joao Martins wrote:
> > > > On 11/13/2017 10:33 AM, Paul Durrant wrote:
> > > > > On 11/10/2017 19:35 PM, Joao Martins wrote:
> > 
> > [snip]
> > 
> > > > >> diff --git a/drivers/net/xen-netback/rx.c 
> > > > >> b/drivers/net/xen-netback/rx.c
> > > > >> index b1cf7c6f407a..793a85f61f9d 100644
> > > > >> --- a/drivers/net/xen-netback/rx.c
> > > > >> +++ b/drivers/net/xen-netback/rx.c
> > > > >> @@ -168,11 +168,14 @@ static void xenvif_rx_copy_add(struct 
> > > > >> xenvif_queue *queue,
> > > > >> struct xen_netif_rx_request *req,
> > > > >> unsigned int offset, void *data, size_t 
> > > > >> len)
> > > > >>  {
> > > > >> +unsigned int batch_size;
> > > > >>  struct gnttab_copy *op;
> > > > >>  struct page *page;
> > > > >>  struct xen_page_foreign *foreign;
> > > > >>
> > > > >> -if (queue->rx_copy.num == COPY_BATCH_SIZE)
> > > > >> +batch_size = min(xenvif_copy_batch_size, queue->rx_copy.size);
> > > > >
> > > > > Surely queue->rx_copy.size and xenvif_copy_batch_size are always
> > > > > identical? Why do you need this statement (and hence stack variable)?
> > > > >
> > > > This statement was to allow to be changed dynamically and would
> > > > affect all newly created guests or running guests if value happened
> > > > to be smaller than initially allocated. But I suppose I should make
> > > > behaviour more consistent with the other params we have right now
> > > > and just look at initially allocated one `queue->rx_copy.batch_size` ?
> > >
> > > Yes, that would certainly be consistent but I can see value in
> > > allowing it to be dynamically tuned, so perhaps adding some re-allocation
> > > code to allow the batch to be grown as well as shrunk might be nice.
> > 
> > The shrink one we potentially risk losing data, so we need to gate the
> > reallocation whenever `rx_copy.num` is less than the new requested
> > batch. Worst case means guestrx_thread simply uses the initial
> > allocated value.
> 
> Can't you just re-alloc immediately after the flush (when num is
> guaranteed to be zero)?

/facepalm

Yes, after the flush should make things much simpler.

Joao

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c

2017-11-13 Thread Stefano Stabellini
On Fri, 10 Nov 2017, Boris Ostrovsky wrote:
> On 11/10/2017 06:57 PM, Stefano Stabellini wrote:
> > On Tue, 7 Nov 2017, Juergen Gross wrote:
> > > On 06/11/17 23:17, Stefano Stabellini wrote:
> > > > mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> > > > take in_mutex on the first try, but you can't take out_mutex. Next times
> > > > you call mutex_trylock() in_mutex is going to fail. It's an endless
> > > > loop.
> > > > 
> > > > Solve the problem by moving the two mutex_trylock calls to two separate
> > > > loops.
> > > > 
> > > > Reported-by: Dan Carpenter 
> > > > Signed-off-by: Stefano Stabellini 
> > > > CC: boris.ostrov...@oracle.com
> > > > CC: jgr...@suse.com
> > > > ---
> > > >   drivers/xen/pvcalls-front.c | 5 +++--
> > > >   1 file changed, 3 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> > > > index 0c1ec68..047dce7 100644
> > > > --- a/drivers/xen/pvcalls-front.c
> > > > +++ b/drivers/xen/pvcalls-front.c
> > > > @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
> > > >  * is set to NULL -- we only need to wait for the
> > > > existing
> > > >  * waiters to return.
> > > >  */
> > > > -   while (!mutex_trylock(>active.in_mutex) ||
> > > > -  !mutex_trylock(>active.out_mutex))
> > > > +   while (!mutex_trylock(>active.in_mutex))
> > > > +   cpu_relax();
> > > > +   while (!mutex_trylock(>active.out_mutex))
> > > > cpu_relax();
> > > 
> > > Any reason you don't just use mutex_lock()?
> > 
> > Hi Juergen, sorry for the late reply.
> > 
> > Yes, you are right. Given the patch, it would be just the same to use
> > mutex_lock.
> > 
> > This is where I realized that actually we have a problem: no matter if
> > we use mutex_lock or mutex_trylock, there are no guarantees that we'll
> > be the last to take the in/out_mutex. Other waiters could be still
> > outstanding.
> > 
> > We solved the same problem using a refcount in pvcalls_front_remove. In
> > this case, I was thinking of reusing the mutex internal counter for
> > efficiency, instead of adding one more refcount.
> > 
> > For using the mutex as a refcount, there is really no need to call
> > mutex_trylock or mutex_lock. I suggest checking on the mutex counter
> > directly:
> > 
> 
> 
> I think you want to say
> 
>   while(mutex_locked(>active.in_mutex.owner) ||
> mutex_locked(>active.out_mutex.owner))
>   cpu_relax();
> 
> since owner being NULL is not a documented value of a free mutex.
> 

You are right, and the function name is "mutex_is_locked", so it would
be:


while(mutex_is_locked(>active.in_mutex.owner) ||
  mutex_is_locked(>active.out_mutex.owner))
cpu_relax();


> > 
> > while (atomic_long_read(>active.in_mutex.owner) != 0UL ||
> >atomic_long_read(>active.out_mutex.owner) != 0UL)
> > cpu_relax();
> > 
> > Cheers,
> > 
> > Stefano
> > 
> > 
> > ---
> > 
> > xen/pvcalls: fix potential endless loop in pvcalls-front.c
> > 
> > mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> > take in_mutex on the first try, but you can't take out_mutex. Next time
> > you call mutex_trylock() in_mutex is going to fail. It's an endless
> > loop.
> > 
> > Actually, we don't want to use mutex_trylock at all: we don't need to
> > take the mutex, we only need to wait until the last mutex waiter/holder
> > releases it.
> > 
> > Instead of calling mutex_trylock or mutex_lock, just check on the mutex
> > refcount instead.
> > 
> > Reported-by: Dan Carpenter 
> > Signed-off-by: Stefano Stabellini 
> > CC: boris.ostrov...@oracle.com
> > CC: jgr...@suse.com
> > 
> > diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> > index 0c1ec68..9f33cb8 100644
> > --- a/drivers/xen/pvcalls-front.c
> > +++ b/drivers/xen/pvcalls-front.c
> > @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
> >  * is set to NULL -- we only need to wait for the existing
> >  * waiters to return.
> >  */
> > -   while (!mutex_trylock(>active.in_mutex) ||
> > -  !mutex_trylock(>active.out_mutex))
> > +   while (atomic_long_read(>active.in_mutex.owner) != 0UL ||
> > +  atomic_long_read(>active.out_mutex.owner) != 0UL)
> > cpu_relax();
> > pvcalls_front_free_map(bedata, map);
> > 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 116143: tolerable all pass - PUSHED

2017-11-13 Thread osstest service owner
flight 116143 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/116143/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  20ed7c8177da2847d65bb3373c6f1263671322d4
baseline version:
 xen  a08ee2c06dd06ef07a0c4f9c5f84b0d86694be7f

Last test of basis   116137  2017-11-13 11:02:47 Z0 days
Testing same since   116143  2017-11-13 16:05:04 Z0 days1 attempts


People who touched revisions under test:
  Anthony PERARD 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To osst...@xenbits.xen.org:/home/xen/git/xen.git
   a08ee2c..20ed7c8  20ed7c8177da2847d65bb3373c6f1263671322d4 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline test] 116126: tolerable FAIL - PUSHED

2017-11-13 Thread osstest service owner
flight 116126 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/116126/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeatfail  like 115657
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 115707
 test-amd64-amd64-xl-qcow219 guest-start/debian.repeatfail  like 115707
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 115707
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 115707
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 115707
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 115707
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 115707
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass

version targeted for testing:
 qemuu4ffa88c99c54d2a30f79e3dbecec50b023eff1c8
baseline version:
 qemuub0fbe46ad82982b289a44ee2495b59b0bad8a842

Last test of basis   115707  2017-11-09 19:51:54 Z3 days
Failing since115732  2017-11-10 16:18:32 Z3 days4 attempts
Testing same since   115747  2017-11-11 07:29:17 Z2 days3 attempts


People who touched revisions under test:
  Daniel P. Berrange 
  David Gibson 
  Greg Kurz 
  Longpeng 
  Michael Davidsaver 
  Peter Maydell 
  Thomas Huth 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvops

Re: [Xen-devel] [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c

2017-11-13 Thread Stefano Stabellini
On Mon, 13 Nov 2017, Juergen Gross wrote:
> On 11/11/17 00:57, Stefano Stabellini wrote:
> > On Tue, 7 Nov 2017, Juergen Gross wrote:
> >> On 06/11/17 23:17, Stefano Stabellini wrote:
> >>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> >>> take in_mutex on the first try, but you can't take out_mutex. Next times
> >>> you call mutex_trylock() in_mutex is going to fail. It's an endless
> >>> loop.
> >>>
> >>> Solve the problem by moving the two mutex_trylock calls to two separate
> >>> loops.
> >>>
> >>> Reported-by: Dan Carpenter 
> >>> Signed-off-by: Stefano Stabellini 
> >>> CC: boris.ostrov...@oracle.com
> >>> CC: jgr...@suse.com
> >>> ---
> >>>  drivers/xen/pvcalls-front.c | 5 +++--
> >>>  1 file changed, 3 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> >>> index 0c1ec68..047dce7 100644
> >>> --- a/drivers/xen/pvcalls-front.c
> >>> +++ b/drivers/xen/pvcalls-front.c
> >>> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
> >>>* is set to NULL -- we only need to wait for the existing
> >>>* waiters to return.
> >>>*/
> >>> - while (!mutex_trylock(>active.in_mutex) ||
> >>> -!mutex_trylock(>active.out_mutex))
> >>> + while (!mutex_trylock(>active.in_mutex))
> >>> + cpu_relax();
> >>> + while (!mutex_trylock(>active.out_mutex))
> >>>   cpu_relax();
> >>
> >> Any reason you don't just use mutex_lock()?
> > 
> > Hi Juergen, sorry for the late reply.
> > 
> > Yes, you are right. Given the patch, it would be just the same to use
> > mutex_lock.
> > 
> > This is where I realized that actually we have a problem: no matter if
> > we use mutex_lock or mutex_trylock, there are no guarantees that we'll
> > be the last to take the in/out_mutex. Other waiters could be still
> > outstanding.
> > 
> > We solved the same problem using a refcount in pvcalls_front_remove. In
> > this case, I was thinking of reusing the mutex internal counter for
> > efficiency, instead of adding one more refcount.
> > 
> > For using the mutex as a refcount, there is really no need to call
> > mutex_trylock or mutex_lock. I suggest checking on the mutex counter
> > directly:
> > 
> > 
> > while (atomic_long_read(>active.in_mutex.owner) != 0UL ||
> >atomic_long_read(>active.out_mutex.owner) != 0UL)
> > cpu_relax();
> > 
> > Cheers,
> > 
> > Stefano
> > 
> > 
> > ---
> > 
> > xen/pvcalls: fix potential endless loop in pvcalls-front.c
> > 
> > mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> > take in_mutex on the first try, but you can't take out_mutex. Next time
> > you call mutex_trylock() in_mutex is going to fail. It's an endless
> > loop.
> > 
> > Actually, we don't want to use mutex_trylock at all: we don't need to
> > take the mutex, we only need to wait until the last mutex waiter/holder
> > releases it.
> > 
> > Instead of calling mutex_trylock or mutex_lock, just check on the mutex
> > refcount instead.
> > 
> > Reported-by: Dan Carpenter 
> > Signed-off-by: Stefano Stabellini 
> > CC: boris.ostrov...@oracle.com
> > CC: jgr...@suse.com
> > 
> > diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> > index 0c1ec68..9f33cb8 100644
> > --- a/drivers/xen/pvcalls-front.c
> > +++ b/drivers/xen/pvcalls-front.c
> > @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
> >  * is set to NULL -- we only need to wait for the existing
> >  * waiters to return.
> >  */
> > -   while (!mutex_trylock(>active.in_mutex) ||
> > -  !mutex_trylock(>active.out_mutex))
> > +   while (atomic_long_read(>active.in_mutex.owner) != 0UL ||
> > +  atomic_long_read(>active.out_mutex.owner) != 0UL)
> 
> I don't like this.
> 
> Can't you use a kref here? Even if it looks like more overhead it is
> much cleaner. There will be no questions regarding possible races,
> while with an approach like yours will always smell racy (can't there
> be someone taking the mutex just after above test?).
> 
> In no case you should make use of the mutex internals.

Boris' suggestion solves that problem well. Would you be OK with the
proposed

while(mutex_is_locked(>active.in_mutex.owner) ||
  mutex_is_locked(>active.out_mutex.owner))
cpu_relax();

?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 00/31] CPUFreq on ARM

2017-11-13 Thread Oleksandr Tyshchenko
On Thu, Nov 9, 2017 at 7:18 PM, Andrii Anisov  wrote:
> Dear Oleksandr,
Dear Andrii

>
>
> Please consider my `Reviewed-by: Andrii Anisov ` for
> all patches.
>
> What you missed after extracting this stuff from github.
Thanks. I will add.

>
>
> On 09.11.17 19:09, Oleksandr Tyshchenko wrote:
>>
>> From: Oleksandr Tyshchenko 
>
>
> --
>
> *Andrii Anisov*
>
>



-- 
Regards,

Oleksandr Tyshchenko

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2 v2] xen: Add support for initializing 16550 UART using ACPI

2017-11-13 Thread Julien Grall

Hi Bhupinder,

On 11/09/2017 10:19 AM, Bhupinder Thakur wrote:

Currently, Xen supports only DT based initialization of 16550 UART.
This patch adds support for initializing 16550 UART using ACPI SPCR table.

This patch also makes the uart initialization code common between DT and
ACPI based initialization.


Can you please have one patch to refactor the code and one to add ACPI 
support? This will be easier to review.




Signed-off-by: Bhupinder Thakur 
---
TBD:
There was one review comment from Julien about how the uart->io_size is being
calculated. Currently, I am calulating the io_size based on address of the last
UART register.

pci_uart_config also calcualates the uart->io_size like this:

uart->io_size = max(8U << param->reg_shift,
  param->uart_offset);

I am not sure whether we can use similar logic for calculating uart->io_size.

Changes since v1:
- Reused common code between DT and ACPI based initializations

CC: Andrew Cooper 
CC: George Dunlap 
CC: Ian Jackson 
CC: Jan Beulich 
CC: Konrad Rzeszutek Wilk 
CC: Stefano Stabellini 
CC: Tim Deegan 
CC: Wei Liu 
CC: Julien Grall 

  xen/drivers/char/ns16550.c  | 132 
  xen/include/xen/8250-uart.h |   1 +
  2 files changed, 121 insertions(+), 12 deletions(-)

diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
index e0f8199..cf42fce 100644
--- a/xen/drivers/char/ns16550.c
+++ b/xen/drivers/char/ns16550.c
@@ -1463,18 +1463,13 @@ void __init ns16550_init(int index, struct 
ns16550_defaults *defaults)
  }
  
  #ifdef CONFIG_HAS_DEVICE_TREE

-static int __init ns16550_uart_dt_init(struct dt_device_node *dev,
-   const void *data)
+static int ns16550_init_dt(struct ns16550 *uart,
+   const struct dt_device_node *dev)
  {
-struct ns16550 *uart;
-int res;
+int res = 0;
  u32 reg_shift, reg_width;
  u64 io_size;
  
-uart = _com[0];

-
-ns16550_init_common(uart);
-
  uart->baud  = BAUD_AUTO;
  uart->data_bits = 8;
  uart->parity= UART_PARITY_NONE;
@@ -1510,18 +1505,103 @@ static int __init ns16550_uart_dt_init(struct 
dt_device_node *dev,
  
  uart->dw_usr_bsy = dt_device_is_compatible(dev, "snps,dw-apb-uart");
  
+return res;

+}
+#else
+static int ns16550_init_dt(struct ns16550 *uart,
+   const struct dt_device_node *dev)
+{
+return -EINVAL;
+}
+#endif
+
+#ifdef CONFIG_ACPI
+#include 
+static int ns16550_init_acpi(struct ns16550 *uart,
+ const void *data)
+{
+struct acpi_table_spcr *spcr = NULL;
+int status = 0;
+
+status = acpi_get_table(ACPI_SIG_SPCR, 0,
+(struct acpi_table_header **));
+
+if ( ACPI_FAILURE(status) )
+{
+printk("ns16550: Failed to get SPCR table\n");
+return -EINVAL;
+}
+
+uart->baud  = BAUD_AUTO;
+uart->data_bits = 8;
+uart->parity= spcr->parity;
+uart->stop_bits = spcr->stop_bits;
+uart->io_base = spcr->serial_port.address;
+uart->irq = spcr->interrupt;
+uart->reg_width = spcr->serial_port.bit_width / 8;
+uart->reg_shift = 0;
+uart->io_size = UART_MAX_REG << uart->reg_shift;
+
+irq_set_type(spcr->interrupt, spcr->interrupt_type);
+
+return 0;
+}
+#else
+static int ns16550_init_acpi(struct ns16550 *uart,
+ const void *data)
+{
+return -EINVAL;
+}
+#endif
+
+static int ns16550_uart_init(struct ns16550 **puart,
+ const void *data, bool acpi)
+{
+struct ns16550 *uart = _com[0];
+
+*puart = uart;
+
+ns16550_init_common(uart);
+
+return ( acpi ) ? ns16550_init_acpi(uart, data)
+: ns16550_init_dt(uart, data);
+}


This function does not look very useful but getting _com[0].
I do agree that we need it is nice to have common code, but I think you 
went too far here.


There are no need for 3 separate functions and 2 functions for each 
firmware.


I think duplicating the code of ns16550_uart_init for ACPI and DT is 
fine. You could then create a function that is a merge vuart_init and 
register_init.


This would also limit the number of #ifdef within this code.


+
+static void ns16550_vuart_init(struct ns16550 *uart)
+{
+#ifdef CONFIG_ARM
  uart->vuart.base_addr = uart->io_base;
  uart->vuart.size = uart->io_size;
-uart->vuart.data_off = UART_THR vuart.status_off = UART_LSRvuart.status = UART_LSR_THRE|UART_LSR_TEMT;
+uart->vuart.data_off = UART_THR << uart->reg_shift;
+uart->vuart.status_off = UART_LSR << uart->reg_shift;
+uart->vuart.status = UART_LSR_THRE | 

[Xen-devel] [seabios test] 116129: regressions - FAIL

2017-11-13 Thread osstest service owner
flight 116129 seabios real [real]
http://logs.test-lab.xenproject.org/osstest/logs/116129/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop   fail REGR. vs. 115539

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 115539
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 115539
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 115539
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass

version targeted for testing:
 seabios  63451fca13c75870e1703eb3e20584d91179aebc
baseline version:
 seabios  0ca6d6277dfafc671a5b3718cbeb5c78e2a888ea

Last test of basis   115539  2017-11-03 20:48:58 Z9 days
Testing same since   115733  2017-11-10 17:19:59 Z3 days5 attempts


People who touched revisions under test:
  Kevin O'Connor 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-qemuu-nested-amdfail
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-ws16-amd64 fail
 test-amd64-i386-xl-qemuu-ws16-amd64  fail
 test-amd64-amd64-xl-qemuu-win10-i386 fail
 test-amd64-i386-xl-qemuu-win10-i386  fail
 test-amd64-amd64-qemuu-nested-intel  pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 63451fca13c75870e1703eb3e20584d91179aebc
Author: Kevin O'Connor 
Date:   Fri Nov 10 11:49:19 2017 -0500

docs: Note v1.11.0 release

Signed-off-by: Kevin O'Connor 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 00/31] CPUFreq on ARM

2017-11-13 Thread Oleksandr Tyshchenko
On Mon, Nov 13, 2017 at 5:21 PM, Andre Przywara
 wrote:
> Hi,
Hi Andre

>
> thanks very much for your work on this!
Thank you for your comments.

>
> On 09/11/17 17:09, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko 
>>
>> Hi, all.
>>
>> The purpose of this RFC patch series is to add CPUFreq support to Xen on ARM.
>> Motivation of hypervisor based CPUFreq is to enable one of the main PM 
>> use-cases in virtualized system powered by Xen hypervisor. Rationale behind 
>> this activity is that CPU virtualization is done by hypervisor and the guest 
>> OS doesn't actually know anything about physical CPUs because it is running 
>> on virtual CPUs. It is quite clear that a decision about frequency change 
>> should be taken by hypervisor as only it has information about actual CPU 
>> load.
>
> Can you please sketch your usage scenario or workloads here? I can think
> of quite different scenarios (oversubscribed server vs. partitioning
> RTOS guests, for instance). The usefulness of CPUFreq and the trade-offs
> in the design are quite different between those.
We keep embedded use-cases in mind. For example, it is a system with
several domains,
where one domain has most critical SW running on and other domain(s)
are, let say, for entertainment purposes.
I think, the CPUFreq is useful where power consumption is a question.

>
> In general I doubt that a hypervisor scheduling vCPUs is in a good
> position to make a decision on the proper frequency physical CPUs should
> run with. From all I know it's already hard for an OS kernel to make
> that call. So I would actually expect that guests provide some input,
> for instance by signalling OPP change request up to the hypervisor. This
> could then decide to act on it - or not.
Each running guest sees only part of the picture, but hypervisor has
the whole picture, it knows all about CPU, measures CPU load and able
to choose required CPU frequency to run on. I am wondering, does Xen
need additional input from guests for make a decision?
BTW, currently guest domain on ARM doesn't even know how many physical
CPUs the system has and what are these OPPs. When creating guest
domain Xen inserts only dummy CPU nodes. All CPU info, such as clocks,
OPPs, thermal, etc are not passed to guest.

>
>> Although these required components (CPUFreq core, governors, etc) already 
>> exist in Xen, it is worth to mention that they are ACPI specific. So, a part 
>> of the current patch series makes them more generic in order to make 
>> possible a CPUFreq usage on architectures without ACPI support in.
>
> Have you looked at how this is used on x86 these days? Can you briefly
> describe how this works and it's used there?
Xen supports CPUFreq feature on x86 [1]. I don't know how widely it is
used at the moment, but it is another question. So, there are two
possible modes: Domain0 based CPUFreq and Hypervisor based CPUFreq
[2]. As I understand, the second option is more popular.
Two different implementations of "Hypervisor based CPUFreq" are
present: ACPI Processor P-States Driver and AMD Architectural P-state
Driver. You can find both them in xen/arch/x86/acpi/cpufreq/ dir.

[1] 
https://wiki.xenproject.org/wiki/Xen_power_management#CPU_P-states_.28cpufreq.29
[2] 
https://wiki.xenproject.org/wiki/Xen_power_management#Hypervisor_based_cpufreq

>
>> But, the main question we have to answer is about frequency changing 
>> interface in virtualized system. The frequency changing interface and all 
>> dependent components which needed CPUFreq to be functional on ARM are not 
>> present in Xen these days. The list of required components is quite big and 
>> may change across different ARM SoC vendors. As an example, the following 
>> components are involved in DVFS on Renesas Salvator-X board which has R-Car 
>> Gen3 SoC installed: generic clock, regulator and thermal frameworks, 
>> Vendor’s CPG, PMIC, AVS, THS drivers, i2c support, etc.
>>
>> We were considering a few possible approaches of hypervisor based CPUFreqs 
>> on ARM and came to conclusion to base this solution on popular at the 
>> moment, already upstreamed to Linux, ARM System Control and Power 
>> Interface(SCPI) protocol [1]. We chose SCPI protocol instead of newer ARM 
>> System Control and Management Interface (SCMI) protocol [2] since it is 
>> widely spread in Linux, there are good examples how to use it, the range of 
>> capabilities it has is enough for implementing hypervisor based CPUFreq and, 
>> what is more, upstream Linux support for SCMI is missed so far, but SCMI 
>> could be used as well.
>>
>> Briefly speaking, the SCPI protocol is used between the System Control 
>> Processor(SCP) and the Application Processors(AP). The mailbox feature 
>> provides a mechanism for inter-processor communication between SCP and AP. 
>> The main purpose of SCP is to offload different PM related tasks from AP and 
>> one of the services that SCP provides is Dynamic 

[Xen-devel] [libvirt test] 116130: regressions - FAIL

2017-11-13 Thread osstest service owner
flight 116130 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/116130/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 115476
 test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeat fail REGR. vs. 115476
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 115476

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  1d358fa5c440eba267ae3db31bc5b02769fbaba7
baseline version:
 libvirt  1bf893406637e852daeaafec6617d3ee3716de25

Last test of basis   115476  2017-11-02 04:22:37 Z   11 days
Failing since115509  2017-11-03 04:20:26 Z   10 days   10 attempts
Testing same since   116130  2017-11-13 04:24:05 Z0 days1 attempts


People who touched revisions under test:
  Andrea Bolognani 
  Christian Ehrhardt 
  Daniel Veillard 
  Dawid Zamirski 
  Jim Fehlig 
  Jiri Denemark 
  John Ferlan 
  Michal Privoznik 
  Nikolay Shirokovskiy 
  Peter Krempa 
  Pino Toscano 
  Viktor Mihajlovski 
  Wim ten Have 
  xinhua.Cao 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  fail
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmblocked 
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm blocked 
 test-amd64-i386-libvirt-xsm  blocked 
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt blocked 
 test-amd64-i386-libvirt  blocked 
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair blocked 
 test-amd64-i386-libvirt-qcow2blocked 
 test-armhf-armhf-libvirt-raw blocked 
 test-amd64-amd64-libvirt-vhd fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 1461 lines long.)

___
Xen-devel mailing list

[Xen-devel] [PATCH 02/16] SUPPORT.md: Add core functionality

2017-11-13 Thread George Dunlap
Core memory management and scheduling.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
CC: Dario Faggioli 
CC: Nathan Studer 
---
 SUPPORT.md | 59 +++
 1 file changed, 59 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index d7f2ae45e4..064a2f43e9 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -16,6 +16,65 @@ for the definitions of the support status levels etc.
 
 # Feature Support
 
+## Memory Management
+
+### Memory Ballooning
+
+Status: Supported
+
+## Resource Management
+
+### CPU Pools
+
+Status: Supported
+
+Groups physical cpus into distinct groups called "cpupools",
+with each pool having the capability
+of using different schedulers and scheduling properties.
+
+### Credit Scheduler
+
+Status: Supported
+
+A weighted proportional fair share virtual CPU scheduler.
+This is the default scheduler.
+
+### Credit2 Scheduler
+
+Status: Supported
+
+A general purpose scheduler for Xen,
+designed with particular focus on fairness, responsiveness, and scalability
+
+### RTDS based Scheduler
+
+Status: Experimental
+
+A soft real-time CPU scheduler 
+built to provide guaranteed CPU capacity to guest VMs on SMP hosts
+
+### ARINC653 Scheduler
+
+Status: Supported
+
+A periodically repeating fixed timeslice scheduler.
+Currently only single-vcpu domains are supported.
+
+### Null Scheduler
+
+Status: Experimental
+
+A very simple, very static scheduling policy 
+that always schedules the same vCPU(s) on the same pCPU(s). 
+It is designed for maximum determinism and minimum overhead
+on embedded platforms.
+
+### NUMA scheduler affinity
+
+Status, x86: Supported
+
+Enables NUMA aware scheduling in Xen
+
 # Format and definitions
 
 This file contains prose, and machine-readable fragments.
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 07/16] SUPPORT.md: Add virtual devices common to ARM and x86

2017-11-13 Thread George Dunlap
Mostly PV protocols.

Signed-off-by: George Dunlap 
---
The xl side of this seems a bit incomplete: There are a number of
things supported but not mentioned (like networking, ), and a number
of things not in xl (PV SCSI).  Couldn't find evidence of pvcall or pv
keyboard support.  Also we seem to be missing "PV channels" from this
list entirely

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Roger Pau Monne 
CC: Anthony Perard 
CC: Paul Durrant 
CC: Julien Grall 
---
 SUPPORT.md | 160 +
 1 file changed, 160 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index a8c56d13dd..20c58377a5 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -130,6 +130,22 @@ Output of information in machine-parseable JSON format
 
 Status: Supported
 
+### Qemu based disk backend (qdisk) for xl
+
+Status: Supported
+
+### PV USB support for xl
+
+Status: Supported
+
+### PV 9pfs support for xl
+
+Status: Tech Preview
+
+### QEMU backend hotplugging for xl
+
+Status: Supported
+
 ## Toolstack/3rd party
 
 ### libvirt driver for xl
@@ -216,6 +232,150 @@ which add paravirtualized functionality to HVM guests
 for improved performance and scalability.
 This includes exposing event channels to HVM guests.
 
+## Virtual driver support, guest side
+
+### Blkfront
+
+Status, Linux: Supported
+Status, FreeBSD: Supported, Security support external
+Status, NetBSD: Supported, Security support external
+Status, Windows: Supported
+
+Guest-side driver capable of speaking the Xen PV block protocol
+
+### Netfront
+
+Status, Linux: Supported
+States, Windows: Supported
+Status, FreeBSD: Supported, Security support external
+Status, NetBSD: Supported, Security support external
+Status, OpenBSD: Supported, Security support external
+
+Guest-side driver capable of speaking the Xen PV networking protocol
+
+### PV Framebuffer (frontend)
+
+Status, Linux (xen-fbfront): Supported
+
+Guest-side driver capable of speaking the Xen PV Framebuffer protocol
+
+### PV Console (frontend)
+
+Status, Linux (hvc_xen): Supported
+Status, Windows: Supported
+Status, FreeBSD: Supported, Security support external
+Status, NetBSD: Supported, Security support external
+
+Guest-side driver capable of speaking the Xen PV console protocol
+
+### PV keyboard (frontend)
+
+Status, Linux (xen-kbdfront): Supported
+Status, Windows: Supported
+
+Guest-side driver capable of speaking the Xen PV keyboard protocol
+
+[XXX 'Supported' here depends on the version we ship in 4.10 having some fixes]
+
+### PV USB (frontend)
+
+Status, Linux: Supported
+
+### PV SCSI protocol (frontend)
+
+Status, Linux: Supported, with caveats
+
+NB that while the PV SCSI backend is in Linux and tested regularly,
+there is currently no xl support.
+
+### PV TPM (frontend)
+
+Status, Linux (xen-tpmfront): Tech Preview
+
+Guest-side driver capable of speaking the Xen PV TPM protocol
+
+### PV 9pfs frontend
+
+Status, Linux: Tech Preview
+
+Guest-side driver capable of speaking the Xen 9pfs protocol
+
+### PVCalls (frontend)
+
+Status, Linux: Tech Preview
+
+Guest-side driver capable of making pv system calls
+
+Note that there is currently no xl support for pvcalls.
+
+## Virtual device support, host side
+
+### Blkback
+
+Status, Linux (blkback): Supported
+Status, FreeBSD (blkback): Supported, Security support external
+Status, NetBSD (xbdback): Supported, security support external
+Status, QEMU (xen_disk): Supported
+Status, Blktap2: Deprecated
+
+Host-side implementations of the Xen PV block protocol
+
+### Netback
+
+Status, Linux (netback): Supported
+Status, FreeBSD (netback): Supported, Security support external
+Status, NetBSD (xennetback): Supported, Security support external
+
+Host-side implementations of Xen PV network protocol
+
+### PV Framebuffer (backend)
+
+Status, QEMU: Supported
+
+Host-side implementaiton of the Xen PV framebuffer protocol
+
+### PV Console (xenconsoled)
+
+Status: Supported
+
+Host-side implementation of the Xen PV console protocol
+
+### PV keyboard (backend)
+
+Status, QEMU: Supported
+
+Host-side implementation fo the Xen PV keyboard protocol
+
+### PV USB (backend)
+
+Status, Linux: Experimental
+Status, QEMU: Supported
+
+Host-side implementation of the Xen PV USB protocol
+
+### PV SCSI protocol (backend)
+
+Status, Linux: Supported, with caveats
+
+NB that while the PV SCSI backend is in Linux and tested regularly,
+there is currently no xl support.
+
+### PV TPM (backend)
+
+Status: Tech 

[Xen-devel] [PATCH 05/16] SUPPORT.md: Toolstack core

2017-11-13 Thread George Dunlap
For now only include xl-specific features, or interaction with the
system.  Feature support matrix will be added when features are
mentioned.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
---
 SUPPORT.md | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 7c01d8cf9a..c884fac7f5 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -98,6 +98,44 @@ Requires hardware virtualisation support (Intel VMX / AMD 
SVM)
 
 ARM only has one guest type at the moment
 
+## Toolstack
+
+### xl
+
+Status: Supported
+
+### Direct-boot kernel image format
+
+Supported, x86: bzImage
+Supported, ARM32: zImage
+Supported, ARM64: Image
+
+Format which the toolstack accept for direct-boot kernels
+
+### systemd support for xl
+
+Status: Supported
+
+### JSON output support for xl
+
+Status: Experimental
+
+Output of information in machine-parseable JSON format
+
+### Open vSwitch integration for xl
+
+Status, Linux: Supported
+
+### Virtual cpu hotplug
+
+Status: Supported
+
+## Toolstack/3rd party
+
+### libvirt driver for xl
+
+Status: Supported, Security support external
+
 ## Memory Management
 
 ### Memory Ballooning
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 09/16] SUPPORT.md: Add ARM-specific virtual hardware

2017-11-13 Thread George Dunlap
Signed-off-by: George Dunlap 
---
Do we need to add anything more here?

And do we need to include ARM ACPI for guests?

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Julien Grall 
---
 SUPPORT.md | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index b95ee0ebe7..8235336c41 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -412,6 +412,16 @@ Virtual Performance Management Unit for HVM guests
 Disabled by default (enable with hypervisor command line option).
 This feature is not security supported: see 
http://xenbits.xen.org/xsa/advisory-163.html
 
+### ARM/Non-PCI device passthrough
+
+Status: Supported
+
+### ARM: 16K and 64K page granularity in guests
+
+Status: Supported, with caveats
+
+No support for QEMU backends in a 16K or 64K domain.
+
 ## Virtual Hardware, QEMU
 
 These are devices available in HVM mode using a qemu devicemodel (the default).
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 03/16] SUPPORT.md: Add some x86 features

2017-11-13 Thread George Dunlap
Including host architecture support and guest types.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Roger Pau Monne 
---
 SUPPORT.md | 53 +
 1 file changed, 53 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 064a2f43e9..6b09f98331 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -16,6 +16,59 @@ for the definitions of the support status levels etc.
 
 # Feature Support
 
+## Host Architecture
+
+### x86-64
+
+Status: Supported
+
+## Host hardware support
+
+### Physical CPU Hotplug
+
+Status, x86: Supported
+
+### Physical Memory Hotplug
+
+Status, x86: Supported
+
+### Host ACPI (via Domain 0)
+
+Status, x86 PV: Supported
+Status, x86 PVH: Tech preview
+
+### x86/Intel Platform QoS Technologies
+
+Status: Tech Preview
+
+## Guest Type
+
+### x86/PV
+
+Status: Supported
+
+Traditional Xen PV guest
+
+No hardware requirements
+
+### x86/HVM
+
+Status: Supported
+
+Fully virtualised guest using hardware virtualisation extensions
+
+Requires hardware virtualisation support (Intel VMX / AMD SVM)
+
+### x86/PVH guest
+
+Status: Supported
+
+PVH is a next-generation paravirtualized mode 
+designed to take advantage of hardware virtualization support when possible.
+During development this was sometimes called HVMLite or PVHv2.
+
+Requires hardware virtualisation support (Intel VMX / AMD SVM)
+
 ## Memory Management
 
 ### Memory Ballooning
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 06/16] SUPPORT.md: Add scalability features

2017-11-13 Thread George Dunlap
Superpage support and PVHVM.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Julien Grall 
---
 SUPPORT.md | 21 +
 1 file changed, 21 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index c884fac7f5..a8c56d13dd 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -195,6 +195,27 @@ on embedded platforms.
 
 Enables NUMA aware scheduling in Xen
 
+## Scalability
+
+### 1GB/2MB super page support
+
+Status, x86 HVM/PVH: : Supported
+Status, ARM: Supported
+
+NB that this refers to the ability of guests
+to have higher-level page table entries point directly to memory,
+improving TLB performance.
+This is independent of the ARM "page granularity" feature (see below).
+
+### x86/PVHVM
+
+Status: Supported
+
+This is a useful label for a set of hypervisor features
+which add paravirtualized functionality to HVM guests 
+for improved performance and scalability.
+This includes exposing event channels to HVM guests.
+
 # Format and definitions
 
 This file contains prose, and machine-readable fragments.
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 10/16] SUPPORT.md: Add Debugging, analysis, crash post-portem

2017-11-13 Thread George Dunlap
Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
---
 SUPPORT.md | 29 +
 1 file changed, 29 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 8235336c41..bd83c81557 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -152,6 +152,35 @@ Output of information in machine-parseable JSON format
 
 Status: Supported, Security support external
 
+## Debugging, analysis, and crash post-mortem
+
+### gdbsx
+
+Status, x86: Supported
+
+Debugger to debug ELF guests
+
+### Soft-reset for PV guests
+
+Status: Supported
+
+Soft-reset allows a new kernel to start 'from scratch' with a fresh VM state, 
+but with all the memory from the previous state of the VM intact.
+This is primarily designed to allow "crash kernels", 
+which can do core dumps of memory to help with debugging in the event of a 
crash.
+
+### xentrace
+
+Status, x86: Supported
+
+Tool to capture Xen trace buffer data
+
+### gcov
+
+Status: Supported, Not security supported
+
+Export hypervisor coverage data suitable for analysis by gcov or lcov.
+
 ## Memory Management
 
 ### Memory Ballooning
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 01/16] Introduce skeleton SUPPORT.md

2017-11-13 Thread George Dunlap
Add a machine-readable file to describe what features are in what
state of being 'supported', as well as information about how long this
release will be supported, and so on.

The document should be formatted using "semantic newlines" [1], to make
changes easier.

Begin with the basic framework.

Signed-off-by: Ian Jackson 
Signed-off-by: George Dunlap 

[1] http://rhodesmill.org/brandon/2012/one-sentence-per-line/
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
CC: Dario Faggioli 
CC: Tamas K Lengyel 
CC: Roger Pau Monne 
CC: Stefano Stabellini 
CC: Anthony Perard 
CC: Paul Durrant 
CC: Konrad Wilk 
CC: Julien Grall 
---
 SUPPORT.md | 196 +
 1 file changed, 196 insertions(+)
 create mode 100644 SUPPORT.md

diff --git a/SUPPORT.md b/SUPPORT.md
new file mode 100644
index 00..d7f2ae45e4
--- /dev/null
+++ b/SUPPORT.md
@@ -0,0 +1,196 @@
+# Support statement for this release
+
+This document describes the support status 
+and in particular the security support status of the Xen branch
+within which you find it.
+
+See the bottom of the file 
+for the definitions of the support status levels etc.
+
+# Release Support
+
+Xen-Version: 4.10-unstable
+Initial-Release: n/a
+Supported-Until: TBD
+Security-Support-Until: Unreleased - not yet security-supported
+
+# Feature Support
+
+# Format and definitions
+
+This file contains prose, and machine-readable fragments.
+The data in a machine-readable fragment relate to
+the section and subsection in which it is found.
+
+The file is in markdown format.
+The machine-readable fragments are markdown literals
+containing RFC-822-like (deb822-like) data.
+
+## Keys found in the Feature Support subsections
+
+### Status
+
+This gives the overall status of the feature,
+including security support status, functional completeness, etc.
+Refer to the detailed definitions below.
+
+If support differs based on implementation
+(for instance, x86 / ARM, Linux / QEMU / FreeBSD),
+one line for each set of implementations will be listed.
+
+## Definition of Status labels
+
+Each Status value corresponds to levels of security support,
+testing, stability, etc., as follows:
+
+### Experimental
+
+Functional completeness: No
+Functional stability: Here be dragons
+Interface stability: Not stable
+Security supported: No
+
+### Tech Preview
+
+Functional completeness: Yes
+Functional stability: Quirky
+Interface stability: Provisionally stable
+Security supported: No
+
+ Supported
+
+Functional completeness: Yes
+Functional stability: Normal
+Interface stability: Yes
+Security supported: Yes
+
+ Deprecated
+
+Functional completeness: Yes
+Functional stability: Quirky
+Interface stability: No (as in, may disappear the next release)
+Security supported: Yes
+
+All of these may appear in modified form.  
+There are several interfaces, for instance,
+which are officially declared as not stable;
+in such a case this feature may be described as "Stable / Interface not 
stable".
+
+## Definition of the status label interpretation tags
+
+### Functionally complete
+
+Does it behave like a fully functional feature?
+Does it work on all expected platforms,
+or does it only work for a very specific sub-case?
+Does it have a sensible UI,
+or do you have to have a deep understanding of the internals
+to get it to work properly?
+
+### Functional stability
+
+What is the risk of it exhibiting bugs?
+
+General answers to the above:
+
+ * **Here be dragons**
+
+   Pretty likely to still crash / fail to work.
+   Not recommended unless you like life on the bleeding edge.
+
+ * **Quirky**
+
+   Mostly works but may have odd behavior here and there.
+   Recommended for playing around or for non-production use cases.
+
+ * **Normal**
+
+   Ready for production use
+
+### Interface stability
+
+If I build a system based on the current interfaces,
+will they still work when I upgrade to the next version?
+
+ * **Not stable**
+
+   Interface is still in the early stages and
+   still fairly likely to be broken in future updates.
+
+ * **Provisionally stable**
+
+   We're not yet promising backwards compatibility,
+   but we think this is probably the final form of the interface.
+   It may still require some tweaks.
+
+ * **Stable**
+
+   We will try very hard to avoid breaking backwards  compatibility,
+   and to fix any regressions that are reported.
+
+### Security supported
+
+Will XSAs be issued if security-related bugs are discovered
+in the functionality?
+
+If "no",

[Xen-devel] [PATCH 08/16] SUPPORT.md: Add x86-specific virtual hardware

2017-11-13 Thread George Dunlap
x86-specific virtual hardware provided by the hypervisor, toolstack,
or QEMU.

Signed-off-by: George Dunlap 
---
Added emulated QEMU support, to replace docs/misc/qemu-xen-security.

Need to figure out what to do with the "backing storage image format"
section of that document.

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Roger Pau Monne 
CC: Anthony Perard 
CC: Paul Durrant 
---
 SUPPORT.md | 106 +
 1 file changed, 106 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 20c58377a5..b95ee0ebe7 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -376,6 +376,112 @@ there is currently no xl support.
 
 Status: Supported
 
+## Virtual Hardware, Hypervisor
+
+### x86/Nested PV
+
+Status, x86 HVM: Tech Preview
+
+This means running a Xen hypervisor inside an HVM domain,
+with support for PV L2 guests only
+(i.e., hardware virtualization extensions not provided
+to the guest).
+
+This works, but has performance limitations
+because the L1 dom0 can only access emulated L1 devices.
+
+### x86/Nested HVM
+
+Status, x86 HVM: Experimental
+
+This means running a Xen hypervisor inside an HVM domain,
+with support for running both PV and HVM L2 guests
+(i.e., hardware virtualization extensions provided
+to the guest).
+
+### x86/Advanced Vector eXtension
+
+Status: Supported
+
+### vPMU
+
+Status, x86: Supported, Not security supported
+
+Virtual Performance Management Unit for HVM guests
+
+Disabled by default (enable with hypervisor command line option).
+This feature is not security supported: see 
http://xenbits.xen.org/xsa/advisory-163.html
+
+## Virtual Hardware, QEMU
+
+These are devices available in HVM mode using a qemu devicemodel (the default).
+Note that other devices are available but not security supported.
+
+### x86/Emulated platform devices (QEMU):
+
+Status, piix3: Supported
+
+### x86/Emulated network (QEMU):
+
+Status, e1000: Supported
+Status, rtl8193: Supported
+Status, virtio-net: Supported
+
+### x86/Emulated storage (QEMU):
+
+Status, piix3 ide: Supported
+Status, ahci: Supported
+
+### x86/Emulated graphics (QEMU):
+
+Status, cirrus-vga: Supported
+Status, stgvga: Supported
+
+### x86/Emulated audio (QEMU):
+
+Status, sb16: Supported
+Status, es1370: Supported
+Status, ac97: Supported
+
+### x86/Emulated input (QEMU):
+
+Status, usbmouse: Supported
+Status, usbtablet: Supported
+Status, ps/2 keyboard: Supported
+Status, ps/2 mouse: Supported
+
+### x86/Emulated serial card (QEMU):
+
+Status, UART 16550A: Supported
+
+### x86/Host USB passthrough (QEMU):
+
+Status: Supported, not security supported 
+
+## Virtual Firmware
+
+### x86/HVM iPXE
+
+Status: Supported, with caveats
+
+Booting a guest via PXE.
+PXE inherently places full trust of the guest in the network,
+and so should only be used
+when the guest network is under the same administrative control
+as the guest itself.
+
+### x86/HVM BIOS
+
+Status: Supported
+
+Booting a guest via guest BIOS firmware
+
+### x86/HVM EFI
+
+Status: Supported
+
+Booting a guest via guest EFI firmware
+
 # Format and definitions
 
 This file contains prose, and machine-readable fragments.
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 04/16] SUPPORT.md: Add core ARM features

2017-11-13 Thread George Dunlap
Hardware support and guest type.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Julien Grall 
---
 SUPPORT.md | 29 +
 1 file changed, 29 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 6b09f98331..7c01d8cf9a 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -22,6 +22,14 @@ for the definitions of the support status levels etc.
 
 Status: Supported
 
+### ARM v7 + Virtualization Extensions
+
+Status: Supported
+
+### ARM v8
+
+Status: Supported
+
 ## Host hardware support
 
 ### Physical CPU Hotplug
@@ -36,11 +44,26 @@ for the definitions of the support status levels etc.
 
 Status, x86 PV: Supported
 Status, x86 PVH: Tech preview
+Status, ARM: Experimental
 
 ### x86/Intel Platform QoS Technologies
 
 Status: Tech Preview
 
+### ARM/SMMUv1
+
+Status: Supported
+
+### ARM/SMMUv2
+
+Status: Supported
+
+### ARM/GICv3 ITS
+
+Status: Experimental
+
+Extension to the GICv3 interrupt controller to support MSI.
+
 ## Guest Type
 
 ### x86/PV
@@ -69,6 +92,12 @@ During development this was sometimes called HVMLite or 
PVHv2.
 
 Requires hardware virtualisation support (Intel VMX / AMD SVM)
 
+### ARM guest
+
+Status: Supported
+
+ARM only has one guest type at the moment
+
 ## Memory Management
 
 ### Memory Ballooning
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 00/31] CPUFreq on ARM

2017-11-13 Thread Andre Przywara
Hi,

thanks very much for your work on this!

On 09/11/17 17:09, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko 
> 
> Hi, all.
> 
> The purpose of this RFC patch series is to add CPUFreq support to Xen on ARM.
> Motivation of hypervisor based CPUFreq is to enable one of the main PM 
> use-cases in virtualized system powered by Xen hypervisor. Rationale behind 
> this activity is that CPU virtualization is done by hypervisor and the guest 
> OS doesn't actually know anything about physical CPUs because it is running 
> on virtual CPUs. It is quite clear that a decision about frequency change 
> should be taken by hypervisor as only it has information about actual CPU 
> load.

Can you please sketch your usage scenario or workloads here? I can think
of quite different scenarios (oversubscribed server vs. partitioning
RTOS guests, for instance). The usefulness of CPUFreq and the trade-offs
in the design are quite different between those.

In general I doubt that a hypervisor scheduling vCPUs is in a good
position to make a decision on the proper frequency physical CPUs should
run with. From all I know it's already hard for an OS kernel to make
that call. So I would actually expect that guests provide some input,
for instance by signalling OPP change request up to the hypervisor. This
could then decide to act on it - or not.

> Although these required components (CPUFreq core, governors, etc) already 
> exist in Xen, it is worth to mention that they are ACPI specific. So, a part 
> of the current patch series makes them more generic in order to make possible 
> a CPUFreq usage on architectures without ACPI support in.

Have you looked at how this is used on x86 these days? Can you briefly
describe how this works and it's used there?

> But, the main question we have to answer is about frequency changing 
> interface in virtualized system. The frequency changing interface and all 
> dependent components which needed CPUFreq to be functional on ARM are not 
> present in Xen these days. The list of required components is quite big and 
> may change across different ARM SoC vendors. As an example, the following 
> components are involved in DVFS on Renesas Salvator-X board which has R-Car 
> Gen3 SoC installed: generic clock, regulator and thermal frameworks, Vendor’s 
> CPG, PMIC, AVS, THS drivers, i2c support, etc.
> 
> We were considering a few possible approaches of hypervisor based CPUFreqs on 
> ARM and came to conclusion to base this solution on popular at the moment, 
> already upstreamed to Linux, ARM System Control and Power Interface(SCPI) 
> protocol [1]. We chose SCPI protocol instead of newer ARM System Control and 
> Management Interface (SCMI) protocol [2] since it is widely spread in Linux, 
> there are good examples how to use it, the range of capabilities it has is 
> enough for implementing hypervisor based CPUFreq and, what is more, upstream 
> Linux support for SCMI is missed so far, but SCMI could be used as well.
> 
> Briefly speaking, the SCPI protocol is used between the System Control 
> Processor(SCP) and the Application Processors(AP). The mailbox feature 
> provides a mechanism for inter-processor communication between SCP and AP. 
> The main purpose of SCP is to offload different PM related tasks from AP and 
> one of the services that SCP provides is Dynamic voltage and frequency 
> scaling (DVFS), it is what we actually need for CPUFreq. I will describe this 
> approach in details down the text.
> 
> Let me explain a bit more what these possible approaches are:
> 
> 1. “Xen+hwdom” solution.
> GlobalLogic team proposed split model [3], where “hwdom-cpufreq” frontend 
> driver in Xen interacts with the “xen-cpufreq” backend driver in Linux hwdom 
> (possibly dom0) in order to scale physical CPUs. This solution hasn’t been 
> accepted by Xen community yet and seems it is not going to be accepted 
> without taking into the account still unanswered major questions and proving 
> that “all-in-Xen” solution, which Xen community considered as more 
> architecturally cleaner option, would be unworkable in practice.
> The other reasons why we decided not to stick to this approach are complex 
> communication interface between Xen and hwdom: event channel, hypercalls, 
> syscalls, passing CPU info via DT, etc and possible synchronization issues 
> with a proposed solution.
> Although it is worth to mention that the beauty of this approach was that 
> there wouldn’t be a need to port a lot of things to Xen. All frequency 
> changing interface and all dependent components which needed CPUFreq to be 
> functional were already in place.

Stefano, Julien and I were thinking about this: Wouldn't it be possible
to come up with some hardware domain, solely dealing with CPUFreq
changes? This could run a Linux kernel, but no or very little userland.
All its vCPUs would be pinned to pCPUs and would normally not be
scheduled by Xen. If Xen wants to change the 

Re: [Xen-devel] [PATCH] Config.mk: Update QEMU changeset

2017-11-13 Thread Wei Liu
On Mon, Nov 13, 2017 at 12:27:32PM +, Anthony PERARD wrote:
> New commits:
> - xen/pt: allow QEMU to request MSI unmasking at bind time
> To fix a passthrough bug.
> - ui/gtk: Fix deprecation of vte_terminal_copy_clipboard
> A build fix.
> 
> Signed-off-by: Anthony PERARD 
> ---
> Should already be released-acked.

Applied.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Ping: [PATCH] x86emul: keep compiler from using {x, y, z}mm registers itself

2017-11-13 Thread Julien Grall

Hi,

On 11/06/2017 03:04 PM, George Dunlap wrote:

On 11/06/2017 11:59 AM, Jan Beulich wrote:

On 16.10.17 at 14:42,  wrote:

On 16.10.17 at 14:37,  wrote:

On 16/10/17 13:32, Jan Beulich wrote:

Since the emulator acts on the live hardware registers, we need to
prevent the compiler from using them e.g. for inlined memcpy() /
memset() (as gcc7 does). We can't, however, set this from the command
line, as otherwise the 64-bit build would face issues with functions
returning floating point values and being declared in standard headers.

As the pragma isn't available prior to gcc6, we need to invoke it
conditionally. Luckily up to gcc6 we haven't seen generated code access
SIMD registers beyond what our asm()s do.

Reported-by: George Dunlap 
Signed-off-by: Jan Beulich 
---
While this doesn't affect core functionality, I think it would still be
nice for it to be allowed in for 4.10.


Agreed.

Has this been tested with Clang?


Sorry, no - still haven't got around to set up a suitable Clang
locally.


  It stands a good chance of being
compatible, but we may need an && !defined(__clang__) included.


Should non-gcc silently ignore "#pragma GCC ..." it doesn't
recognize, or not define __GNUC__ in the first place if it isn't
sufficiently compatible? I.e. if anything I'd expect we need
"#elif defined(__clang__)" to achieve the same for Clang by
some different pragma (if such exists).


Not having received any reply so far, I'm wondering whether
being able to build the test harness with clang is more
important than for it to work correctly when built with gcc. I
can't predict when I would get around to set up a suitable
clang on my dev systems.


I agree with the argument you make above.  On the unlikely chance
there's a problem Travis should catch it, and someone who actually has a
clang setup can help sort it out.


I am not entirely sure whether this count for a ack or not?

I was waiting an Acked-by/Reviewed-by before consider the Release-Acked-by.

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] xen-mceinj tool testing cause dom0 crash

2017-11-13 Thread Hao, Xudong

> -Original Message-
> From: Zhang, Haozhong
> Sent: Thursday, November 9, 2017 9:45 AM
> To: Jan Beulich ; Hao, Xudong 
> Cc: Julien Grall ; George Dunlap
> ; Lars Kurth ; xen-
> de...@lists.xen.org
> Subject: Re: [Xen-devel] [BUG] xen-mceinj tool testing cause dom0 crash
> 
> On 11/07/17 01:37 -0700, Jan Beulich wrote:
> > >>> On 07.11.17 at 09:23,  wrote:
> > >> From: Jan Beulich [mailto:jbeul...@suse.com]
> > >> Sent: Tuesday, November 7, 2017 4:09 PM
> > >> >>> On 07.11.17 at 02:37,  wrote:
> > >> >> From: Jan Beulich [mailto:jbeul...@suse.com]
> > >> >> Sent: Monday, November 6, 2017 5:17 PM
> > >> >> >>> On 03.11.17 at 09:29,  wrote:
> > >> >> > We figured out the problem, some corner scripts triggered the
> > >> >> > error injection at the same page (pfn 0x180020) twice, i.e.
> > >> >> > "./xen-mceinj -t 0" run over one time, which resulted in Dom0 crash.
> > >> >>
> > >> >> But isn't this a valid scenario, which shouldn't result in a kernel 
> > >> >> crash?
> > >> > What if
> > >> >> two successive #MCs occurred for the same page?
> > >> >> I.e. ...
> > >> >>
> > >> >
> > >> > Yes, it's another valid scenario, the expect result is kernel crash.
> > >>
> > >> Kernel _crash_ or rather kernel _panic_? Of course without any
> > >> kernel messages we can't tell one from the other, but to me this makes a
> difference nevertheless.
> > >>
> > > Exactly, Dom0 crash.
> >
> > I don't believe a crash is the expected outcome here.
> >
> 
> This test case injects two errors to the same dom0 page. During the first
> injection, offline_page() is called to set PGC_broken flag of that page. 
> During the
> second injection, offline_page() detects the same broken page is touched 
> again,
> and then tries to shutdown the page owner, i.e. dom0 in this case:
> 
> /*
>  * NB. When broken page belong to guest, usually hypervisor will
>  * notify the guest to handle the broken page. However, hypervisor
>  * need to prevent malicious guest access the broken page again.
>  * Under such case, hypervisor shutdown guest, preventing recursive mce.
>  */
> if ( (pg->count_info & PGC_broken) && (owner = page_get_owner(pg)) )
> {
> *status = PG_OFFLINE_AGAIN;
> domain_shutdown(owner, SHUTDOWN_crash);
> return 0;
> }
> 
> So I think Dom0 crash and the following machine reboot are the expected
> behaviors here.
> 
> But, it looks a (unexpected) page fault happens during the reboot.
> Xudong, can you check whether a normal reboot on that machine triggers a
> page fault?
> 

Yes, a normal rebooting of Dom0 triggered a Xen page fault on Intel Skylake 4 
sockets platform, but no page fault on Skylake 2 sockets system and Broadwell 
platforms.

Haozhong, will you fix this page fault issue?


Thanks,
-Xudong


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Bringing up OSS test framework on moonshot(aarch64) systems

2017-11-13 Thread Bhupinder Thakur

Hi Ian,


On Wednesday 08 November 2017 05:09 PM, Ian Jackson wrote:

Bhupinder Thakur writes ("Bringing up OSS test framework on moonshot(aarch64) 
systems"):

While going through [1], I have some queries/doubts on the configuration.
H
NetNameservers 10.80.248.2 10.80.16.28 10.80.16.67
HostProp_DhcpWatchMethod leases dhcp3 dhcp.uk.xensource.com:5556
TftpPath /usr/groups/netboot/

DebianNonfreeFirmware firmware-bnx2
DebianSuite squeeze
DebianMirrorHost debian.uk.xensource.com
DebianPreseed= < <‘END’
d-i clock-setup/ntp-server string ntp.uk.xensource.com
END

1. In this configuration, where would the DNS server be running? Does
it expect that a DNS server is already configured in the network and
it has mapping of name <--> IP address for all test hosts? Or do we
need to setup it up on the OSS controller?

The information about the nameservers, the tftp server, and the ntp
server, is supposed to refer to infrastructure that already exists.
I think your test hosts should be in the DNS, yes.  It may be possible
to get it to work without doing that but I wouldn't recommend it.

osstest does not need a dedicated network.  Specifically, it can
share its broadcast domain, and its dhcp and tftp servers (and web
proxies, Debian mirrors, ntp servers, and so on), with other uses.

When running osstest in production ("Executive") mode the individual
test boxes must be configured to be available to osstest only if they
are not being used for something else, of course.

See INSTALL.production.


2. What is the DhcpWatchMethod option used for?

See under DHCP in INSTALL.production, and please let me know if that's
not clear.
What I could understand is that this option is used to listen to any 
changes in the
dhcp lease file. But I am not clear why osstest needs to listen to any 
changes to
the lease file. Is it because it needs to know the IP address allocated 
to the guest
VMs? So whenever the guest VM is allocated an ip address, the osstest 
would come

to know.


3. How are the debian related options mentioned above used? Does OSS
fetches the installers/preseed files from DEbianMirrorHost and place
them in the required tftp folders?

mg-debian-installer-update downloads d-i installation information and
puts it in the tftp area.

But the tftp area is also updated at runtime, obviously, in order to
control the booting of each host.  And the mirror host is accessed
separately, too.


I may have more doubts as I try to set things up.

I'm happy to answer more questions, of course :-).


[1] https://blog.xenproject.org/2013/09/30/osstest-standalone-mode-step-by-step/

That blog post may be rather out of date, I'm afraid.  But the in-tree
documentation is somewhat better since then.


I am trying to bring up OSS test framework on a couple of moonshot
systems which are accessible to me remotely.

I'm not familiar with the referent of "moonshot" in this context.  IME
"moonshot" is a project name chosen multiple times, for different
projects, by people who want to give an impression that the project is
ambitious.

Regards,
Ian.


Regards,
Bhupinder

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH net-next v1] xen-netback: make copy batch size configurable

2017-11-13 Thread Paul Durrant
> -Original Message-
> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> Sent: 13 November 2017 16:34
> To: Paul Durrant 
> Cc: net...@vger.kernel.org; Wei Liu ; xen-
> de...@lists.xenproject.org
> Subject: Re: [PATCH net-next v1] xen-netback: make copy batch size
> configurable
> 
> On Mon, Nov 13, 2017 at 11:58:03AM +, Paul Durrant wrote:
> > On Mon, Nov 13, 2017 at 11:54:00AM +, Joao Martins wrote:
> > > On 11/13/2017 10:33 AM, Paul Durrant wrote:
> > > > On 11/10/2017 19:35 PM, Joao Martins wrote:
> 
> [snip]
> 
> > > >> diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-
> netback/rx.c
> > > >> index b1cf7c6f407a..793a85f61f9d 100644
> > > >> --- a/drivers/net/xen-netback/rx.c
> > > >> +++ b/drivers/net/xen-netback/rx.c
> > > >> @@ -168,11 +168,14 @@ static void xenvif_rx_copy_add(struct
> > > >> xenvif_queue *queue,
> > > >>   struct xen_netif_rx_request *req,
> > > >>   unsigned int offset, void *data, size_t 
> > > >> len)
> > > >>  {
> > > >> +  unsigned int batch_size;
> > > >>struct gnttab_copy *op;
> > > >>struct page *page;
> > > >>struct xen_page_foreign *foreign;
> > > >>
> > > >> -  if (queue->rx_copy.num == COPY_BATCH_SIZE)
> > > >> +  batch_size = min(xenvif_copy_batch_size, queue-
> >rx_copy.size);
> > > >
> > > > Surely queue->rx_copy.size and xenvif_copy_batch_size are always
> > > > identical? Why do you need this statement (and hence stack variable)?
> > > >
> > > This statement was to allow to be changed dynamically and would
> > > affect all newly created guests or running guests if value happened
> > > to be smaller than initially allocated. But I suppose I should make
> > > behaviour more consistent with the other params we have right now
> > > and just look at initially allocated one `queue->rx_copy.batch_size` ?
> >
> > Yes, that would certainly be consistent but I can see value in
> > allowing it to be dynamically tuned, so perhaps adding some re-allocation
> > code to allow the batch to be grown as well as shrunk might be nice.
> 
> The shrink one we potentially risk losing data, so we need to gate the
> reallocation whenever `rx_copy.num` is less than the new requested
> batch. Worst case means guestrx_thread simply uses the initial
> allocated value.

Can't you just re-alloc immediately after the flush (when num is guaranteed to 
be zero)?

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [qemu-upstream-unstable test] 116133: tolerable FAIL - PUSHED

2017-11-13 Thread osstest service owner
flight 116133 qemu-upstream-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/116133/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-libvirt  6 xen-install  fail in 116105 pass in 116133
 test-armhf-armhf-xl-rtds  6 xen-install  fail in 116118 pass in 116133
 test-amd64-i386-xl-xsm 20 guest-start/debian.repeat fail in 116118 pass in 
116133
 test-amd64-i386-libvirt-qcow2 17 guest-start/debian.repeat fail in 116118 pass 
in 116133
 test-armhf-armhf-xl-vhd  15 guest-start/debian.repeat  fail pass in 116105
 test-amd64-amd64-xl-qcow220 guest-start.2  fail pass in 116118
 test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeat  fail pass in 116118
 test-armhf-armhf-xl-xsm   6 xen-installfail pass in 116118

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stopfail REGR. vs. 114457
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop   fail REGR. vs. 114457

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-xsm 13 migrate-support-check fail in 116118 never pass
 test-armhf-armhf-xl-xsm 14 saverestore-support-check fail in 116118 never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 114457
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 114457
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 114457
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass

version targeted for testing:
 qemuub79708a8ed1b3d18bee67baeaf33b3fa529493e2
baseline version:
 qemuu5cd7ce5dde3f228b3b669ed9ca432f588947bd40

Last test of basis   114457  2017-10-13 09:14:56 Z   31 days
Testing same since   115703  2017-11-09 16:18:53 Z4 days7 attempts


People who touched revisions under test:
  Anthony PERARD 
  Michael Tokarev 
  Roger Pau Monne 
  Roger Pau Monné 
  Stefano Stabellini 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64