[Xen-devel] [linux-4.1 test] 95848: tolerable FAIL - PUSHED

2016-06-17 Thread osstest service owner
flight 95848 linux-4.1 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/95848/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl   3 host-install(3)  broken in 95666 pass in 95848
 test-amd64-i386-xl3 host-install(3)  broken in 95666 pass in 95848
 test-amd64-i386-freebsd10-i386 3 host-install(3) broken in 95666 pass in 95848
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 3 host-install(3) broken in 
95666 pass in 95848
 test-amd64-i386-qemut-rhel6hvm-amd 3 host-install(3) broken in 95666 pass in 
95848
 test-amd64-i386-xl-raw3 host-install(3)  broken in 95666 pass in 95848
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 3 host-install(3) broken in 
95666 pass in 95848
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 3 host-install(3) broken 
in 95666 pass in 95848
 test-amd64-i386-freebsd10-amd64 10 guest-start fail in 95666 pass in 95848
 test-armhf-armhf-xl-arndale 15 guest-start/debian.repeat fail in 95666 pass in 
95848
 test-armhf-armhf-xl-xsm  11 guest-startfail in 95666 pass in 95848
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail in 95666 pass in 95848
 test-armhf-armhf-libvirt-xsm 4 host-ping-check-native fail in 95666 pass in 
95848
 test-armhf-armhf-xl-rtds  6 xen-boot   fail in 95666 pass in 95848
 test-armhf-armhf-xl  15 guest-start/debian.repeat   fail pass in 95666

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-credit2  16 guest-start.2fail blocked in 94729
 test-armhf-armhf-xl-credit2 15 guest-start/debian.repeat fail in 95666 like 
94729
 build-amd64-rumpuserxen   6 xen-buildfail   like 94729
 build-i386-rumpuserxen6 xen-buildfail   like 94729
 test-armhf-armhf-xl-multivcpu 15 guest-start/debian.repeatfail  like 94729
 test-armhf-armhf-xl-cubietruck 15 guest-start/debian.repeatfail like 94729
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 94729
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 94729
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 94729
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 94729
 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeatfail   like 94729
 test-armhf-armhf-xl-vhd   9 debian-di-installfail   like 94729

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 

Re: [Xen-devel] Issues with PCI-Passtrough (VT-d) in HVM with Xen 4.6

2016-06-17 Thread Andrey Grodzovsky
On Wed, Jun 15, 2016 at 9:14 AM, Jan Beulich  wrote:

> >>> On 15.06.16 at 12:45,  wrote:
> > In reply to -
> > http://lists.xen.org/archives/html/xen-devel/2016-06/msg00622.html
> >
> > HI, I am working with Jurgen on the issue, as per Jan's request I tried
> to
> > write explicitly only latency timer to be written -
> > bool force_write = false;
> > if ((dev_data->permissive || xen_pcibk_permissive) &&
> >   offset == PCI_CACHE_LINE_SIZE && size == 4)
> >  force_write = true;
> > ...
> >
> > if ((force_write || !handled) && !err) {...}
> >
> > But then it exposed another issue, the command register field seems not
> to
> > be restored also
> > because I think the bits which are to be restored are not
> > in PCI_COMMAND_GUEST mask.
>
> Please be more precise: Which bits in particular are not getting
> set back to the needed values (I would guess the memory
> and/or I/O decode ones, which we specifically must not allow
> the guest to control, but I'd like to be certain)?
>

Indeed, 0x103 which would be  MEMORY,IO and SERR.

>
> If my guess is correct, then I think rather than adding some
> hackish workaround to pciback you'd better see whether there's
> a way to cause a pci_disable_device() through your driver before
> the reset, and a pci_enable_device() after. Or actually, I think
> you can do this via plain config space writes from your driver: Try
> writing the Command Register with the two decode bits clear
> right before restoring the intended value (see the logic close to
> the top of command_write()).
>
> Jan
>
> So i tried the first option, as you suggested by writing
to /sys/bus/pci/devices/\:00\:00.0/enable
'0' == disable before reset and '1' == enable after from the test app, but
no writes get through to pic-back on DOM0 , only the reads.
But this basically still tries to write the MEMORY and IO bits to command
register down the call stack in pci_enable_resources

which would again be blocked in xen-pciback xen_pcibk_config_write or am I
missing something ?
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 02/19] xen: sched: make the 'tickled' perf counter clearer

2016-06-17 Thread Meng Xu
On Fri, Jun 17, 2016 at 7:11 PM, Dario Faggioli
 wrote:
>
> In fact, what we have right now, i.e., tickle_idlers_none
> and tickle_idlers_some, is not good enough for describing
> what really happens in the various tickling functions of
> the various scheduler.
>
> Switch to a more descriptive set of counters, such as:
>  - tickled_no_cpu: for when we don't tickle anyone
>  - tickled_idle_cpu: for when we tickle one or more
>  idler
>  - tickled_busy_cpu: for when we tickle one or more
>  non-idler
>
> While there, fix style of an "out:" label in sched_rt.c.
>
> Signed-off-by: Dario Faggioli 
> ---
> Cc: George Dunlap 
> Cc: Meng Xu 
> Cc: Anshul Makkar 
> Cc: David Vrabel 
> ---
>  xen/common/sched_credit.c|   10 +++---
>  xen/common/sched_credit2.c   |   12 +---
>  xen/common/sched_rt.c|8 +---
>  xen/include/xen/perfc_defn.h |5 +++--
>  4 files changed, 20 insertions(+), 15 deletions(-)


In terms of sched_rt.c and perfc_defn.h,

Reviewed-by: Meng Xu 

Thanks,

Meng


Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [libvirt test] 95846: tolerable FAIL - PUSHED

2016-06-17 Thread osstest service owner
flight 95846 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/95846/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  e33cd67a9b51e8fcb4e084f40f500057b30b2786
baseline version:
 libvirt  2c51fa6ec4d34ccdd63f0935fb7530b681b126ac

Last test of basis95814  2016-06-16 04:24:25 Z1 days
Testing same since95846  2016-06-17 04:21:18 Z0 days1 attempts


People who touched revisions under test:
  Chen Hanxiao 
  Cole Robinson 
  Jim Fehlig 
  John Ferlan 
  Jovanka Gulicoska 
  Ján Tomko 
  Laine Stump 
  Martin Kletzander 
  sannyshao 
  yuelongguang 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt fail
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-armhf-armhf-libvirt-qcow2   fail
 test-armhf-armhf-libvirt-raw fail
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=libvirt
+ revision=e33cd67a9b51e8fcb4e084f40f500057b30b2786
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo 

[Xen-devel] [linux-3.18 test] 95844: tolerable FAIL - PUSHED

2016-06-17 Thread osstest service owner
flight 95844 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/95844/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail in 95521 pass in 95844
 test-amd64-i386-freebsd10-amd64 10 guest-start fail in 95521 pass in 95844
 test-armhf-armhf-xl-arndale   6 xen-boot   fail in 95809 pass in 95844
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop  fail pass in 95521
 test-armhf-armhf-xl-rtds  6 xen-bootfail pass in 95809

Regressions which are regarded as allowable (not blocking):
 build-i386-rumpuserxen6 xen-buildfail   like 94728
 build-amd64-rumpuserxen   6 xen-buildfail   like 94728
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 94728
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 94728
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 94728

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds 13 saverestore-support-check fail in 95521 never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-check fail in 95521 never pass
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 linuxb5076139991c6b12c62346d9880eec1d4227d99f
baseline version:
 linux3b6aa07b936b09d38c1bfcee1e06845b968df475

Last test of basis94728  2016-05-23 21:45:21 Z   25 days
Testing same since95406  2016-06-08 00:46:40 Z9 days8 attempts


People who touched revisions under test:
  "Kirill A. Shutemov" 
  Adrian Hunter 
  Alan Stern 
  Andreas Noever 
  Andreas Werner 
  Andrew Jeffery 
  Andrew Morton 
  Arnd Bergmann 
  Artem Bityutskiy 
  Axel Lin 
  Bjorn Helgaas 
  Brian Norris 
  Catalin Marinas 
  Catalin Vasile 
  

Re: [Xen-devel] [PATCH 12/15] xen/xsm: remove .xsm_initcall.init section

2016-06-17 Thread Daniel De Graaf

On 06/17/2016 01:21 PM, Konrad Rzeszutek Wilk wrote:

On Fri, Jun 17, 2016 at 01:18:47PM -0400, Daniel De Graaf wrote:

On 06/17/2016 01:14 PM, Konrad Rzeszutek Wilk wrote:

On Fri, Jun 17, 2016 at 01:04:01PM -0400, Daniel De Graaf wrote:

On 06/17/2016 11:50 AM, Konrad Rzeszutek Wilk wrote:

On Thu, Jun 09, 2016 at 10:47:15AM -0400, Daniel De Graaf wrote:

Since FLASK is the only implementation of XSM hooks in Xen, using an
iterated initcall dispatch for setup is overly complex.  Change this to


As such, should the Kconfig file enable the FLASK by default?
Or make the XSM entry have the configuration for FLASK?

Or perhaps make the FLASK be the visible one and select XSM?


XSM has previously been the configuration option to enable.  If XSM is
enabled (by choice), FLASK will then be enabled by default.


Ah, OK. I need to check, but could you disable FLASK and still have XSM enabled?


Yes, but it won't do anything except slow down the hypervisor a bit.
It's the same behavior as enabling flask and passing "flask=disabled" on
the hypervisor command line.


Right. Would it make sense to squash XSM and FLASK together in the Kconfig?


When making the Kconifg changes, I originally tried making a choice option for
the XSM provider (like for scheduler), with FLASK and dummy being the two
choices.  I discarded that because it didn't add anything useful over the
binary option.

However, thinking about it, if it doesn't make sense to disable FLASK then
hiding the option would simplify the configuration. It removes the grouping of
flask-only options, but since there's only one that isn't all that important.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 18/19] xen: credit2: implement SMT support independent runq arrangement

2016-06-17 Thread Dario Faggioli
In fact, right now, we recommend keepeing runqueues
arranged per-core, so that it is the inter-runqueue load
balancing code that automatically spreads the work in an
SMT friendly way. This means that any other runq
arrangement one may want to use falls short of SMT
scheduling optimizations.

This commit implements SMT awareness --similar to the
one we have in Credit1-- for any possible runq
arrangement. This turned out to be pretty easy to do,
as the logic can live entirely in runq_tickle()
(although, in order to avoid for_each_cpu loops in
that function, we use a new cpumask which indeed needs
to be updated in other places).

In addition to disentangling SMT awareness from load
balancing, this also allows us to support the
sched_smt_power_savings parametar in Credit2 as well.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 xen/common/sched_credit2.c |  141 +++-
 1 file changed, 126 insertions(+), 15 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 93943fa..a8b3a85 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -351,7 +351,8 @@ struct csched2_runqueue_data {
 unsigned int max_weight;
 
 cpumask_t idle,/* Currently idle */
-tickled;   /* Another cpu in the queue is already targeted for 
this one */
+smt_idle,  /* Fully idle cores (as in all the siblings are 
idle) */
+tickled;   /* Have been asked to go through schedule */
 int load;  /* Instantaneous load: Length of queue  + num 
non-idle threads */
 s_time_t load_last_update;  /* Last time average was updated */
 s_time_t avgload;   /* Decaying queue load */
@@ -412,6 +413,73 @@ struct csched2_dom {
 };
 
 /*
+ * Hyperthreading (SMT) support.
+ *
+ * We use a special per-runq mask (smt_idle) and update it according to the
+ * following logic:
+ *  - when _all_ the SMT sibling in a core are idle, all their corresponding
+ *bits are set in the smt_idle mask;
+ *  - when even _just_one_ of the SMT siblings in a core is not idle, all the
+ *bits correspondings to it and to all its siblings are clear in the
+ *smt_idle mask.
+ *
+ * Once we have such a mask, it is easy to implement a policy that, either:
+ *  - uses fully idle cores first: it is enough to try to schedule the vcpus
+ *on pcpus from smt_idle mask first. This is what happens if
+ *sched_smt_power_savings was not set at boot (default), and it maximizes
+ *true parallelism, and hence performance;
+ *  - uses already busy cores first: it is enough to try to schedule the vcpus
+ *on pcpus that are idle, but are not in smt_idle. This is what happens if
+ *sched_smt_power_savings is set at boot, and it allows as more cores as
+ *possible to stay in low power states, minimizing power consumption.
+ *
+ * This logic is entirely implemented in runq_tickle(), and that is enough.
+ * In fact, in this scheduler, placement of a vcpu on one of the pcpus of a
+ * runq, _always_ happens by means of tickling:
+ *  - when a vcpu wakes up, it calls csched2_vcpu_wake(), which calls
+ *runq_tickle();
+ *  - when a migration is initiated in schedule.c, we call csched2_cpu_pick(),
+ *csched2_vcpu_migrate() (which calls migrate()) and csched2_vcpu_wake().
+ *csched2_cpu_pick() looks for the least loaded runq and return just any
+ *of its processors. Then, csched2_vcpu_migrate() just moves the vcpu to
+ *the chosen runq, and it is again runq_tickle(), called by
+ *csched2_vcpu_wake() that actually decides what pcpu to use within the
+ *chosen runq;
+ *  - when a migration is initiated in sched_credit2.c, by calling  migrate()
+ *directly, that again temporarily use a random pcpu from the new runq,
+ *and then calls runq_tickle(), by itself.
+ */
+
+/*
+ * If all the siblings of cpu (including cpu itself) are in idlers,
+ * set all their bits in mask.
+ *
+ * In order to properly take into account tickling, idlers needs to be
+ * set qeual to something like:
+ *
+ *  rqd->idle & (~rqd->tickled)
+ *
+ * This is because cpus that have been tickled will very likely pick up some
+ * work as soon as the manage to schedule, and hence we should really consider
+ * them as busy.
+ */
+static inline
+void smt_idle_mask_set(unsigned int cpu, cpumask_t *idlers, cpumask_t *mask)
+{
+if ( cpumask_subset( per_cpu(cpu_sibling_mask, cpu), idlers) )
+cpumask_or(mask, mask, per_cpu(cpu_sibling_mask, cpu));
+}
+
+/*
+ * Clear the bits of all the siblings of cpu from mask.
+ */
+static inline
+void smt_idle_mask_clear(unsigned int cpu, cpumask_t *mask)
+{
+cpumask_andnot(mask, mask, per_cpu(cpu_sibling_mask, cpu));
+}
+
+/*
  * When a hard affinity change occurs, we may not be able to check some
  * 

[Xen-devel] [PATCH 19/19] xen: credit2: use cpumask_first instead of cpumask_any when choosing cpu

2016-06-17 Thread Dario Faggioli
because it is cheaper, and there is no much point in
randomizing which cpu gets selected anyway, as such
choice will be overridden shortly after, in runq_tickle().

If we really feel the need (e.g., we prove it worth with
benchmarking), we can record the last cpu which was used
by csched2_cpu_pick() and migrate() in a per-runq variable,
and then use cpumask_cycle()... but this really does not
look necessary.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 xen/common/sched_credit2.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index a8b3a85..afd432e 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -1545,7 +1545,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu 
*vc)
 {
 cpumask_and(cpumask_scratch, vc->cpu_hard_affinity,
 >migrate_rqd->active);
-new_cpu = cpumask_any(cpumask_scratch);
+new_cpu = cpumask_first(cpumask_scratch);
 if ( new_cpu < nr_cpu_ids )
 goto out_up;
 }
@@ -1604,7 +1604,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu 
*vc)
 
 cpumask_and(cpumask_scratch, vc->cpu_hard_affinity,
 >rqd[min_rqi].active);
-new_cpu = cpumask_any(cpumask_scratch);
+new_cpu = cpumask_first(cpumask_scratch);
 BUG_ON(new_cpu >= nr_cpu_ids);
 
  out_up:
@@ -1718,7 +1718,7 @@ static void migrate(const struct scheduler *ops,
 
 cpumask_and(cpumask_scratch, svc->vcpu->cpu_hard_affinity,
 >active);
-svc->vcpu->processor = cpumask_any(cpumask_scratch);
+svc->vcpu->processor = cpumask_first(cpumask_scratch);
 ASSERT(svc->vcpu->processor < nr_cpu_ids);
 
 __runq_assign(svc, trqd);


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 16/19] tools: tracing: deal with new Credit2 events

2016-06-17 Thread Dario Faggioli
more specifically, with: TICKLE_NEW, RUNQ_MAX_WEIGHT,
MIGRATE, LOAD_CHECK, LOAD_BALANCE and PICKED_CPU, and
in both both xenalyze and formats (for xentrace_format).

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Anshul Makkar 
---
 tools/xentrace/formats|6 +++
 tools/xentrace/xenalyze.c |   78 -
 2 files changed, 82 insertions(+), 2 deletions(-)

diff --git a/tools/xentrace/formats b/tools/xentrace/formats
index 2e58d03..caafb5f 100644
--- a/tools/xentrace/formats
+++ b/tools/xentrace/formats
@@ -55,6 +55,12 @@
 0x0002220a  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:runq_assign[ 
dom:vcpu = 0x%(1)08x, rq_id = %(2)d ]
 0x0002220b  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:updt_vcpu_load [ 
dom:vcpu = 0x%(3)08x, vcpuload = 0x%(2)08x%(1)08x, wshift = %(4)d ]
 0x0002220c  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:updt_runq_load [ 
rq_load[16]:rq_id[8]:wshift[8] = 0x%(5)08x, rq_avgload = 0x%(2)08x%(1)08x, 
b_avgload = 0x%(4)08x%(3)08x ]
+0x0002220d  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:tickle_new [ 
dom:vcpu = 0x%(1)08x, processor = %(2)d credit = %(3)d ]
+0x0002220e  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:runq_max_weight [ 
rq_id[16]:max_weight[16] = 0x%(1)08x ]
+0x0002220f  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:migrrate   [ 
dom:vcpu = 0x%(1)08x, rq_id[16]:trq_id[16] = 0x%(2)08x ]
+0x00022210  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:load_check [ 
lrq_id[16]:orq_id[16] = 0x%(1)08x, delta = %(2)d ]
+0x00022211  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:load_balance   [ 
l_bavgload = 0x%(2)08x%(1)08x, o_bavgload = 0x%(4)08x%(3)08x, 
lrq_id[16]:orq_id[16] = 0x%(5)08x ]
+0x00022212  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:pick_cpu   [ 
b_avgload = 0x%(2)08x%(1)08x, dom:vcpu = 0x%(3)08x, rq_id[16]:new_cpu[16] = 
%(4)d ]
 
 0x00022801  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:tickle[ cpu = 
%(1)d ]
 0x00022802  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:runq_pick [ dom:vcpu 
= 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ]
diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
index f2f97bd..d223de6 100644
--- a/tools/xentrace/xenalyze.c
+++ b/tools/xentrace/xenalyze.c
@@ -7725,7 +7725,6 @@ void sched_process(struct pcpu_info *p)
 /* CREDIT 2 (TRC_CSCHED2_xxx) */
 case TRC_SCHED_CLASS_EVT(CSCHED2, 1): /* TICK  */
 case TRC_SCHED_CLASS_EVT(CSCHED2, 4): /* CREDIT_ADD*/
-case TRC_SCHED_CLASS_EVT(CSCHED2, 9): /* UPDATE_LOAD   */
 break;
 case TRC_SCHED_CLASS_EVT(CSCHED2, 2): /* RUNQ_POS  */
 if(opt.dump_all) {
@@ -7788,11 +7787,15 @@ void sched_process(struct pcpu_info *p)
 if(opt.dump_all)
 printf(" %s csched2:sched_tasklet\n", ri->dump_header);
 break;
+case TRC_SCHED_CLASS_EVT(CSCHED2, 9):  /* UPDATE_LOAD  */
+if(opt.dump_all)
+printf(" %s csched2:update_load\n", ri->dump_header);
+break;
 case TRC_SCHED_CLASS_EVT(CSCHED2, 10): /* RUNQ_ASSIGN  */
 if(opt.dump_all) {
 struct {
 unsigned int vcpuid:16, domid:16;
-unsigned int rqi;
+unsigned int rqi:16;
 } *r = (typeof(r))ri->d;
 
 printf(" %s csched2:runq_assign d%uv%u on rq# %u\n",
@@ -7834,6 +7837,77 @@ void sched_process(struct pcpu_info *p)
avgload, r->rq_avgload, b_avgload, r->b_avgload);
 }
 break;
+case TRC_SCHED_CLASS_EVT(CSCHED2, 13): /* TICKLE_NEW   */
+if (opt.dump_all) {
+struct {
+unsigned vcpuid:16, domid:16;
+unsigned processor, credit;
+} *r = (typeof(r))ri->d;
+
+printf(" %s csched2:runq_tickle_new d%uv%u, "
+   "processor = %u, credit = %u\n",
+   ri->dump_header, r->domid, r->vcpuid,
+   r->processor, r->credit);
+}
+break;
+case TRC_SCHED_CLASS_EVT(CSCHED2, 14): /* RUNQ_MAX_WEIGHT  */
+if (opt.dump_all) {
+struct {
+unsigned rqi:16, max_weight:16;
+} *r = (typeof(r))ri->d;
+
+printf(" %s csched2:update_max_weight rq# %u, max_weight = 
%u\n",
+   ri->dump_header, r->rqi, r->max_weight);
+}
+break;
+case TRC_SCHED_CLASS_EVT(CSCHED2, 15): /* MIGRATE  */
+if (opt.dump_all) {
+struct {
+unsigned vcpuid:16, domid:16;
+unsigned rqi:16, trqi:16;
+} *r = 

[Xen-devel] [PATCH 15/19] xen: credit2: only marshall trace point arguments if tracing enabled

2016-06-17 Thread Dario Faggioli
Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 xen/common/sched_credit2.c |  114 +++-
 1 file changed, 59 insertions(+), 55 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index e9f3f13..3fdc91c 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -636,6 +636,7 @@ __update_runq_load(const struct scheduler *ops,
 
 ASSERT(rqd->avgload <= STIME_MAX && rqd->b_avgload <= STIME_MAX);
 
+if ( unlikely(tb_init_done) )
 {
 struct {
 uint64_t rq_avgload, b_avgload;
@@ -646,9 +647,9 @@ __update_runq_load(const struct scheduler *ops,
 d.rq_avgload = rqd->avgload;
 d.b_avgload = rqd->b_avgload;
 d.shift = P;
-trace_var(TRC_CSCHED2_UPDATE_RUNQ_LOAD, 1,
-  sizeof(d),
-  (unsigned char *));
+__trace_var(TRC_CSCHED2_UPDATE_RUNQ_LOAD, 1,
+sizeof(d),
+(unsigned char *));
 }
 }
 
@@ -691,6 +692,7 @@ __update_svc_load(const struct scheduler *ops,
 }
 svc->load_last_update = now;
 
+if ( unlikely(tb_init_done) )
 {
 struct {
 uint64_t v_avgload;
@@ -701,9 +703,9 @@ __update_svc_load(const struct scheduler *ops,
 d.vcpu = svc->vcpu->vcpu_id;
 d.v_avgload = svc->avgload;
 d.shift = P;
-trace_var(TRC_CSCHED2_UPDATE_VCPU_LOAD, 1,
-  sizeof(d),
-  (unsigned char *));
+__trace_var(TRC_CSCHED2_UPDATE_VCPU_LOAD, 1,
+sizeof(d),
+(unsigned char *));
 }
 }
 
@@ -759,6 +761,7 @@ runq_insert(const struct scheduler *ops, struct 
csched2_vcpu *svc)
 
 pos = __runq_insert(runq, svc);
 
+if ( unlikely(tb_init_done) )
 {
 struct {
 unsigned vcpu:16, dom:16;
@@ -767,9 +770,9 @@ runq_insert(const struct scheduler *ops, struct 
csched2_vcpu *svc)
 d.dom = svc->vcpu->domain->domain_id;
 d.vcpu = svc->vcpu->vcpu_id;
 d.pos = pos;
-trace_var(TRC_CSCHED2_RUNQ_POS, 1,
-  sizeof(d),
-  (unsigned char *));
+__trace_var(TRC_CSCHED2_RUNQ_POS, 1,
+sizeof(d),
+(unsigned char *));
 }
 
 return;
@@ -812,7 +815,7 @@ runq_tickle(const struct scheduler *ops, struct 
csched2_vcpu *new, s_time_t now)
 
 ASSERT(new->rqd == rqd);
 
-/* TRACE */
+if ( unlikely(tb_init_done) )
 {
 struct {
 unsigned vcpu:16, dom:16;
@@ -822,9 +825,9 @@ runq_tickle(const struct scheduler *ops, struct 
csched2_vcpu *new, s_time_t now)
 d.vcpu = new->vcpu->vcpu_id;
 d.processor = new->vcpu->processor;
 d.credit = new->credit;
-trace_var(TRC_CSCHED2_TICKLE_NEW, 1,
-  sizeof(d),
-  (unsigned char *));
+__trace_var(TRC_CSCHED2_TICKLE_NEW, 1,
+sizeof(d),
+(unsigned char *));
 }
 
 /*
@@ -882,7 +885,8 @@ runq_tickle(const struct scheduler *ops, struct 
csched2_vcpu *new, s_time_t now)
 lowest = cur->credit;
 }
 
-/* TRACE */ {
+if ( unlikely(tb_init_done) )
+{
 struct {
 unsigned vcpu:16, dom:16;
 unsigned credit;
@@ -890,9 +894,9 @@ runq_tickle(const struct scheduler *ops, struct 
csched2_vcpu *new, s_time_t now)
 d.dom = cur->vcpu->domain->domain_id;
 d.vcpu = cur->vcpu->vcpu_id;
 d.credit = cur->credit;
-trace_var(TRC_CSCHED2_TICKLE_CHECK, 1,
-  sizeof(d),
-  (unsigned char *));
+__trace_var(TRC_CSCHED2_TICKLE_CHECK, 1,
+sizeof(d),
+(unsigned char *));
 }
 }
 
@@ -910,14 +914,15 @@ runq_tickle(const struct scheduler *ops, struct 
csched2_vcpu *new, s_time_t now)
  tickle:
 BUG_ON(ipid == -1);
 
-/* TRACE */ {
+if ( unlikely(tb_init_done) )
+{
 struct {
 unsigned cpu:16, pad:16;
 } d;
 d.cpu = ipid; d.pad = 0;
-trace_var(TRC_CSCHED2_TICKLE, 1,
-  sizeof(d),
-  (unsigned char *));
+__trace_var(TRC_CSCHED2_TICKLE, 1,
+sizeof(d),
+(unsigned char *));
 }
 __cpumask_set_cpu(ipid, >tickled);
 cpu_raise_softirq(ipid, SCHEDULE_SOFTIRQ);
@@ -979,7 +984,8 @@ static void reset_credit(const struct scheduler *ops, int 
cpu, s_time_t now,
 
 svc->start_time = now;
 
-/* TRACE */ {
+if ( unlikely(tb_init_done) )
+{
 struct {
 unsigned vcpu:16, dom:16;
 unsigned credit_start, credit_end;
@@ -990,9 

[Xen-devel] [PATCH 17/19] xen: credit2: the private scheduler lock can be an rwlock.

2016-06-17 Thread Dario Faggioli
In fact, the data it protects only change either at init-time,
during cpupools manipulation, or when changing domains' weights.
In all other cases (namely, load balancing, reading weights
and status dumping), information is only read.

Therefore, let the lock be an read/write one. This means there
is no full serialization point for the whole scheduler and
for all the pCPUs of the host any longer.

This is particularly good for scalability (especially when doing
load balancing).

Also, update the high level description of the locking discipline,
and take the chance for rewording it a little bit (as well as
for adding a couple of locking related ASSERT()-s).

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 xen/common/sched_credit2.c |  133 ++--
 1 file changed, 80 insertions(+), 53 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 3fdc91c..93943fa 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -85,17 +85,37 @@
 
 /*
  * Locking:
- * - Schedule-lock is per-runqueue
- *  + Protects runqueue data, runqueue insertion, 
- *  + Also protects updates to private sched vcpu structure
- *  + Must be grabbed using vcpu_schedule_lock_irq() to make sure 
vcpu->processr
- *doesn't change under our feet.
- * - Private data lock
- *  + Protects access to global domain list
- *  + All other private data is written at init and only read afterwards.
+ *
+ * - runqueue lock
+ *  + it is per-runqueue, so:
+ *   * cpus in a runqueue take the runqueue lock, when using
+ * pcpu_schedule_lock() / vcpu_schedule_lock() (and friends),
+ *   * a cpu may (try to) take a "remote" runqueue lock, e.g., for
+ * load balancing;
+ *  + serializes runqueue operations (removing and inserting vcpus);
+ *  + protects runqueue-wide data in csched2_runqueue_data;
+ *  + protects vcpu parameters in csched2_vcpu for the vcpu in the
+ *runqueue.
+ *
+ * - Private scheduler lock
+ *  + protects scheduler-wide data in csched2_private, such as:
+ *   * the list of domains active in this scheduler,
+ *   * what cpus and what runqueues are active and in what
+ * runqueue each cpu is;
+ *  + serializes the operation of changing the weights of domains;
+ *
+ * - Type:
+ *  + runqueue locks are 'regular' spinlocks;
+ *  + the private scheduler lock can be an rwlock. In fact, data
+ *it protects is modified only during initialization, cpupool
+ *manipulation and when changing weights, and read in all
+ *other cases (e.g., during load balancing).
+ *
  * Ordering:
- * - We grab private->schedule when updating domain weight; so we
- *  must never grab private if a schedule lock is held.
+ *  + tylock must be used when wanting to take a runqueue lock,
+ *if we already hold another one;
+ *  + if taking both a runqueue lock and the private scheduler
+ *lock is, the latter must always be taken for first.
  */
 
 /*
@@ -342,7 +362,7 @@ struct csched2_runqueue_data {
  * System-wide private data
  */
 struct csched2_private {
-spinlock_t lock;
+rwlock_t lock;
 cpumask_t initialized; /* CPU is initialized for this pool */
 
 struct list_head sdom; /* Used mostly for dump keyhandler. */
@@ -1300,13 +1320,14 @@ static void
 csched2_vcpu_wake(const struct scheduler *ops, struct vcpu *vc)
 {
 struct csched2_vcpu * const svc = CSCHED2_VCPU(vc);
+unsigned int cpu = vc->processor;
 s_time_t now;
 
-/* Schedule lock should be held at this point. */
+ASSERT(spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock));
 
 ASSERT(!is_idle_vcpu(vc));
 
-if ( unlikely(curr_on_cpu(vc->processor) == vc) )
+if ( unlikely(curr_on_cpu(cpu) == vc) )
 {
 SCHED_STAT_CRANK(vcpu_wake_running);
 goto out;
@@ -1397,7 +1418,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu 
*vc)
 ASSERT(!cpumask_empty(>active_queues));
 
 /* Locking:
- * - vc->processor is already locked
+ * - Runqueue lock of vc->processor is already locked
  * - Need to grab prv lock to make sure active runqueues don't
  *   change
  * - Need to grab locks for other runqueues while checking
@@ -1410,8 +1431,9 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu 
*vc)
  * just grab the prv lock.  Instead, we'll have to trylock, and
  * do something else reasonable if we fail.
  */
+ASSERT(spin_is_locked(per_cpu(schedule_data, 
vc->processor).schedule_lock));
 
-if ( !spin_trylock(>lock) )
+if ( !read_trylock(>lock) )
 {
 /* We may be here because someon requested us to migrate */
 __clear_bit(__CSFLAG_runq_migrate_request, >flags);
@@ -1493,7 +1515,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu 
*vc)
 }
 
 out_up:
-spin_unlock(>lock);
+

[Xen-devel] [PATCH 10/19] xen: credit2: rework load tracking logic

2016-06-17 Thread Dario Faggioli
The existing load tracking code was hard to understad and
maintain, and not entirely consistent. This is due to a
number of reasons:
 - code and comments were not in perfect sync, making it
   difficult to figure out what the intent of a particular
   choice was (e.g., the choice of 18 for load_window_shift);
 - the math, although effective, was not entirely consistent.
   In fact, we were doing (if W is the lenght of the window):

avgload = (delta*load*W + (W - delta)*avgload)/W
avgload = avgload + delta*load - delta*avgload/W

   which does not match any known variant of 'smoothing
   moving average'. In fact, it should have been:

avgload = avgload + delta*load/W - delta*avgload/W

   (for details on why, see the doc comments inside this
   patch.). Furthermore, with

avgload ~= avgload + W*load - avgload
avgload ~= W*load

The reason why the formula above sort of worked was because
the number of bits used for the fractional parts of the
values used in fixed point math and the number of bits used
for the lenght of the window were the same (load_window_shift
was being used for both).

This may look handy, but it introduced a (not especially well
documented) dependency between the lenght of the window and
the precision of the calculations, which really should be
two independent things. Especially if treating them as such
(like it is done in this patch) does not lead to more
complex maths (same number of multiplications and shifts, and
there is still room for some optimization).

Therefore, in this patch, we:
 - split length of the window and precision (and, since there
   is already a command line parameter for length of window,
   introduce one for precision too),
 - align the math with one proper incarnation of exponential
   smoothing (at no added cost),
 - add comments, about the details of the algorithm and the
   math used.

While there fix a couple of style issues as well (pointless
initialization, long lines, comments).

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 docs/misc/xen-command-line.markdown |   30 +++
 xen/common/sched_credit2.c  |  328 ++-
 2 files changed, 310 insertions(+), 48 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index fed732c..29a554b 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -477,9 +477,39 @@ the address range the area should fall into.
 ### credit2\_balance\_under
 > `= `
 
+### credit2\_load\_precision\_shift
+> `= `
+
+> Default: `18`
+
+Specify the number of bits to use for the fractional part of the
+values involved in Credit2 load tracking and load balancing math.
+
 ### credit2\_load\_window\_shift
 > `= `
 
+> Default: `30`
+
+Specify the number of bits to use for represent the length of the
+window (in nanoseconds) we use for load tracking inside Credit2.
+This means that, with the default value (30), we use
+2^30 nsec ~= 1 sec long window.
+
+Load tracking is done by means of a variation of exponentially
+weighted moving average (EWMA). The window length defined here
+is what tells for how long we give value to previous history
+of the load itself. In fact, after a full window has passed,
+what happens is that we discard all previous history entirely.
+
+A short window will make the load balancer quick at reacting
+to load changes, but also short-sighted about previous history
+(and hence, e.g., long term load trends). A long window will
+make the load balancer thoughtful of previous history (and
+hence capable of capturing, e.g., long term load trends), but
+also slow in responding to load changes.
+
+The default value of `1 sec` is rather long.
+
 ### credit2\_runqueue
 > `= core | socket | node | all`
 
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index c534f9c..230a512 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -173,16 +173,88 @@ integer_param("sched_credit2_migrate_resist", 
opt_migrate_resist);
 #define RQD(_ops, _cpu) (_PRIV(_ops)->rqd[c2r(_ops, _cpu)])
 
 /*
- * Shifts for load average.
- * - granularity: Reduce granularity of time by a factor of 1000, so we can 
use 32-bit maths
- * - window shift: Given granularity shift, make the window about 1 second
- * - scale shift: Shift up load by this amount rather than using fractions; 
128 corresponds 
- *   to a load of 1.
+ * Load tracking and load balancing
+ *
+ * Load history of runqueues and vcpus is accounted for by using an
+ * exponential weighted moving average algorithm. However, instead of using
+ * fractions,we shift everything to left by the number of bits we want to
+ * use for representing the fractional part (Q-format).
+ *
+ * We may also want to reduce the precision of time accounting, to
+ * accommodate 'longer  

[Xen-devel] [PATCH 14/19] xen: credit2: add yet some more tracing

2016-06-17 Thread Dario Faggioli
(and fix the style of two labels as well.)

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 xen/common/sched_credit2.c |   58 +---
 1 file changed, 54 insertions(+), 4 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index ba3a78a..e9f3f13 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -46,6 +46,9 @@
 #define TRC_CSCHED2_TICKLE_NEW   TRC_SCHED_CLASS_EVT(CSCHED2, 13)
 #define TRC_CSCHED2_RUNQ_MAX_WEIGHT  TRC_SCHED_CLASS_EVT(CSCHED2, 14)
 #define TRC_CSCHED2_MIGRATE  TRC_SCHED_CLASS_EVT(CSCHED2, 15)
+#define TRC_CSCHED2_LOAD_CHECK   TRC_SCHED_CLASS_EVT(CSCHED2, 16)
+#define TRC_CSCHED2_LOAD_BALANCE TRC_SCHED_CLASS_EVT(CSCHED2, 17)
+#define TRC_CSCHED2_PICKED_CPU   TRC_SCHED_CLASS_EVT(CSCHED2, 19)
 
 /*
  * WARNING: This is still in an experimental phase.  Status and work can be 
found at the
@@ -709,6 +712,8 @@ update_load(const struct scheduler *ops,
 struct csched2_runqueue_data *rqd,
 struct csched2_vcpu *svc, int change, s_time_t now)
 {
+trace_var(TRC_CSCHED2_UPDATE_LOAD, 1, 0,  NULL);
+
 __update_runq_load(ops, rqd, change, now);
 if ( svc )
 __update_svc_load(ops, svc, change, now);
@@ -1484,6 +1489,23 @@ csched2_cpu_pick(const struct scheduler *ops, struct 
vcpu *vc)
 out_up:
 spin_unlock(>lock);
 
+/* TRACE */
+{
+struct {
+uint64_t b_avgload;
+unsigned vcpu:16, dom:16;
+unsigned rq_id:16, new_cpu:16;
+   } d;
+d.b_avgload = prv->rqd[min_rqi].b_avgload;
+d.dom = vc->domain->domain_id;
+d.vcpu = vc->vcpu_id;
+d.rq_id = c2r(ops, new_cpu);
+d.new_cpu = new_cpu;
+trace_var(TRC_CSCHED2_PICKED_CPU, 1,
+  sizeof(d),
+  (unsigned char *));
+}
+
 return new_cpu;
 }
 
@@ -1611,7 +1633,7 @@ static void balance_load(const struct scheduler *ops, int 
cpu, s_time_t now)
 bool_t inner_load_updated = 0;
 
 balance_state_t st = { .best_push_svc = NULL, .best_pull_svc = NULL };
-
+
 /*
  * Basic algorithm: Push, pull, or swap.
  * - Find the runqueue with the furthest load distance
@@ -1677,6 +1699,20 @@ static void balance_load(const struct scheduler *ops, 
int cpu, s_time_t now)
 if ( i > cpus_max )
 cpus_max = i;
 
+/* TRACE */
+{
+struct {
+unsigned lrq_id:16, orq_id:16;
+unsigned load_delta;
+} d;
+d.lrq_id = st.lrqd->id;
+d.orq_id = st.orqd->id;
+d.load_delta = st.load_delta;
+trace_var(TRC_CSCHED2_LOAD_CHECK, 1,
+  sizeof(d),
+  (unsigned char *));
+}
+
 /*
  * If we're under 100% capacaty, only shift if load difference
  * is > 1.  otherwise, shift if under 12.5%
@@ -1705,6 +1741,21 @@ static void balance_load(const struct scheduler *ops, 
int cpu, s_time_t now)
 if ( unlikely(st.orqd->id < 0) )
 goto out_up;
 
+/* TRACE */
+{
+struct {
+uint64_t lb_avgload, ob_avgload;
+unsigned lrq_id:16, orq_id:16;
+} d;
+d.lrq_id = st.lrqd->id;
+d.lb_avgload = st.lrqd->b_avgload;
+d.orq_id = st.orqd->id;
+d.ob_avgload = st.orqd->b_avgload;
+trace_var(TRC_CSCHED2_LOAD_BALANCE, 1,
+  sizeof(d),
+  (unsigned char *));
+}
+
 now = NOW();
 
 /* Look for "swap" which gives the best load average
@@ -1756,10 +1807,9 @@ static void balance_load(const struct scheduler *ops, 
int cpu, s_time_t now)
 if ( st.best_pull_svc )
 migrate(ops, st.best_pull_svc, st.lrqd, now);
 
-out_up:
+ out_up:
 spin_unlock(>lock);
-
-out:
+ out:
 return;
 }
 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 13/19] xen: credit2: make the code less experimental

2016-06-17 Thread Dario Faggioli
Mainly, almost all of the BUG_ON-s can be converted into
ASSERTS, and the debug printk either removed or turned
into tracing.

The 'TODO' list, in a comment at the beginning of the file,
was also stale, so remove items that were still there but
are actually done.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 xen/common/sched_credit2.c |  244 +++-
 1 file changed, 126 insertions(+), 118 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 2ca63ae..ba3a78a 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -27,9 +27,6 @@
 #include 
 #include 
 
-#define d2printk(x...)
-//#define d2printk printk
-
 /*
  * Credit2 tracing events ("only" 512 available!). Check
  * include/public/trace.h for more details.
@@ -46,16 +43,16 @@
 #define TRC_CSCHED2_RUNQ_ASSIGN  TRC_SCHED_CLASS_EVT(CSCHED2, 10)
 #define TRC_CSCHED2_UPDATE_VCPU_LOAD TRC_SCHED_CLASS_EVT(CSCHED2, 11)
 #define TRC_CSCHED2_UPDATE_RUNQ_LOAD TRC_SCHED_CLASS_EVT(CSCHED2, 12)
+#define TRC_CSCHED2_TICKLE_NEW   TRC_SCHED_CLASS_EVT(CSCHED2, 13)
+#define TRC_CSCHED2_RUNQ_MAX_WEIGHT  TRC_SCHED_CLASS_EVT(CSCHED2, 14)
+#define TRC_CSCHED2_MIGRATE  TRC_SCHED_CLASS_EVT(CSCHED2, 15)
 
 /*
  * WARNING: This is still in an experimental phase.  Status and work can be 
found at the
  * credit2 wiki page:
  *  http://wiki.xen.org/wiki/Credit2_Scheduler_Development
+ *
  * TODO:
- * + Multiple sockets
- *  - Simple load balancer / runqueue assignment
- *  - Runqueue load measurement
- *  - Load-based load balancer
  * + Hyperthreading
  *  - Look for non-busy core if possible
  *  - "Discount" time run on a thread with busy siblings
@@ -608,8 +605,8 @@ __update_runq_load(const struct scheduler *ops,
 delta = now - rqd->load_last_update;
 if ( unlikely(delta < 0) )
 {
-d2printk("%s: Time went backwards? now %"PRI_stime" llu 
%"PRI_stime"\n",
- __func__, now, rqd->load_last_update);
+printk("WARNING: %s: Time went backwards? now %"PRI_stime" llu 
%"PRI_stime"\n",
+   __func__, now, rqd->load_last_update);
 delta = 0;
 }
 
@@ -680,8 +677,8 @@ __update_svc_load(const struct scheduler *ops,
 delta = now - svc->load_last_update;
 if ( unlikely(delta < 0) )
 {
-d2printk("%s: Time went backwards? now %"PRI_stime" llu 
%"PRI_stime"\n",
- __func__, now, svc->load_last_update);
+printk("WARNING: %s: Time went backwards? now %"PRI_stime" llu 
%"PRI_stime"\n",
+   __func__, now, svc->load_last_update);
 delta = 0;
 }
 
@@ -723,23 +720,18 @@ __runq_insert(struct list_head *runq, struct csched2_vcpu 
*svc)
 struct list_head *iter;
 int pos = 0;
 
-d2printk("rqi %pv\n", svc->vcpu);
-
-BUG_ON(>rqd->runq != runq);
-/* Idle vcpus not allowed on the runqueue anymore */
-BUG_ON(is_idle_vcpu(svc->vcpu));
-BUG_ON(svc->vcpu->is_running);
-BUG_ON(svc->flags & CSFLAG_scheduled);
+ASSERT(>rqd->runq == runq);
+ASSERT(!is_idle_vcpu(svc->vcpu));
+ASSERT(!svc->vcpu->is_running);
+ASSERT(!(svc->flags & CSFLAG_scheduled));
 
 list_for_each( iter, runq )
 {
 struct csched2_vcpu * iter_svc = __runq_elem(iter);
 
 if ( svc->credit > iter_svc->credit )
-{
-d2printk(" p%d %pv\n", pos, iter_svc->vcpu);
 break;
-}
+
 pos++;
 }
 
@@ -755,10 +747,10 @@ runq_insert(const struct scheduler *ops, struct 
csched2_vcpu *svc)
 struct list_head * runq = (ops, cpu)->runq;
 int pos = 0;
 
-ASSERT( spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock) );
+ASSERT(spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock));
 
-BUG_ON( __vcpu_on_runq(svc) );
-BUG_ON( c2r(ops, cpu) != c2r(ops, svc->vcpu->processor) );
+ASSERT(!__vcpu_on_runq(svc));
+ASSERT(c2r(ops, cpu) == c2r(ops, svc->vcpu->processor));
 
 pos = __runq_insert(runq, svc);
 
@@ -781,7 +773,7 @@ runq_insert(const struct scheduler *ops, struct 
csched2_vcpu *svc)
 static inline void
 __runq_remove(struct csched2_vcpu *svc)
 {
-BUG_ON( !__vcpu_on_runq(svc) );
+ASSERT(__vcpu_on_runq(svc));
 list_del_init(>runq_elem);
 }
 
@@ -806,16 +798,29 @@ void burn_credits(struct csched2_runqueue_data *rqd, 
struct csched2_vcpu *, s_ti
 static void
 runq_tickle(const struct scheduler *ops, struct csched2_vcpu *new, s_time_t 
now)
 {
-int i, ipid=-1;
-s_time_t lowest=(1<<30);
+int i, ipid = -1;
+s_time_t lowest = (1<<30);
 unsigned int cpu = new->vcpu->processor;
 struct csched2_runqueue_data *rqd = RQD(ops, cpu);
 cpumask_t mask;
 struct csched2_vcpu * cur;
 
-d2printk("rqt %pv curr %pv\n", 

[Xen-devel] [PATCH 09/19] xen: credit2: avoid calling __update_svc_load() multiple times on the same vcpu

2016-06-17 Thread Dario Faggioli
by not resetting the variable that should guard against
that at the beginning of each step of the outer loop.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 xen/common/sched_credit2.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index af28e7b..c534f9c 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -1378,6 +1378,7 @@ static void balance_load(const struct scheduler *ops, int 
cpu, s_time_t now)
 struct csched2_private *prv = CSCHED2_PRIV(ops);
 int i, max_delta_rqi = -1;
 struct list_head *push_iter, *pull_iter;
+bool_t inner_load_updated = 0;
 
 balance_state_t st = { .best_push_svc = NULL, .best_pull_svc = NULL };
 
@@ -1478,7 +1479,6 @@ static void balance_load(const struct scheduler *ops, int 
cpu, s_time_t now)
 /* Reuse load delta (as we're trying to minimize it) */
 list_for_each( push_iter, >svc )
 {
-int inner_load_updated = 0;
 struct csched2_vcpu * push_svc = list_entry(push_iter, struct 
csched2_vcpu, rqd_elem);
 
 __update_svc_load(ops, push_svc, 0, now);
@@ -1490,10 +1490,8 @@ static void balance_load(const struct scheduler *ops, 
int cpu, s_time_t now)
 {
 struct csched2_vcpu * pull_svc = list_entry(pull_iter, struct 
csched2_vcpu, rqd_elem);
 
-if ( ! inner_load_updated )
-{
+if ( !inner_load_updated )
 __update_svc_load(ops, pull_svc, 0, now);
-}
 
 if ( !vcpu_is_migrateable(pull_svc, st.lrqd) )
 continue;


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 12/19] xen: credit2: use non-atomic cpumask and bit operations

2016-06-17 Thread Dario Faggioli
as all the accesses to both the masks and the flags are
serialized by the runqueues locks already.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 xen/common/sched_credit2.c |   48 ++--
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 230a512..2ca63ae 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -909,7 +909,7 @@ runq_tickle(const struct scheduler *ops, struct 
csched2_vcpu *new, s_time_t now)
   sizeof(d),
   (unsigned char *));
 }
-cpumask_set_cpu(ipid, >tickled);
+__cpumask_set_cpu(ipid, >tickled);
 cpu_raise_softirq(ipid, SCHEDULE_SOFTIRQ);
 }
 
@@ -1277,7 +1277,7 @@ csched2_vcpu_sleep(const struct scheduler *ops, struct 
vcpu *vc)
 __runq_remove(svc);
 }
 else if ( svc->flags & CSFLAG_delayed_runq_add )
-clear_bit(__CSFLAG_delayed_runq_add, >flags);
+__clear_bit(__CSFLAG_delayed_runq_add, >flags);
 }
 
 static void
@@ -1314,7 +1314,7 @@ csched2_vcpu_wake(const struct scheduler *ops, struct 
vcpu *vc)
  * after the context has been saved. */
 if ( unlikely(svc->flags & CSFLAG_scheduled) )
 {
-set_bit(__CSFLAG_delayed_runq_add, >flags);
+__set_bit(__CSFLAG_delayed_runq_add, >flags);
 goto out;
 }
 
@@ -1347,7 +1347,7 @@ csched2_context_saved(const struct scheduler *ops, struct 
vcpu *vc)
 BUG_ON( !is_idle_vcpu(vc) && svc->rqd != RQD(ops, vc->processor));
 
 /* This vcpu is now eligible to be put on the runqueue again */
-clear_bit(__CSFLAG_scheduled, >flags);
+__clear_bit(__CSFLAG_scheduled, >flags);
 
 /* If someone wants it on the runqueue, put it there. */
 /*
@@ -1357,7 +1357,7 @@ csched2_context_saved(const struct scheduler *ops, struct 
vcpu *vc)
  * it seems a bit pointless; especially as we have plenty of
  * bits free.
  */
-if ( test_and_clear_bit(__CSFLAG_delayed_runq_add, >flags)
+if ( __test_and_clear_bit(__CSFLAG_delayed_runq_add, >flags)
  && likely(vcpu_runnable(vc)) )
 {
 BUG_ON(__vcpu_on_runq(svc));
@@ -1399,10 +1399,10 @@ csched2_cpu_pick(const struct scheduler *ops, struct 
vcpu *vc)
 
 if ( !spin_trylock(>lock) )
 {
-if ( test_and_clear_bit(__CSFLAG_runq_migrate_request, >flags) )
+if ( __test_and_clear_bit(__CSFLAG_runq_migrate_request, >flags) )
 {
 d2printk("%pv -\n", svc->vcpu);
-clear_bit(__CSFLAG_runq_migrate_request, >flags);
+__clear_bit(__CSFLAG_runq_migrate_request, >flags);
 }
 
 return get_fallback_cpu(svc);
@@ -1410,7 +1410,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu 
*vc)
 
 /* First check to see if we're here because someone else suggested a place
  * for us to move. */
-if ( test_and_clear_bit(__CSFLAG_runq_migrate_request, >flags) )
+if ( __test_and_clear_bit(__CSFLAG_runq_migrate_request, >flags) )
 {
 if ( unlikely(svc->migrate_rqd->id < 0) )
 {
@@ -1545,8 +1545,8 @@ static void migrate(const struct scheduler *ops,
 d2printk("%pv %d-%d a\n", svc->vcpu, svc->rqd->id, trqd->id);
 /* It's running; mark it to migrate. */
 svc->migrate_rqd = trqd;
-set_bit(_VPF_migrating, >vcpu->pause_flags);
-set_bit(__CSFLAG_runq_migrate_request, >flags);
+__set_bit(_VPF_migrating, >vcpu->pause_flags);
+__set_bit(__CSFLAG_runq_migrate_request, >flags);
 SCHED_STAT_CRANK(migrate_requested);
 }
 else
@@ -2079,7 +2079,7 @@ csched2_schedule(
 
 /* Clear "tickled" bit now that we've been scheduled */
 if ( cpumask_test_cpu(cpu, >tickled) )
-cpumask_clear_cpu(cpu, >tickled);
+__cpumask_clear_cpu(cpu, >tickled);
 
 /* Update credits */
 burn_credits(rqd, scurr, now);
@@ -2115,7 +2115,7 @@ csched2_schedule(
 if ( snext != scurr
  && !is_idle_vcpu(scurr->vcpu)
  && vcpu_runnable(current) )
-set_bit(__CSFLAG_delayed_runq_add, >flags);
+__set_bit(__CSFLAG_delayed_runq_add, >flags);
 
 ret.migrated = 0;
 
@@ -2134,7 +2134,7 @@ csched2_schedule(
cpu, snext->vcpu, snext->vcpu->processor, scurr->vcpu);
 BUG();
 }
-set_bit(__CSFLAG_scheduled, >flags);
+__set_bit(__CSFLAG_scheduled, >flags);
 }
 
 /* Check for the reset condition */
@@ -2146,7 +2146,7 @@ csched2_schedule(
 
 /* Clear the idle mask if necessary */
 if ( cpumask_test_cpu(cpu, >idle) )
-cpumask_clear_cpu(cpu, >idle);
+__cpumask_clear_cpu(cpu, >idle);
 
 snext->start_time = now;
 
@@ -2168,10 +2168,10 @@ csched2_schedule(
 if ( 

[Xen-devel] [PATCH 11/19] tools: tracing: adapt Credit2 load tracking events to new format

2016-06-17 Thread Dario Faggioli
in both xenalyze and formats (for xentrace_format).

In particular, in xenalyze, now that we have the precision
of the fixed point load values in the tracepoint, show both
the raw value and the (easier to interpreet) percentage.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Anshul Makkar 
---
 tools/xentrace/formats|4 ++--
 tools/xentrace/xenalyze.c |   25 ++---
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/tools/xentrace/formats b/tools/xentrace/formats
index d204351..2e58d03 100644
--- a/tools/xentrace/formats
+++ b/tools/xentrace/formats
@@ -53,8 +53,8 @@
 0x00022208  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:sched_tasklet
 0x00022209  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:update_load
 0x0002220a  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:runq_assign[ 
dom:vcpu = 0x%(1)08x, rq_id = %(2)d ]
-0x0002220b  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:updt_vcpu_load [ 
dom:vcpu = 0x%(1)08x, avgload = %(2)d ]
-0x0002220c  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:updt_runq_load [ 
rq_load[4]:rq_avgload[28] = 0x%(1)08x, rq_id[4]:b_avgload[28] = 0x%(2)08x ]
+0x0002220b  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:updt_vcpu_load [ 
dom:vcpu = 0x%(3)08x, vcpuload = 0x%(2)08x%(1)08x, wshift = %(4)d ]
+0x0002220c  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:updt_runq_load [ 
rq_load[16]:rq_id[8]:wshift[8] = 0x%(5)08x, rq_avgload = 0x%(2)08x%(1)08x, 
b_avgload = 0x%(4)08x%(3)08x ]
 
 0x00022801  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:tickle[ cpu = 
%(1)d ]
 0x00022802  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:runq_pick [ dom:vcpu 
= 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ]
diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
index 01ead8b..f2f97bd 100644
--- a/tools/xentrace/xenalyze.c
+++ b/tools/xentrace/xenalyze.c
@@ -7802,25 +7802,36 @@ void sched_process(struct pcpu_info *p)
 case TRC_SCHED_CLASS_EVT(CSCHED2, 11): /* UPDATE_VCPU_LOAD */
 if(opt.dump_all) {
 struct {
+uint64_t vcpuload;
 unsigned int vcpuid:16, domid:16;
-unsigned int avgload;
+unsigned int shift;
 } *r = (typeof(r))ri->d;
+double vcpuload;
 
-printf(" %s csched2:update_vcpu_load d%uv%u, avg_load = %u\n",
-   ri->dump_header, r->domid, r->vcpuid, r->avgload);
+vcpuload = (r->vcpuload * 100.0) / (1ULL << r->shift);
+
+printf(" %s csched2:update_vcpu_load d%uv%u, "
+   "vcpu_load = %4.3f%% (%"PRIu64")\n",
+   ri->dump_header, r->domid, r->vcpuid, vcpuload,
+   r->vcpuload);
 }
 break;
 case TRC_SCHED_CLASS_EVT(CSCHED2, 12): /* UPDATE_RUNQ_LOAD */
 if(opt.dump_all) {
 struct {
-unsigned int rq_load:4, rq_avgload:28;
-unsigned int rq_id:4, b_avgload:28;
+uint64_t rq_avgload, b_avgload;
+unsigned int rq_load:16, rq_id:8, shift:8;
 } *r = (typeof(r))ri->d;
+double avgload, b_avgload;
+
+avgload = (r->rq_avgload * 100.0) / (1ULL << r->shift);
+b_avgload = (r->b_avgload * 100.0) / (1ULL << r->shift);
 
 printf(" %s csched2:update_rq_load rq# %u, load = %u, "
-   "avgload = %u, b_avgload = %u\n",
+   "avgload = %4.3f%% (%"PRIu64"), "
+   "b_avgload = %4.3f%% (%"PRIu64")\n",
ri->dump_header, r->rq_id, r->rq_load,
-   r->rq_avgload, r->b_avgload);
+   avgload, r->rq_avgload, b_avgload, r->b_avgload);
 }
 break;
 /* RTDS (TRC_RTDS_xxx) */


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 08/19] xen: credit2: when tickling, check idle cpus first

2016-06-17 Thread Dario Faggioli
If there are idle pCPUs, it's always better to try to
"ship" the new vCPU there, instead than letting it
preempting on a currently busy one.

This commit also adds a cpumask_test_or_cycle() helper
function, to make it easier to code the preference for
the pCPU where the vCPU was running before.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 xen/common/sched_credit2.c |   68 +---
 xen/include/xen/cpumask.h  |8 +
 2 files changed, 53 insertions(+), 23 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index b73d034..af28e7b 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -568,9 +568,23 @@ __runq_remove(struct csched2_vcpu *svc)
 
 void burn_credits(struct csched2_runqueue_data *rqd, struct csched2_vcpu *, 
s_time_t);
 
-/* Check to see if the item on the runqueue is higher priority than what's
- * currently running; if so, wake up the processor */
-static /*inline*/ void
+/*
+ * Check what processor it is best to 'wake', for picking up a vcpu that has
+ * just been put (back) in the runqueue. Logic is as follows:
+ *  1. if there are idle processors in the runq, wake one of them;
+ *  2. if there aren't idle processor, check the one were the vcpu was
+ * running before to see if we can preempt what's running there now
+ * (and hence doing just one migration);
+ *  3. last stand: check all processors and see if the vcpu is in right
+ * of preempting any of the other vcpus running on them (this requires
+ * two migrations, and that's indeed why it is left as the last stand).
+ *
+ * Note that when we say 'idle processors' what we really mean is (pretty
+ * much always) both _idle_ and _not_already_tickled_. In fact, if a
+ * processor has been tickled, it will run csched2_schedule() shortly, and
+ * pick up some work, so it would be wrong to consider it idle.
+ */
+static void
 runq_tickle(const struct scheduler *ops, struct csched2_vcpu *new, s_time_t 
now)
 {
 int i, ipid=-1;
@@ -584,22 +598,14 @@ runq_tickle(const struct scheduler *ops, struct 
csched2_vcpu *new, s_time_t now)
 
 BUG_ON(new->rqd != rqd);
 
-/* Look at the cpu it's running on first */
-cur = CSCHED2_VCPU(curr_on_cpu(cpu));
-burn_credits(rqd, cur, now);
-
-if ( cur->credit < new->credit )
-{
-ipid = cpu;
-goto tickle;
-}
-
-/* Get a mask of idle, but not tickled, that new is allowed to run on. */
+/*
+ * Get a mask of idle, but not tickled, processors that new is
+ * allowed to run on. If that's not empty, choose someone from there
+ * (preferrably, the one were new was running on already).
+ */
 cpumask_andnot(, >idle, >tickled);
 cpumask_and(, , new->vcpu->cpu_hard_affinity);
-
-/* If it's not empty, choose one */
-i = cpumask_cycle(cpu, );
+i = cpumask_test_or_cycle(cpu, );
 if ( i < nr_cpu_ids )
 {
 SCHED_STAT_CRANK(tickled_idle_cpu);
@@ -607,12 +613,26 @@ runq_tickle(const struct scheduler *ops, struct 
csched2_vcpu *new, s_time_t now)
 goto tickle;
 }
 
-/* Otherwise, look for the non-idle cpu with the lowest credit,
- * skipping cpus which have been tickled but not scheduled yet,
- * that new is allowed to run on. */
+/*
+ * Otherwise, look for the non-idle (and non-tickled) processors with
+ * the lowest credit, among the ones new is allowed to run on. Again,
+ * the cpu were it was running on would be the best candidate.
+ */
 cpumask_andnot(, >active, >idle);
 cpumask_andnot(, , >tickled);
 cpumask_and(, , new->vcpu->cpu_hard_affinity);
+if ( cpumask_test_cpu(cpu, ) )
+{
+cur = CSCHED2_VCPU(curr_on_cpu(cpu));
+burn_credits(rqd, cur, now);
+
+if ( cur->credit < new->credit )
+{
+SCHED_STAT_CRANK(tickled_busy_cpu);
+ipid = cpu;
+goto tickle;
+}
+}
 
 for_each_cpu(i, )
 {
@@ -624,7 +644,7 @@ runq_tickle(const struct scheduler *ops, struct 
csched2_vcpu *new, s_time_t now)
 
 BUG_ON(is_idle_vcpu(cur->vcpu));
 
-/* Update credits for current to see if we want to preempt */
+/* Update credits for current to see if we want to preempt. */
 burn_credits(rqd, cur, now);
 
 if ( cur->credit < lowest )
@@ -647,8 +667,10 @@ runq_tickle(const struct scheduler *ops, struct 
csched2_vcpu *new, s_time_t now)
 }
 }
 
-/* Only switch to another processor if the credit difference is greater
- * than the migrate resistance */
+/*
+ * Only switch to another processor if the credit difference is
+ * greater than the migrate resistance.
+ */
 if ( ipid == -1 || lowest + CSCHED2_MIGRATE_RESIST > new->credit )
 {
 

[Xen-devel] [PATCH 07/19] xen: credit2: prevent load balancing to go mad if time goes backwards

2016-06-17 Thread Dario Faggioli
This really should not happen, but:
 1. it does happen! Investigation is ongoing here:
http://lists.xen.org/archives/html/xen-devel/2016-06/msg00922.html
 2. even when 1 will be fixed it makes sense and is easy enough
to have a 'safety catch' for it.

The reason why this is particularly bad for Credit2 is that
negative values of delta mean out of scale high load (because
of the conversion to unsigned). This, for instance in the
case of runqueue load, results in a runqueue having its load
updated to values of the order of 1% or so, which in turns
means that the load balancer will migrate everything off from
the pCPUs in the runqueue, and leave them idle until the load
gets back to something sane... which may indeed take a while!

This is not a fix for the problem of time going backwards. In
fact, if that happens a lot, load tracking accuracy is still
compromized, but at least the effect is a lot less bad than
before.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 xen/common/sched_credit2.c |   12 
 1 file changed, 12 insertions(+)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 50f8dfd..b73d034 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -404,6 +404,12 @@ __update_runq_load(const struct scheduler *ops,
 else
 {
 delta = now - rqd->load_last_update;
+if ( unlikely(delta < 0) )
+{
+d2printk("%s: Time went backwards? now %"PRI_stime" llu 
%"PRI_stime"\n",
+ __func__, now, rqd->load_last_update);
+delta = 0;
+}
 
 rqd->avgload =
 ( ( delta * ( (unsigned long long)rqd->load << 
prv->load_window_shift ) )
@@ -455,6 +461,12 @@ __update_svc_load(const struct scheduler *ops,
 else
 {
 delta = now - svc->load_last_update;
+if ( unlikely(delta < 0) )
+{
+d2printk("%s: Time went backwards? now %"PRI_stime" llu 
%"PRI_stime"\n",
+ __func__, now, svc->load_last_update);
+delta = 0;
+}
 
 svc->avgload =
 ( ( delta * ( (unsigned long long)vcpu_load << 
prv->load_window_shift ) )


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 06/19] xen: credit2: read NOW() with the proper runq lock held

2016-06-17 Thread Dario Faggioli
Yet another situation very similar to 779511f4bf5ae
("sched: avoid races on time values read from NOW()").

In fact, when more than one runqueue is involved, we need
to make sure that the following does not happen:
 1. take the lock of 1st runq
 2. now = NOW()
 3. take the lock of 2nd runq
 4. use now

as, if we have to wait at step 3, the value in now may
be stale when we get to use it at step 4.

While there, fix the style of a label.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 xen/common/sched_credit2.c |7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 9e8e561..50f8dfd 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -1361,7 +1361,7 @@ static void balance_load(const struct scheduler *ops, int 
cpu, s_time_t now)
 
 __update_runq_load(ops, st.lrqd, 0, now);
 
-retry:
+ retry:
 if ( !spin_trylock(>lock) )
 return;
 
@@ -1377,7 +1377,8 @@ retry:
  || !spin_trylock(>lock) )
 continue;
 
-__update_runq_load(ops, st.orqd, 0, now);
+/* Use a value of NOW() sampled after taking orqd's lock. */
+__update_runq_load(ops, st.orqd, 0, NOW());
 
 delta = st.lrqd->b_avgload - st.orqd->b_avgload;
 if ( delta < 0 )
@@ -1435,6 +1436,8 @@ retry:
 if ( unlikely(st.orqd->id < 0) )
 goto out_up;
 
+now = NOW();
+
 /* Look for "swap" which gives the best load average
  * FIXME: O(n^2)! */
 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 04/19] xen: credit2: kill useless helper function choose_cpu

2016-06-17 Thread Dario Faggioli
In fact, it has the same signature of csched2_cpu_pick,
which also is its uniqe caller.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 xen/common/sched_credit2.c |   14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 5881583..ef199e3 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -321,7 +321,7 @@ struct csched2_dom {
 /*
  * When a hard affinity change occurs, we may not be able to check some
  * (any!) of the other runqueues, when looking for the best new processor
- * for svc (as trylock-s in choose_cpu() can fail). If that happens, we
+ * for svc (as trylock-s in csched2_cpu_pick() can fail). If that happens, we
  * pick, in order of decreasing preference:
  *  - svc's current pcpu;
  *  - another pcpu from svc's current runq;
@@ -1116,7 +1116,7 @@ csched2_context_saved(const struct scheduler *ops, struct 
vcpu *vc)
 
 #define MAX_LOAD (1ULL<<60);
 static int
-choose_cpu(const struct scheduler *ops, struct vcpu *vc)
+csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
 {
 struct csched2_private *prv = CSCHED2_PRIV(ops);
 int i, min_rqi = -1, new_cpu;
@@ -1490,16 +1490,6 @@ out:
 return;
 }
 
-static int
-csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
-{
-int new_cpu;
-
-new_cpu = choose_cpu(ops, vc);
-
-return new_cpu;
-}
-
 static void
 csched2_vcpu_migrate(
 const struct scheduler *ops, struct vcpu *vc, unsigned int new_cpu)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 05/19] xen: credit2: do not warn if calling burn_credits more than once

2016-06-17 Thread Dario Faggioli
on the same vcpu, without NOW() having changed.

This is, in fact, a legitimate use case. If it happens,
we should just do nothing, without producing any warning
or debug message.

While there, fix style and add a couple of branching
hints.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 xen/common/sched_credit2.c |   12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index ef199e3..9e8e561 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -738,14 +738,15 @@ static void reset_credit(const struct scheduler *ops, int 
cpu, s_time_t now,
 /* No need to resort runqueue, as everyone's order should be the same. */
 }
 
-void burn_credits(struct csched2_runqueue_data *rqd, struct csched2_vcpu *svc, 
s_time_t now)
+void burn_credits(struct csched2_runqueue_data *rqd,
+  struct csched2_vcpu *svc, s_time_t now)
 {
 s_time_t delta;
 
 /* Assert svc is current */
 ASSERT(svc==CSCHED2_VCPU(curr_on_cpu(svc->vcpu->processor)));
 
-if ( is_idle_vcpu(svc->vcpu) )
+if ( unlikely(is_idle_vcpu(svc->vcpu)) )
 {
 BUG_ON(svc->credit != CSCHED2_IDLE_CREDIT);
 return;
@@ -753,13 +754,16 @@ void burn_credits(struct csched2_runqueue_data *rqd, 
struct csched2_vcpu *svc, s
 
 delta = now - svc->start_time;
 
-if ( delta > 0 ) {
+if ( likely(delta > 0) )
+{
 SCHED_STAT_CRANK(burn_credits_t2c);
 t2c_update(rqd, delta, svc);
 svc->start_time = now;
 
 d2printk("b %pv c%d\n", svc->vcpu, svc->credit);
-} else {
+}
+else if ( delta < 0 )
+{
 d2printk("%s: Time went backwards? now %"PRI_stime" start 
%"PRI_stime"\n",
__func__, now, svc->start_time);
 }


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 00/19] xen: sched: assorted fixes and improvements to Credit2

2016-06-17 Thread Dario Faggioli
Hi everyone,

Here you go a collection of pseudo-random fixes and improvement to Credit2.

In the process of working on Soft Affinity and Caps support, I stumbled upon
them, one after the other, and decided to take care.

It's been hard to test and run benchmark, due to the "time goes backwards" bug
I uncovered [1], and this is at least part of the reason why the code for
affinity and caps is still missing. I've got it already, but need to refine a
couple of things, after double checking benchmark results. So, now that we have
Jan's series [2] (thanks! [*]), and that I managed to indeed run some tests on
this preliminary set of patches, I decided I better set this first group free,
while working on finishing the rest.

The various patches do a wide range of different things, so, please, refer to
individual changelogs for more detailed explanation.

About the numbers I could collect so far, here's the situation. I've run rather
simple benchmarks such as: - Xen build inside a VM. Metric is how log that
takes (in seconds), so lower is better.  - Iperf from a VM to its host. Metric
is total aggregate throughput, so higher is better.

The host is a 16 pCPUs / 2 NUMA nodes Xeon E5620, 6GB RAM per node. The VM had
16 vCPUs and 4GB of memory. Dom0 had 16 vCPUs as well, and 1GB of RAM.

The Xen build, I did it one time with -j4 --representative of low VM load-- and
another time with -j24 --representative of high VM laod. The Iperf test, I've
only used 8 parallel streams (I wanted to do 4 and 8, but there was a bug in my
scripts! :-/).

I've run the above both with and without disturbing external (from the point of
view of the VM) load. Such load were just generated by means of running
processes in dom0. It's rather basic, but it certainly keeps dom0's vCPUs busy
and stress the scheduler. This "noise", when present, was composed by:
 - 8 (v)CPU hog process (`yes &> /dev/null'), running in dom0
 - 4 processes alternating computation and sleep with a duty cycle of 35%.

So, there basically were 12 vCPUs of dom0 kept busy, in an heterogeneous 
fashion.

I benchmarked Credit2 with runqueues arranged per-core (the current default)
and per-socket, and also Credit1, for reference. The baseline was current
staging plus Jan's monotonicity series.

Actual numbers:

|===|
| CREDIT 1 (for reference)  |
|===|
| Xen build, low VM load, no noise|
|-|
|   32.207|
|-|-|
| Xen build, high VM load, no noise   | Iperf, high VM load, no noise   |
|-|-|
|   18.500| 22.633  |
|-|-|
| Xen build, low VM load, with noise  |
|-|
|   38.700|
|-|-|
| Xen build, high VM load, with noise | Iperf, high VM load, with noise |
|-|-|
|   80.317| 21.300
|===|
| CREDIT 2  |
|===|
| Xen build, low VM load, no noise| 
|-|
|runq=core   runq=socket  |
| baseline 34.543   38.070|
| patched  35.200   33.433|
|-|-|
| Xen build, high VM load, no noise   | Iperf, high VM load, no noise   |
|-|-|
|runq=core   runq=socket  |   runq=core runq=socket |
| baseline 18.710   19.397| baseline21.300 21.933   |
| patched  18.013   18.530| patched 23.200 23.466   |
|-|-|
| Xen build, low VM load, with noise  |
|-|
|runq=core   runq=socket  |
| baseline 44.483   40.747|
| patched  45.866   39.493|
|-|-|
| Xen build, high VM load, with noise | Iperf, high VM load, with noise |
|-|-|
|runq=core   runq=socket  |   runq=core runq=socket |
| baseline 41.466   30.630| baseline20.333 20.633   |
| patched  36.840   29.080| patched 19.967 21.000   |

[Xen-devel] [PATCH 03/19] xen: credit2: insert and tickle don't need a cpu parameter

2016-06-17 Thread Dario Faggioli
In fact, they always operate on the svc->processor of
the csched2_vcpu passed to them.

No functional change intended.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 xen/common/sched_credit2.c |   19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 0246453..5881583 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -518,8 +518,9 @@ __runq_insert(struct list_head *runq, struct csched2_vcpu 
*svc)
 }
 
 static void
-runq_insert(const struct scheduler *ops, unsigned int cpu, struct csched2_vcpu 
*svc)
+runq_insert(const struct scheduler *ops, struct csched2_vcpu *svc)
 {
+unsigned int cpu = svc->vcpu->processor;
 struct list_head * runq = (ops, cpu)->runq;
 int pos = 0;
 
@@ -558,17 +559,17 @@ void burn_credits(struct csched2_runqueue_data *rqd, 
struct csched2_vcpu *, s_ti
 /* Check to see if the item on the runqueue is higher priority than what's
  * currently running; if so, wake up the processor */
 static /*inline*/ void
-runq_tickle(const struct scheduler *ops, unsigned int cpu, struct csched2_vcpu 
*new, s_time_t now)
+runq_tickle(const struct scheduler *ops, struct csched2_vcpu *new, s_time_t 
now)
 {
 int i, ipid=-1;
 s_time_t lowest=(1<<30);
+unsigned int cpu = new->vcpu->processor;
 struct csched2_runqueue_data *rqd = RQD(ops, cpu);
 cpumask_t mask;
 struct csched2_vcpu * cur;
 
 d2printk("rqt %pv curr %pv\n", new->vcpu, current);
 
-BUG_ON(new->vcpu->processor != cpu);
 BUG_ON(new->rqd != rqd);
 
 /* Look at the cpu it's running on first */
@@ -1071,8 +1072,8 @@ csched2_vcpu_wake(const struct scheduler *ops, struct 
vcpu *vc)
 update_load(ops, svc->rqd, svc, 1, now);
 
 /* Put the VCPU on the runq */
-runq_insert(ops, vc->processor, svc);
-runq_tickle(ops, vc->processor, svc, now);
+runq_insert(ops, svc);
+runq_tickle(ops, svc, now);
 
 out:
 d2printk("w-\n");
@@ -1104,8 +1105,8 @@ csched2_context_saved(const struct scheduler *ops, struct 
vcpu *vc)
 {
 BUG_ON(__vcpu_on_runq(svc));
 
-runq_insert(ops, vc->processor, svc);
-runq_tickle(ops, vc->processor, svc, now);
+runq_insert(ops, svc);
+runq_tickle(ops, svc, now);
 }
 else if ( !is_idle_vcpu(vc) )
 update_load(ops, svc->rqd, svc, -1, now);
@@ -1313,8 +1314,8 @@ static void migrate(const struct scheduler *ops,
 if ( on_runq )
 {
 update_load(ops, svc->rqd, NULL, 1, now);
-runq_insert(ops, svc->vcpu->processor, svc);
-runq_tickle(ops, svc->vcpu->processor, svc, now);
+runq_insert(ops, svc);
+runq_tickle(ops, svc, now);
 SCHED_STAT_CRANK(migrate_on_runq);
 }
 else


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 01/19] xen: sched: leave CPUs doing tasklet work alone.

2016-06-17 Thread Dario Faggioli
In both Credit1 and Credit2, stop considering a pCPU idle,
if the reason why the idle vCPU is being selected, is to
do tasklet work.

Not doing so means that the tickling and load balancing
logic, seeing the pCPU as idle, considers it a candidate
for picking up vCPUs. But the pCPU won't actually pick
up or schedule any vCPU, which would then remain in the
runqueue, which is bas, especially if there were other,
truly idle pCPUs, that could execute it.

The only drawback is that we can't assume that a pCPU is
in always marked as idle when being removed from an
instance of the Credit2 scheduler (csched2_deinit_pdata).
In fact, if we are in stop-machine (i.e., during suspend
or shutdown), the pCPUs are running the stopmachine_tasklet
and hence are actually marked as busy. On the other hand,
when removing a pCPU from a Credit2 pool, it will indeed
be idle. The only thing we can do, therefore, is to
remove the BUG_ON() check.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 xen/common/sched_credit.c  |   12 ++--
 xen/common/sched_credit2.c |   14 ++
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index a38a63d..a6645a2 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -1819,24 +1819,24 @@ csched_schedule(
 else
 snext = csched_load_balance(prv, cpu, snext, );
 
+ out:
 /*
  * Update idlers mask if necessary. When we're idling, other CPUs
  * will tickle us when they get extra work.
  */
-if ( snext->pri == CSCHED_PRI_IDLE )
+if ( tasklet_work_scheduled || snext->pri != CSCHED_PRI_IDLE )
 {
-if ( !cpumask_test_cpu(cpu, prv->idlers) )
-cpumask_set_cpu(cpu, prv->idlers);
+if ( cpumask_test_cpu(cpu, prv->idlers) )
+cpumask_clear_cpu(cpu, prv->idlers);
 }
-else if ( cpumask_test_cpu(cpu, prv->idlers) )
+else if ( !cpumask_test_cpu(cpu, prv->idlers) )
 {
-cpumask_clear_cpu(cpu, prv->idlers);
+cpumask_set_cpu(cpu, prv->idlers);
 }
 
 if ( !is_idle_vcpu(snext->vcpu) )
 snext->start_time += now;
 
-out:
 /*
  * Return task to run next...
  */
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 1933ff1..cf8455c 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -1910,8 +1910,16 @@ csched2_schedule(
 }
 else
 {
-/* Update the idle mask if necessary */
-if ( !cpumask_test_cpu(cpu, >idle) )
+/*
+ * Update the idle mask if necessary. Note that, if we're scheduling
+ * idle in order to carry on some tasklet work, we want to play busy!
+ */
+if ( tasklet_work_scheduled )
+{
+if ( cpumask_test_cpu(cpu, >idle) )
+cpumask_clear_cpu(cpu, >idle);
+}
+else if ( !cpumask_test_cpu(cpu, >idle) )
 cpumask_set_cpu(cpu, >idle);
 /* Make sure avgload gets updated periodically even
  * if there's no activity */
@@ -2291,8 +2299,6 @@ csched2_deinit_pdata(const struct scheduler *ops, void 
*pcpu, int cpu)
 /* No need to save IRQs here, they're already disabled */
 spin_lock(>lock);
 
-BUG_ON(!cpumask_test_cpu(cpu, >idle));
-
 printk("Removing cpu %d from runqueue %d\n", cpu, rqi);
 
 cpumask_clear_cpu(cpu, >idle);


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 02/19] xen: sched: make the 'tickled' perf counter clearer

2016-06-17 Thread Dario Faggioli
In fact, what we have right now, i.e., tickle_idlers_none
and tickle_idlers_some, is not good enough for describing
what really happens in the various tickling functions of
the various scheduler.

Switch to a more descriptive set of counters, such as:
 - tickled_no_cpu: for when we don't tickle anyone
 - tickled_idle_cpu: for when we tickle one or more
 idler
 - tickled_busy_cpu: for when we tickle one or more
 non-idler

While there, fix style of an "out:" label in sched_rt.c.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Meng Xu 
Cc: Anshul Makkar 
Cc: David Vrabel 
---
 xen/common/sched_credit.c|   10 +++---
 xen/common/sched_credit2.c   |   12 +---
 xen/common/sched_rt.c|8 +---
 xen/include/xen/perfc_defn.h |5 +++--
 4 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index a6645a2..a54bb2d 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -385,7 +385,9 @@ static inline void __runq_tickle(struct csched_vcpu *new)
  || (idlers_empty && new->pri > cur->pri) )
 {
 if ( cur->pri != CSCHED_PRI_IDLE )
-SCHED_STAT_CRANK(tickle_idlers_none);
+SCHED_STAT_CRANK(tickled_busy_cpu);
+else
+SCHED_STAT_CRANK(tickled_idle_cpu);
 __cpumask_set_cpu(cpu, );
 }
 else if ( !idlers_empty )
@@ -444,13 +446,13 @@ static inline void __runq_tickle(struct csched_vcpu *new)
 set_bit(_VPF_migrating, >vcpu->pause_flags);
 }
 /* Tickle cpu anyway, to let new preempt cur. */
-SCHED_STAT_CRANK(tickle_idlers_none);
+SCHED_STAT_CRANK(tickled_busy_cpu);
 __cpumask_set_cpu(cpu, );
 }
 else if ( !new_idlers_empty )
 {
 /* Which of the idlers suitable for new shall we wake up? */
-SCHED_STAT_CRANK(tickle_idlers_some);
+SCHED_STAT_CRANK(tickled_idle_cpu);
 if ( opt_tickle_one_idle )
 {
 this_cpu(last_tickle_cpu) =
@@ -479,6 +481,8 @@ static inline void __runq_tickle(struct csched_vcpu *new)
 /* Send scheduler interrupts to designated CPUs */
 cpumask_raise_softirq(, SCHEDULE_SOFTIRQ);
 }
+else
+SCHED_STAT_CRANK(tickled_no_cpu);
 }
 
 static void
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index cf8455c..0246453 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -589,6 +589,7 @@ runq_tickle(const struct scheduler *ops, unsigned int cpu, 
struct csched2_vcpu *
 i = cpumask_cycle(cpu, );
 if ( i < nr_cpu_ids )
 {
+SCHED_STAT_CRANK(tickled_idle_cpu);
 ipid = i;
 goto tickle;
 }
@@ -637,11 +638,12 @@ runq_tickle(const struct scheduler *ops, unsigned int 
cpu, struct csched2_vcpu *
  * than the migrate resistance */
 if ( ipid == -1 || lowest + CSCHED2_MIGRATE_RESIST > new->credit )
 {
-SCHED_STAT_CRANK(tickle_idlers_none);
-goto no_tickle;
+SCHED_STAT_CRANK(tickled_no_cpu);
+return;
 }
 
-tickle:
+SCHED_STAT_CRANK(tickled_busy_cpu);
+ tickle:
 BUG_ON(ipid == -1);
 
 /* TRACE */ {
@@ -654,11 +656,7 @@ tickle:
   (unsigned char *));
 }
 cpumask_set_cpu(ipid, >tickled);
-SCHED_STAT_CRANK(tickle_idlers_some);
 cpu_raise_softirq(ipid, SCHEDULE_SOFTIRQ);
-
-no_tickle:
-return;
 }
 
 /*
diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 5b077d7..dd1c4d3 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -1140,6 +1140,7 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu 
*new)
 /* 1) if new's previous cpu is idle, kick it for cache benefit */
 if ( is_idle_vcpu(curr_on_cpu(new->vcpu->processor)) )
 {
+SCHED_STAT_CRANK(tickled_idle_cpu);
 cpu_to_tickle = new->vcpu->processor;
 goto out;
 }
@@ -1151,6 +1152,7 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu 
*new)
 iter_vc = curr_on_cpu(cpu);
 if ( is_idle_vcpu(iter_vc) )
 {
+SCHED_STAT_CRANK(tickled_idle_cpu);
 cpu_to_tickle = cpu;
 goto out;
 }
@@ -1164,14 +1166,15 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu 
*new)
 if ( latest_deadline_vcpu != NULL &&
  new->cur_deadline < latest_deadline_vcpu->cur_deadline )
 {
+SCHED_STAT_CRANK(tickled_busy_cpu);
 cpu_to_tickle = latest_deadline_vcpu->vcpu->processor;
 goto out;
 }
 
 /* didn't tickle any cpu */
-SCHED_STAT_CRANK(tickle_idlers_none);
+SCHED_STAT_CRANK(tickled_no_cpu);
 return;
-out:
+ out:
 /* 

Re: [Xen-devel] [PATCH 00/19] Assorted fixes and improvements to Credit2

2016-06-17 Thread Dario Faggioli
On Fri, 2016-06-17 at 19:32 +0200, Dario Faggioli wrote:
> Hi everyone,
> 
Mmm... I'm not sure why, but this time, 'stg mail' only managed to send
the cover letter, then it terminated with no errors! :-O

In any case, I'm resending... apologies for the noise.

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 95842: trouble: blocked/broken/fail/pass

2016-06-17 Thread osstest service owner
flight 95842 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/95842/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-pvops 3 host-install(3) broken REGR. vs. 95801

Regressions which are regarded as allowable (not blocking):
 build-amd64-rumpuserxen   6 xen-buildfail   like 95801
 build-i386-rumpuserxen6 xen-buildfail   like 95801
 test-amd64-amd64-xl-rtds  9 debian-install   fail   like 95801
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 95801
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 95801
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 95801
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 95801

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass

version targeted for testing:
 xen  759b9618b8a22ddd87d01c0bff5366814b17eea7
baseline version:
 xen  759b9618b8a22ddd87d01c0bff5366814b17eea7

Last test of basis95842  2016-06-17 00:49:45 Z0 days
Testing same since0  1970-01-01 00:00:00 Z 16969 days0 attempts

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern  pass
 build-i386-oldkern   pass
 build-amd64-prev pass
 build-i386-prev  pass
 build-amd64-pvopspass
 build-armhf-pvopsbroken  
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  blocked 
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 

Re: [Xen-devel] [PATCH v5 3/9] monitor: Rename vm_event_monitor_guest_request

2016-06-17 Thread Tamas K Lengyel
On Thu, Jun 2, 2016 at 4:52 PM, Tamas K Lengyel  wrote:
> Mechanical renaming and relocation to the monitor subsystem.
>
> Signed-off-by: Tamas K Lengyel 
> Acked-by: Razvan Cojocaru 
> Acked-by: Jan Beulich 
> ---
> Cc: Stefano Stabellini 
> Cc: Julien Grall 
> Cc: Andrew Cooper 

Pinging the rest of the maintainers to get an update on this patch.

Thanks,
Tamas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 1/9] vm_event: clear up return value of vm_event_monitor_traps

2016-06-17 Thread Tamas K Lengyel
On Thu, Jun 2, 2016 at 4:52 PM, Tamas K Lengyel  wrote:
> The return value has not been clearly defined, with the function
> never returning 0 which seemingly indicated a condition where the
> guest should crash. In this patch we define -rc as error condition
> where the notification was not sent, 0 where the notification was sent
> and the vCPU is not paused and 1 that the notification was sent and
> that the vCPU is paused.
>
> Signed-off-by: Tamas K Lengyel 
> ---
> Cc: Jun Nakajima 
> Cc: Kevin Tian 
> Cc: Andrew Cooper 

Pinging the rest of the maintainers to get an update on this patch.
Current status is:

Acked-by: Razvan Cojocaru 
Reviewed-by: Jan Beulich 

Thanks,
Tamas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 2/9] monitor: Rename vm_event_monitor_get_capabilities

2016-06-17 Thread Tamas K Lengyel
On Thu, Jun 2, 2016 at 4:52 PM, Tamas K Lengyel  wrote:
> The monitor_get_capabilities check actually belongs to the monitor subsystem 
> so
> relocating and renaming it to sanitize the code's name and location. 
> Mechanical
> patch, no code changes introduced.
>
> Signed-off-by: Tamas K Lengyel 
> Acked-by: Andrew Cooper 
> Acked-by: Razvan Cojocaru 
> ---
> Cc: Jan Beulich 
> Cc: Stefano Stabellini 
> Cc: Julien Grall 

Pinging the rest of the maintainers to get an update on this patch.

Thanks,
Tamas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/7] minor (formatting) fixes

2016-06-17 Thread Tamas K Lengyel
On Fri, Jun 17, 2016 at 3:33 AM, Jan Beulich  wrote:
 On 17.06.16 at 10:36,  wrote:
>> On 06/17/2016 11:33 AM, Corneliu ZUZU wrote:
>>> Ah, ok. Didn't that patch make it to staging yet? I pulled the latest.
>>> Since you already have a patch for that I guess it's ok to remove those
>>> comments and leave the rest as it is and merge later when one of these
>>> patches makes it to staging?
>>
>> I think there's a delay caused by the 4.7 release. I also have an acked
>> patch waiting to go in that might cause you to have to rebase (part of)
>> the series: https://patchwork.kernel.org/patch/9033771/
>
> Well, there's a slight difference here: Yours is indeed delayed until
> 4.7 is set in stone, for being more intrusive. Tamas'es, otoh, I would
> probably have put in already if it had all necessary acks (or they
> had at least been pinged for, at which point we could waive the need
> for some trivial adjustments).

Patches 1 through 4 of my v5 series already has acks, so I guess I'll
issue a ping for the other maintainers to check.

Tamas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-3.10 test] 95840: regressions - FAIL

2016-06-17 Thread osstest service owner
flight 95840 linux-3.10 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/95840/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1fail REGR. vs. 86412
 test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1  fail REGR. vs. 86412

Tests which are failing intermittently (not blocking):
 test-amd64-i386-qemut-rhel6hvm-amd 3 host-install(3) broken in 95804 pass in 
95840
 test-amd64-amd64-i386-pvgrub  3 host-install(3)  broken in 95804 pass in 95840
 test-amd64-i386-freebsd10-amd64 10 guest-start  fail pass in 95804

Regressions which are regarded as allowable (not blocking):
 build-amd64-rumpuserxen   6 xen-buildfail   like 86412
 build-i386-rumpuserxen6 xen-buildfail   like 86412
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 86412
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 86412
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 86412
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 86412

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass

version targeted for testing:
 linuxca1199fccf14540e86f6da955333e31d6fec5f3e
baseline version:
 linux326a1b2d50cbe1f890e56ab60b9215cd30053f5a

Last test of basis86412  2016-03-16 16:24:33 Z   93 days
Testing same since95665  2016-06-13 14:51:45 Z4 days4 attempts


People who touched revisions under test:
  Aaro Koskinen 
  Adrian Hunter 
  Al Viro 
  Alan Stern 
  Alex Deucher 
  Alexandre Belloni 
  Alexandre Bounine 
  Alexey Khoroshilov 
  Allain Legacy 
  Andi Kleen 
  Andrew Morton 
  Andrey Gelman 
  Andy Lutomirski 
  Andy Lutomirski  # On a Dell XPS 13 9350
  Anton Blanchard 
  Antonio Quartulli 
  Aristeu Rozanski 
  Arnaldo Carvalho de Melo 
  Arnd Bergmann 
  Artem Bityutskiy 
  Aurelien Jacquiot 
  Behan Webster 
  Ben Hutchings 
  Bill Sommerfeld 
  Bjorn Helgaas 
  Bjørn Mork 
  Bob Moore 
  Borislav Petkov 
  Brian King 
  Brian Norris 
  Chanwoo Choi 
  Chas Williams <3ch...@gmail.com>
  Chris Friesen 
  Dan Carpenter 
  Dan Streetman 
  Daniel Fraga 
  Daniel Lezcano 
  David S. Miller 
  Diego Viola 
  Dinh Nguyen 
  Dmitry Ivanov 
  Dmitry Ivanov 
  Dmitry Torokhov 
  Douglas Gilbert 
  Eric Wheeler 
  Eric Wheeler 
  Eryu Guan 
  Felipe Balbi 
  Florian Westphal 
  Gabriel Krisman Bertazi 
  Geert Uytterhoeven 
  Greg Kroah-Hartman 
  Guenter Roeck 
  Guillaume Nault 
  H. Nikolaus Schaller 
  H. Peter Anvin 
  Haibo Chen 
  Haishuang Yan 

[Xen-devel] [qemu-mainline test] 95838: regressions - FAIL

2016-06-17 Thread osstest service owner
flight 95838 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/95838/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail 
REGR. vs. 94856
 test-amd64-i386-freebsd10-amd64 10 guest-startfail REGR. vs. 94856
 test-armhf-armhf-xl-cubietruck  6 xen-bootfail REGR. vs. 94856
 test-armhf-armhf-xl   6 xen-boot  fail REGR. vs. 94856
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1fail REGR. vs. 94856
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail REGR. 
vs. 94856

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 94856
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 94856
 test-amd64-amd64-xl-rtds  9 debian-install   fail   like 94856

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass

version targeted for testing:
 qemuu585fcd4b11070b3220685fc54ecca1991cdeb161
baseline version:
 qemuud6550e9ed2e1a60d889dfb721de00d9a4e3bafbe

Last test of basis94856  2016-05-27 20:14:49 Z   20 days
Failing since 94983  2016-05-31 09:40:12 Z   17 days   21 attempts
Testing same since95838  2016-06-16 18:53:25 Z0 days1 attempts


People who touched revisions under test:
  Alberto Garcia 
  Alex Bennée 
  Alex Bligh 
  Alexander Graf 
  Alexey Kardashevskiy 
  Alistair Francis 
  Amit Shah 
  Andrea Arcangeli 
  Andrew Jones 
  Anthony PERARD 
  Anton Blanchard 
  Ard Biesheuvel 
  Benjamin Herrenschmidt 
  Bharata B Rao 
  Cao jin 
  Changlong Xie 
  Chao Peng 
  Chen Fan 
  Christian Borntraeger 
  Christophe Lyon 
  Cole Robinson 
  Colin Lord 
  Corey Minyard 
  Cornelia Huck 
  Cédric Le Goater 
  Daniel P. Berrange 
  David Gibson 
  David 

[Xen-devel] [PATCH 00/19] Assorted fixes and improvements to Credit2

2016-06-17 Thread Dario Faggioli
Hi everyone,

Here you go a collection of pseudo-random fixes and improvement to Credit2.

In the process of working on Soft Affinity and Caps support, I stumbled upon
them, one after the other, and decided to take care.

It's been hard to test and run benchmark, due to the "time goes backwards" bug
I uncovered [1], and this is at least part of the reason why the code for
affinity and caps is still missing. I've got it already, but need to refine a
couple of things, after double checking benchmark results. So, now that we have
Jan's series [2] (thanks! [*]), and that I managed to indeed run some tests on
this preliminary set of patches, I decided I better set this first group free,
while working on finishing the rest.

The various patches do a wide range of different things, so, please, refer to
individual changelogs for more detailed explanation.

About the numbers I could collect so far, here's the situation. I've run rather
simple benchmarks such as: - Xen build inside a VM. Metric is how log that
takes (in seconds), so lower is better.  - Iperf from a VM to its host. Metric
is total aggregate throughput, so higher is better.

The host is a 16 pCPUs / 2 NUMA nodes Xeon E5620, 6GB RAM per node. The VM had
16 vCPUs and 4GB of memory. Dom0 had 16 vCPUs as well, and 1GB of RAM.

The Xen build, I did it one time with -j4 --representative of low VM load-- and
another time with -j24 --representative of high VM laod. The Iperf test, I've
only used 8 parallel streams (I wanted to do 4 and 8, but there was a bug in my
scripts! :-/).

I've run the above both with and without disturbing external (from the point of
view of the VM) load. Such load were just generated by means of running
processes in dom0. It's rather basic, but it certainly keeps dom0's vCPUs busy
and stress the scheduler. This "noise", when present, was composed by:
 - 8 (v)CPU hog process (`yes &> /dev/null'), running in dom0
 - 4 processes alternating computation and sleep with a duty cycle of 35%.

So, there basically were 12 vCPUs of dom0 kept busy, in an heterogeneous 
fashion.

I benchmarked Credit2 with runqueues arranged per-core (the current default)
and per-socket, and also Credit1, for reference. The baseline was current
staging plus Jan's monotonicity series.

Actual numbers:

|===|
| CREDIT 1 (for reference)  |
|===|
| Xen build, low VM load, no noise|
|-|
|   32.207|
|-|-|
| Xen build, high VM load, no noise   | Iperf, high VM load, no noise   |
|-|-|
|   18.500| 22.633  |
|-|-|
| Xen build, low VM load, with noise  |
|-|
|   38.700|
|-|-|
| Xen build, high VM load, with noise | Iperf, high VM load, with noise |
|-|-|
|   80.317| 21.300
|===|
| CREDIT 2  |
|===|
| Xen build, low VM load, no noise| 
|-|
|runq=core   runq=socket  |
| baseline 34.543   38.070|
| patched  35.200   33.433|
|-|-|
| Xen build, high VM load, no noise   | Iperf, high VM load, no noise   |
|-|-|
|runq=core   runq=socket  |   runq=core runq=socket |
| baseline 18.710   19.397| baseline21.300 21.933   |
| patched  18.013   18.530| patched 23.200 23.466   |
|-|-|
| Xen build, low VM load, with noise  |
|-|
|runq=core   runq=socket  |
| baseline 44.483   40.747|
| patched  45.866   39.493|
|-|-|
| Xen build, high VM load, with noise | Iperf, high VM load, with noise |
|-|-|
|runq=core   runq=socket  |   runq=core runq=socket |
| baseline 41.466   30.630| baseline20.333 20.633   |
| patched  36.840   29.080| patched 19.967 21.000   |

Re: [Xen-devel] [PATCH v2 1/2] libs, libxc: Interface for grant copy operation

2016-06-17 Thread Paulina Szubarczyk
On Fri, 2016-06-17 at 17:43 +0100, Wei Liu wrote:
> On Thu, Jun 16, 2016 at 01:16:54PM +0100, Wei Liu wrote:
> [...]
> [...]
> > > diff --git a/tools/libs/gnttab/private.h b/tools/libs/gnttab/private.h
> > > index d286c86..22ad53a 100644
> > > --- a/tools/libs/gnttab/private.h
> > > +++ b/tools/libs/gnttab/private.h
> > > @@ -9,6 +9,20 @@ struct xengntdev_handle {
> > >  int fd;
> > >  };
> > >
> > > +struct xengnttab_copy_grant_segment {
> > > +union xengnttab_copy_ptr {
> > > +void *virt;
> > > +struct {
> > > +uint32_t ref;
> > > +uint16_t offset;
> > > +uint16_t domid;
> > > +} foreign;
> > > +} source, dest;
> > > +uint16_t len;
> > > +uint16_t flags;
> > > +int16_t status;
> > > +};
> > >
> >
> > The struct is more or less a direct copy of Linux structure. It is
> > probably fine as-is, but I don't want to risk making this library Linux
> > centric. If you look at other functions, they accept a bunch of discrete
> > arguments then assemble those arguments into OS dependent structure in
> > osdep functions. I know having discrete arguments for the API you want
> > to introduce seems cumbersome, but I want to at least tell you all the
> > background information needed for a thorough discussion. I would be
> > interested in Roger's view on this.
> >
> > I apologise for not having commented on your series earlier.
> >
> 
> After checking various places I'm convinced that this structure is fine
> as-is.
> 
> BSDes have not yet had a user space grant table driver, so I don't
> really worry about that at this point.
> 
> As I have asked you to removed all the stuff in xenctrl_compat.h, you
> will need to move this to libs/gnttab/xengnttab.h.
> 
> Also I have one further comment for code:
> 
> > +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> > +uint32_t count,
> > +xengnttab_grant_copy_segment_t* segs)
> > +{
> > +int fd = xgt->fd;
> > +struct ioctl_gntdev_grant_copy copy;
> > +
> > +copy.segments = (struct ioctl_gntdev_grant_copy_segment*)segs;
> 
> Please create an array of ioctl structure and use explicit field by
> field assignment here.
> 
> Casting like this can easily introduce bug -- just imagine ioctl gets
> changed, or the segment structure gets changed.
> 
> Wei.

Hi Wei, 

Thank you for all the remarks. I am working on correcting the patch.

Paulina



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 95866: tolerable all pass - PUSHED

2016-06-17 Thread osstest service owner
flight 95866 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/95866/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  08754333892407f415045c05659783baeb8fc5d4
baseline version:
 xen  2f7b56c1a7b1ee92ff5f92888723e380412dd3ab

Last test of basis95858  2016-06-17 12:02:01 Z0 days
Testing same since95866  2016-06-17 15:01:34 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Dario Faggioli 
  George Dunlap 
  Jan Beulich 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=08754333892407f415045c05659783baeb8fc5d4
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
08754333892407f415045c05659783baeb8fc5d4
+ branch=xen-unstable-smoke
+ revision=08754333892407f415045c05659783baeb8fc5d4
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.7-testing
+ '[' x08754333892407f415045c05659783baeb8fc5d4 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local 'options=[fetch=try]'
 getconfig GitCacheProxy
 perl -e '
use Osstest;
readglobalconfig();
print 

Re: [Xen-devel] [PATCH 12/15] xen/xsm: remove .xsm_initcall.init section

2016-06-17 Thread Konrad Rzeszutek Wilk
On Fri, Jun 17, 2016 at 01:18:47PM -0400, Daniel De Graaf wrote:
> On 06/17/2016 01:14 PM, Konrad Rzeszutek Wilk wrote:
> >On Fri, Jun 17, 2016 at 01:04:01PM -0400, Daniel De Graaf wrote:
> >>On 06/17/2016 11:50 AM, Konrad Rzeszutek Wilk wrote:
> >>>On Thu, Jun 09, 2016 at 10:47:15AM -0400, Daniel De Graaf wrote:
> Since FLASK is the only implementation of XSM hooks in Xen, using an
> iterated initcall dispatch for setup is overly complex.  Change this to
> >>>
> >>>As such, should the Kconfig file enable the FLASK by default?
> >>>Or make the XSM entry have the configuration for FLASK?
> >>>
> >>>Or perhaps make the FLASK be the visible one and select XSM?
> >>
> >>XSM has previously been the configuration option to enable.  If XSM is
> >>enabled (by choice), FLASK will then be enabled by default.
> >
> >Ah, OK. I need to check, but could you disable FLASK and still have XSM 
> >enabled?
> 
> Yes, but it won't do anything except slow down the hypervisor a bit.
> It's the same behavior as enabling flask and passing "flask=disabled" on
> the hypervisor command line.

Right. Would it make sense to squash XSM and FLASK together in the Kconfig?

> 
> If someone wanted to write another XSM implementation (or resurrect the
> old ACM module), this would be the starting point for that.
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 11/15] flask: improve unknown permission handling

2016-06-17 Thread Daniel De Graaf

On 06/17/2016 01:13 PM, Konrad Rzeszutek Wilk wrote:

On Fri, Jun 17, 2016 at 01:02:58PM -0400, Daniel De Graaf wrote:

On 06/17/2016 11:45 AM, Konrad Rzeszutek Wilk wrote:

On Thu, Jun 09, 2016 at 10:47:14AM -0400, Daniel De Graaf wrote:

When an unknown domctl, sysctl, or other operation is encountered in the
FLASK security server, use the allow_unknown bit in the security policy
to decide if the permission should be allowed or denied.  This bit is
off by default, but it can be set by using checkpolicy -U allow when
compiling the policy.  This allows new operations to be tested without
needing to immediately add security checks; however, it is not flexible
enough to avoid adding the actual permission checks.  An error message
is printed to the hypervisor console when this fallback is encountered.


.. and the operation is permitted.


The error message is printed either way (with a different priority).  Were


correct.

you suggesting I expand this explanation to include both the error and
warning messages separately?


It just that the patch changes the behavior. That is in the past if
you had created an policy using checkpolicy -U allow it would print an
error and return -EPERM.

But now it will print an error and return 0 and pass the XSM check
(aka operation ends being permitted).


I would be surprised if someone actually used allow_unknown before now,
since it did nothing and required manually enabling.  But if they did,
this is a functionality change.  I'll add a note of that.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 12/15] xen/xsm: remove .xsm_initcall.init section

2016-06-17 Thread Daniel De Graaf

On 06/17/2016 01:14 PM, Konrad Rzeszutek Wilk wrote:

On Fri, Jun 17, 2016 at 01:04:01PM -0400, Daniel De Graaf wrote:

On 06/17/2016 11:50 AM, Konrad Rzeszutek Wilk wrote:

On Thu, Jun 09, 2016 at 10:47:15AM -0400, Daniel De Graaf wrote:

Since FLASK is the only implementation of XSM hooks in Xen, using an
iterated initcall dispatch for setup is overly complex.  Change this to


As such, should the Kconfig file enable the FLASK by default?
Or make the XSM entry have the configuration for FLASK?

Or perhaps make the FLASK be the visible one and select XSM?


XSM has previously been the configuration option to enable.  If XSM is
enabled (by choice), FLASK will then be enabled by default.


Ah, OK. I need to check, but could you disable FLASK and still have XSM enabled?


Yes, but it won't do anything except slow down the hypervisor a bit.
It's the same behavior as enabling flask and passing "flask=disabled" on
the hypervisor command line.

If someone wanted to write another XSM implementation (or resurrect the
old ACM module), this would be the starting point for that.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 12/15] xen/xsm: remove .xsm_initcall.init section

2016-06-17 Thread Konrad Rzeszutek Wilk
On Fri, Jun 17, 2016 at 01:04:01PM -0400, Daniel De Graaf wrote:
> On 06/17/2016 11:50 AM, Konrad Rzeszutek Wilk wrote:
> >On Thu, Jun 09, 2016 at 10:47:15AM -0400, Daniel De Graaf wrote:
> >>Since FLASK is the only implementation of XSM hooks in Xen, using an
> >>iterated initcall dispatch for setup is overly complex.  Change this to
> >
> >As such, should the Kconfig file enable the FLASK by default?
> >Or make the XSM entry have the configuration for FLASK?
> >
> >Or perhaps make the FLASK be the visible one and select XSM?
> 
> XSM has previously been the configuration option to enable.  If XSM is
> enabled (by choice), FLASK will then be enabled by default.

Ah, OK. I need to check, but could you disable FLASK and still have XSM enabled?

> 
> Logically, FLASK is an implementation of XSM, and while it would be
> possible to swap them, this would probably need to be done by hiding the
> XSM option from the user.

Right.

> 
> >>a direct function call to a globally visible function; if additional XSM
> >>hooks are added in the future, a switching mechanism will be needed
> >>regardless, and that can be placed in xsm_core.c.
> >

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 11/15] flask: improve unknown permission handling

2016-06-17 Thread Konrad Rzeszutek Wilk
On Fri, Jun 17, 2016 at 01:02:58PM -0400, Daniel De Graaf wrote:
> On 06/17/2016 11:45 AM, Konrad Rzeszutek Wilk wrote:
> >On Thu, Jun 09, 2016 at 10:47:14AM -0400, Daniel De Graaf wrote:
> >>When an unknown domctl, sysctl, or other operation is encountered in the
> >>FLASK security server, use the allow_unknown bit in the security policy
> >>to decide if the permission should be allowed or denied.  This bit is
> >>off by default, but it can be set by using checkpolicy -U allow when
> >>compiling the policy.  This allows new operations to be tested without
> >>needing to immediately add security checks; however, it is not flexible
> >>enough to avoid adding the actual permission checks.  An error message
> >>is printed to the hypervisor console when this fallback is encountered.
> >
> >.. and the operation is permitted.
> 
> The error message is printed either way (with a different priority).  Were

correct.
> you suggesting I expand this explanation to include both the error and
> warning messages separately?

It just that the patch changes the behavior. That is in the past if
you had created an policy using checkpolicy -U allow it would print an
error and return -EPERM.

But now it will print an error and return 0 and pass the XSM check 
(aka operation ends being permitted).

> 
> >>
> >>Signed-off-by: Daniel De Graaf 
> >>---
> >> xen/xsm/flask/hooks.c| 44 
> >> +---
> >> xen/xsm/flask/include/security.h |  2 ++
> >> xen/xsm/flask/ss/policydb.c  |  1 +
> >> xen/xsm/flask/ss/policydb.h  |  6 ++
> >> xen/xsm/flask/ss/services.c  |  5 +
> >> 5 files changed, 42 insertions(+), 16 deletions(-)
> >>
> >>diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
> >>index a8d45e7..3ab3fbf 100644
> >>--- a/xen/xsm/flask/hooks.c
> >>+++ b/xen/xsm/flask/hooks.c
> >>@@ -136,6 +136,23 @@ static int get_irq_sid(int irq, u32 *sid, struct 
> >>avc_audit_data *ad)
> >> return 0;
> >> }
> >>
> >>+static int avc_unknown_permission(const char *name, int id)
> >>+{
> >>+int rc;
> >
> >I would add a new line here.
> 
> OK
> 
> >>+if ( !flask_enforcing || security_get_allow_unknown() )
> >>+{
> >>+printk(XENLOG_G_WARNING "FLASK: Allowing unknown %s: %d.\n", name, 
> >>id);
> >>+rc = 0;
> >>+}
> >>+else
> >>+{
> >>+printk(XENLOG_G_ERR "FLASK: Denying unknown %s: %d.\n", name, id);
> >>+rc = -EPERM;
> >>+}
> >>+
> >>+return rc;
> >>+}
> >>+
> >
> >The rest looks OK, but I have a question: Is this how Linux operates?
> 
> Yes; selinux_nlmsg_perm for an unknown netlink message seems to be an
> example there.
> 
> -- 
> Daniel De Graaf
> National Security Agency

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 12/15] xen/xsm: remove .xsm_initcall.init section

2016-06-17 Thread Daniel De Graaf

On 06/17/2016 11:50 AM, Konrad Rzeszutek Wilk wrote:

On Thu, Jun 09, 2016 at 10:47:15AM -0400, Daniel De Graaf wrote:

Since FLASK is the only implementation of XSM hooks in Xen, using an
iterated initcall dispatch for setup is overly complex.  Change this to


As such, should the Kconfig file enable the FLASK by default?
Or make the XSM entry have the configuration for FLASK?

Or perhaps make the FLASK be the visible one and select XSM?


XSM has previously been the configuration option to enable.  If XSM is
enabled (by choice), FLASK will then be enabled by default.

Logically, FLASK is an implementation of XSM, and while it would be
possible to swap them, this would probably need to be done by hiding the
XSM option from the user.


a direct function call to a globally visible function; if additional XSM
hooks are added in the future, a switching mechanism will be needed
regardless, and that can be placed in xsm_core.c.



+config FLASK
+   bool "FLux Advanced Security Kernel support"
+   default y
+   depends on XSM


So this would be 'select XSM' ?

+   ---help---
+ Enables FLASK (FLux Advanced Security Kernel) as the access control
+ mechanism used by the XSM framework.  This provides a mandatory access
+ control framework by which security enforcement, isolation, and
+ auditing can be achieved with fine granular control via a security
+ policy.
+
+ If unsure, say Y.
+
+config FLASK_AVC_STATS
+   def_bool y
+   depends on FLASK
+   ---help---
+ Maintain statistics on the access vector cache
+
 # Enable schedulers
 menu "Schedulers"
visible if EXPERT = "y"
diff --git a/xen/xsm/xsm_core.c b/xen/xsm/xsm_core.c
index 634ec98..cb2fdb6 100644
--- a/xen/xsm/xsm_core.c
+++ b/xen/xsm/xsm_core.c
@@ -38,13 +38,7 @@ static inline int verify(struct xsm_operations *ops)

 static void __init do_xsm_initcalls(void)
 {
-xsm_initcall_t *call;
-call = __xsm_initcall_start;
-while ( call < __xsm_initcall_end )
-{
-(*call) ();
-call++;
-}
+flask_init();


Could you delete do_xsm_initcalls() and make xsm_init() call flask_init() ?


Sure.

--
Daniel De Graaf
National Security Agency

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 11/15] flask: improve unknown permission handling

2016-06-17 Thread Daniel De Graaf

On 06/17/2016 11:45 AM, Konrad Rzeszutek Wilk wrote:

On Thu, Jun 09, 2016 at 10:47:14AM -0400, Daniel De Graaf wrote:

When an unknown domctl, sysctl, or other operation is encountered in the
FLASK security server, use the allow_unknown bit in the security policy
to decide if the permission should be allowed or denied.  This bit is
off by default, but it can be set by using checkpolicy -U allow when
compiling the policy.  This allows new operations to be tested without
needing to immediately add security checks; however, it is not flexible
enough to avoid adding the actual permission checks.  An error message
is printed to the hypervisor console when this fallback is encountered.


.. and the operation is permitted.


The error message is printed either way (with a different priority).  Were
you suggesting I expand this explanation to include both the error and
warning messages separately?



Signed-off-by: Daniel De Graaf 
---
 xen/xsm/flask/hooks.c| 44 +---
 xen/xsm/flask/include/security.h |  2 ++
 xen/xsm/flask/ss/policydb.c  |  1 +
 xen/xsm/flask/ss/policydb.h  |  6 ++
 xen/xsm/flask/ss/services.c  |  5 +
 5 files changed, 42 insertions(+), 16 deletions(-)

diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index a8d45e7..3ab3fbf 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -136,6 +136,23 @@ static int get_irq_sid(int irq, u32 *sid, struct 
avc_audit_data *ad)
 return 0;
 }

+static int avc_unknown_permission(const char *name, int id)
+{
+int rc;


I would add a new line here.


OK


+if ( !flask_enforcing || security_get_allow_unknown() )
+{
+printk(XENLOG_G_WARNING "FLASK: Allowing unknown %s: %d.\n", name, id);
+rc = 0;
+}
+else
+{
+printk(XENLOG_G_ERR "FLASK: Denying unknown %s: %d.\n", name, id);
+rc = -EPERM;
+}
+
+return rc;
+}
+


The rest looks OK, but I have a question: Is this how Linux operates?


Yes; selinux_nlmsg_perm for an unknown netlink message seems to be an
example there.

--
Daniel De Graaf
National Security Agency

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 03/15] flask/policy: move user definitions and constraints into modules

2016-06-17 Thread Daniel De Graaf

On 06/17/2016 11:28 AM, Konrad Rzeszutek Wilk wrote:

diff --git a/tools/flask/policy/modules/modules.conf 
b/tools/flask/policy/modules/modules.conf
index d875dbf..9aac6a0 100644
--- a/tools/flask/policy/modules/modules.conf
+++ b/tools/flask/policy/modules/modules.conf
@@ -34,6 +34,13 @@ nomigrate = on
 nic_dev = on

 # This allows any domain type to be created using the system_r role.  When it 
is
-# disabled, domains not using the default types (dom0_t and domU_t) must use
-# another role (such as vm_r) from the vm_role module.
+# disabled, domains not using the default types (dom0_t, domU_t, dm_dom_t) must
+# use another role (such as vm_r from the vm_role module below).
 all_system_role = on
+
+# Example users, roles, and constraints for user-based separation.
+#
+# The three users defined here can set up grant/event channel communication
+# (vchan, device frontend/backend) between their own VMs, but cannot set up a
+# channel to a VM under a different user.
+vm_role = on


So should this be off? As by default we would want all_system_role ?

Ah wait, it can be loaded - even if not used.


Yes, enabling both of these modules gives you flexibility to use either or
both types for domains.  Enabling only one would be useful to enforce its
use, and disabling both doesn't make much sense unless you were adding
another module.

--
Daniel De Graaf
National Security Agency

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] libs, libxc: Interface for grant copy operation

2016-06-17 Thread Wei Liu
On Thu, Jun 16, 2016 at 01:16:54PM +0100, Wei Liu wrote:
[...]
[...]
> > diff --git a/tools/libs/gnttab/private.h b/tools/libs/gnttab/private.h
> > index d286c86..22ad53a 100644
> > --- a/tools/libs/gnttab/private.h
> > +++ b/tools/libs/gnttab/private.h
> > @@ -9,6 +9,20 @@ struct xengntdev_handle {
> >  int fd;
> >  };
> >
> > +struct xengnttab_copy_grant_segment {
> > +union xengnttab_copy_ptr {
> > +void *virt;
> > +struct {
> > +uint32_t ref;
> > +uint16_t offset;
> > +uint16_t domid;
> > +} foreign;
> > +} source, dest;
> > +uint16_t len;
> > +uint16_t flags;
> > +int16_t status;
> > +};
> >
>
> The struct is more or less a direct copy of Linux structure. It is
> probably fine as-is, but I don't want to risk making this library Linux
> centric. If you look at other functions, they accept a bunch of discrete
> arguments then assemble those arguments into OS dependent structure in
> osdep functions. I know having discrete arguments for the API you want
> to introduce seems cumbersome, but I want to at least tell you all the
> background information needed for a thorough discussion. I would be
> interested in Roger's view on this.
>
> I apologise for not having commented on your series earlier.
>

After checking various places I'm convinced that this structure is fine
as-is.

BSDes have not yet had a user space grant table driver, so I don't
really worry about that at this point.

As I have asked you to removed all the stuff in xenctrl_compat.h, you
will need to move this to libs/gnttab/xengnttab.h.

Also I have one further comment for code:

> +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> +uint32_t count,
> +xengnttab_grant_copy_segment_t* segs)
> +{
> +int fd = xgt->fd;
> +struct ioctl_gntdev_grant_copy copy;
> +
> +copy.segments = (struct ioctl_gntdev_grant_copy_segment*)segs;

Please create an array of ioctl structure and use explicit field by
field assignment here.

Casting like this can easily introduce bug -- just imagine ioctl gets
changed, or the segment structure gets changed.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/4] x86/vMSI-X: defer intercept handler registration

2016-06-17 Thread Jan Beulich
>>> On 17.06.16 at 18:13,  wrote:
> On Wed, Jun 08, 2016 at 06:52:58AM -0600, Jan Beulich wrote:
>> There's no point in registering the internal MSI-X table intercept
>> functions on all domains - it is sufficient to do so once a domain gets
>> an MSI-X capable device assigned.
> 
> I think this will break on SR-IOV devices that are created and we
> had not setup MCFG.

How does MCFG get into the picture here? (And anyway, this series
is about vMSI-X, not host MSI-X.)

> There is this Intel board (which I have) where the MCFG is setup
> only via ACPI and in the past we had issues - where we never
> detect that the VF had MSI-X - b/c we could not access the
> configuration registers past 0xff.

The MSI-X capability necessarily lives in the low 256 bytes. Do you
mean the SR-IOV capability?

> I am pretty sure that Linux now uploads the MCFG data during
> bootup from the ACPI AML code, but earlier kernels may not.
> 
> And that would mean the device would try to use MSI-X while
> code would be !pdev->msix.
> 
> Hmm, thought I wonder - with all those improvements of capturing
> the MSI-X and validating it in the hypervisor - would this
> even matter?
> 
> Let me stash an 82576 card in that box and see what happens with
> Xen 4.7.

Thanks.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/4] x86/vMSI-X: defer intercept handler registration

2016-06-17 Thread Konrad Rzeszutek Wilk
On Wed, Jun 08, 2016 at 06:52:58AM -0600, Jan Beulich wrote:
> There's no point in registering the internal MSI-X table intercept
> functions on all domains - it is sufficient to do so once a domain gets
> an MSI-X capable device assigned.

I think this will break on SR-IOV devices that are created and we
had not setup MCFG.

There is this Intel board (which I have) where the MCFG is setup
only via ACPI and in the past we had issues - where we never
detect that the VF had MSI-X - b/c we could not access the
configuration registers past 0xff.

I am pretty sure that Linux now uploads the MCFG data during
bootup from the ACPI AML code, but earlier kernels may not.

And that would mean the device would try to use MSI-X while
code would be !pdev->msix.

Hmm, thought I wonder - with all those improvements of capturing
the MSI-X and validating it in the hypervisor - would this
even matter?

Let me stash an 82576 card in that box and see what happens with
Xen 4.7.

> 
> Signed-off-by: Jan Beulich 
> 
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -644,8 +644,6 @@ int hvm_domain_initialise(struct domain
>  
>  rtc_init(d);
>  
> -msixtbl_init(d);
> -
>  register_portio_handler(d, 0xe9, 1, hvm_print_line);
>  
>  if ( hvm_tsc_scaling_supported )
> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -553,7 +553,8 @@ found:
>  
>  void msixtbl_init(struct domain *d)
>  {
> -if ( !has_vlapic(d) )
> +if ( !has_hvm_container_domain(d) || !has_vlapic(d) ||
> + d->arch.hvm_domain.msixtbl_list.next )
>  return;
>  
>  INIT_LIST_HEAD(>arch.hvm_domain.msixtbl_list);
> @@ -567,7 +568,7 @@ void msixtbl_pt_cleanup(struct domain *d
>  struct msixtbl_entry *entry, *temp;
>  unsigned long flags;
>  
> -if ( !has_vlapic(d) )
> +if ( !d->arch.hvm_domain.msixtbl_list.next )
>  return;
>  
>  /* msixtbl_list_lock must be acquired with irq_disabled for check_lock() 
> */
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -1380,6 +1380,9 @@ static int assign_device(struct domain *
>  goto done;
>  }
>  
> +if ( pdev->msix )
> +msixtbl_init(d);
> +
>  pdev->fault.count = 0;
>  
>  if ( (rc = hd->platform_ops->assign_device(d, devfn, pci_to_dev(pdev), 
> flag)) )
> 
> 
> 

> x86/vMSI-X: defer intercept handler registration
> 
> There's no point in registering the internal MSI-X table intercept
> functions on all domains - it is sufficient to do so once a domain gets
> an MSI-X capable device assigned.
> 
> Signed-off-by: Jan Beulich 
> 
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -644,8 +644,6 @@ int hvm_domain_initialise(struct domain
>  
>  rtc_init(d);
>  
> -msixtbl_init(d);
> -
>  register_portio_handler(d, 0xe9, 1, hvm_print_line);
>  
>  if ( hvm_tsc_scaling_supported )
> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -553,7 +553,8 @@ found:
>  
>  void msixtbl_init(struct domain *d)
>  {
> -if ( !has_vlapic(d) )
> +if ( !has_hvm_container_domain(d) || !has_vlapic(d) ||
> + d->arch.hvm_domain.msixtbl_list.next )
>  return;
>  
>  INIT_LIST_HEAD(>arch.hvm_domain.msixtbl_list);
> @@ -567,7 +568,7 @@ void msixtbl_pt_cleanup(struct domain *d
>  struct msixtbl_entry *entry, *temp;
>  unsigned long flags;
>  
> -if ( !has_vlapic(d) )
> +if ( !d->arch.hvm_domain.msixtbl_list.next )
>  return;
>  
>  /* msixtbl_list_lock must be acquired with irq_disabled for check_lock() 
> */
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -1380,6 +1380,9 @@ static int assign_device(struct domain *
>  goto done;
>  }
>  
> +if ( pdev->msix )
> +msixtbl_init(d);
> +
>  pdev->fault.count = 0;
>  
>  if ( (rc = hd->platform_ops->assign_device(d, devfn, pci_to_dev(pdev), 
> flag)) )

> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xen: fix qdisk BLKIF_OP_DISCARD for 32/64 word size mix

2016-06-17 Thread Stefano Stabellini
On Fri, 17 Jun 2016, Paul Durrant wrote:
> > -Original Message-
> > From: Juergen Gross [mailto:jgr...@suse.com]
> > Sent: 17 June 2016 11:40
> > To: Paul Durrant; Jan Beulich
> > Cc: Anthony Perard; xen-devel; sstabell...@kernel.org; qemu-
> > de...@nongnu.org; kra...@redhat.com
> > Subject: Re: [Xen-devel] [PATCH v2] xen: fix qdisk BLKIF_OP_DISCARD for
> > 32/64 word size mix
> > 
> > On 17/06/16 12:15, Paul Durrant wrote:
> > >> -Original Message-
> > >> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> > >> Juergen Gross
> > >> Sent: 17 June 2016 11:08
> > >> To: Paul Durrant; Jan Beulich
> > >> Cc: Anthony Perard; xen-devel; sstabell...@kernel.org; qemu-
> > >> de...@nongnu.org; kra...@redhat.com
> > >> Subject: Re: [Xen-devel] [PATCH v2] xen: fix qdisk BLKIF_OP_DISCARD for
> > >> 32/64 word size mix
> > >>
> > >> On 17/06/16 11:50, Paul Durrant wrote:
> >  -Original Message-
> >  From: Juergen Gross [mailto:jgr...@suse.com]
> >  Sent: 17 June 2016 10:46
> >  To: Paul Durrant; Jan Beulich
> >  Cc: Anthony Perard; xen-devel; sstabell...@kernel.org; qemu-
> >  de...@nongnu.org; kra...@redhat.com
> >  Subject: Re: [Xen-devel] [PATCH v2] xen: fix qdisk BLKIF_OP_DISCARD
> > for
> >  32/64 word size mix
> > 
> >  On 17/06/16 11:37, Paul Durrant wrote:
> > >> -Original Message-
> > >> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On
> > Behalf
> > >> Of
> >  Jan
> > >> Beulich
> > >> Sent: 17 June 2016 10:26
> > >> To: Juergen Gross
> > >> Cc: Anthony Perard; xen-devel; sstabell...@kernel.org; qemu-
> > >> de...@nongnu.org; kra...@redhat.com
> > >> Subject: Re: [Xen-devel] [PATCH v2] xen: fix qdisk
> > BLKIF_OP_DISCARD
> > >> for
> > >> 32/64 word size mix
> > >>
> > > On 17.06.16 at 11:14,  wrote:
> > >>> In case the word size of the domU and qemu running the qdisk
> > >> backend
> > >>> differ BLKIF_OP_DISCARD will not work reliably, as the request
> > >>> structure in the ring have different layouts for different word 
> > >>> size.
> > >>>
> > >>> Correct this by copying the request structure in case of different
> > >>> word size element by element in the BLKIF_OP_DISCARD case, too.
> > >>>
> > >>> The easiest way to achieve this is to resync hw/block/xen_blkif.h
> > with
> > >>> its original source from the Linux kernel.
> > >>>
> > >>> Signed-off-by: Juergen Gross 
> > >>> ---
> > >>> V2: resync with Linux kernel version of hw/block/xen_blkif.h as
> > >>> suggested by Paul Durrant
> > >>
> > >> Oh, I didn't realize he suggested syncing with the Linux variant.
> > >> Why not with the canonical one? I have to admit that I particularly
> > >> dislike Linux'es strange union-izng, mainly because of it requiring
> > >> this myriad of __attribute__((__packed__)).
> > >>
> > >
> > > Yes, it's truly grotesque and such things should be blown away with
> >  extreme prejudice.
> > 
> >  Sorry, I'm confused now.
> > 
> >  Do you still mandate for the resync or not?
> > 
> >  Resyncing with elimination of all the __packed__ stuff seems not to be
> >  a proper alternative as this would require a major rework.
> > >>>
> > >>> Why? Replacing the existing horribleness with the canonical header
> > (fixed
> > >> for style) might mean a large diff but it should be functionally the 
> > >> same or
> > >> something has gone very seriously wrong. If the extra part you need is
> > not in
> > >> the canonical header then adding this as a second patch seems like a
> > >> reasonable plan.
> > >>
> > >> I think you don't realize that qemu is built using the public headers
> > >> from the Xen build environment. So there is no way to resync with the
> > >> canonical header as this isn't part of the qemu tree.
> > >>
> > >
> > > Now I'm confused... you're posting a patch to hw/block/xen_blkif.h. That's
> > in the QEMU source, right? That's not a Xen public header but is a Linux
> > mangled variant of a Xen public header. So, actually, I guess the question 
> > is
> > why can't this header just go away and QEMU use the canonical header
> > directly from Xen?
> > 
> > No, hw/block/xen_blkif.h is based on the Linux header
> > drivers/block/xen-blkback/common.h which is an add-on header to the
> > canonical-based Linux header include/xen/interface/io/blkif.h
> >
> > >> The header in question is originating from the Linux one which is an
> > >> add-on of the canonical header containing the explicit 32- and 64-bit
> > >> variants of the xenbus protocol and the conversion routines between
> > >> those.
> > >>
> > >> It would be possible to add these parts to the canonical header, but
> > >> do we really want that?
> > >>
> > >
> > > No, we shouldn't be taking Linux brokenness into the canonical header.
> > 
> > Okay, so then 

Re: [Xen-devel] [PATCH] xen: fix qdisk BLKIF_OP_DISCARD for 32/64 word size mix

2016-06-17 Thread Stefano Stabellini
On Thu, 16 Jun 2016, Juergen Gross wrote:
> On 16/06/16 15:07, Stefano Stabellini wrote:
> > On Thu, 16 Jun 2016, Juergen Gross wrote:
> >> On 16/06/16 12:54, Jan Beulich wrote:
> >> On 16.06.16 at 12:02,  wrote:
>  In case the word size of the domU and qemu running the qdisk backend
>  differ BLKIF_OP_DISCARD will not work reliably, as the request
>  structure in the ring have different layouts for different word size.
> 
>  Correct this by copying the request structure in case of different
>  word size element by element in the BLKIF_OP_DISCARD case, too.
> 
>  Signed-off-by: Juergen Gross 
> >>>
> >>> With the indentation (tabs vs blanks) fixed
> >>
> >> Hmm, qemu coding style is to use blanks. I could:
> >> a) leave the patch as is (changed lines indented with blanks)
> >> b) use tabs to indent (style of the modified file up to now)
> >> c) change the style of the file in this patch
> >> d) change the style of the file in a separate patch
> >>
> >> Any preferences?
> > 
> > I would go with d), fixing it to conform with QEMU coding style.
> 
> As I'm about to resync the header with the Linux one as Paul suggested
> it will be more like c). :-)
 
That's not a good idea, as you probably have noticed by now on the other
thread. This patch is OK, you just need to fix the intendation
separately.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 00/15] XSM/FLASK updates for 4.8

2016-06-17 Thread Konrad Rzeszutek Wilk
On Thu, Jun 09, 2016 at 10:47:03AM -0400, Daniel De Graaf wrote:
> Some of these patches have been posted before (patch 11 was posted in
> 2014; an earlier variant of 1-6 and 15 were posted recently as RFC).
> The rest are mostly removal of unused code or other cleanup.

Hey Daniel,

I reviewed most (all?) of the patches. Thanks for posting them!
> 
> FLASK policy updates:
> [PATCH 01/15] flask/policy: split into modules
> [PATCH 02/15] flask/policy: split out rules for system_r
> [PATCH 03/15] flask/policy: move user definitions and constraints
> [PATCH 04/15] flask/policy: remove unused support for binary modules
> [PATCH 05/15] flask/policy: xenstore stubdom policy
> [PATCH 06/15] flask/policy: remove unused example
> 
> Hypervisor updates to the FLASK security server:
> [PATCH 07/15] flask: unify {get,set}vcpucontext permissions
> [PATCH 08/15] flask: remove unused secondary context in ocontext
> [PATCH 09/15] flask: remove unused AVC callback functions
> [PATCH 10/15] flask: remove xen_flask_userlist operation
> [PATCH 11/15] flask: improve unknown permission handling
> 
> Hypervisor updates to the XSM framework:
> [PATCH 12/15] xen/xsm: remove .xsm_initcall.init section
> [PATCH 13/15] xsm: annotate setup functions with __init
> [PATCH 14/15] xsm: clean up unregistration
> [PATCH 15/15] xsm: add a default policy to .init.data
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 15/15] xsm: add a default policy to .init.data

2016-06-17 Thread Konrad Rzeszutek Wilk
On Thu, Jun 09, 2016 at 10:47:18AM -0400, Daniel De Graaf wrote:
> This adds a Kconfig option and support for including the XSM policy from
> tools/flask/policy in the hypervisor so that the bootloader does not
> need to provide a policy to get sane behavior from an XSM-enabled
> hypervisor.  The policy provided by the bootloader, if present, will
> override the built-in policy.
> 
> Enabling this option only builds the policy if checkpolicy is available
> during compilation of the hypervisor; otherwise, it does nothing.  The
> XSM policy is not moved out of tools because that remains the primary
> location for installing and configuring the policy.
> 
> Signed-off-by: Daniel De Graaf 

Reviewed-by: Konrad Rzeszutek Wilk 

Thanks!

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 14/15] xsm: clean up unregistration

2016-06-17 Thread Konrad Rzeszutek Wilk
On Thu, Jun 09, 2016 at 10:47:17AM -0400, Daniel De Graaf wrote:
> The only possible value of original_ops was _xsm_ops, and
> unregister_xsm was never used.
> 
> Signed-off-by: Daniel De Graaf 

Reviewed-by: Konrad Rzeszutek Wilk 

> ---
>  xen/include/xsm/xsm.h|  1 -
>  xen/xsm/flask/flask_op.c |  4 +---
>  xen/xsm/flask/hooks.c|  3 ---
>  xen/xsm/xsm_core.c   | 16 
>  4 files changed, 1 insertion(+), 23 deletions(-)
> 
> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
> index 0d525ec..4b8843d 100644
> --- a/xen/include/xsm/xsm.h
> +++ b/xen/include/xsm/xsm.h
> @@ -750,7 +750,6 @@ extern bool has_xsm_magic(paddr_t);
>  #endif
>  
>  extern int register_xsm(struct xsm_operations *ops);
> -extern int unregister_xsm(struct xsm_operations *ops);
>  
>  extern struct xsm_operations dummy_xsm_ops;
>  extern void xsm_fixup_ops(struct xsm_operations *ops);
> diff --git a/xen/xsm/flask/flask_op.c b/xen/xsm/flask/flask_op.c
> index 3ad4bdc..719c2d7 100644
> --- a/xen/xsm/flask/flask_op.c
> +++ b/xen/xsm/flask/flask_op.c
> @@ -58,8 +58,6 @@ static int flask_security_make_bools(void);
>  
>  extern int ss_initialized;
>  
> -extern struct xsm_operations *original_ops;
> -
>  static void __init parse_flask_param(char *s)
>  {
>  if ( !strcmp(s, "enforcing") )
> @@ -243,7 +241,7 @@ static int flask_disable(void)
>  flask_disabled = 1;
>  
>  /* Reset xsm_ops to the original module. */
> -xsm_ops = original_ops;
> +xsm_ops = _xsm_ops;
>  
>  return 0;
>  }
> diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
> index 596ac0a..5e81ed4 100644
> --- a/xen/xsm/flask/hooks.c
> +++ b/xen/xsm/flask/hooks.c
> @@ -35,8 +35,6 @@
>  #include 
>  #include 
>  
> -struct xsm_operations *original_ops = NULL;
> -
>  static u32 domain_sid(struct domain *dom)
>  {
>  struct domain_security_struct *dsec = dom->ssid;
> @@ -1841,7 +1839,6 @@ __init void flask_init(void)
>  
>  avc_init();
>  
> -original_ops = xsm_ops;
>  if ( register_xsm(_ops) )
>  panic("Flask: Unable to register with XSM");
>  
> diff --git a/xen/xsm/xsm_core.c b/xen/xsm/xsm_core.c
> index 2c4d576..4a264c2 100644
> --- a/xen/xsm/xsm_core.c
> +++ b/xen/xsm/xsm_core.c
> @@ -149,22 +149,6 @@ int __init register_xsm(struct xsm_operations *ops)
>  return 0;
>  }
>  
> -
> -int unregister_xsm(struct xsm_operations *ops)
> -{
> -if ( ops != xsm_ops )
> -{
> -printk("%s: trying to unregister "
> -   "a security_opts structure that is not "
> -   "registered, failing.\n", __FUNCTION__);
> -return -EINVAL;
> -}
> -
> -xsm_ops = _xsm_ops;
> -
> -return 0;
> -}
> -
>  #endif
>  
>  long do_xsm_op (XEN_GUEST_HANDLE_PARAM(xsm_op_t) op)
> -- 
> 2.5.5
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 12/15] xen/xsm: remove .xsm_initcall.init section

2016-06-17 Thread Konrad Rzeszutek Wilk
On Thu, Jun 09, 2016 at 10:47:15AM -0400, Daniel De Graaf wrote:
> Since FLASK is the only implementation of XSM hooks in Xen, using an
> iterated initcall dispatch for setup is overly complex.  Change this to

As such, should the Kconfig file enable the FLASK by default?
Or make the XSM entry have the configuration for FLASK?

Or perhaps make the FLASK be the visible one and select XSM?

> a direct function call to a globally visible function; if additional XSM
> hooks are added in the future, a switching mechanism will be needed
> regardless, and that can be placed in xsm_core.c.

> +config FLASK
> + bool "FLux Advanced Security Kernel support"
> + default y
> + depends on XSM

So this would be 'select XSM' ?
> + ---help---
> +   Enables FLASK (FLux Advanced Security Kernel) as the access control
> +   mechanism used by the XSM framework.  This provides a mandatory access
> +   control framework by which security enforcement, isolation, and
> +   auditing can be achieved with fine granular control via a security
> +   policy.
> +
> +   If unsure, say Y.
> +
> +config FLASK_AVC_STATS
> + def_bool y
> + depends on FLASK
> + ---help---
> +   Maintain statistics on the access vector cache
> +
>  # Enable schedulers
>  menu "Schedulers"
>   visible if EXPERT = "y"
> diff --git a/xen/xsm/xsm_core.c b/xen/xsm/xsm_core.c
> index 634ec98..cb2fdb6 100644
> --- a/xen/xsm/xsm_core.c
> +++ b/xen/xsm/xsm_core.c
> @@ -38,13 +38,7 @@ static inline int verify(struct xsm_operations *ops)
>  
>  static void __init do_xsm_initcalls(void)
>  {
> -xsm_initcall_t *call;
> -call = __xsm_initcall_start;
> -while ( call < __xsm_initcall_end )
> -{
> -(*call) ();
> -call++;
> -}
> +flask_init();

Could you delete do_xsm_initcalls() and make xsm_init() call flask_init() ?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 11/15] flask: improve unknown permission handling

2016-06-17 Thread Konrad Rzeszutek Wilk
On Thu, Jun 09, 2016 at 10:47:14AM -0400, Daniel De Graaf wrote:
> When an unknown domctl, sysctl, or other operation is encountered in the
> FLASK security server, use the allow_unknown bit in the security policy
> to decide if the permission should be allowed or denied.  This bit is
> off by default, but it can be set by using checkpolicy -U allow when
> compiling the policy.  This allows new operations to be tested without
> needing to immediately add security checks; however, it is not flexible
> enough to avoid adding the actual permission checks.  An error message
> is printed to the hypervisor console when this fallback is encountered.

.. and the operation is permitted.

> 
> Signed-off-by: Daniel De Graaf 
> ---
>  xen/xsm/flask/hooks.c| 44 
> +---
>  xen/xsm/flask/include/security.h |  2 ++
>  xen/xsm/flask/ss/policydb.c  |  1 +
>  xen/xsm/flask/ss/policydb.h  |  6 ++
>  xen/xsm/flask/ss/services.c  |  5 +
>  5 files changed, 42 insertions(+), 16 deletions(-)
> 
> diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
> index a8d45e7..3ab3fbf 100644
> --- a/xen/xsm/flask/hooks.c
> +++ b/xen/xsm/flask/hooks.c
> @@ -136,6 +136,23 @@ static int get_irq_sid(int irq, u32 *sid, struct 
> avc_audit_data *ad)
>  return 0;
>  }
>  
> +static int avc_unknown_permission(const char *name, int id)
> +{
> +int rc;

I would add a new line here.
> +if ( !flask_enforcing || security_get_allow_unknown() )
> +{
> +printk(XENLOG_G_WARNING "FLASK: Allowing unknown %s: %d.\n", name, 
> id);
> +rc = 0;
> +}
> +else
> +{
> +printk(XENLOG_G_ERR "FLASK: Denying unknown %s: %d.\n", name, id);
> +rc = -EPERM;
> +}
> +
> +return rc;
> +}
> +

The rest looks OK, but I have a question: Is this how Linux operates?


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 07/15] flask: unify {get, set}vcpucontext permissions

2016-06-17 Thread Konrad Rzeszutek Wilk
On Thu, Jun 09, 2016 at 10:47:10AM -0400, Daniel De Graaf wrote:
> These permissions were initially split because they were in separate
> domctls, but this split is very unlikely to actually provide security
> benefits: it would require a carefully contrived situation for a domain
> to both need access to one type of CPU register and also need to be
> prohibited from accessing another type.
> 

CC-ing Andrew as I believe has been looking in this code when doing
miration and may have an opinion on this.

> Signed-off-by: Daniel De Graaf 

Reviewed-by: Konrad Rzeszutek Wilk 
> ---
>  tools/flask/policy/modules/dom0.te  |  1 -
>  tools/flask/policy/modules/xen.if   |  7 +++
>  xen/xsm/flask/hooks.c   | 20 ++--
>  xen/xsm/flask/policy/access_vectors | 16 ++--
>  4 files changed, 15 insertions(+), 29 deletions(-)
> 
> diff --git a/tools/flask/policy/modules/dom0.te 
> b/tools/flask/policy/modules/dom0.te
> index ef6a986..d228b24 100644
> --- a/tools/flask/policy/modules/dom0.te
> +++ b/tools/flask/policy/modules/dom0.te
> @@ -34,7 +34,6 @@ allow dom0_t dom0_t:domain {
>   setvcpucontext max_vcpus setaffinity getaffinity getscheduler
>   getdomaininfo getvcpuinfo getvcpucontext setdomainmaxmem setdomainhandle
>   setdebugging hypercall settime setaddrsize getaddrsize trigger
> - getextvcpucontext setextvcpucontext getvcpuextstate setvcpuextstate
>   getpodtarget setpodtarget set_misc_info set_virq_handler
>  };
>  allow dom0_t dom0_t:domain2 {
> diff --git a/tools/flask/policy/modules/xen.if 
> b/tools/flask/policy/modules/xen.if
> index 00d1bbb..fd96303 100644
> --- a/tools/flask/policy/modules/xen.if
> +++ b/tools/flask/policy/modules/xen.if
> @@ -47,9 +47,8 @@ define(`declare_build_label', `
>  
>  define(`create_domain_common', `
>   allow $1 $2:domain { create max_vcpus setdomainmaxmem setaddrsize
> - getdomaininfo hypercall setvcpucontext setextvcpucontext
> - getscheduler getvcpuinfo getvcpuextstate getaddrsize
> - getaffinity setaffinity setvcpuextstate };
> + getdomaininfo hypercall setvcpucontext getscheduler
> + getvcpuinfo getaddrsize getaffinity setaffinity };
>   allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim
>   set_max_evtchn set_vnumainfo get_vnumainfo cacheflush
>   psr_cmt_op psr_cat_op soft_reset };
> @@ -94,7 +93,7 @@ define(`migrate_domain_out', `
>   allow $1 domxen_t:mmu map_read;
>   allow $1 $2:hvm { gethvmc getparam irqlevel };
>   allow $1 $2:mmu { stat pageinfo map_read };
> - allow $1 $2:domain { getaddrsize getvcpucontext getextvcpucontext 
> getvcpuextstate pause destroy };
> + allow $1 $2:domain { getaddrsize getvcpucontext pause destroy };
>   allow $1 $2:domain2 gettsc;
>   allow $1 $2:shadow { enable disable logdirty };
>  ')
> diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
> index 20d46c8..a8d45e7 100644
> --- a/xen/xsm/flask/hooks.c
> +++ b/xen/xsm/flask/hooks.c
> @@ -630,10 +630,16 @@ static int flask_domctl(struct domain *d, int cmd)
>  case XEN_DOMCTL_setdomainhandle:
>  return current_has_perm(d, SECCLASS_DOMAIN, DOMAIN__SETDOMAINHANDLE);
>  
> +case XEN_DOMCTL_set_ext_vcpucontext:
> +case XEN_DOMCTL_set_vcpu_msrs:
>  case XEN_DOMCTL_setvcpucontext:
> +case XEN_DOMCTL_setvcpuextstate:
>  return current_has_perm(d, SECCLASS_DOMAIN, DOMAIN__SETVCPUCONTEXT);
>  
> +case XEN_DOMCTL_get_ext_vcpucontext:
> +case XEN_DOMCTL_get_vcpu_msrs:
>  case XEN_DOMCTL_getvcpucontext:
> +case XEN_DOMCTL_getvcpuextstate:
>  return current_has_perm(d, SECCLASS_DOMAIN, DOMAIN__GETVCPUCONTEXT);
>  
>  case XEN_DOMCTL_getvcpuinfo:
> @@ -675,20 +681,6 @@ static int flask_domctl(struct domain *d, int cmd)
>  case XEN_DOMCTL_pin_mem_cacheattr:
>  return current_has_perm(d, SECCLASS_HVM, HVM__CACHEATTR);
>  
> -case XEN_DOMCTL_set_ext_vcpucontext:
> -case XEN_DOMCTL_set_vcpu_msrs:
> -return current_has_perm(d, SECCLASS_DOMAIN, 
> DOMAIN__SETEXTVCPUCONTEXT);
> -
> -case XEN_DOMCTL_get_ext_vcpucontext:
> -case XEN_DOMCTL_get_vcpu_msrs:
> -return current_has_perm(d, SECCLASS_DOMAIN, 
> DOMAIN__GETEXTVCPUCONTEXT);
> -
> -case XEN_DOMCTL_setvcpuextstate:
> -return current_has_perm(d, SECCLASS_DOMAIN, DOMAIN__SETVCPUEXTSTATE);
> -
> -case XEN_DOMCTL_getvcpuextstate:
> -return current_has_perm(d, SECCLASS_DOMAIN, DOMAIN__GETVCPUEXTSTATE);
> -
>  case XEN_DOMCTL_sendtrigger:
>  return current_has_perm(d, SECCLASS_DOMAIN, DOMAIN__TRIGGER);
>  
> diff --git a/xen/xsm/flask/policy/access_vectors 
> b/xen/xsm/flask/policy/access_vectors
> index 3d29042..7e69ede 100644
> --- a/xen/xsm/flask/policy/access_vectors
> +++ b/xen/xsm/flask/policy/access_vectors
> @@ 

Re: [Xen-devel] [PATCH 06/15] flask/policy: remove unused example

2016-06-17 Thread Konrad Rzeszutek Wilk
On Thu, Jun 09, 2016 at 10:47:09AM -0400, Daniel De Graaf wrote:
> The access vectors defined here have never been used by xenstore.
> 
> Signed-off-by: Daniel De Graaf 

Reviewed-by: Konrad Rzeszutek Wilk 
> ---
>  tools/flask/policy/policy/access_vectors   | 23 ++-
>  tools/flask/policy/policy/security_classes |  1 -
>  2 files changed, 2 insertions(+), 22 deletions(-)
> 
> diff --git a/tools/flask/policy/policy/access_vectors 
> b/tools/flask/policy/policy/access_vectors
> index 4fd61f1..d9c69c0 100644
> --- a/tools/flask/policy/policy/access_vectors
> +++ b/tools/flask/policy/policy/access_vectors
> @@ -1,24 +1,5 @@
>  # Locally defined access vectors
>  #
> -# Define access vectors for the security classes defined in security_classes
> +# Define access vectors for the security classes defined in security_classes.
> +# Access vectors defined in this file should not be used by the hypervisor.
>  #
> -
> -# Note: this is an example; the xenstore daemon provided with Xen does
> -# not yet include XSM support, and the exact permissions may be defined
> -# differently if such support is added.
> -class xenstore {
> - # read from keys owned by the target domain (if permissions allow)
> - read
> - # write to keys owned by the target domain (if permissions allow)
> - write
> - # change permissions of a key owned by the target domain
> - chmod
> - # change the owner of a key which was owned by the target domain
> - chown_from
> - # change the owner of a key to the target domain
> - chown_to
> - # access a key owned by the target domain without permission
> - override
> - # introduce a domain
> - introduce
> -}
> diff --git a/tools/flask/policy/policy/security_classes 
> b/tools/flask/policy/policy/security_classes
> index 56595e8..0f0f9f3 100644
> --- a/tools/flask/policy/policy/security_classes
> +++ b/tools/flask/policy/policy/security_classes
> @@ -5,4 +5,3 @@
>  # security policy.
>  #
>  # Access vectors for these classes must be defined in the access_vectors 
> file.
> -class xenstore
> -- 
> 2.5.5
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 05/15] flask/policy: xenstore stubdom policy

2016-06-17 Thread Konrad Rzeszutek Wilk
On Thu, Jun 09, 2016 at 10:47:08AM -0400, Daniel De Graaf wrote:
> This adds the xenstore_t type to the example policy for use by a
> xenstore stub domain.
> 
> Signed-off-by: Daniel De Graaf 
> ---
>  tools/flask/policy/modules/modules.conf |  3 +++
>  tools/flask/policy/modules/xenstore.te  | 24 
>  2 files changed, 27 insertions(+)
>  create mode 100644 tools/flask/policy/modules/xenstore.te
> 
> diff --git a/tools/flask/policy/modules/modules.conf 
> b/tools/flask/policy/modules/modules.conf
> index 9aac6a0..dd10884 100644
> --- a/tools/flask/policy/modules/modules.conf
> +++ b/tools/flask/policy/modules/modules.conf
> @@ -33,6 +33,9 @@ nomigrate = on
>  # Example device policy.  Also see policy/device_contexts.
>  nic_dev = on
>  
> +# Xenstore stub domain.

I would also add "(see init-xenstore-domain)."

And with that Reviewed-by: Konrad Rzeszutek Wilk 

(albeit take it with a grain of salt.  I am just barely touching the SELinux 
policy
language).

> +xenstore = on
> +
>  # This allows any domain type to be created using the system_r role.  When 
> it is
>  # disabled, domains not using the default types (dom0_t, domU_t, dm_dom_t) 
> must
>  # use another role (such as vm_r from the vm_role module below).
> diff --git a/tools/flask/policy/modules/xenstore.te 
> b/tools/flask/policy/modules/xenstore.te
> new file mode 100644
> index 000..519566a
> --- /dev/null
> +++ b/tools/flask/policy/modules/xenstore.te
> @@ -0,0 +1,24 @@
> +
> +#
> +# Xenstore stubdomain
> +#
> +
> +declare_singleton_domain(xenstore_t)
> +create_domain(dom0_t, xenstore_t)
> +manage_domain(dom0_t, xenstore_t)
> +
> +# Xenstore requires the global VIRQ for domain destroy operations
> +allow dom0_t xenstore_t:domain set_virq_handler;
> +# Current xenstore stubdom uses the hypervisor console, not "xl console"
> +allow xenstore_t xen_t:xen writeconsole;
> +# Xenstore queries domaininfo on all domains
> +allow xenstore_t domain_type:domain getdomaininfo;
> +
> +# As a shortcut, the following 3 rules are used instead of adding a 
> domain_comms
> +# rule between xenstore_t and every domain type that talks to xenstore
> +create_channel(xenstore_t, domain_type, xenstore_t_channel)
> +allow event_type xenstore_t: event bind;
> +allow xenstore_t domain_type:grant { map_read map_write unmap };
> +
> +# Xenstore is a utility domain, so it should use the system role
> +role system_r types xenstore_t;
> -- 
> 2.5.5
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 03/15] flask/policy: move user definitions and constraints into modules

2016-06-17 Thread Konrad Rzeszutek Wilk
> diff --git a/tools/flask/policy/modules/modules.conf 
> b/tools/flask/policy/modules/modules.conf
> index d875dbf..9aac6a0 100644
> --- a/tools/flask/policy/modules/modules.conf
> +++ b/tools/flask/policy/modules/modules.conf
> @@ -34,6 +34,13 @@ nomigrate = on
>  nic_dev = on
>  
>  # This allows any domain type to be created using the system_r role.  When 
> it is
> -# disabled, domains not using the default types (dom0_t and domU_t) must use
> -# another role (such as vm_r) from the vm_role module.
> +# disabled, domains not using the default types (dom0_t, domU_t, dm_dom_t) 
> must
> +# use another role (such as vm_r from the vm_role module below).
>  all_system_role = on
> +
> +# Example users, roles, and constraints for user-based separation.
> +# 
> +# The three users defined here can set up grant/event channel communication
> +# (vchan, device frontend/backend) between their own VMs, but cannot set up a
> +# channel to a VM under a different user.
> +vm_role = on

So should this be off? As by default we would want all_system_role ?

Ah wait, it can be loaded - even if not used.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] xen: make available hvm_fep to non-debug build as well

2016-06-17 Thread Wei Liu
On Fri, Jun 17, 2016 at 08:55:04AM -0600, Jan Beulich wrote:
> >>> On 17.06.16 at 16:44,  wrote:
> > On Fri, Jun 17, 2016 at 08:40:40AM -0600, Jan Beulich wrote:
> >> >>> On 17.06.16 at 16:32,  wrote:
> >> > On Fri, Jun 17, 2016 at 08:28:34AM -0600, Jan Beulich wrote:
> >> >> >>> On 17.06.16 at 15:51,  wrote:
> >> >> > On 17/06/16 12:34, Jan Beulich wrote:
> >> >> > On 17.06.16 at 13:05,  wrote:
> >> >> >>> --- a/xen/arch/x86/Kconfig
> >> >> >>> +++ b/xen/arch/x86/Kconfig
> >> >> >>> @@ -59,6 +59,20 @@ config BIGMEM
> >> >> >>>  
> >> >> >>>   If unsure, say N.
> >> >> >>>  
> >> >> >>> +config HVM_FEP
> >> >> >>> +   bool "HVM Forced Emulation Prefix support"
> >> >> >> And no "if EXPERT"?
> >> >> > 
> >> >> > Is that wise? Does that mean we get a default on option which can't be
> >> >> > selected in menuconfig?
> >> >> 
> >> >> Oh, that's true. The default should be off in any event.
> >> >> 
> >> > 
> >> > So it depends on config EXPERT, defaults to off.
> >> > 
> >> > Fine by me in any event, just need a final decision and I will make the
> >> > change.
> >> 
> >> One more adjustment: I guess it should default to DEBUG, to retain
> >> current behavior.
> >> 
> > 
> > I think I wrote this series to make this feature available to non-debug
> > builds.  I don't really get the rationale behind this request. Did you
> > change your mind during the last few days?
> 
> Making it available doesn't mean enable it by default for everyone.
> People wanting it in non-debug builds should be able to get it, but
> unaware people shouldn't be caught by surprise. (And btw.,
> defaulting it to DEBUG is a relaxation over defaulting it to off.)
> 

I see. I misread as "it should depend on DEBUG". Sorry about that.

I will make the change as requested.

Wei.

> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] xen: make available hvm_fep to non-debug build as well

2016-06-17 Thread Jan Beulich
>>> On 17.06.16 at 16:44,  wrote:
> On Fri, Jun 17, 2016 at 08:40:40AM -0600, Jan Beulich wrote:
>> >>> On 17.06.16 at 16:32,  wrote:
>> > On Fri, Jun 17, 2016 at 08:28:34AM -0600, Jan Beulich wrote:
>> >> >>> On 17.06.16 at 15:51,  wrote:
>> >> > On 17/06/16 12:34, Jan Beulich wrote:
>> >> > On 17.06.16 at 13:05,  wrote:
>> >> >>> --- a/xen/arch/x86/Kconfig
>> >> >>> +++ b/xen/arch/x86/Kconfig
>> >> >>> @@ -59,6 +59,20 @@ config BIGMEM
>> >> >>>  
>> >> >>> If unsure, say N.
>> >> >>>  
>> >> >>> +config HVM_FEP
>> >> >>> + bool "HVM Forced Emulation Prefix support"
>> >> >> And no "if EXPERT"?
>> >> > 
>> >> > Is that wise? Does that mean we get a default on option which can't be
>> >> > selected in menuconfig?
>> >> 
>> >> Oh, that's true. The default should be off in any event.
>> >> 
>> > 
>> > So it depends on config EXPERT, defaults to off.
>> > 
>> > Fine by me in any event, just need a final decision and I will make the
>> > change.
>> 
>> One more adjustment: I guess it should default to DEBUG, to retain
>> current behavior.
>> 
> 
> I think I wrote this series to make this feature available to non-debug
> builds.  I don't really get the rationale behind this request. Did you
> change your mind during the last few days?

Making it available doesn't mean enable it by default for everyone.
People wanting it in non-debug builds should be able to get it, but
unaware people shouldn't be caught by surprise. (And btw.,
defaulting it to DEBUG is a relaxation over defaulting it to off.)

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] xen: make available hvm_fep to non-debug build as well

2016-06-17 Thread Jan Beulich
>>> On 17.06.16 at 15:50,  wrote:
> On 17/06/16 12:05, Wei Liu wrote:
>> @@ -182,6 +183,28 @@ static int __init hvm_enable(void)
>>  if ( !opt_altp2m_enabled )
>>  hvm_funcs.altp2m_supported = 0;
>>  
>> +if ( opt_hvm_fep )
>> +{
>> +unsigned int i, j;
>> +
>> +printk("**\n");
>> +printk("*** WARNING: HVM FORCED EMULATION PREFIX IS 
>> AVAILABLE\n");
>> +printk("*** This option is *ONLY* intended to aid testing of 
>> Xen.\n");
>> +printk("*** It has implications on the security of the 
>> system.\n");
>> +printk("*** Please *DO NOT* use this in production.\n");
>> +printk("**\n");
>> +for ( i = 0; i < 3; i++ )
>> +{
>> +printk("%u... ", 3 - i);
>> +for ( j = 0; j < 100; j++ )
>> +{
>> +process_pending_softirqs();
>> +mdelay(10);
>> +}
>> +}
> 
> I would drop this 3s wait, and I plan to drop it from sync_console as
> well.  It isn't of any practical use, even if you are watching the
> serial console on boot, and just leads to an unnecessary delay.  Worse,
> the two delays are cumulative.

I think the delay serves a purpose (for the messages to not scroll by
unnoticed), but I do appreciate that having two such delays is not
really desirable. Considering that I would also like these messages to
go into .init.rodata (also those for the sync_console warning), perhaps
worth a little bit of abstraction?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] xen: make available hvm_fep to non-debug build as well

2016-06-17 Thread Wei Liu
On Fri, Jun 17, 2016 at 08:40:40AM -0600, Jan Beulich wrote:
> >>> On 17.06.16 at 16:32,  wrote:
> > On Fri, Jun 17, 2016 at 08:28:34AM -0600, Jan Beulich wrote:
> >> >>> On 17.06.16 at 15:51,  wrote:
> >> > On 17/06/16 12:34, Jan Beulich wrote:
> >> > On 17.06.16 at 13:05,  wrote:
> >> >>> --- a/xen/arch/x86/Kconfig
> >> >>> +++ b/xen/arch/x86/Kconfig
> >> >>> @@ -59,6 +59,20 @@ config BIGMEM
> >> >>>  
> >> >>>  If unsure, say N.
> >> >>>  
> >> >>> +config HVM_FEP
> >> >>> +  bool "HVM Forced Emulation Prefix support"
> >> >> And no "if EXPERT"?
> >> > 
> >> > Is that wise? Does that mean we get a default on option which can't be
> >> > selected in menuconfig?
> >> 
> >> Oh, that's true. The default should be off in any event.
> >> 
> > 
> > So it depends on config EXPERT, defaults to off.
> > 
> > Fine by me in any event, just need a final decision and I will make the
> > change.
> 
> One more adjustment: I guess it should default to DEBUG, to retain
> current behavior.
> 

I think I wrote this series to make this feature available to non-debug
builds.  I don't really get the rationale behind this request. Did you
change your mind during the last few days?

Wei.

> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] xen: make available hvm_fep to non-debug build as well

2016-06-17 Thread Jan Beulich
>>> On 17.06.16 at 16:32,  wrote:
> On Fri, Jun 17, 2016 at 08:28:34AM -0600, Jan Beulich wrote:
>> >>> On 17.06.16 at 15:51,  wrote:
>> > On 17/06/16 12:34, Jan Beulich wrote:
>> > On 17.06.16 at 13:05,  wrote:
>> >>> --- a/xen/arch/x86/Kconfig
>> >>> +++ b/xen/arch/x86/Kconfig
>> >>> @@ -59,6 +59,20 @@ config BIGMEM
>> >>>  
>> >>>If unsure, say N.
>> >>>  
>> >>> +config HVM_FEP
>> >>> +bool "HVM Forced Emulation Prefix support"
>> >> And no "if EXPERT"?
>> > 
>> > Is that wise? Does that mean we get a default on option which can't be
>> > selected in menuconfig?
>> 
>> Oh, that's true. The default should be off in any event.
>> 
> 
> So it depends on config EXPERT, defaults to off.
> 
> Fine by me in any event, just need a final decision and I will make the
> change.

One more adjustment: I guess it should default to DEBUG, to retain
current behavior.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] xen: make available hvm_fep to non-debug build as well

2016-06-17 Thread Wei Liu
On Fri, Jun 17, 2016 at 08:28:34AM -0600, Jan Beulich wrote:
> >>> On 17.06.16 at 15:51,  wrote:
> > On 17/06/16 12:34, Jan Beulich wrote:
> > On 17.06.16 at 13:05,  wrote:
> >>> --- a/xen/arch/x86/Kconfig
> >>> +++ b/xen/arch/x86/Kconfig
> >>> @@ -59,6 +59,20 @@ config BIGMEM
> >>>  
> >>> If unsure, say N.
> >>>  
> >>> +config HVM_FEP
> >>> + bool "HVM Forced Emulation Prefix support"
> >> And no "if EXPERT"?
> > 
> > Is that wise? Does that mean we get a default on option which can't be
> > selected in menuconfig?
> 
> Oh, that's true. The default should be off in any event.
> 

So it depends on config EXPERT, defaults to off.

Fine by me in any event, just need a final decision and I will make the
change.

Wei.

> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] qemu-traditional build problem for Xen 4.7.0 RC6

2016-06-17 Thread Ian Jackson
Hao, Xudong writes ("RE: [Xen-devel] qemu-traditional build problem for Xen 
4.7.0 RC6"):
> Actually it's not compile error, it's a repo clone issue. Error log is:
> 
> ...
> Initialized empty Git repository in 
> /home/www/builds_xen_unstable/xen-src-xen-4.7.0-rc6-20160616/tools/qemu-xen-traditional-dir-remote.tmp/.git/
> fatal: reference is not a tree: 698d6d6f8d095edadb0c23612b552a89dd3eee4c
> make[2]: *** [qemu-xen-traditional-dir-find] Error 128
> ...
> 
> The issue get root caused and resolved now. 
> The failure is that we set up a local repo mirror for qemu-xen-traditional, 
> but the mirror only sync "master" branch, not include "stable-4.7" branch, so 
> commit 698d6d6* didn't be identified when doing checkout. 

OK, thanks for clearing that up.

Indeed, the "master" branch won't (in general) contain all the
relevant commits, because stable branches like 4.7 do diverge.

Regards,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] xen: make available hvm_fep to non-debug build as well

2016-06-17 Thread Jan Beulich
>>> On 17.06.16 at 15:51,  wrote:
> On 17/06/16 12:34, Jan Beulich wrote:
> On 17.06.16 at 13:05,  wrote:
>>> --- a/xen/arch/x86/Kconfig
>>> +++ b/xen/arch/x86/Kconfig
>>> @@ -59,6 +59,20 @@ config BIGMEM
>>>  
>>>   If unsure, say N.
>>>  
>>> +config HVM_FEP
>>> +   bool "HVM Forced Emulation Prefix support"
>> And no "if EXPERT"?
> 
> Is that wise? Does that mean we get a default on option which can't be
> selected in menuconfig?

Oh, that's true. The default should be off in any event.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen: sched: use default scheduler upon an invalid "sched="

2016-06-17 Thread Wei Liu
On Fri, Jun 17, 2016 at 12:31:00PM +0200, Dario Faggioli wrote:
> instead of just the first scheduler we find in the array.
> 
> In fact, right now, if someone makes a typo when passing
> the "sched=" command line option to Xen, we (with all
> schedulers configured in) pick ARINC653, which is most
> likely not what one would expect.
> 
> Go for the default scheduler instead.
> 
> Signed-off-by: Dario Faggioli 
> ---
> Cc: George Dunlap 
> Cc: Jan Beulich 
> Cc: Doug Goldstein 
> Cc: Jonathan Creekmore 
> ---
> I don't think I'm going to ask for this to be put in 4.7,
> as I don't want to give Wei an heart attack... :-)
> 

I will just say "Meh, let's backport this." :-P

> Considering it for backporting should be enough, IMO.
> ---
>  xen/common/schedule.c |3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index 5e35310..7ac12d3 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -1625,7 +1625,8 @@ void __init scheduler_init(void)
>  {
>  printk("Could not find scheduler: %s\n", opt_sched);
>  for ( i = 0; i < NUM_SCHEDULERS; i++ )
> -if ( schedulers[i] )
> +if ( schedulers[i] &&
> + !strcmp(schedulers[i]->opt_name, CONFIG_SCHED_DEFAULT) )
>  {
>  ops = *schedulers[i];
>  break;
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] hvmloader: don't hard-code IO-APIC parameters

2016-06-17 Thread Jan Beulich
>>> On 17.06.16 at 13:51,  wrote:
> On 06/17/2016 02:00 AM, Jan Beulich wrote:
> On 16.06.16 at 18:49,  wrote:
>>> On 06/16/2016 05:40 AM, Jan Beulich wrote:
 The IO-APIC address has variable bits determined by the PCI-to-ISA
 bridge, and the IO-APIC version should be read from the IO-APIC. (Note
 that there's still implicit rather than explicit agreement on the
 IO-APIC base address between qemu and the hypervisor.)
>>> It probably doesn't matter now for PVHv2 since we are not going to
>>> initially emulate IOAPIC but when/if we do, how do we query for base
>>> address and version? Should we just rely on the same implicit agreement?
>> If this refers to the sentence in parentheses, then this was really
>> meant to indicate a work item. I.e. qemu should negotiate with
>> Xen. For PVHv2 this would then probably mean for hvmloader to
>> query Xen (via new hypercall).
> 
> We are not using hvmloader for PVH so I am not sure how that would work.
> You mean the toolstack?

Oh, right. Whoever puts the ACPI tables in place.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH XTF] tests: add fep test

2016-06-17 Thread Wei Liu
Signed-off-by: Wei Liu 
---
 tests/fep/Makefile | 12 
 tests/fep/main.c   | 31 +++
 2 files changed, 43 insertions(+)
 create mode 100644 tests/fep/Makefile
 create mode 100644 tests/fep/main.c

diff --git a/tests/fep/Makefile b/tests/fep/Makefile
new file mode 100644
index 000..8702123
--- /dev/null
+++ b/tests/fep/Makefile
@@ -0,0 +1,12 @@
+MAKEFLAGS += -r
+ROOT := $(abspath $(CURDIR)/../..)
+
+include $(ROOT)/build/common.mk
+
+NAME  := fep
+CATEGORY  := utility
+TEST-ENVS := $(HVM_ENVIRONMENTS)
+
+obj-perenv += main.o
+
+include $(ROOT)/build/gen.mk
diff --git a/tests/fep/main.c b/tests/fep/main.c
new file mode 100644
index 000..34a93c0
--- /dev/null
+++ b/tests/fep/main.c
@@ -0,0 +1,31 @@
+/**
+ * @file tests/fep/main.c
+ * @ref test-fep
+ *
+ * @page test-fep FEP
+ *
+ * Returns SUCCESS if FEP is available, FAILURE if not.
+ *
+ * @sa tests/fep/main.c
+ */
+#include 
+
+void test_main(void)
+{
+printk("Test availability of HVM forced emulation prefix\n");
+
+if ( xtf_has_fep )
+xtf_success(NULL);
+else
+xtf_failure(NULL);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen: sched: use default scheduler upon an invalid "sched="

2016-06-17 Thread George Dunlap
On 17/06/16 11:31, Dario Faggioli wrote:
> instead of just the first scheduler we find in the array.
> 
> In fact, right now, if someone makes a typo when passing
> the "sched=" command line option to Xen, we (with all
> schedulers configured in) pick ARINC653, which is most
> likely not what one would expect.
> 
> Go for the default scheduler instead.
> 
> Signed-off-by: Dario Faggioli 

Acked-by: George Dunlap 

I'll check it in here shortly.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen: sched: use default scheduler upon an invalid "sched="

2016-06-17 Thread Doug Goldstein
On 6/17/16 5:31 AM, Dario Faggioli wrote:
> instead of just the first scheduler we find in the array.
> 
> In fact, right now, if someone makes a typo when passing
> the "sched=" command line option to Xen, we (with all
> schedulers configured in) pick ARINC653, which is most
> likely not what one would expect.
> 
> Go for the default scheduler instead.
> 
> Signed-off-by: Dario Faggioli 
> ---
> Cc: George Dunlap 
> Cc: Jan Beulich 
> Cc: Doug Goldstein 
> Cc: Jonathan Creekmore 

Reviewed-by: Doug Goldstein 

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 95858: tolerable all pass - PUSHED

2016-06-17 Thread osstest service owner
flight 95858 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/95858/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  2f7b56c1a7b1ee92ff5f92888723e380412dd3ab
baseline version:
 xen  c1cee3ff93c1be525443472dd4eebe230b57ece9

Last test of basis95853  2016-06-17 09:14:43 Z0 days
Testing same since95858  2016-06-17 12:02:01 Z0 days1 attempts


People who touched revisions under test:
  David Scott 
  Dongli Zhang 
  Ian Jackson 
  Juergen Gross 
  Julien Grall 
  Peng Fan 
  Roger Pau Monné 
  Shanker Donthineni 
  Shannon Zhao 
  Stefano Stabellini 
  Wei Chen 
  Wei Liu 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=2f7b56c1a7b1ee92ff5f92888723e380412dd3ab
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
2f7b56c1a7b1ee92ff5f92888723e380412dd3ab
+ branch=xen-unstable-smoke
+ revision=2f7b56c1a7b1ee92ff5f92888723e380412dd3ab
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.7-testing
+ '[' x2f7b56c1a7b1ee92ff5f92888723e380412dd3ab = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo 

Re: [Xen-devel] [PATCH v2 2/2] xen: make available hvm_fep to non-debug build as well

2016-06-17 Thread Andrew Cooper
On 17/06/16 12:05, Wei Liu wrote:
> Originally hvm_fep was guarded by NDEBUG, which means it was only
> available to debug builds.
>
> However there is value to have it for non-debug builds as well. User can
> use that to run tests in setup that replicates production setup.
>
> Make it clear with a sync_console style warning that this option can't
> be used in production setup. Update command line documentation
> accordingly. Finally mark Xen as tainted when this option is enabled.
>
> Add a kconfig option under x86 to configure hvm_fep.
>
> Signed-off-by: Wei Liu 
> ---
> Cc: Andrew Cooper 
> Cc: Jan Beulich 
> Cc: Doug Goldstein 
>
> v2:
> 1. unsigned -> unsigned int
> 2. %d -> %u
> 3. Add spaces around "-"
> 4. Update warning message
> 5. Only taint hv when fep is used
> 6. Add kconfig option
> ---
>  docs/misc/xen-command-line.markdown |  8 ++--
>  xen/arch/x86/Kconfig| 14 ++
>  xen/arch/x86/hvm/hvm.c  | 28 ++--
>  xen/common/kernel.c |  6 --
>  xen/include/asm-x86/hvm/hvm.h   |  2 +-
>  xen/include/xen/lib.h   |  1 +
>  6 files changed, 52 insertions(+), 7 deletions(-)
>
> diff --git a/docs/misc/xen-command-line.markdown 
> b/docs/misc/xen-command-line.markdown
> index fed732c..dc53e24 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -878,8 +878,12 @@ Recognized in debug builds of the hypervisor only.
>  Allow use of the Forced Emulation Prefix in HVM guests, to allow emulation of
>  arbitrary instructions.
>  
> -This option is intended for development purposes, and is only available in
> -debug builds of the hypervisor.
> +This option is intended for development and testing purposes.
> +
> +*Warning*
> +As this feature opens up the instruction emulator to HVM guest, don't

"to arbitrary instruction from an HVM guest,".  It is an important
distinction, because all guest can enter the emulator in other ways as
part of normal operation.

> +use this in production system. No security support is provided when
> +this flag is set.
>  
>  ### hvm\_port80
>  > `= `
> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
> index 73f79cc..5e3b04a 100644
> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -59,6 +59,20 @@ config BIGMEM
>  
> If unsure, say N.
>  
> +config HVM_FEP
> + bool "HVM Forced Emulation Prefix support"
> + default y
> + ---help---
> +
> +   Compiles in a feature that allows HVM guest to enter
> +   instruction emulator with forced emulation prefix.

"allows HVM guests to arbitrarily exercise the instruction emulator." 
There are other ways in which guests can enter the instruction emulator,
but they are specifically limited in nature.

I would also have a separate paragraph stating:

"This is strictly for testing purposes, and not appropriate for use in
production."

> +
> +   This feature can only be enabled during boot time with
> +   appropriate hypervisor command line option. Please read
> +   hypervisor command line documentation before trying to use
> +   this feature.
> +
> +   If unsure, say Y.
>  endmenu
>  
>  source "common/Kconfig"
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 78db903..373b78e 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -37,6 +37,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -95,9 +96,9 @@ unsigned long __section(".bss.page_aligned")
>  static bool_t __initdata opt_hap_enabled = 1;
>  boolean_param("hap", opt_hap_enabled);
>  
> -#ifndef opt_hvm_fep
> +#if CONFIG_HVM_FEP
>  /* Permit use of the Forced Emulation Prefix in HVM guests */
> -bool_t opt_hvm_fep;
> +bool_t __read_mostly opt_hvm_fep;
>  boolean_param("hvm_fep", opt_hvm_fep);
>  #endif
>  
> @@ -182,6 +183,28 @@ static int __init hvm_enable(void)
>  if ( !opt_altp2m_enabled )
>  hvm_funcs.altp2m_supported = 0;
>  
> +if ( opt_hvm_fep )
> +{
> +unsigned int i, j;
> +
> +printk("**\n");
> +printk("*** WARNING: HVM FORCED EMULATION PREFIX IS 
> AVAILABLE\n");
> +printk("*** This option is *ONLY* intended to aid testing of 
> Xen.\n");
> +printk("*** It has implications on the security of the 
> system.\n");
> +printk("*** Please *DO NOT* use this in production.\n");
> +printk("**\n");
> +for ( i = 0; i < 3; i++ )
> +{
> +printk("%u... ", 3 - i);
> +for ( j = 0; j < 100; j++ )
> +{
> +process_pending_softirqs();
> +mdelay(10);
> +}
> +}

I would drop this 3s wait, and I plan to drop it from sync_console as
well.  It 

Re: [Xen-devel] [PATCH] xen: sched: use default scheduler upon an invalid "sched="

2016-06-17 Thread Jonathan Creekmore

> On Jun 17, 2016, at 5:31 AM, Dario Faggioli  wrote:
> 
> instead of just the first scheduler we find in the array.
> 
> In fact, right now, if someone makes a typo when passing
> the "sched=" command line option to Xen, we (with all
> schedulers configured in) pick ARINC653, which is most
> likely not what one would expect.
> 
> Go for the default scheduler instead.
> 
> Signed-off-by: Dario Faggioli 
> ---
> Cc: George Dunlap 
> Cc: Jan Beulich 
> Cc: Doug Goldstein 
> Cc: Jonathan Creekmore 

Reviewed-By: Jonathan Creekmore 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 95833: regressions - FAIL

2016-06-17 Thread osstest service owner
flight 95833 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/95833/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-ovmf-amd64 17 guest-start/debianhvm.repeat fail REGR. 
vs. 94748
 test-amd64-amd64-xl-qemuu-ovmf-amd64 17 guest-start/debianhvm.repeat fail 
REGR. vs. 94748

version targeted for testing:
 ovmf d8d217c57632549d99256f134a4cee6be8dc67bb
baseline version:
 ovmf dc99315b8732b6e3032d01319d3f534d440b43d0

Last test of basis94748  2016-05-24 22:43:25 Z   23 days
Failing since 94750  2016-05-25 03:43:08 Z   23 days   40 attempts
Testing same since95833  2016-06-16 12:53:07 Z1 days1 attempts


People who touched revisions under test:
  Ard Biesheuvel 
  Chao Zhang 
  Cinnamon Shia 
  Cohen, Eugene 
  Dandan Bi 
  Darbin Reyes 
  Eric Dong 
  Eugene Cohen 
  Evan Lloyd 
  Fu Siyuan 
  Fu, Siyuan 
  Gary Li 
  Gary Lin 
  Giri P Mudusuru 
  Hao Wu 
  Hegde Nagaraj P 
  hegdenag 
  Heyi Guo 
  Jeff Fan 
  Jiaxin Wu 
  Jiewen Yao 
  Katie Dellaquila 
  Laszlo Ersek 
  Liming Gao 
  lushifex 
  Marvin H?user 
  Marvin Haeuser 
  Maurice Ma 
  Michael Zimmermann 
  Ruiyu Ni 
  Ryan Harkin 
  Sami Mujawar 
  Satya Yarlagadda 
  Sriram Subramanian 
  Star Zeng 
  Tapan Shah 
  Thomas Palmer 
  Yonghong Zhu 
  Zhang, Chao B 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 2140 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] xen: make available hvm_fep to non-debug build as well

2016-06-17 Thread Wei Liu
On Fri, Jun 17, 2016 at 05:34:07AM -0600, Jan Beulich wrote:
> >>> On 17.06.16 at 13:05,  wrote:
> > --- a/xen/arch/x86/Kconfig
> > +++ b/xen/arch/x86/Kconfig
> > @@ -59,6 +59,20 @@ config BIGMEM
> >  
> >   If unsure, say N.
> >  
> > +config HVM_FEP
> > +   bool "HVM Forced Emulation Prefix support"
> 
> And no "if EXPERT"?
> 

Done.

> > @@ -95,9 +96,9 @@ unsigned long __section(".bss.page_aligned")
> >  static bool_t __initdata opt_hap_enabled = 1;
> >  boolean_param("hap", opt_hap_enabled);
> >  
> > -#ifndef opt_hvm_fep
> > +#if CONFIG_HVM_FEP
> 
> #ifdef please. And I think this would better be left alone anyway
> (and then the comment only applies to the other instance).
> 

Change this instance back to what is was and fix the other.

Wei.

> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue (+ crash logic )

2016-06-17 Thread Julien Grall

Hi Quan,

On 17/06/16 09:51, Xu, Quan wrote:

+ arm/amd maintainers..

On June 01, 2016 5:05 PM, Xu, Quan  wrote:

If Device-TLB flush timed out, we hide the target ATS device immediately and
crash the domain owning this ATS device. If impacted domain is hardware
domain, just throw out a warning.

By hiding the device, we make sure it can't be assigned to any domain any
longer (see device_assigned).


DomU (other than Dom0) gets crashed when a device IOTLB flush times out. I 
suppose that's what you will want on ARM/AMD then too.


Correct it is what we want for ARM :).


We need to move up the crash logic , as similar as iommu_map_page() / 
iommu_unmap_page().

 - add the crash logic to iommu_iotlb_flush() / iommu_iotlb_flush_all().

 - when IOMMU/MMU share page tables, we need to fix it one by one.
 -- on amd, we need to add the crash logic to amd_iommu_flush_pages().
 -- on intel, we need to add the crash logic to iommu_pte_flush().
 -- on arm, we benefit that we add the crash logic to 
iommu_iotlb_flush().


Right.




Taken together, we need to add crash logic to
  iommu_iotlb_flush() / iommu_iotlb_flush_all() / 
iommu_pte_flush() / amd_iommu_flush_pages().


For iommu_iotlb_* yes as it is common code. I don't know for the others.

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RESEND 05/14] libxl/arm: Construct ACPI GTDT table

2016-06-17 Thread Julien Grall



On 17/06/16 10:15, Stefano Stabellini wrote:

On Fri, 17 Jun 2016, Julien Grall wrote:

Hello Shannon,

On 17/06/16 04:29, Shannon Zhao wrote:

On 2016/6/6 19:40, Stefano Stabellini wrote:

ACPI tables for ARM guests should be user configurable: acpi=1 in the VM
config file enables them, with default off.

While the configuration "acpi" already exists for HVM guest
configuration, do we plan to reuse it or use a new name, e.g. arm_acpi?


We always try to re-use options unless their meaning are completely different.


I agree. "acpi" is a good name for it. I would avoid introducing
something like "arm_acpi".


BTW, this value should be default false for at least 32-bit domain.

I am not sure if we agreed what should be the default for 64-bit domain.

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/7] vm-event: introduce vm_event_vcpu_enter

2016-06-17 Thread Julien Grall

Hello,

On 17/06/16 12:40, Corneliu ZUZU wrote:

On 6/17/2016 11:55 AM, Julien Grall wrote:

On 16/06/16 15:08, Corneliu ZUZU wrote:

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index d31f821..ba248c8 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -19,6 +19,7 @@
  #include 
  #include 
  #include 
+#include 

  #include 
  #include 
@@ -251,6 +252,8 @@ static void schedule_tail(struct vcpu *prev)

  ctxt_switch_to(current);

+vm_event_vcpu_enter(current);
+
  local_irq_enable();

  context_saved(prev);
@@ -296,7 +299,7 @@ void context_switch(struct vcpu *prev, struct
vcpu *next)

  void continue_running(struct vcpu *same)
  {
-/* Nothing to do */
+vm_event_vcpu_enter(same);
  }

  void sync_local_execstate(void)


From my understanding of the commit message, vm_event_vcpu_enter
should be called before returning to the guest. The scheduling
functions are not called every-time Xen is returning to the guest. So
if you want to do every time Xen is re-entering to the guest, then
this should be done in leave_hypervisor_tail.

Regards,



Which is also done, vm_event_vcpu_enter is called from
leave_hypervisor_tail as well.


Sorry I did not spot this one.


Are you saying that calling from leave_hypervisor_tail is enough and
that the schedule_tail call is not necessary?


leave_hypervisor_tail will always be called before returning to the 
guest. So a call to vm_event_vcpu_enter in this function will be sufficient.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 7/7] vm-event/arm: implement support for control-register write vm-events

2016-06-17 Thread Julien Grall

Hello Corneliu,

On 17/06/16 11:36, Corneliu ZUZU wrote:

On 6/16/2016 7:49 PM, Julien Grall wrote:

On 16/06/16 15:13, Corneliu ZUZU wrote:


[...]


Please mention that PRRR and NMRR are aliased to respectively MAIR0
and MAIR1. This will avoid to spend time trying to understanding why
the spec says they are trapped but you don't "handle" them.


I mentioned that in traps.h (see "AKA" comments). Will put the same
comment here then.


I noticed it later. But it was not obvious to find.

[...]


diff --git a/xen/arch/arm/vm_event.c b/xen/arch/arm/vm_event.c
new file mode 100644
index 000..3f23fec
--- /dev/null
+++ b/xen/arch/arm/vm_event.c
@@ -0,0 +1,112 @@


[...]


+#include 
+#include 
+
+#if CONFIG_ARM_64
+
+#define MWSINF_SCTLR32,SCTLR_EL1
+#define MWSINF_TTBR064,TTBR0_EL1
+#define MWSINF_TTBR164,TTBR1_EL1
+#define MWSINF_TTBCR64,TCR_EL1
+
+#elif CONFIG_ARM_32
+
+#define MWSINF_SCTLR32,SCTLR
+#define MWSINF_TTBR064,TTBR0
+#define MWSINF_TTBR164,TTBR1


The values are the same as for arm64 (*_EL1 is aliased to * for
arm32). Please avoid duplication.


(see comment below about later reply)


Your later reply explain why you did not expose TTBR*_32 to ARM64, but 
does not explain why the 3 define above is the same as the ARM64.







+#define MWSINF_TTBR0_32 32,TTBR0_32
+#define MWSINF_TTBR1_32 32,TTBR1_32
+#define MWSINF_TTBCR32,TTBCR
+
+#endif
+
+#define MWS_EMUL_(val, sz, r...) WRITE_SYSREG##sz((uint##sz##_t)
(val), r)


The cast is not necessary.


+#define MWS_EMUL(r) CALL_MACRO(MWS_EMUL_, w->value, MWSINF_##r)
+
+static inline void vcpu_enter_write_data(struct vcpu *v)
+{
+struct monitor_write_data *w = >arch.vm_event.write_data;
+
+if ( likely(MWS_NOWRITE == w->status) )
+return;
+
+switch ( w->status )
+{
+case MWS_SCTLR:
+MWS_EMUL(SCTLR);
+break;
+case MWS_TTBR0:
+MWS_EMUL(TTBR0);
+break;
+case MWS_TTBR1:
+MWS_EMUL(TTBR1);
+break;
+#if CONFIG_ARM_32
+case MWS_TTBR0_32:
+MWS_EMUL(TTBR0_32);
+break;
+case MWS_TTBR1_32:
+MWS_EMUL(TTBR1_32);
+break;
+#endif


Aarch32 kernel can return on an AArch64 Xen. This means that
TTBR{0,1}_32 could be trapped and the write therefore be properly
emulated.


MCR TTBRx writes from AArch32 guests appear as writes of the low 32-bits
of AArch64 TTBRx_EL1 (architecturally mapped).
MCRR TTBRx writes from AArch32 guests appear as writes to AArch64
TTBRx_EL1.
Therefore, in the end the register we need to write is still TTBRx_EL1.


Why would you want to notify write to TTBR*_32 when Xen is running in 
AArch32 mode and none in Aaarch64 mode?


The vm-events for an AArch32 guests should be exactly the same
regardless of Xen mode (i.e AArch32 vs AArch64).

[...]




@@ -0,0 +1,253 @@


[...]


+/*
+ * Emulation of system-register trapped writes that do not cause
+ * VM_EVENT_REASON_WRITE_CTRLREG monitor vm-events.
+ * Such writes are collaterally trapped due to setting the
HCR_EL2.TVM bit.
+ *
+ * Regarding aarch32 domains, note that from Xen's perspective
system-registers
+ * of such domains are architecturally-mapped to aarch64 registers
in one of
+ * three ways:
+ *  - low 32-bits mapping   (e.g. aarch32 DFAR -> aarch64
FAR_EL1[31:0])
+ *  - high 32-bits mapping  (e.g. aarch32 IFAR -> aarch64
FAR_EL1[63:32])
+ *  - full mapping  (e.g. aarch32 SCTLR -> aarch64 SCTLR_EL1)
+ *
+ * Hence we define 2 macro variants:
+ *  - TVM_EMUL_SZ variant, for full mappings
+ *  - TVM_EMUL_LH variant, for low/high 32-bits mappings
+ */
+#define TVM_EMUL_SZ(regs, hsr, val, sz,
r...)   \
+{ \
+if ( psr_mode_is_user(regs)
)   \


Those registers are not accessible at EL0.


Hmm, I have this question noted.
I remember finding from the manuals that a write from user-mode of those
regs would still trap to EL2, but I didn't test that yet.
Will put that to test and come back with results for v2.


Testing will not tell you if a trap will occur or not. The ARM ARM may 
define it as IMPLEMENTATION DEFINED.


From my understanding of the ARMv7 spec (B1.14.1 and B1.14.13 in ARM 
DDI 0406C.c), the instruction at PL0 (user mode) will not trap to the 
hypervisor:


"Setting HCR.TVM to 1 means that any attempt, to write to a Non-secure 
memory control register from a Non-secure
PL1 or PL0 mode, that this reference manual does not describe as 
UNDEFINED , generates a Hyp Trap exception."


For ARMv8 (See description of HCR_EL2.TVM D7-1971 in ARM DDI 0487A.j), 
only NS EL1 write to the registers will be trapped.





+return inject_undef_exception(regs,
hsr);   \
+WRITE_SYSREG##sz((uint##sz##_t) (val), r);


[...]


diff --git a/xen/include/asm-arm/vm_event.h
b/xen/include/asm-arm/vm_event.h
index 4e5a272..edf9654 100644
--- a/xen/include/asm-arm/vm_event.h
+++ b/xen/include/asm-arm/vm_event.h
@@ -30,6 +30,12 @@ static inline int 

[Xen-devel] [linux-3.14 test] 95836: regressions - trouble: blocked/broken/fail/pass

2016-06-17 Thread osstest service owner
flight 95836 linux-3.14 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/95836/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-freebsd10-amd64 10 guest-startfail REGR. vs. 95164
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1fail REGR. vs. 95164
 test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1  fail REGR. vs. 95164

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 3 host-install(3) broken pass in 
95800
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 3 host-install(3) broken 
pass in 95800
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail in 
95800 pass in 95836
 test-amd64-amd64-xl-qemuu-ovmf-amd64  6 xen-bootfail pass in 95800

Regressions which are regarded as allowable (not blocking):
 build-i386-rumpuserxen6 xen-buildfail   like 95164
 build-amd64-rumpuserxen   6 xen-buildfail   like 95164
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 95164
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 95164
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 95164
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 95164

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass

version targeted for testing:
 linux022aafd04696a3c6b7ad47ab83a9650646287bf8
baseline version:
 linuxf06cb456a442c7df95a4ba6e2f3a341cf925d7cf

Last test of basis95164  2016-06-01 19:46:34 Z   15 days
Testing same since95407  2016-06-08 00:46:14 Z9 days8 attempts


People who touched revisions under test:
  Andrew Morton 
  Ben Hutchings 
  Bjorn Helgaas 
  Daniel Lezcano 
  Daniel Vetter 
  Dave Chinner 
  Dave Chinner 
  Dave Gerlach 
  David Vrabel 
  Dmitry Torokhov 
  Greg Kroah-Hartman 
  Hari Bathini 
  Itai Handler 
  J. Bruce Fields 
  James Hogan 
  Jeff Layton 
  Joseph Salisbury 
  Kalle Valo 
  Kalle Valo 
  Larry Finger 
  Linus Torvalds 
  Lyude 
  Mahesh Salgaonkar 
  Martin K. Petersen 
  Matthias Schiffer 
  Michael Ellerman 
  Nicolai Stange 
  Patrik Jakobsson 
  Paul Burton 
  Prarit Bhargava 
  Rafael J. Wysocki 
  Raghava Aditya Renukunta 
  Ralf Baechle 
  Ricky Liang 
  Ross Lagerwall 
  Shyam Kaushik 
  Theodore Ts'o 
  Tomáš Trnka 
  Ville Syrjälä 
  Wang YanQing 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops 

Re: [Xen-devel] [PATCH 3/7] vm-event: introduce vm_event_vcpu_enter

2016-06-17 Thread Corneliu ZUZU

On 6/17/2016 2:27 PM, Jan Beulich wrote:

On 17.06.16 at 13:13,  wrote:

On 6/17/2016 10:17 AM, Jan Beulich wrote:

(And to be clear, I much appreciate any form of reduction of the
sometimes extremely long lists of #include-s, just not [apparently
or really] randomly mixed with other, substantial changes. That's
namely because it's not clear whether source files should explicitly
include everything they need, or instead be allowed to rely on
headers they include to include further headers they also
_explicitly_ rely on.

Personally I prefer the former since I think it also cuts down
compilation time.
Having header H include every header Ni needed by source S makes H
unnecessarily bulky at compilation time for other sources <> S that
don't need headers Ni but which depend on H nonetheless.

I nowhere said every _header_ should include everything any of
its _consumers_ would require. My point was solely about source
files. For example, if S depends on both H1 and H2, and H2
already includes H1, whether S then needs to just include H2, or
should also explicitly include H1 (such that S doesn't need changing
when the inclusion of H1 by H2 goes away).

Jan




Ah, ok got it.
I restate my view of treating these "issues" with automation tools 
rather than leaving the programmer to do primitive work that he 
shouldn't be required to do with a nowadays programming language.
Clang for example offers a powerful parsing library (can parse GCC too) 
with python bindings, it would be nice to take more advantage of that.


Corneliu.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] hvmloader: don't hard-code IO-APIC parameters

2016-06-17 Thread Boris Ostrovsky
On 06/17/2016 02:00 AM, Jan Beulich wrote:
 On 16.06.16 at 18:49,  wrote:
>> On 06/16/2016 05:40 AM, Jan Beulich wrote:
>>> The IO-APIC address has variable bits determined by the PCI-to-ISA
>>> bridge, and the IO-APIC version should be read from the IO-APIC. (Note
>>> that there's still implicit rather than explicit agreement on the
>>> IO-APIC base address between qemu and the hypervisor.)
>> It probably doesn't matter now for PVHv2 since we are not going to
>> initially emulate IOAPIC but when/if we do, how do we query for base
>> address and version? Should we just rely on the same implicit agreement?
> If this refers to the sentence in parentheses, then this was really
> meant to indicate a work item. I.e. qemu should negotiate with
> Xen. For PVHv2 this would then probably mean for hvmloader to
> query Xen (via new hypercall).

We are not using hvmloader for PVH so I am not sure how that would work.
You mean the toolstack?

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 7/7] vm-event/arm: implement support for control-register write vm-events

2016-06-17 Thread Corneliu ZUZU

On 6/17/2016 12:28 AM, Julien Grall wrote:



On 16/06/2016 20:24, Corneliu ZUZU wrote:

On 6/16/2016 5:26 PM, Julien Grall wrote:
Hi Julien. Yes, I agree that it's complex, I would have preferred to
split it up too and I actually tried, but the changes are tightly
coupled and they don't seem to be 'split-able'.


After I reviewed this patch I think it could be simplified a lot.

Most of the registers not trapped by vm-event could be emulated with 
one-line WRITE_SYSREGxx and does not require specific case depending 
on the architecture.


You can give a look how we handle the domain context switch in C 
(arch/arm/domain.c).


Only few of the registers will require more than one-line (such as 
MAIR*, *FAR) which could be over a macro.


Regards,



Ok, I'll take a look again but I'm not that optimistic, I remember 
trying pretty hard to simplify this (and kind of failing), I got the 
best I could.
I think it had to do with the fact the SCTLR macros & such got expanded 
too early (where I wanted to use them as prefixes to construct other 
macros).
Having said this, I'll look into your suggestions and reinspect the code 
to recall why those macros got to be the way the are and get back with 
conclusions.


Regards,
Corneliu.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/7] vm-event: introduce vm_event_vcpu_enter

2016-06-17 Thread Corneliu ZUZU

On 6/17/2016 11:55 AM, Julien Grall wrote:

Hello,

On 16/06/16 15:08, Corneliu ZUZU wrote:

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index d31f821..ba248c8 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -19,6 +19,7 @@
  #include 
  #include 
  #include 
+#include 

  #include 
  #include 
@@ -251,6 +252,8 @@ static void schedule_tail(struct vcpu *prev)

  ctxt_switch_to(current);

+vm_event_vcpu_enter(current);
+
  local_irq_enable();

  context_saved(prev);
@@ -296,7 +299,7 @@ void context_switch(struct vcpu *prev, struct 
vcpu *next)


  void continue_running(struct vcpu *same)
  {
-/* Nothing to do */
+vm_event_vcpu_enter(same);
  }

  void sync_local_execstate(void)


From my understanding of the commit message, vm_event_vcpu_enter 
should be called before returning to the guest. The scheduling 
functions are not called every-time Xen is returning to the guest. So 
if you want to do every time Xen is re-entering to the guest, then 
this should be done in leave_hypervisor_tail.


Regards,



Which is also done, vm_event_vcpu_enter is called from 
leave_hypervisor_tail as well.
Are you saying that calling from leave_hypervisor_tail is enough and 
that the schedule_tail call is not necessary?


Thanks,
Corneliu.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] xen: make available hvm_fep to non-debug build as well

2016-06-17 Thread Jan Beulich
>>> On 17.06.16 at 13:05,  wrote:
> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -59,6 +59,20 @@ config BIGMEM
>  
> If unsure, say N.
>  
> +config HVM_FEP
> + bool "HVM Forced Emulation Prefix support"

And no "if EXPERT"?

> @@ -95,9 +96,9 @@ unsigned long __section(".bss.page_aligned")
>  static bool_t __initdata opt_hap_enabled = 1;
>  boolean_param("hap", opt_hap_enabled);
>  
> -#ifndef opt_hvm_fep
> +#if CONFIG_HVM_FEP

#ifdef please. And I think this would better be left alone anyway
(and then the comment only applies to the other instance).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 6/7] vm-event/arm: move hvm_event_cr->common vm_event_monitor_cr

2016-06-17 Thread Corneliu ZUZU

On 6/17/2016 11:38 AM, Jan Beulich wrote:

On 17.06.16 at 10:25,  wrote:

On 6/16/2016 6:16 PM, Jan Beulich wrote:

And looking at all the uses of this variable I get the impression that
you really want a shorthand for >arch.monitor (if any such
helper variable is worthwhile to have here in the first place).

Well, this was a simple cut-paste operation, not very old content aware :)
Personally I prefer the current shorthand (ad) (seems more intuitive and
is consistent with the other XEN_DOMCTL_MONITOR_EVENT_* cases), but if
you prefer I'll change that shorthand to am = >arch.monitor?

I'd prefer either no shorthand, or one eliminating the longest common
prefix across all uses.


am = >arch.monitor it is then.


--- a/xen/include/asm-x86/monitor.h
+++ b/xen/include/asm-x86/monitor.h
@@ -24,8 +24,6 @@
   
   #include 
   
-#define monitor_ctrlreg_bitmask(ctrlreg_index) (1U << (ctrlreg_index))

-
   static inline
   int arch_monitor_domctl_op(struct domain *d, struct xen_domctl_monitor_op 
*mop)
   {
--- a/xen/include/xen/monitor.h
+++ b/xen/include/xen/monitor.h
@@ -25,6 +25,10 @@
   struct domain;
   struct xen_domctl_monitor_op;
   
+#if CONFIG_X86

+#define monitor_ctrlreg_bitmask(ctrlreg_index) (1U << (ctrlreg_index))
+#endif

What's the point in removing this from the x86 header if then it
needs to be put in such a conditional? If the conditional gets
dropped in the next patch, then I think you have two options:
Leave it where it is here, and move it there. Or move it here,
but omit the conditional right away - I can't see this definition
being present to harm the ARM build in any way.

Meld comment above.

I don't follow - having split the patches for reviewability was
a good idea. I merely commented on some oddities resulting
from that split, which - afaict - could be easily corrected without
folding the patches together.

Jan


Ooh, haha, sorry I've re-read your comment now. You're right, there's no 
point in that change, I'll leave it on the X86 side until the ARM part 
is actually implemented (last patch).


Corneliu.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/7] vm-event: introduce vm_event_vcpu_enter

2016-06-17 Thread Jan Beulich
>>> On 17.06.16 at 13:13,  wrote:
> On 6/17/2016 10:17 AM, Jan Beulich wrote:
>> (And to be clear, I much appreciate any form of reduction of the
>> sometimes extremely long lists of #include-s, just not [apparently
>> or really] randomly mixed with other, substantial changes. That's
>> namely because it's not clear whether source files should explicitly
>> include everything they need, or instead be allowed to rely on
>> headers they include to include further headers they also
>> _explicitly_ rely on.
> 
> Personally I prefer the former since I think it also cuts down 
> compilation time.
> Having header H include every header Ni needed by source S makes H 
> unnecessarily bulky at compilation time for other sources <> S that 
> don't need headers Ni but which depend on H nonetheless.

I nowhere said every _header_ should include everything any of
its _consumers_ would require. My point was solely about source
files. For example, if S depends on both H1 and H2, and H2
already includes H1, whether S then needs to just include H2, or
should also explicitly include H1 (such that S doesn't need changing
when the inclusion of H1 by H2 goes away).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 95853: tolerable all pass - PUSHED

2016-06-17 Thread osstest service owner
flight 95853 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/95853/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  c1cee3ff93c1be525443472dd4eebe230b57ece9
baseline version:
 xen  759b9618b8a22ddd87d01c0bff5366814b17eea7

Last test of basis95786  2016-06-15 16:07:33 Z1 days
Testing same since95853  2016-06-17 09:14:43 Z0 days1 attempts


People who touched revisions under test:
  George Dunlap 
  Jan Beulich 
  Julien Grall 
  Kevin Tian 
  Quan Xu 
  Suravee Suthikulpanit 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=c1cee3ff93c1be525443472dd4eebe230b57ece9
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
c1cee3ff93c1be525443472dd4eebe230b57ece9
+ branch=xen-unstable-smoke
+ revision=c1cee3ff93c1be525443472dd4eebe230b57ece9
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.7-testing
+ '[' xc1cee3ff93c1be525443472dd4eebe230b57ece9 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local 'options=[fetch=try]'
 getconfig GitCacheProxy
 perl -e '
use 

Re: [Xen-devel] [PATCH 4/7] vm-event/x86: use vm_event_vcpu_enter properly

2016-06-17 Thread Corneliu ZUZU

On 6/17/2016 10:20 AM, Jan Beulich wrote:

On 16.06.16 at 22:20,  wrote:

On 6/16/2016 6:00 PM, Jan Beulich wrote:

On 16.06.16 at 16:09,  wrote:

@@ -1432,18 +1430,16 @@ static void vmx_update_guest_cr(struct vcpu *v, 
unsigned int cr)
   if ( paging_mode_hap(v->domain) )
   {
   /* Manage GUEST_CR3 when CR0.PE=0. */
+uint32_t old_ctls = v->arch.hvm_vmx.exec_control;
   uint32_t cr3_ctls = (CPU_BASED_CR3_LOAD_EXITING |
CPU_BASED_CR3_STORE_EXITING);
+
   v->arch.hvm_vmx.exec_control &= ~cr3_ctls;
   if ( !hvm_paging_enabled(v) && !vmx_unrestricted_guest(v) )
   v->arch.hvm_vmx.exec_control |= cr3_ctls;
   
-/* Trap CR3 updates if CR3 memory events are enabled. */

-if ( v->domain->arch.monitor.write_ctrlreg_enabled &
- monitor_ctrlreg_bitmask(VM_EVENT_X86_CR3) )
-v->arch.hvm_vmx.exec_control |= CPU_BASED_CR3_LOAD_EXITING;
-
-vmx_update_cpu_exec_control(v);
+if ( old_ctls != v->arch.hvm_vmx.exec_control )
+vmx_update_cpu_exec_control(v);
   }

How does this match up with the rest of this patch?

And by 'this' you mean slightly optimizing this sequence by adding in
old_ctls?
It seems pretty straight-forward to me, I figured if I am to move the
monitor.write_ctrlreg_enabled part from here
it wouldn't be much of a stretch to also do this little
optimization...what would have been appropriate?
To do this in a separate patch? To mention it in the commit message?

At least the latter, and perhaps better the former. Without even
mentioning it the readers (reviewers) have to guess whether this
is an integral part of the change, or - as you now confirm - just a
minor optimization done along the road.

Jan


Ack, will split in separate patch in v2.
You're right, I've got to be more attentive to always separate unrelated 
code changes, however minor they are :)


Thanks,
Corneliu.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] xen: use native disk xenbus protocol if possible

2016-06-17 Thread Juergen Gross
The qdisk implementation is using the native xenbus protocol only in
case of no protocol specified at all. As using the explicit 32- or
64-bit protocol is slower than the native one due to copying requests
not by memcpy but element for element, this is not optimal.

Correct this by using the native protocol in case word sizes of
frontend and backend match.

Signed-off-by: Juergen Gross 
---
 hw/block/xen_disk.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 2862b59..0961fcb 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -976,14 +976,14 @@ static int blk_connect(struct XenDevice *xendev)
 blkdev->feature_persistent = !!pers;
 }
 
-blkdev->protocol = BLKIF_PROTOCOL_NATIVE;
-if (blkdev->xendev.protocol) {
-if (strcmp(blkdev->xendev.protocol, XEN_IO_PROTO_ABI_X86_32) == 0) {
-blkdev->protocol = BLKIF_PROTOCOL_X86_32;
-}
-if (strcmp(blkdev->xendev.protocol, XEN_IO_PROTO_ABI_X86_64) == 0) {
-blkdev->protocol = BLKIF_PROTOCOL_X86_64;
-}
+if (!blkdev->xendev.protocol) {
+blkdev->protocol = BLKIF_PROTOCOL_NATIVE;
+} else if (strcmp(blkdev->xendev.protocol, XEN_IO_PROTO_ABI_NATIVE) == 0) {
+blkdev->protocol = BLKIF_PROTOCOL_NATIVE;
+} else if (strcmp(blkdev->xendev.protocol, XEN_IO_PROTO_ABI_X86_32) == 0) {
+blkdev->protocol = BLKIF_PROTOCOL_X86_32;
+} else if (strcmp(blkdev->xendev.protocol, XEN_IO_PROTO_ABI_X86_64) == 0) {
+blkdev->protocol = BLKIF_PROTOCOL_X86_64;
 }
 
 blkdev->sring = xengnttab_map_grant_ref(blkdev->xendev.gnttabdev,
-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/7] vm-event: introduce vm_event_vcpu_enter

2016-06-17 Thread Corneliu ZUZU

On 6/17/2016 10:17 AM, Jan Beulich wrote:

On 16.06.16 at 22:10,  wrote:

On 6/16/2016 5:51 PM, Jan Beulich wrote:

On 16.06.16 at 16:08,  wrote:

@@ -509,6 +508,8 @@ void hvm_do_resume(struct vcpu *v)
   }
   }
   
+vm_event_vcpu_enter(v);

Why here?

Why indeed. It made sense because monitor_write_data handling was
originally there and then the plan was to move it to vm_event_vcpu_enter
(which happens in the following commit).
The question is though, why was monitor_write_data handled there in the
first place? Why was it not put e.g. in vmx_do_resume immediately after
the call to hvm_do_resume and just before
the reset_stack_and_jump...? And what happens with handling
monitor_write_data if this:

if ( !handle_hvm_io_completion(v) )
  return;

causes a return?

I see Razvan responded to this. I don't have a strong opinion
either way, my only request if for the call to be in exactly one
place.


--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -36,7 +36,6 @@
   #include 
   #include 
   #include 
-#include 
   #include 

There are way too many of these #include adjustments here. If
you really mean to clean these up, please don't randomly throw
this into various unrelated patches.

I haven't thrown anything "randomly into unrelated patches", please
first ask for my reasoning and then draw such conclusions.

See patch 1.


"Sorry, that was done out of reflex, should have stated the reasoning."


Plus I don't think I (or in fact any reviewer) should ask
for such reasoning: Instead you should state extra cleanup you do
to unrelated (to the purpose of your patch) files in the description.


Is that still the case when that reasoning is obvious? (at least it 
seemed to me)


but anyway..


Or even better, split it off to a follow-on, purely cleanup patch.


I agree with this. Will keep in mind with v2.


(And to be clear, I much appreciate any form of reduction of the
sometimes extremely long lists of #include-s, just not [apparently
or really] randomly mixed with other, substantial changes. That's
namely because it's not clear whether source files should explicitly
include everything they need, or instead be allowed to rely on
headers they include to include further headers they also
_explicitly_ rely on.


Personally I prefer the former since I think it also cuts down 
compilation time.
Having header H include every header Ni needed by source S makes H 
unnecessarily bulky at compilation time for other sources <> S that 
don't need headers Ni but which depend on H nonetheless.



IOW there's likely a discussion to be had for
this kind of cleanup, and such a discussion should be a separate
thread from the one on the functional adjustments here.)


Corneliu.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 1/2] xen/kernel: document 'C' in print_tainted

2016-06-17 Thread Wei Liu
Signed-off-by: Wei Liu 
Acked-by: Jan Beulich 
---
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Wei Liu 
---
 xen/common/kernel.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/common/kernel.c b/xen/common/kernel.c
index 1a6823a..dae7e35 100644
--- a/xen/common/kernel.c
+++ b/xen/common/kernel.c
@@ -174,6 +174,7 @@ int __init parse_bool(const char *s)
  *  'S' - SMP with CPUs not designed for SMP.
  *  'M' - Machine had a machine check experience.
  *  'B' - System has hit bad_page.
+ *  'C' - Console output is synchronous.
  *
  *  The string is overwritten by the next call to print_taint().
  */
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 0/2] Make hvm_fep available to non-debug build

2016-06-17 Thread Wei Liu
Wei Liu (2):
  xen/kernel: document 'C' in print_tainted
  xen: make available hvm_fep to non-debug build as well

 docs/misc/xen-command-line.markdown |  8 ++--
 xen/arch/x86/Kconfig| 14 ++
 xen/arch/x86/hvm/hvm.c  | 28 ++--
 xen/common/kernel.c |  7 +--
 xen/include/asm-x86/hvm/hvm.h   |  2 +-
 xen/include/xen/lib.h   |  1 +
 6 files changed, 53 insertions(+), 7 deletions(-)

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 2/2] xen: make available hvm_fep to non-debug build as well

2016-06-17 Thread Wei Liu
Originally hvm_fep was guarded by NDEBUG, which means it was only
available to debug builds.

However there is value to have it for non-debug builds as well. User can
use that to run tests in setup that replicates production setup.

Make it clear with a sync_console style warning that this option can't
be used in production setup. Update command line documentation
accordingly. Finally mark Xen as tainted when this option is enabled.

Add a kconfig option under x86 to configure hvm_fep.

Signed-off-by: Wei Liu 
---
Cc: Andrew Cooper 
Cc: Jan Beulich 
Cc: Doug Goldstein 

v2:
1. unsigned -> unsigned int
2. %d -> %u
3. Add spaces around "-"
4. Update warning message
5. Only taint hv when fep is used
6. Add kconfig option
---
 docs/misc/xen-command-line.markdown |  8 ++--
 xen/arch/x86/Kconfig| 14 ++
 xen/arch/x86/hvm/hvm.c  | 28 ++--
 xen/common/kernel.c |  6 --
 xen/include/asm-x86/hvm/hvm.h   |  2 +-
 xen/include/xen/lib.h   |  1 +
 6 files changed, 52 insertions(+), 7 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index fed732c..dc53e24 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -878,8 +878,12 @@ Recognized in debug builds of the hypervisor only.
 Allow use of the Forced Emulation Prefix in HVM guests, to allow emulation of
 arbitrary instructions.
 
-This option is intended for development purposes, and is only available in
-debug builds of the hypervisor.
+This option is intended for development and testing purposes.
+
+*Warning*
+As this feature opens up the instruction emulator to HVM guest, don't
+use this in production system. No security support is provided when
+this flag is set.
 
 ### hvm\_port80
 > `= `
diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 73f79cc..5e3b04a 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -59,6 +59,20 @@ config BIGMEM
 
  If unsure, say N.
 
+config HVM_FEP
+   bool "HVM Forced Emulation Prefix support"
+   default y
+   ---help---
+
+ Compiles in a feature that allows HVM guest to enter
+ instruction emulator with forced emulation prefix.
+
+ This feature can only be enabled during boot time with
+ appropriate hypervisor command line option. Please read
+ hypervisor command line documentation before trying to use
+ this feature.
+
+ If unsure, say Y.
 endmenu
 
 source "common/Kconfig"
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 78db903..373b78e 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -95,9 +96,9 @@ unsigned long __section(".bss.page_aligned")
 static bool_t __initdata opt_hap_enabled = 1;
 boolean_param("hap", opt_hap_enabled);
 
-#ifndef opt_hvm_fep
+#if CONFIG_HVM_FEP
 /* Permit use of the Forced Emulation Prefix in HVM guests */
-bool_t opt_hvm_fep;
+bool_t __read_mostly opt_hvm_fep;
 boolean_param("hvm_fep", opt_hvm_fep);
 #endif
 
@@ -182,6 +183,28 @@ static int __init hvm_enable(void)
 if ( !opt_altp2m_enabled )
 hvm_funcs.altp2m_supported = 0;
 
+if ( opt_hvm_fep )
+{
+unsigned int i, j;
+
+printk("**\n");
+printk("*** WARNING: HVM FORCED EMULATION PREFIX IS AVAILABLE\n");
+printk("*** This option is *ONLY* intended to aid testing of 
Xen.\n");
+printk("*** It has implications on the security of the system.\n");
+printk("*** Please *DO NOT* use this in production.\n");
+printk("**\n");
+for ( i = 0; i < 3; i++ )
+{
+printk("%u... ", 3 - i);
+for ( j = 0; j < 100; j++ )
+{
+process_pending_softirqs();
+mdelay(10);
+}
+}
+printk("\n");
+}
+
 /*
  * Allow direct access to the PC debug ports 0x80 and 0xed (they are
  * often used for I/O delays, but the vmexits simply slow things down).
@@ -3905,6 +3928,7 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
 {
 regs->eip += sizeof(sig);
 regs->eflags &= ~X86_EFLAGS_RF;
+add_taint(TAINT_HVM_FEP);
 }
 }
 
diff --git a/xen/common/kernel.c b/xen/common/kernel.c
index dae7e35..5bf77aa 100644
--- a/xen/common/kernel.c
+++ b/xen/common/kernel.c
@@ -175,6 +175,7 @@ int __init parse_bool(const char *s)
  *  'M' - Machine had a machine check experience.
  *  'B' - System has hit bad_page.
  *  'C' - Console output is synchronous.
+ *  'H' - HVM forced emulation prefix is permitted.
  *
  *  The string is overwritten by the 

Re: [Xen-devel] [PATCH] xen: sched: use default scheduler upon an invalid "sched="

2016-06-17 Thread Andrew Cooper
On 17/06/16 11:31, Dario Faggioli wrote:
> instead of just the first scheduler we find in the array.
>
> In fact, right now, if someone makes a typo when passing
> the "sched=" command line option to Xen, we (with all
> schedulers configured in) pick ARINC653, which is most
> likely not what one would expect.
>
> Go for the default scheduler instead.
>
> Signed-off-by: Dario Faggioli 
> ---
> Cc: George Dunlap 
> Cc: Jan Beulich 
> Cc: Doug Goldstein 
> Cc: Jonathan Creekmore 

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xen: fix qdisk BLKIF_OP_DISCARD for 32/64 word size mix

2016-06-17 Thread Paul Durrant
> -Original Message-
> From: Juergen Gross [mailto:jgr...@suse.com]
> Sent: 17 June 2016 11:40
> To: Paul Durrant; Jan Beulich
> Cc: Anthony Perard; xen-devel; sstabell...@kernel.org; qemu-
> de...@nongnu.org; kra...@redhat.com
> Subject: Re: [Xen-devel] [PATCH v2] xen: fix qdisk BLKIF_OP_DISCARD for
> 32/64 word size mix
> 
> On 17/06/16 12:15, Paul Durrant wrote:
> >> -Original Message-
> >> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> >> Juergen Gross
> >> Sent: 17 June 2016 11:08
> >> To: Paul Durrant; Jan Beulich
> >> Cc: Anthony Perard; xen-devel; sstabell...@kernel.org; qemu-
> >> de...@nongnu.org; kra...@redhat.com
> >> Subject: Re: [Xen-devel] [PATCH v2] xen: fix qdisk BLKIF_OP_DISCARD for
> >> 32/64 word size mix
> >>
> >> On 17/06/16 11:50, Paul Durrant wrote:
>  -Original Message-
>  From: Juergen Gross [mailto:jgr...@suse.com]
>  Sent: 17 June 2016 10:46
>  To: Paul Durrant; Jan Beulich
>  Cc: Anthony Perard; xen-devel; sstabell...@kernel.org; qemu-
>  de...@nongnu.org; kra...@redhat.com
>  Subject: Re: [Xen-devel] [PATCH v2] xen: fix qdisk BLKIF_OP_DISCARD
> for
>  32/64 word size mix
> 
>  On 17/06/16 11:37, Paul Durrant wrote:
> >> -Original Message-
> >> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On
> Behalf
> >> Of
>  Jan
> >> Beulich
> >> Sent: 17 June 2016 10:26
> >> To: Juergen Gross
> >> Cc: Anthony Perard; xen-devel; sstabell...@kernel.org; qemu-
> >> de...@nongnu.org; kra...@redhat.com
> >> Subject: Re: [Xen-devel] [PATCH v2] xen: fix qdisk
> BLKIF_OP_DISCARD
> >> for
> >> 32/64 word size mix
> >>
> > On 17.06.16 at 11:14,  wrote:
> >>> In case the word size of the domU and qemu running the qdisk
> >> backend
> >>> differ BLKIF_OP_DISCARD will not work reliably, as the request
> >>> structure in the ring have different layouts for different word size.
> >>>
> >>> Correct this by copying the request structure in case of different
> >>> word size element by element in the BLKIF_OP_DISCARD case, too.
> >>>
> >>> The easiest way to achieve this is to resync hw/block/xen_blkif.h
> with
> >>> its original source from the Linux kernel.
> >>>
> >>> Signed-off-by: Juergen Gross 
> >>> ---
> >>> V2: resync with Linux kernel version of hw/block/xen_blkif.h as
> >>> suggested by Paul Durrant
> >>
> >> Oh, I didn't realize he suggested syncing with the Linux variant.
> >> Why not with the canonical one? I have to admit that I particularly
> >> dislike Linux'es strange union-izng, mainly because of it requiring
> >> this myriad of __attribute__((__packed__)).
> >>
> >
> > Yes, it's truly grotesque and such things should be blown away with
>  extreme prejudice.
> 
>  Sorry, I'm confused now.
> 
>  Do you still mandate for the resync or not?
> 
>  Resyncing with elimination of all the __packed__ stuff seems not to be
>  a proper alternative as this would require a major rework.
> >>>
> >>> Why? Replacing the existing horribleness with the canonical header
> (fixed
> >> for style) might mean a large diff but it should be functionally the same 
> >> or
> >> something has gone very seriously wrong. If the extra part you need is
> not in
> >> the canonical header then adding this as a second patch seems like a
> >> reasonable plan.
> >>
> >> I think you don't realize that qemu is built using the public headers
> >> from the Xen build environment. So there is no way to resync with the
> >> canonical header as this isn't part of the qemu tree.
> >>
> >
> > Now I'm confused... you're posting a patch to hw/block/xen_blkif.h. That's
> in the QEMU source, right? That's not a Xen public header but is a Linux
> mangled variant of a Xen public header. So, actually, I guess the question is
> why can't this header just go away and QEMU use the canonical header
> directly from Xen?
> 
> No, hw/block/xen_blkif.h is based on the Linux header
> drivers/block/xen-blkback/common.h which is an add-on header to the
> canonical-based Linux header include/xen/interface/io/blkif.h
>
> >> The header in question is originating from the Linux one which is an
> >> add-on of the canonical header containing the explicit 32- and 64-bit
> >> variants of the xenbus protocol and the conversion routines between
> >> those.
> >>
> >> It would be possible to add these parts to the canonical header, but
> >> do we really want that?
> >>
> >
> > No, we shouldn't be taking Linux brokenness into the canonical header.
> 
> Okay, so then back to the first approach using hw/block/xen_blkif.h as
> today and adapting the style first and then doing the necessary code
> correction?
> 

I guess re-syncing with the Linux header as in your v2 patch is the least worst 
option then.

  Paul

> 
> Juergen


Re: [Xen-devel] [PATCH RESEND 01/14] libxl/arm: Fix the function name in error log

2016-06-17 Thread Wei Liu
On Fri, Jun 03, 2016 at 08:25:05PM +0100, Wei Liu wrote:
> On Tue, May 31, 2016 at 01:02:53PM +0800, Shannon Zhao wrote:
> > From: Shannon Zhao 
> > 
> > It should be xc_dom_devicetree_mem instead of xc_dom_devicetree_file.
> > 
> > Signed-off-by: Shannon Zhao 
> 
> Acked-by: Wei Liu 
> 

I've pushed this to staging to reduce the length of your patch queue.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/7] minor (formatting) fixes

2016-06-17 Thread Corneliu ZUZU

On 6/17/2016 10:06 AM, Jan Beulich wrote:

On 16.06.16 at 21:19,  wrote:

On 6/16/2016 5:24 PM, Jan Beulich wrote:

On 16.06.16 at 16:06,  wrote:

--- a/xen/arch/x86/hvm/event.c
+++ b/xen/arch/x86/hvm/event.c
@@ -23,6 +23,7 @@
   
   #include 

   #include 
+#include 
   #include 
   #include 
   #include 
diff --git a/xen/common/monitor.c b/xen/common/monitor.c
index d950a7c..b30857a 100644
--- a/xen/common/monitor.c
+++ b/xen/common/monitor.c
@@ -22,7 +22,6 @@
   #include 
   #include 
   #include 
-#include 
   #include 
   #include 

These two adjustments clearly don't fit title / description. I certainly
don't mind unnecessary inclusions to be dropped, but the addition of
one clearly needs explanation (after all thing build fine without it).

Sorry, that was done out of reflex, should have stated the reasoning.
Generally, if:
- event.c includes event.h
- event.c needs paging.h
- event.h -doesn't need- paging.h
then I prefer to include paging.h in event.c, not in event.h (include
strictly -where- needed).
Also since xen/paging.h included asm/paging.h and event.c only needs the
asm/paging.h part, I also changed xen/paging.h inclusion -> asm/paging.h
inclusion (include strictly -what's- needed).
But I can revert that if you want me to (not that important and these
little things are better done with automatic tools anyway), or should I
leave the change and update the commit message?

Well, I'd leave the ultimate decision to the maintainers, but I'd say
this doesn't belong in a patch dealing with formatting issues. I.e.
I'm fine with the cleanup, but either under a different title (and
with at least an abbreviated version of what you said above), or
in a separate patch. Considering that you appear to do the same
in later patches, perhaps consolidating all the #include adjustments
into one patch would a good idea?


I tend to prefer organizing what's easy to understand/minor first (to 
let reviewers get rid of those in a flash), so I'd prefer to leave it in 
this patch with the other minor fixes and update the commit message instead.

But either way it's fine by me.


--- a/xen/include/xen/vm_event.h
+++ b/xen/include/xen/vm_event.h
@@ -85,7 +85,6 @@ void vm_event_monitor_guest_request(void);
   
   #endif /* __VM_EVENT_H__ */
   
-

   /*
* Local variables:
* mode: C

Why don't you remove the other stray blank line at once?

What stray line? Shouldn't there be -one- blank line between the #endif
and the local vars block?
Looking @ other files that rule seems to hold...

Oh, you're right. I thought I had frequently seen it the other way
around (and it would seem more logical to me), but I must have been
misremembering.

Jan




Thanks,
Corneliu.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   >