[Xen-devel] [qemu-mainline test] 59832: regressions - trouble: broken/fail/pass

2015-07-23 Thread osstest service owner
flight 59832 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59832/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-freebsd10-i386 12 guest-saverestore   fail REGR. vs. 59059
 test-amd64-amd64-xl  20 guest-start/debian.repeat fail REGR. vs. 59059
 test-amd64-i386-xl-qemuu-ovmf-amd64 11 guest-saverestore  fail REGR. vs. 59059
 test-amd64-i386-xl-qemuu-debianhvm-amd64 9 debian-hvm-install fail REGR. vs. 
59059
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 11 guest-saverestore fail REGR. 
vs. 59059
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 11 guest-saverestore fail REGR. vs. 
59059
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 11 guest-saverestore fail REGR. vs. 
59059
 test-amd64-amd64-xl-qemuu-ovmf-amd64 11 guest-saverestore fail REGR. vs. 59059
 test-amd64-amd64-xl-qemuu-winxpsp3 11 guest-saverestore   fail REGR. vs. 59059
 test-amd64-i386-freebsd10-amd64 12 guest-saverestore  fail REGR. vs. 59059
 test-amd64-i386-xl-qemuu-winxpsp3 11 guest-saverestorefail REGR. vs. 59059
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 11 guest-saverestore fail REGR. 
vs. 59059
 test-amd64-amd64-xl-qemuu-win7-amd64 11 guest-saverestore fail REGR. vs. 59059
 test-amd64-i386-xl-qemuu-win7-amd64 11 guest-saverestore  fail REGR. vs. 59059

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds  8 leak-check/basis(8)   fail REGR. vs. 59059

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass

version targeted for testing:
 qemuub69b30532e0a80e25449244c01b0cbed000c99a3
baseline version:
 qemuu35360642d043c2a5366e8a04a10e5545e7353bd5

Last test of basis59059  2015-07-05 10:39:20 Z   18 days
Failing since 59109  2015-07-06 14:58:21 Z   17 days   24 attempts
Testing same since59832  2015-07-23 10:53:08 Z0 days1 attempts


People who touched revisions under test:
  Alberto Garcia be...@igalia.com
  Alex Williamson alex.william...@redhat.com
  Alexander Graf ag...@suse.de
  Alexey Kardashevskiy a...@ozlabs.ru
  Alvise Rigo a.r...@virtualopensystems.com
  Amit Shah amit.s...@redhat.com
  Andreas Färber afaer...@suse.de
  Andrew Bennett andrew.benn...@imgtec.com
  Andrew Jones drjo...@redhat.com
  Artyom Tarasenko atar4q...@gmail.com
  Aurelien Jarno aurel...@aurel32.net
  Benjamin Herrenschmidt b...@kernel.crashing.org
  Bharata B Rao bhar...@linux.vnet.com
  Bharata B Rao bhar...@linux.vnet.ibm.com
  Brian Kress kre...@moose.net
  Chen Hanxiao chenhanx...@cn.fujitsu.com
  Christian Borntraeger borntrae...@de.ibm.com
  Christoffer Dall christoffer.d...@linaro.org
  Christoph Hellwig h...@lst.de
  Christophe Fergeau cferg...@redhat.com
  Claudio Fontana claudio.font...@huawei.com
  Cormac O'Brien i.am.cormac.obr...@gmail.com
  Cornelia Huck cornelia.h...@de.ibm.com
  Dana Rubin dana.ru...@ravellosystems.com
  Daniel P. Berrange berra...@redhat.com
  David Gibson da...@gibson.dropbear.id.au
  Denis V. Lunev d...@openvz.org
  Dmitry Osipenko dig...@gmail.com
  Dr. David Alan Gilbert dgilb...@redhat.com
  Eduardo Habkost ehabk...@redhat.com
  Eric Auger eric.au...@linaro.org
  Fam Zheng f...@redhat.com
  Frediano Ziglio fzig...@redhat.com
  Gabriel Laupre glau...@chelsio.com
  Gavin Shan gws...@linux.vnet.ibm.com
  Gerd Hoffmann kra...@redhat.com
  Gonglei arei.gong...@huawei.com
  Greg Kurz gk...@linux.vnet.ibm.com
  Hannes Reinecke h...@suse.de
  Hervé Poussineau hpous...@reactos.org
  Igor Mammedov imamm...@redhat.com
  James Hogan james.ho...@imgtec.com
  Jan Kiszka jan.kis...@siemens.com
  Jason Wang jasow...@redhat.com
  Jeff Cody jc...@redhat.com
  Johannes Schlatow schla...@ida.ing.tu-bs.de
  John Snow js...@redhat.com
  Josh Durgin jdur...@redhat.com
  Juan 

Re: [Xen-devel] [PATCH v3 2/3] x86/ldt: Make modify_ldt optional

2015-07-23 Thread Willy Tarreau
On Thu, Jul 23, 2015 at 04:40:14PM -0700, Andy Lutomirski wrote:
 On Thu, Jul 23, 2015 at 4:36 PM, Kees Cook keesc...@chromium.org wrote:
  I've been pondering something like this that is even MORE generic, for
  any syscall. Something like a syscalls directory under
  /proc/sys/kernel, with 1 entry per syscall. 0 is available, 1 is
  disabled, and -1 disabled until next boot.
 
 
 It might want to be /proc/sys/kernel/syscalls/[abi]/[name], possibly
 with more than just those options.  We might want disabled, returns
 ENOSYS, disabled, returns EPERM, and a lock bit.
 
 On x86 at least, the implementation's easy -- we can just poke the
 syscall table.

I wouldn't do it these days. Around 2000-2001, with a friend we designed
a module with its userland counterpart which was called overloader. The
principle was to intercept syscalls in order to enforce some form of
policies, log values, or remap paths, etc. The first use was to log all
file creations during a make install to more easily build packages. It
was at the era where it was easy to modify the syscall table from a module,
in kernel 2.2.

We quickly found that beyond logging/rewriting syscall arguments, it had
limited use cases when used as a syscall firewall because many syscalls
are still too coarse to decide whether you want to enable/disable them.
I remember that socketcall() and ioctl() were among the annoying ones.
Either you totally enable or totally disable. In the end, the only valid
use cases we found for enabling/disabling a syscall were limited to a very
small set for debugging purposes, in order to force some application code
to detect a missing implementation and switch to an alternative (eg: these
days if you suspect a bug in epoll you could disable it and force the app
to use poll instead). It was still useful to disable module loading and
FS mounting but that was about all by then.

All this to say that probably only a handful of tricky syscalls would
need an on/off switch but clearly not all of them at all, so I'd rather
add a few entries just for the relevant ones, mainly to fix compatibility
issues and nothing more. Eg: what's the point of disabling exit(), wait(),
kill(), fork() or getpid()... It would only increase the difficulty to
sort out bug reports.

Just my opinion,
Willy


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/3] x86/ldt: Make modify_ldt optional

2015-07-23 Thread Kees Cook
On Thu, Jul 23, 2015 at 4:58 PM, Willy Tarreau w...@1wt.eu wrote:
 On Thu, Jul 23, 2015 at 04:40:14PM -0700, Andy Lutomirski wrote:
 On Thu, Jul 23, 2015 at 4:36 PM, Kees Cook keesc...@chromium.org wrote:
  I've been pondering something like this that is even MORE generic, for
  any syscall. Something like a syscalls directory under
  /proc/sys/kernel, with 1 entry per syscall. 0 is available, 1 is
  disabled, and -1 disabled until next boot.
 

 It might want to be /proc/sys/kernel/syscalls/[abi]/[name], possibly
 with more than just those options.  We might want disabled, returns
 ENOSYS, disabled, returns EPERM, and a lock bit.

 On x86 at least, the implementation's easy -- we can just poke the
 syscall table.

 I wouldn't do it these days. Around 2000-2001, with a friend we designed
 a module with its userland counterpart which was called overloader. The
 principle was to intercept syscalls in order to enforce some form of
 policies, log values, or remap paths, etc. The first use was to log all
 file creations during a make install to more easily build packages. It
 was at the era where it was easy to modify the syscall table from a module,
 in kernel 2.2.

 We quickly found that beyond logging/rewriting syscall arguments, it had
 limited use cases when used as a syscall firewall because many syscalls
 are still too coarse to decide whether you want to enable/disable them.
 I remember that socketcall() and ioctl() were among the annoying ones.
 Either you totally enable or totally disable. In the end, the only valid
 use cases we found for enabling/disabling a syscall were limited to a very
 small set for debugging purposes, in order to force some application code
 to detect a missing implementation and switch to an alternative (eg: these
 days if you suspect a bug in epoll you could disable it and force the app
 to use poll instead). It was still useful to disable module loading and
 FS mounting but that was about all by then.

 All this to say that probably only a handful of tricky syscalls would
 need an on/off switch but clearly not all of them at all, so I'd rather
 add a few entries just for the relevant ones, mainly to fix compatibility
 issues and nothing more. Eg: what's the point of disabling exit(), wait(),
 kill(), fork() or getpid()... It would only increase the difficulty to
 sort out bug reports.

 Just my opinion,

Well, I would really like to have something like this around so that I
can trivially globally disable syscalls when they have security risks.
My hack[1] to disable kexec_load, for example, was terrible while I
waited for a kernel that supported the disable_kexec_load sysctl.

-Kees

[1] https://outflux.net/blog/archives/2013/12/10/live-patching-the-kernel/

-- 
Kees Cook
Chrome OS Security

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/3] x86/ldt: Make modify_ldt optional

2015-07-23 Thread Andy Lutomirski
On Thu, Jul 23, 2015 at 4:36 PM, Kees Cook keesc...@chromium.org wrote:
 On Thu, Jul 23, 2015 at 3:24 AM, Willy Tarreau w...@1wt.eu wrote:
  #ifdef CONFIG_SMP
  static void flush_ldt(void *current_mm)
  {
 @@ -254,6 +260,9 @@ asmlinkage int sys_modify_ldt(int func, void __user *ptr,
  {
 int ret = -ENOSYS;

 +   if (!sysctl_modify_ldt)
 +   return ret;
 +
 switch (func) {
 case 0:
 ret = read_ldt(ptr, bytecount);
 diff --git a/kernel/sysctl.c b/kernel/sysctl.c
 index 2082b1a..60270c6 100644
 --- a/kernel/sysctl.c
 +++ b/kernel/sysctl.c
 @@ -111,6 +111,9 @@ extern int sysctl_nr_open_min, sysctl_nr_open_max;
  #ifndef CONFIG_MMU
  extern int sysctl_nr_trim_pages;
  #endif
 +#ifdef CONFIG_X86
 +extern int sysctl_modify_ldt;
 +#endif

  /* Constants used for minimum and  maximum */
  #ifdef CONFIG_LOCKUP_DETECTOR
 @@ -962,6 +965,13 @@ static struct ctl_table kern_table[] = {
 .mode   = 0644,
 .proc_handler   = proc_dointvec,
 },
 +   {
 +   .procname   = modify_ldt,
 +   .data   = sysctl_modify_ldt,
 +   .maxlen = sizeof(int),
 +   .mode   = 0644,
 +   .proc_handler   = proc_dointvec,
 +   },
  #endif
  #if defined(CONFIG_MMU)
 {

 I've been pondering something like this that is even MORE generic, for
 any syscall. Something like a syscalls directory under
 /proc/sys/kernel, with 1 entry per syscall. 0 is available, 1 is
 disabled, and -1 disabled until next boot.


It might want to be /proc/sys/kernel/syscalls/[abi]/[name], possibly
with more than just those options.  We might want disabled, returns
ENOSYS, disabled, returns EPERM, and a lock bit.

On x86 at least, the implementation's easy -- we can just poke the
syscall table.

--Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [MINUTES] Monthly Xen.org Technical Call (2015-07-22)

2015-07-23 Thread Andrew Cooper
On 23/07/15 12:17, Roger Pau Monné wrote:
 El 23/07/15 a les 12.59, Konrad Rzeszutek Wilk ha escrit:
 [...]
 We forgot to speak about dom0. This work outlined will lay out how to do it 
 - but the pieces for dom0 are not implemented and would need work (which 
 actually may be following most of the is_pvh in the hypervisor).
 Let's do that once we have DomU nailed down and the guest ABI marked as
 stable.

 Some work will indeed be needed for Dom0, for example I would like to
 avoid using the current Dom0 builder (construct_dom0, that is suited for
 PV, but all the PVH hacks just make it almost impossible to understand)
 and start from scratch with a more sane approach.

+1 on both points.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 13:45, ian.jack...@eu.citrix.com wrote:
 Ian Jackson writes ([PATCH 03/16] xen/passthrough: extend hypercall to 
 support rdm reservation policy):
 From: Tiejun Chen tiejun.c...@intel.com
 
 This patch extends the existing hypercall to support rdm reservation policy.
 We return error or just throw out a warning message depending on whether
 the policy is strict or relaxed when reserving RDM regions in pfn space.
 Note in some special cases, e.g. add a device to hwdomain, and remove a
 device from user domain, 'relaxed' is fine enough since this is always safe
 to hwdomain.
 
 This patch breaks the build on ARM:
 
 gcc -O1 -fno-omit-frame-pointer -marm -g -fno-strict-aliasing -std=gnu99 
 -Wall 
 -Wstrict-prototypes -Wdeclaration-after-statement 
 -Wno-unused-but-set-variable 
 -Wno-unused-local-typedefs   -I/local/scratch/ianj/xen.git/xen/include 
 -fno-stack-protector -fno-exceptions -Wnested-externs -msoft-float 
 -mcpu=cortex-a15 
 -DGCC_HAS_VISIBILITY_ATTRIBUTE -fno-builtin -fno-common -Werror 
 -Wredundant-decls 
 -Wno-pointer-arith -pipe -g -D__XEN__ -include 
 /local/scratch/ianj/xen.git/xen/include/xen/config.h -nostdinc 
 -fno-optimize-sibling-calls -DVERBOSE -DHAS_PASSTHROUGH -DHAS_DEVICE_TREE 
 -DHAS_MEM_ACCESS -DHAS_PDX -fno-omit-frame-pointer -DCONFIG_FRAME_POINTER 
 -MMD -MF 
 .smmu.o.d -c smmu.c -o smmu.o
 smmu.c: In function 'arm_smmu_reassign_dev':
 smmu.c:2712:9: error: too few arguments to function 'arm_smmu_assign_dev'
ret = arm_smmu_assign_dev(t, devfn, dev);
  ^
 smmu.c:2607:12: note: declared here
  static int arm_smmu_assign_dev(struct domain *d, u8 devfn,
 ^
 /local/scratch/ianj/xen.git/xen/Rules.mk:168: recipe for target 'smmu.o' 
 failed
 make[6]: *** [smmu.o] Error 1
 
 I had a quick look but it's not a simple matter of plumbing through an
 additional flags parameter becuase the reassign_device method
 apparently doesn't take flags.

Considering that the parameter is ignored anyway, I'd suggest
following what was done for various of the rmrr_identity_mapping()
callers - just pass zero.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] tools/libxl: Fixes to stream v2 task joining logic

2015-07-23 Thread Wei Liu
On Thu, Jul 23, 2015 at 12:42:25PM +0100, Andrew Cooper wrote:
 On 23/07/15 12:38, Wei Liu wrote:
  On Thu, Jul 23, 2015 at 12:09:38PM +0100, Andrew Cooper wrote:
  During review of the libxl migration v2 series, I changes the task
  joining logic, but clearly didn't think the result through
  properly. This patch fixes several errors.
 
  1) Do not call check_all_finished() in the success cases of
  libxl__xc_domain_{save,restore}_done().  It serves no specific purpose
  as the save helper state will report itself as inactive by this point,
  and avoids triggering a second stream-completion_callback() in the case
  that write_toolstack_record()/stream_continue() record errors
  synchronously themselves.
 
  2) Only ever set stream-rc in stream_complete().  The first version of
  the migration v2 series had separate rc and joined_rc parameters, where
  this logic worked.  However when combining the two, the teardown path
  fails to trigger if stream_done() records stream-rc itself.  A side
  effect of this is that stream_done() needs to take an rc parameter.
 
  3) Avoid stacking of check_all_finished() via synchronous teardown of
  tasks.  If the _abort() functions call back synchronously,
  stream-completion_callback() ends up getting called twice, as first and
  last check_all_finished() frames observe each task being finished.
 
  Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
  CC: Ian Campbell ian.campb...@citrix.com
  CC: Ian Jackson ian.jack...@eu.citrix.com
  CC: Wei Liu wei.l...@citrix.com
 
  ---
  I found this while working to fix the toolstack record issue, but am
  posting this bugfix ahead of the other work as OSSTest has tripped over
  the issue.
  This change itself doesn't seem to have anything to do with libxc. In
  OSSTest the error that triggers this knock on effect is the failure of
  xc_map_foreign_bulk. Does that mean this patch only fix half of the
  problem seen in OSSTest?
 
 Saving to image new xl format (info 0x3/0x0/1797)
 xc: info: In experimental xc_domain_save2
 xc: info: Saving domain 2, type x86 HVM
 xc: progress: Memory: 67584/13445135%
 xc: progress: Memory: 135168/1344513   10%
 xc: progress: Memory: 201728/1344513   15%
 xc: progress: Memory: 269312/1344513   20%
 xc: progress: Memory: 336896/1344513   25%
 xc: progress: Memory: 403456/1344513   30%
 xc: progress: Memory: 471040/1344513   35%
 xc: progress: Memory: 538624/1344513   40%
 xc: progress: Memory: 605184/1344513   45%
 xc: progress: Memory: 672768/1344513   50%
 xc: progress: Memory: 740352/1344513   55%
 xc: error: xc_map_foreign_bulk: mmap failed (12 = Cannot allocate
 memory): Internal error
 xc: error: Failed to map guest pages (12 = Cannot allocate memory):
 Internal error
 xc: error: Save failed (12 = Cannot allocate memory): Internal error
 libxl: error: libxl_stream_write.c:267:libxl__xc_domain_save_done:
 saving domain: domain responded to suspend request: Cannot allocate memory
 xl: libxl_event.c:1866: libxl__ao_inprogress_gc: Assertion
 `!ao-complete' failed.
 
 
 I don't know why xc_map_foreign_bulk() failed, but at a guess I would
 day dom0 is out of RAM.
 

I don't think so. There is no OOM error in serial log and dmesg.

A similar failure is discovered in QEMU mainline test. That one is using
OVMF + QEMU upstream. So this failure is not related to firmware
or device model.

http://logs.test-lab.xenproject.org/osstest/logs/59819/test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm/info.html

It should be a real bug manifests only on 32bit toolstack + guest with
large amount of ram (5000 in this case).

Looks like we have one more bug to hunt down.

Wei.

 This patch fixes the libxl assertion.
 
 ~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4 16/17] vmx: Add some scheduler hooks for VT-d posted interrupts

2015-07-23 Thread Dario Faggioli
On Thu, 2015-07-23 at 19:35 +0800, Feng Wu wrote:
 This patch adds the following arch hooks in scheduler:

So, preliminary question: does this mean that you have identified (and
fixed) the differences in behavior wrt the runstate based model, which
was causing performance issue?

I've been sidetracked a bit in looking at your previous patch, but did
found a couple of differences, which I was about to report... But I
guess that's no longer necessary, I guess, is it?

Dario

 - vmx_pre_ctx_switch_pi():
 It is called before context switch, we update the posted
 interrupt descriptor when the vCPU is preempted, go to sleep,
 or is blocked.
 
 - vmx_post_ctx_switch_pi()
 It is called after context switch, we update the posted
 interrupt descriptor when the vCPU is going to run.
 
 - arch_vcpu_wake()
 It will be called when waking up the vCPU, we update
 the posted interrupt descriptor when the vCPU is unblocked.

-- 
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/HVM: atomically access pointers in bufioreq handling

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 12:04, stefano.stabell...@eu.citrix.com wrote:
 On Thu, 23 Jul 2015, Jan Beulich wrote:
  On 22.07.15 at 19:24, stefano.stabell...@eu.citrix.com wrote:
  I'll queue this change up for the next QEMU release cycle.
 
 Thanks - v2 (with the adjusted description) just sent.
 
 It would however be nice for our variant in 4.6 to also gain this,
 perhaps independent of upstream's schedule.
 
 Is the hypervisor side going to go in 4.6? Do we need a freeze
 exception or do we consider this a bug fix?

As said in the updated description in v2, the hypervisor side went in
already some time ago.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [MINUTES] Monthly Xen.org Technical Call (2015-07-22)

2015-07-23 Thread Roger Pau Monné
El 23/07/15 a les 12.59, Konrad Rzeszutek Wilk ha escrit:
[...]
 We forgot to speak about dom0. This work outlined will lay out how to do it - 
 but the pieces for dom0 are not implemented and would need work (which 
 actually may be following most of the is_pvh in the hypervisor).

Let's do that once we have DomU nailed down and the guest ABI marked as
stable.

Some work will indeed be needed for Dom0, for example I would like to
avoid using the current Dom0 builder (construct_dom0, that is suited for
PV, but all the PVH hacks just make it almost impossible to understand)
and start from scratch with a more sane approach.

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] hvmloader: don't build with __XEN_TOOLS__ defined

2015-07-23 Thread Andrew Cooper
On 22/07/15 15:38, Jan Beulich wrote:
 This being an in-guest component, it shouldn't get to see (and even
 less so use) tools-only public interfaces.

 Signed-off-by: Jan Beulich jbeul...@suse.com

Reviewed-by: Andrew Cooper andrew.coop...@citrix.com


 --- a/tools/firmware/hvmloader/Makefile
 +++ b/tools/firmware/hvmloader/Makefile
 @@ -31,6 +31,9 @@ SMBIOS_REL_DATE ?= $(shell date +%m/%d/%
  
  CFLAGS += $(CFLAGS_xeninclude)
  
 +# We mustn't use tools-only public interfaces.
 +CFLAGS += -U__XEN_TOOLS__ 
 -D__XEN_INTERFACE_VERSION__=__XEN_LATEST_INTERFACE_VERSION__
 +
  OBJS  = hvmloader.o mp_tables.o util.o smbios.o 
  OBJS += smp.o cacheattr.o xenbus.o vnuma.o
  OBJS += e820.o pci.o pir.o ctype.o





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/6] libxl: Remove linux udev rules

2015-07-23 Thread Ian Campbell
On Thu, 2015-07-23 at 13:55 +0200, Roger Pau Monné wrote:
 El 14/07/15 a les 18.48, Ian Campbell ha escrit:
  On Tue, 2015-07-14 at 17:40 +0100, Wei Liu wrote:
   On Tue, Jul 14, 2015 at 05:35:23PM +0100, George Dunlap wrote:
On 07/14/2015 05:21 PM, Ian Campbell wrote:
 On Tue, 2015-07-14 at 12:13 -0400, Konrad Rzeszutek Wilk 
 wrote:
  On Tue, Jul 07, 2015 at 12:39:52PM +0100, Wei Liu wrote:
   On Mon, Jul 06, 2015 at 11:51:39AM +0100, George Dunlap 
   wrote:
They are no longer needed, having been replaced by a 
daemon for
driverdomains which will run scripts as necessary.
  
  This introduces an regression. The 'daemon for 
  driverdomains'
  is xenbackendd - which only gets built for NetBSD.
 
 It's not (any more), it's xl devd, which should be built 
 everywhere.

Aha!  I knew I'd seen it somewhere.

In that case, do we actually still need xenbackendd?  Or does
xenbackendd do something slightly different?

   
   That's NetBSD-only thing as far as I remember. Not sure if what 
   it does
   is NetBSD specific.
  
  My understanding was it had been subsumed by xl devd, but I don't 
  know
  why it hasn't been removed. Probably best to wait until Roger is 
  back...
 
 Some time ago I proposed to the NetBSD folks to remove xenbackendd,
 because I think it's useless now, but the reply was to leave it as 
 is:
 
 http://mail-index.netbsd.org/port-xen/2015/01/29/msg008483.html

It's redundant with xl devd, I don't see any reason for use to
continue carrying it in xen.git. If xl devd has some short coming
relative to xenbackendd then we should address that. But otherwise
if folks want to keep xenbackend despite it having been replaced then
they can carry the patch IMHO.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy

2015-07-23 Thread Ian Jackson
Ian Jackson writes ([PATCH 03/16] xen/passthrough: extend hypercall to support 
rdm reservation policy):
 From: Tiejun Chen tiejun.c...@intel.com
 
 This patch extends the existing hypercall to support rdm reservation policy.
 We return error or just throw out a warning message depending on whether
 the policy is strict or relaxed when reserving RDM regions in pfn space.
 Note in some special cases, e.g. add a device to hwdomain, and remove a
 device from user domain, 'relaxed' is fine enough since this is always safe
 to hwdomain.

This patch breaks the build on ARM:

gcc -O1 -fno-omit-frame-pointer -marm -g -fno-strict-aliasing -std=gnu99 -Wall 
-Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable 
-Wno-unused-local-typedefs   -I/local/scratch/ianj/xen.git/xen/include 
-fno-stack-protector -fno-exceptions -Wnested-externs -msoft-float 
-mcpu=cortex-a15 -DGCC_HAS_VISIBILITY_ATTRIBUTE -fno-builtin -fno-common 
-Werror -Wredundant-decls -Wno-pointer-arith -pipe -g -D__XEN__ -include 
/local/scratch/ianj/xen.git/xen/include/xen/config.h -nostdinc 
-fno-optimize-sibling-calls -DVERBOSE -DHAS_PASSTHROUGH -DHAS_DEVICE_TREE 
-DHAS_MEM_ACCESS -DHAS_PDX -fno-omit-frame-pointer -DCONFIG_FRAME_POINTER -MMD 
-MF .smmu.o.d -c smmu.c -o smmu.o
smmu.c: In function 'arm_smmu_reassign_dev':
smmu.c:2712:9: error: too few arguments to function 'arm_smmu_assign_dev'
   ret = arm_smmu_assign_dev(t, devfn, dev);
 ^
smmu.c:2607:12: note: declared here
 static int arm_smmu_assign_dev(struct domain *d, u8 devfn,
^
/local/scratch/ianj/xen.git/xen/Rules.mk:168: recipe for target 'smmu.o' failed
make[6]: *** [smmu.o] Error 1

I had a quick look but it's not a simple matter of plumbing through an
additional flags parameter becuase the reassign_device method
apparently doesn't take flags.


Tiejun, please can you send a patch to fix this up.  Please send just
a revised version of this patch.  I think the rest of the series will
rebase just fine on top of it.  (If I'm wrong then we will need to do
something more complex.)

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/MSI: drop bogus NULL check from pci_restore_msi_state()

2015-07-23 Thread Jan Beulich
Commit 372900faf8 (x86/MSI-X: reduce fiddling with control register
during restore) introduced de-references of pdev before it gets
checked against NULL. Instead of deferring the de-references, drop
the pointless check - both call sites do that check already.

Reported-by: Andrew Cooper andrew.coop...@citrix.com
Signed-off-by: Jan Beulich jbeul...@suse.com

--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -1354,9 +1354,6 @@ int pci_restore_msi_state(struct pci_dev
 if ( !use_msi )
 return -EOPNOTSUPP;
 
-if ( !pdev )
-return -EINVAL;
-
 ret = xsm_resource_setup_pci(XSM_PRIV,
 (pdev-seg  16) | (pdev-bus  8) |
 pdev-devfn);



x86/MSI: drop bogus NULL check from pci_restore_msi_state()

Commit 372900faf8 (x86/MSI-X: reduce fiddling with control register
during restore) introduced de-references of pdev before it gets
checked against NULL. Instead of deferring the de-references, drop
the pointless check - both call sites do that check already.

Reported-by: Andrew Cooper andrew.coop...@citrix.com
Signed-off-by: Jan Beulich jbeul...@suse.com

--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -1354,9 +1354,6 @@ int pci_restore_msi_state(struct pci_dev
 if ( !use_msi )
 return -EOPNOTSUPP;
 
-if ( !pdev )
-return -EINVAL;
-
 ret = xsm_resource_setup_pci(XSM_PRIV,
 (pdev-seg  16) | (pdev-bus  8) |
 pdev-devfn);
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4 15/17] arm: add a dummy arch hooks for scheduler

2015-07-23 Thread Julien Grall
Hi,

On 23/07/15 12:35, Feng Wu wrote:
 Add a dummy arch hooks for scheduler to make the build successful
 in ARM.

I would make more sense if you introduce the dummy hooks in the patch
where you effectively use it (i.e #16).

Actually you also introduce arch_vcpu_wake for x86 there too.

 CC: Ian Campbell ian.campb...@citrix.com
 CC: Stefano Stabellini stefano.stabell...@citrix.com
 CC: Tim Deegan t...@xen.org
 Signed-off-by: Feng Wu feng...@intel.com
 ---
 v4:
 - Newly added
 
  xen/include/asm-arm/domain.h | 2 ++
  1 file changed, 2 insertions(+)
 
 diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
 index f1a087e..7f25fd5 100644
 --- a/xen/include/asm-arm/domain.h
 +++ b/xen/include/asm-arm/domain.h
 @@ -265,6 +265,8 @@ static inline unsigned int domain_max_vcpus(const struct 
 domain *d)
  return MAX_VIRT_CPUS;
  }
  
 +static void arch_vcpu_wake(struct vcpu *v) {}
 +

Please use static inline here.

  #endif /* __ASM_DOMAIN_H__ */
  
  /*
 

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/6] libxl: Remove linux udev rules

2015-07-23 Thread Roger Pau Monné
El 14/07/15 a les 18.48, Ian Campbell ha escrit:
 On Tue, 2015-07-14 at 17:40 +0100, Wei Liu wrote:
 On Tue, Jul 14, 2015 at 05:35:23PM +0100, George Dunlap wrote:
 On 07/14/2015 05:21 PM, Ian Campbell wrote:
 On Tue, 2015-07-14 at 12:13 -0400, Konrad Rzeszutek Wilk wrote:
 On Tue, Jul 07, 2015 at 12:39:52PM +0100, Wei Liu wrote:
 On Mon, Jul 06, 2015 at 11:51:39AM +0100, George Dunlap wrote:
 They are no longer needed, having been replaced by a daemon for
 driverdomains which will run scripts as necessary.

 This introduces an regression. The 'daemon for driverdomains'
 is xenbackendd - which only gets built for NetBSD.

 It's not (any more), it's xl devd, which should be built everywhere.

 Aha!  I knew I'd seen it somewhere.

 In that case, do we actually still need xenbackendd?  Or does
 xenbackendd do something slightly different?


 That's NetBSD-only thing as far as I remember. Not sure if what it does
 is NetBSD specific.
 
 My understanding was it had been subsumed by xl devd, but I don't know
 why it hasn't been removed. Probably best to wait until Roger is back...

Some time ago I proposed to the NetBSD folks to remove xenbackendd,
because I think it's useless now, but the reply was to leave it as is:

http://mail-index.netbsd.org/port-xen/2015/01/29/msg008483.html

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy

2015-07-23 Thread Ian Campbell
On Thu, 2015-07-23 at 12:45 +0100, Ian Jackson wrote:
 
 I had a quick look but it's not a simple matter of plumbing through 
 an additional flags parameter becuase the reassign_device method
 apparently doesn't take flags.

The vtd version (reassign_device_ownership) does:
/*
 * Any RMRR flag is always ignored when remove a device,
 * but its always safe and strict to set 0.
 */
ret = rmrr_identity_mapping(source, 0, rmrr, 0);

Where that second 0 there was the flag when it was called from
(intel_iommu_assign_device).

Given that arm_smmu_assign_dev currently ignores the flag, and that ARM
doesn't currently have RDM/RMRR at all I think just passing 0 in
arm_smmu_reassign_dev would be tolerable.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] tools/libxl: Fixes to stream v2 task joining logic

2015-07-23 Thread Andrew Cooper
On 23/07/15 12:38, Wei Liu wrote:
 On Thu, Jul 23, 2015 at 12:09:38PM +0100, Andrew Cooper wrote:
 During review of the libxl migration v2 series, I changes the task
 joining logic, but clearly didn't think the result through
 properly. This patch fixes several errors.

 1) Do not call check_all_finished() in the success cases of
 libxl__xc_domain_{save,restore}_done().  It serves no specific purpose
 as the save helper state will report itself as inactive by this point,
 and avoids triggering a second stream-completion_callback() in the case
 that write_toolstack_record()/stream_continue() record errors
 synchronously themselves.

 2) Only ever set stream-rc in stream_complete().  The first version of
 the migration v2 series had separate rc and joined_rc parameters, where
 this logic worked.  However when combining the two, the teardown path
 fails to trigger if stream_done() records stream-rc itself.  A side
 effect of this is that stream_done() needs to take an rc parameter.

 3) Avoid stacking of check_all_finished() via synchronous teardown of
 tasks.  If the _abort() functions call back synchronously,
 stream-completion_callback() ends up getting called twice, as first and
 last check_all_finished() frames observe each task being finished.

 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com

 ---
 I found this while working to fix the toolstack record issue, but am
 posting this bugfix ahead of the other work as OSSTest has tripped over
 the issue.
 This change itself doesn't seem to have anything to do with libxc. In
 OSSTest the error that triggers this knock on effect is the failure of
 xc_map_foreign_bulk. Does that mean this patch only fix half of the
 problem seen in OSSTest?

Saving to image new xl format (info 0x3/0x0/1797)
xc: info: In experimental xc_domain_save2
xc: info: Saving domain 2, type x86 HVM
xc: progress: Memory: 67584/13445135%
xc: progress: Memory: 135168/1344513   10%
xc: progress: Memory: 201728/1344513   15%
xc: progress: Memory: 269312/1344513   20%
xc: progress: Memory: 336896/1344513   25%
xc: progress: Memory: 403456/1344513   30%
xc: progress: Memory: 471040/1344513   35%
xc: progress: Memory: 538624/1344513   40%
xc: progress: Memory: 605184/1344513   45%
xc: progress: Memory: 672768/1344513   50%
xc: progress: Memory: 740352/1344513   55%
xc: error: xc_map_foreign_bulk: mmap failed (12 = Cannot allocate
memory): Internal error
xc: error: Failed to map guest pages (12 = Cannot allocate memory):
Internal error
xc: error: Save failed (12 = Cannot allocate memory): Internal error
libxl: error: libxl_stream_write.c:267:libxl__xc_domain_save_done:
saving domain: domain responded to suspend request: Cannot allocate memory
xl: libxl_event.c:1866: libxl__ao_inprogress_gc: Assertion
`!ao-complete' failed.


I don't know why xc_map_foreign_bulk() failed, but at a guess I would
day dom0 is out of RAM.

This patch fixes the libxl assertion.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 07/32] xen/x86: fix arch_set_info_guest for HVM guests

2015-07-23 Thread Ian Campbell
On Thu, 2015-07-23 at 05:29 -0600, Jan Beulich wrote:
  
On 23.07.15 at 12:25, roger@citrix.com wrote:
  El 13/07/15 a les 16.01, Jan Beulich ha escrit:
  On 03.07.15 at 13:34, roger@citrix.com wrote:
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -795,6 +795,15 @@ int arch_set_info_guest(
   c.nat-fs_base || c.nat-gs_base_user)) )
 return -EINVAL;
 }
+else if ( is_hvm_domain(d) )
+{
+if ( c(ctrlreg[0]) || c(ctrlreg[1]) || c(ctrlreg[2]) 
||
+ c(ctrlreg[3]) || c(ctrlreg[4]) || c(ctrlreg[5]) 
||
+ c(ctrlreg[6]) || c(ctrlreg[7]) || c(ldt_base) ||
+ c(ldt_ents) || c(kernel_ss) || c(kernel_sp) ||
+ c(gdt_ents) )
+return -EINVAL;
+}
   
   In addition to what Andrew said - is the use of c() here really 
   correct
   considering
   
   compat = is_pv_32bit_domain(d);
   
   #define c(fld) (compat ? (c.cmp-fld) : (c.nat-fld))
   
   near the beginning of the function?
  
  I've been thinking about this. Since HVM vCPUs are always started 
  in
  32bit mode, I would ideally like to use the 
  vcpu_guest_context_x86_32_t
  struct to set the vCPU context. This means that for HVM guests 
  compat
  should always be true.
  
  The problem is that inside of the guest there's no notion of
  vcpu_guest_context_x86_32_t or vcpu_guest_context_x86_64_t, there's 
  only
  vcpu_guest_context, which defaults to the native bitness of the 
  guest.
  So if your guest is running in 64bit mode vcpu_guest_context is 
  aliased
  to vcpu_guest_context_x86_64_t by default.
  
  TBH I'm not sure about the best way to solve this. Should
  vcpu_guest_context_x86_32_t and vcpu_guest_context_x86_64_t be 
  exposed
  to the guest like it's done for the tools?
 
 Why? We're in arch_set_info_guest(), which is unreachable for a
 HVM guest on itself. Or is this in preparation of allowing HVM
 guests to use VCPUOP_initialise? In that case, exposing might be
 an option, but remember that the compat mode layout even today
 is used only for PV guests. So I'd rather avoid exposing both
 layouts to guests, and do translation instead inside the hypervisor.

On ARM we deliberately arranged that the vcpu_guest_context was the
same for both arm32 and arm64. Moving to PVH seems like an ideal
opportunity to do something similar for x86, if not by standardising on
the x86_64 layout then by declaring a new struct which can encompass
both and declaring the old ones PV legacy.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy

2015-07-23 Thread Ian Campbell
On Thu, 2015-07-23 at 13:19 +0100, Ian Jackson wrote:
 Chen, Tiejun writes (Re: [PATCH 03/16] xen/passthrough: extend 
 hypercall to support rdm reservation policy):
   Tiejun, please can you send a patch to fix this up.  Please send 
   just
   a revised version of this patch.  I think the rest of the series 
   will
   rebase just fine on top of it.  (If I'm wrong then we will need 
   to do
   something more complex.)
  
  Sorry to this.
  
  
   On ARM side the flag field doesn't take any affect.
  
   Signed-off-by: Tiejun Chen tiejun.c...@intel.com
 
 OK, thanks.  I wasn't sure that just passing 0 was right.
 
 I will take Jan's and Ian's suggestions as acks for your patch, which
 looks correct to me to implement them.

Can you avoid the mention of DT in the comment please, since PCI will
eventually go that path. Something like No flags are defined for ARM
would suffice I think.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2] oxenstored: link in the systemd system library

2015-07-23 Thread Jonathan Creekmore
If systemd is configured for use AND you are building oxenstored, the C
systemd library must be linked in to the systemd.cxma library.

Signed-off-by: Jonathan Creekmore jonathan.creekm...@gmail.com

---
Changed since v1:
  * Link the systemd library in to the systemd.cxma instead of the final
  binary.
---
 tools/ocaml/xenstored/Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/ocaml/xenstored/Makefile b/tools/ocaml/xenstored/Makefile
index d861f11..59875f7 100644
--- a/tools/ocaml/xenstored/Makefile
+++ b/tools/ocaml/xenstored/Makefile
@@ -30,6 +30,8 @@ systemd_OBJS = systemd
 systemd_C_OBJS = systemd_stubs
 OCAML_LIBRARY += systemd
 
+LIBS_systemd += $(LDFLAGS-y)
+
 OBJS = define \
stdext \
trie \
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-3.18 test] 59825: regressions - FAIL

2015-07-23 Thread osstest service owner
flight 59825 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59825/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail REGR. vs. 58581
 test-amd64-i386-xl-qemut-debianhvm-amd64 11 guest-saverestore fail REGR. vs. 
58581

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-rumpuserxen-amd64 15 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail REGR. vs. 58558
 test-armhf-armhf-libvirt  6 xen-boot  fail REGR. vs. 58581
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 11 guest-saverestore fail 
baseline untested
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 14 guest-localmigrate.2 
fail baseline untested
 test-armhf-armhf-xl-rtds 14 guest-start.2   fail baseline untested
 test-armhf-armhf-xl-credit2   6 xen-boot fail   like 58581
 test-armhf-armhf-xl   6 xen-boot fail   like 58581
 test-armhf-armhf-xl-multivcpu  6 xen-boot fail  like 58581
 test-armhf-armhf-xl-xsm   6 xen-boot fail   like 58581
 test-armhf-armhf-libvirt-xsm  6 xen-boot fail   like 58581
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 58581
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 58581
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 58581

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-armhf-armhf-xl-cubietruck  6 xen-boot fail never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass

version targeted for testing:
 linux22a6cbf9f36ee3ae2878efcbdde33e6ca00b9c4b
baseline version:
 linuxd048c068d00da7d4cfa5ea7651933b99026958cf

Last test of basis58581  2015-06-15 09:42:22 Z   38 days
Failing since 58976  2015-06-29 19:43:23 Z   23 days   33 attempts
Testing same since59825  2015-07-22 17:19:07 Z0 days1 attempts


346 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  fail
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsmfail
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm fail
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-xl-xsm  pass
 test-armhf-armhf-xl-xsm  fail
 test-amd64-i386-xl-xsm   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemut-rhel6hvm-amd 

Re: [Xen-devel] [v4 16/17] vmx: Add some scheduler hooks for VT-d posted interrupts

2015-07-23 Thread Wu, Feng


 -Original Message-
 From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
 Sent: Thursday, July 23, 2015 8:50 PM
 To: Wu, Feng
 Cc: xen-devel@lists.xen.org; Tian, Kevin; Keir Fraser; George Dunlap; Andrew
 Cooper; Jan Beulich
 Subject: Re: [Xen-devel] [v4 16/17] vmx: Add some scheduler hooks for VT-d
 posted interrupts
 
 On Thu, 2015-07-23 at 19:35 +0800, Feng Wu wrote:
  This patch adds the following arch hooks in scheduler:
 
 So, preliminary question: does this mean that you have identified (and
 fixed) the differences in behavior wrt the runstate based model, which
 was causing performance issue?

Yes, the issue has been fixed in this version.

 
 I've been sidetracked a bit in looking at your previous patch, but did
 found a couple of differences, which I was about to report... But I
 guess that's no longer necessary, I guess, is it?

Thanks for your effort, Dario! Maybe you can give a review about this
new version, if you have time.

Thanks,
Feng

 
 Dario
 
  - vmx_pre_ctx_switch_pi():
  It is called before context switch, we update the posted
  interrupt descriptor when the vCPU is preempted, go to sleep,
  or is blocked.
 
  - vmx_post_ctx_switch_pi()
  It is called after context switch, we update the posted
  interrupt descriptor when the vCPU is going to run.
 
  - arch_vcpu_wake()
  It will be called when waking up the vCPU, we update
  the posted interrupt descriptor when the vCPU is unblocked.
 
 --
 This happens because I choose it to happen! (Raistlin Majere)
 -
 Dario Faggioli, Ph.D, http://about.me/dario.faggioli
 Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Patch V4 1/3] usb: Add Xen pvUSB protocol description

2015-07-23 Thread Juergen Gross

On 07/23/2015 09:08 PM, Greg KH wrote:

On Thu, Jul 23, 2015 at 08:46:17AM +0200, Juergen Gross wrote:

On 07/23/2015 06:36 AM, Greg KH wrote:

On Thu, Jul 23, 2015 at 06:04:39AM +0200, Juergen Gross wrote:

On 07/23/2015 01:46 AM, Greg KH wrote:

On Tue, Jun 23, 2015 at 08:53:23AM +0200, Juergen Gross wrote:

Add the definition of pvUSB protocol used between the pvUSB frontend in
a Xen domU and the pvUSB backend in a Xen driver domain (usually Dom0).

This header was originally provided by Fujitsu for Xen based on Linux
2.6.18.

Changes are:
- adapt to Linux style guide

Signed-off-by: Juergen Gross jgr...@suse.com
---
  include/xen/interface/io/usbif.h | 252 +++


Why is this a different interface than the existing ones we have today
(i.e. usbip?)  Where is it documented?  Do the Xen developers /


The interface definition is living in the Xen git repository for several
years now:

git://xenbits.xen.org/xen.git - xen/include/public/io/usbif.h


That's header file, not a document describing the api here.


I suppose you want to tell me I should add something like:

Documentation/DocBook/usb/API-struct-urb.html


Somewhere that people can refer to that describes this public-facing API
that must not ever be broken or changed.  If you want to put it in a
documentation file, or a .h file, I don't care.



It is used e.g. in SUSE's xen kernel since 2.6.18.


I am very aware of the amount of Xen crap in SuSE's kernel, don't use
that as an excuse for me to merge it to mainline :)


:-)

Wasn't meant as an excuse, just a hint why the interface can't be the
same as for usbip. We have to ensure compatibility with those kernels


This shouldn't be a kernel/kernel compability issue, as the api talks
between Xen and the OS, not between different OSs, right?


Depends on where the backend is living. It's the backend the frontend is
talking to.

There is a backend in SUSE's kernels up to SLE12. So compatibility is
to be maintained to those kernels.

Looks as if in future there will be one in qemu.


and possibly other operating systems (BSD?, Windows?) which already
might be using pvUSB with a Dom0 based on the SUSE xen kernel.


Are there other operating system drivers today that use this API?  Is
this an API in the Xen core today that we have to support?


Yes.


Some more background / descriptions would be nice to have.


I guess a documentation file giving a brief explanation about the
interfaces of Xen wouldn't be a bad idea. This could avoid discussions
like this.

It shouldn't define each interface, but the classes of interfaces which
are existing (between kernel and hypervisor, frontends and backends)
and the stability requirements. Headers like the one we are discussing
here could then refer to this document.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4 04/17] iommu: Add iommu_intpost to control VT-d Posted-Interrupts feature

2015-07-23 Thread Wu, Feng


 -Original Message-
 From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
 Sent: Thursday, July 23, 2015 10:05 PM
 To: Wu, Feng; xen-devel@lists.xen.org
 Cc: Tian, Kevin; Jan Beulich
 Subject: Re: [Xen-devel] [v4 04/17] iommu: Add iommu_intpost to control VT-d
 Posted-Interrupts feature
 
 On 23/07/15 15:01, Andrew Cooper wrote:
  On 23/07/15 12:35, Feng Wu wrote:
  VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
  With VT-d Posted-Interrupts enabled, external interrupts from
  direct-assigned devices can be delivered to guests without VMM
  intervention when guest is running in non-root mode.
 
  This patch adds variable 'iommu_intpost' to control whether enable VT-d
  posted-interrupt or not in the generic IOMMU code.
 
  CC: Jan Beulich jbeul...@suse.com
  CC: Kevin Tian kevin.t...@intel.com
  Signed-off-by: Feng Wu feng...@intel.com
  Reviewed-by: Kevin Tian kevin.t...@intel.com
  Please patch docs/misc/xen-command-line.markdown as you are
 
  ---
  v4:
  - No changes
 
  v3:
  - Remove pointless initializer for 'iommu_intpost'.
  - Some adjustment for if no intremap then no intpost logic.
  * For parse_iommu_param(), move it to the end of the function,
so we don't need to add the some logic when introduing the
new kernel parameter 'intpost' in later patch.
  * Add this logic in iommu_setup() after iommu_hardware_setup()
is called.
 
   xen/drivers/passthrough/iommu.c | 17 -
   xen/include/xen/iommu.h |  2 +-
   2 files changed, 17 insertions(+), 2 deletions(-)
 
  diff --git a/xen/drivers/passthrough/iommu.c
 b/xen/drivers/passthrough/iommu.c
  index 06cb38f..536b69f 100644
  --- a/xen/drivers/passthrough/iommu.c
  +++ b/xen/drivers/passthrough/iommu.c
  @@ -39,6 +39,7 @@ static void iommu_dump_p2m_table(unsigned char
 key);
*   no-snoop   Disable VT-d Snoop Control
*   no-qinval  Disable VT-d Queued Invalidation
*   no-intremapDisable VT-d Interrupt Remapping
  + *   no-intpost Disable VT-d Interrupt posting
*/
   custom_param(iommu, parse_iommu_param);
   bool_t __initdata iommu_enable = 1;
  @@ -51,6 +52,14 @@ bool_t __read_mostly iommu_passthrough;
   bool_t __read_mostly iommu_snoop = 1;
   bool_t __read_mostly iommu_qinval = 1;
   bool_t __read_mostly iommu_intremap = 1;
  +
  +/*
  + * In the current implementation of VT-d posted interrupts, in some
 extreme
  + * cases, the per cpu list which saves the blocked vCPU will be very long,
  + * and this will affect the interrupt latency, so let this feature off by
  + * default until we find a good solution to resolve it.
  + */
  +bool_t __read_mostly iommu_intpost;
   bool_t __read_mostly iommu_hap_pt_share = 1;
   bool_t __read_mostly iommu_debug;
   bool_t __read_mostly amd_iommu_perdev_intremap = 1;
  @@ -112,6 +121,9 @@ static void __init parse_iommu_param(char *s)
 
   s = ss + 1;
   } while ( ss );
  +
  +if ( !iommu_intremap )
  +iommu_intpost = 0;
  This hunk is still wrong, as I said in v3.
 
 Furthermore, you appear to have no way in v4 to enable intpost on the
 Xen command line.

Oh, sorry, since this feature is required default off, I forget to send out the 
last
patch, which add the Xen command line. I will re-add it in the next version.

Thanks,
Feng

 
 ~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 59836: regressions - FAIL

2015-07-23 Thread osstest service owner
flight 59836 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59836/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-xsm   13 guest-saverestore fail REGR. vs. 59254
 test-amd64-i386-xl   13 guest-saverestore fail REGR. vs. 59254
 test-amd64-i386-pair   21 guest-migrate/src_host/dst_host fail REGR. vs. 59254
 test-amd64-i386-xl-qemut-win7-amd64 15 guest-localmigrate/x10 fail REGR. vs. 
59254
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 12 guest-localmigrate 
fail REGR. vs. 59254

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 11 guest-start   fail REGR. vs. 59254
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 59254
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 59254
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
like 59423-bisect

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 13 guest-saverestorefail  never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass

version targeted for testing:
 linuxc5dfd654d0ec0a28fe81e7bd4d4fd984a9855e09
baseline version:
 linux45820c294fe1b1a9df495d57f40585ef2d069a39

Last test of basis59254  2015-07-09 04:20:48 Z   14 days
Failing since 59348  2015-07-10 04:24:05 Z   13 days   14 attempts
Testing same since59836  2015-07-23 10:55:54 Z0 days1 attempts


452 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   fail
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsmfail
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm fail
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-xl-xsm  pass
 test-armhf-armhf-xl-xsm  pass
 test-amd64-i386-xl-xsm   fail
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 

Re: [Xen-devel] [v4 15/17] arm: add a dummy arch hooks for scheduler

2015-07-23 Thread Wu, Feng


 -Original Message-
 From: Julien Grall [mailto:julien.gr...@citrix.com]
 Sent: Thursday, July 23, 2015 7:54 PM
 To: Wu, Feng; xen-devel@lists.xen.org
 Cc: Stefano Stabellini; Ian Campbell; Tim Deegan
 Subject: Re: [Xen-devel] [v4 15/17] arm: add a dummy arch hooks for scheduler
 
 Hi,
 
 On 23/07/15 12:35, Feng Wu wrote:
  Add a dummy arch hooks for scheduler to make the build successful
  in ARM.
 
 I would make more sense if you introduce the dummy hooks in the patch
 where you effectively use it (i.e #16).
 
 Actually you also introduce arch_vcpu_wake for x86 there too.

I split it just because this is the only changes to ARM. I will make them
one patch any way.

Thanks,
Feng

 
  CC: Ian Campbell ian.campb...@citrix.com
  CC: Stefano Stabellini stefano.stabell...@citrix.com
  CC: Tim Deegan t...@xen.org
  Signed-off-by: Feng Wu feng...@intel.com
  ---
  v4:
  - Newly added
 
   xen/include/asm-arm/domain.h | 2 ++
   1 file changed, 2 insertions(+)
 
  diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
  index f1a087e..7f25fd5 100644
  --- a/xen/include/asm-arm/domain.h
  +++ b/xen/include/asm-arm/domain.h
  @@ -265,6 +265,8 @@ static inline unsigned int domain_max_vcpus(const
 struct domain *d)
   return MAX_VIRT_CPUS;
   }
 
  +static void arch_vcpu_wake(struct vcpu *v) {}
  +
 
 Please use static inline here.
 
   #endif /* __ASM_DOMAIN_H__ */
 
   /*
 
 
 Regards,
 
 --
 Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Patch V4 1/3] usb: Add Xen pvUSB protocol description

2015-07-23 Thread Pasi Kärkkäinen
On Thu, Jul 23, 2015 at 12:08:01PM -0700, Greg KH wrote:
 
 Somewhere that people can refer to that describes this public-facing API
 that must not ever be broken or changed.  If you want to put it in a
 documentation file, or a .h file, I don't care.
 
  It is used e.g. in SUSE's xen kernel since 2.6.18.
  
  I am very aware of the amount of Xen crap in SuSE's kernel, don't use
  that as an excuse for me to merge it to mainline :)
  
  :-)
  
  Wasn't meant as an excuse, just a hint why the interface can't be the
  same as for usbip. We have to ensure compatibility with those kernels
 
 This shouldn't be a kernel/kernel compability issue, as the api talks
 between Xen and the OS, not between different OSs, right?
 
  and possibly other operating systems (BSD?, Windows?) which already
  might be using pvUSB with a Dom0 based on the SUSE xen kernel.
 
 Are there other operating system drivers today that use this API?  Is
 this an API in the Xen core today that we have to support?
 
 Some more background / descriptions would be nice to have.


For example Xen GPLPV drivers for Windows do have PVUSB frontend driver..


-- Pasi

 
 thanks,
 
 greg k-h
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4 04/17] iommu: Add iommu_intpost to control VT-d Posted-Interrupts feature

2015-07-23 Thread Andrew Cooper
On 23/07/15 12:35, Feng Wu wrote:
 VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
 With VT-d Posted-Interrupts enabled, external interrupts from
 direct-assigned devices can be delivered to guests without VMM
 intervention when guest is running in non-root mode.

 This patch adds variable 'iommu_intpost' to control whether enable VT-d
 posted-interrupt or not in the generic IOMMU code.

 CC: Jan Beulich jbeul...@suse.com
 CC: Kevin Tian kevin.t...@intel.com
 Signed-off-by: Feng Wu feng...@intel.com
 Reviewed-by: Kevin Tian kevin.t...@intel.com

Please patch docs/misc/xen-command-line.markdown as you are

 ---
 v4:
 - No changes

 v3:
 - Remove pointless initializer for 'iommu_intpost'.
 - Some adjustment for if no intremap then no intpost logic.
 * For parse_iommu_param(), move it to the end of the function,
   so we don't need to add the some logic when introduing the
   new kernel parameter 'intpost' in later patch.
 * Add this logic in iommu_setup() after iommu_hardware_setup()
   is called.

  xen/drivers/passthrough/iommu.c | 17 -
  xen/include/xen/iommu.h |  2 +-
  2 files changed, 17 insertions(+), 2 deletions(-)

 diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
 index 06cb38f..536b69f 100644
 --- a/xen/drivers/passthrough/iommu.c
 +++ b/xen/drivers/passthrough/iommu.c
 @@ -39,6 +39,7 @@ static void iommu_dump_p2m_table(unsigned char key);
   *   no-snoop   Disable VT-d Snoop Control
   *   no-qinval  Disable VT-d Queued Invalidation
   *   no-intremapDisable VT-d Interrupt Remapping
 + *   no-intpost Disable VT-d Interrupt posting
   */
  custom_param(iommu, parse_iommu_param);
  bool_t __initdata iommu_enable = 1;
 @@ -51,6 +52,14 @@ bool_t __read_mostly iommu_passthrough;
  bool_t __read_mostly iommu_snoop = 1;
  bool_t __read_mostly iommu_qinval = 1;
  bool_t __read_mostly iommu_intremap = 1;
 +
 +/*
 + * In the current implementation of VT-d posted interrupts, in some extreme
 + * cases, the per cpu list which saves the blocked vCPU will be very long,
 + * and this will affect the interrupt latency, so let this feature off by
 + * default until we find a good solution to resolve it.
 + */
 +bool_t __read_mostly iommu_intpost;
  bool_t __read_mostly iommu_hap_pt_share = 1;
  bool_t __read_mostly iommu_debug;
  bool_t __read_mostly amd_iommu_perdev_intremap = 1;
 @@ -112,6 +121,9 @@ static void __init parse_iommu_param(char *s)
  
  s = ss + 1;
  } while ( ss );
 +
 +if ( !iommu_intremap )
 +iommu_intpost = 0;

This hunk is still wrong, as I said in v3.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] oxenstored: link in the systemd system library

2015-07-23 Thread Ian Campbell
On Thu, 2015-07-23 at 09:03 -0500, Jonathan Creekmore wrote:
  
  On Jul 23, 2015, at 9:00 AM, Ian Campbell ian.campb...@citrix.com 
  wrote:
  
  On Thu, 2015-07-23 at 08:40 -0500, Jonathan Creekmore wrote:
   If systemd is configured for use AND you are building oxenstored, 
   the 
   C
   systemd library must be linked in to the systemd.cxma library.
   
   Signed-off-by: Jonathan Creekmore jonathan.creekm...@gmail.com
  
  Acked-by: Ian Campbell ian.campb...@citrix.com
  
  OOI what was the failure mode for you without this? Runtime rather 
  than
  build time I suspect?
 
 No, it was build-time. It failed to link the oxenstored binary due to 
 missing symbols. 

Interesting, that (surprisingly) works for me. I suspect differing
toolchain versions then!

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] xenbackendd: remove xenbackendd

2015-07-23 Thread Roger Pau Monne
The functionality provided by xenbackendd has been integrated into the xl
toolstack under the devd command on all platforms, because of that it no
longer makes sense to maintain xenbackendd.

Init scripts have also been provided for all platforms in order to launch xl
devd in the background, mimicking the functionality provided by xenbackendd.
Furthermore, xenbackendd is not started by default on NetBSD since a couple
of releases ago (4.2 was the first release to disable the automatic startup
of xenbackendd in xencommons).

Signed-off-by: Roger Pau Monné roger@citrix.com
Cc: Ian Jackson ian.jack...@eu.citrix.com
Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com
Cc: Ian Campbell ian.campb...@citrix.com
Cc: Wei Liu wei.l...@citrix.com
Cc: George Dunlap george.dun...@eu.citrix.com
Cc: port-...@netbsd.org
---
 tools/Makefile |   1 -
 tools/hotplug/Linux/init.d/sysconfig.xencommons.in |   6 -
 tools/hotplug/NetBSD/block |   2 +-
 tools/hotplug/NetBSD/rc.d/xencommons.in|   2 -
 tools/hotplug/NetBSD/vif-bridge|   2 +-
 tools/hotplug/NetBSD/vif-ip|   2 +-
 tools/xenbackendd/Makefile |  41 ---
 tools/xenbackendd/xenbackendd.c| 325 -
 8 files changed, 3 insertions(+), 378 deletions(-)
 delete mode 100644 tools/xenbackendd/Makefile
 delete mode 100644 tools/xenbackendd/xenbackendd.c

diff --git a/tools/Makefile b/tools/Makefile
index 45cb4b2..633fb38 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -17,7 +17,6 @@ SUBDIRS-y += xenmon
 SUBDIRS-y += xenstat
 SUBDIRS-$(CONFIG_Linux) += memshr 
 SUBDIRS-$(CONFIG_BLKTAP2) += blktap2
-SUBDIRS-$(CONFIG_NetBSD) += xenbackendd
 SUBDIRS-y += libfsimage
 SUBDIRS-$(CONFIG_Linux) += libvchan
 
diff --git a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in 
b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
index f0fa98d..f6f35cd 100644
--- a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
+++ b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
@@ -40,11 +40,5 @@ XENSTORED_ARGS=
 # Running xenstored on XENSTORED_ROOTDIR
 #XENSTORED_ROOTDIR=@XEN_LIB_STORED@
 
-## Type: string
-## Default: Not defined, xenbackendd debug mode off
-#
-# Running xenbackendd in debug mode
-#XENBACKENDD_DEBUG=[yes|on|1]
-
 # qemu path
 #QEMU_XEN=@LIBEXEC_BIN@/qemu-system-i386
diff --git a/tools/hotplug/NetBSD/block b/tools/hotplug/NetBSD/block
index 32c20b6..fc8f668 100644
--- a/tools/hotplug/NetBSD/block
+++ b/tools/hotplug/NetBSD/block
@@ -1,7 +1,7 @@
 #!/bin/sh -e
 
 # $NetBSD: block-nbsd,v 1.1.1.1 2008/08/07 20:26:57 cegger Exp $
-# Called by xenbackendd
+#
 # Usage: block xsdir_backend_path state
 
 DIR=$(dirname $0)
diff --git a/tools/hotplug/NetBSD/rc.d/xencommons.in 
b/tools/hotplug/NetBSD/rc.d/xencommons.in
index d7552cd..9f45cfd 100644
--- a/tools/hotplug/NetBSD/rc.d/xencommons.in
+++ b/tools/hotplug/NetBSD/rc.d/xencommons.in
@@ -22,8 +22,6 @@ required_files=/kern/xen/privcmd
 
 XENSTORED_PIDFILE=/var/run/xenstored.pid
 XENCONSOLED_PIDFILE=/var/run/xenconsoled.pid
-XENBACKENDD_PIDFILE=/var/run/xenbackendd.pid
-#XENBACKENDD_DEBUG=1
 #XENCONSOLED_TRACE=/var/log/xen/xenconsole-trace.log
 #XENSTORED_TRACE=/var/log/xen/xenstore-trace.log
 
diff --git a/tools/hotplug/NetBSD/vif-bridge b/tools/hotplug/NetBSD/vif-bridge
index b58e922..f12edf0 100644
--- a/tools/hotplug/NetBSD/vif-bridge
+++ b/tools/hotplug/NetBSD/vif-bridge
@@ -1,7 +1,7 @@
 #!/bin/sh -e
 
 # $NetBSD: vif-bridge-nbsd,v 1.1.1.1 2008/08/07 20:26:57 cegger Exp $
-# Called by xenbackendd
+#
 # Usage: vif-bridge xsdir_backend_path state
 
 DIR=$(dirname $0)
diff --git a/tools/hotplug/NetBSD/vif-ip b/tools/hotplug/NetBSD/vif-ip
index 83cbfe2..44200e3 100644
--- a/tools/hotplug/NetBSD/vif-ip
+++ b/tools/hotplug/NetBSD/vif-ip
@@ -1,7 +1,7 @@
 #!/bin/sh -e
 
 # $NetBSD: vif-ip-nbsd,v 1.1.1.1 2008/08/07 20:26:57 cegger Exp $
-# Called by xenbackendd
+#
 # Usage: vif-ip xsdir_backend_path state
 
 DIR=$(dirname $0)
diff --git a/tools/xenbackendd/Makefile b/tools/xenbackendd/Makefile
deleted file mode 100644
index f52be74..000
--- a/tools/xenbackendd/Makefile
+++ /dev/null
@@ -1,41 +0,0 @@
-# Copyright (c) 2009 Advanced Micro Devices, Inc.
-#
-# This program is free software; you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation; under version 2 of the License.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-# GNU General Public License for more details.
-
-XEN_ROOT=$(CURDIR)/../..
-include $(XEN_ROOT)/tools/Rules.mk
-
-CFLAGS  += -Werror
-CFLAGS  += $(CFLAGS_libxenstore)
-CPPFLAGS += -DXEN_SCRIPT_DIR=\$(XEN_SCRIPT_DIR)\
-LDLIBS  += $(LDLIBS_libxenstore)
-
-.PHONY: all
-all: build
-
-.PHONY: build
-build: 

Re: [Xen-devel] Stable backport request for new OVMF version

2015-07-23 Thread Ian Campbell
Forgot to CC the list, now done...

On Thu, 2015-07-23 at 14:56 +0100, Ian Campbell wrote:
 The version of OVMF in 4.5 (and presumably earlier) doesn't build with
 the gcc in modern Linux distros (I noticed it with 4.9 from
 Debian/Jessie).
 
 I think we should update Config.mk:OVMF_UPSTREAM_REVISION to
 cb9a7ebabcd6b8a49dc0854b2f9592d732b5afbd in all stable branches. That
 version is the one currently in master.
 
 One aspect of the specific failure I've tripped over (Debian Jessie) is
 that it will be a blocker for updating osstest to using the new version
 of Debian on the test hosts, which in turn is a prerequisite for arm64
 support.
 
 Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 06/15] VMX/altp2m: add code to support EPTP switching and #VE.

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 16:40, ravi.sah...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Thursday, July 23, 2015 2:43 AM
 On 23.07.15 at 01:01, edmund.h.wh...@intel.com wrote:
 +static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v) {
 +struct domain *d = v-domain;
 +u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
 +
 +if ( !cpu_has_vmx_vmfunc )
 +return;
 +
 +if ( cpu_has_vmx_virt_exceptions )
 +mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
 +
 +vmx_vmcs_enter(v);
 +
 +if ( !d-is_dying  altp2m_active(d) )
 +{
 +v-arch.hvm_vmx.secondary_exec_control |= mask;
 +__vmwrite(VM_FUNCTION_CONTROL,
VMX_VMFUNC_EPTP_SWITCHING);
 +__vmwrite(EPTP_LIST_ADDR,
 + virt_to_maddr(d-arch.altp2m_eptp));
 +
 +if ( cpu_has_vmx_virt_exceptions )
 +{
 +p2m_type_t t;
 +mfn_t mfn;
 +
 +mfn = get_gfn_query_unlocked(d,
 + gfn_x(vcpu_altp2m(v).veinfo_gfn), t);
 +
 +if ( mfn_x(mfn) != INVALID_MFN )
 +__vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) 
 + PAGE_SHIFT);

Considering that the VMCS field holds a byte-aligned address, why do you
have the (later introduced) hvmop specify a GFN instead of a GPA?

 
 The SDM states:
  If the EPT-violation #VE VM-execution control is 1, the 
 virtualization-exception information address must
 satisfy the following checks:
 - Bits 11:0 of the address must be 0.
 - The address must not set any bits beyond the processor's physical-address 
 width.

Ah, interesting. Knowing what to search for, I was able to find this.
But to be honest, it should also be stated in the section talking
about #VE, not just in the one talking about checks being done.

With that, just like for the other patch my ack depends on the
promise to deal with all the remaining issues.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 05/15] x86/altp2m: basic data structures and support routines.

2015-07-23 Thread Sahita, Ravi
From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Thursday, July 23, 2015 7:54 AM

 On 23.07.15 at 16:36, ravi.sah...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Thursday, July 23, 2015 2:22 AM
 On 23.07.15 at 01:01, edmund.h.wh...@intel.com wrote:
 +out:


 Missed sorry.


Just to repeat - labels should be indented by at least one space.

But you realize this isn't the only case, and it looked like you adjusted only
exactly the one place where the comment was made?


I agree these kinds of things are hard to catch - we did go in and scan our v7 
for comment styles, indentation, case indentation, spaces in if conditional 
expressions etc,
But of course we missed some and this could be addressed - Andrew was also 
suggesting some sort of tool to do this would be good to use.

thanks
Ravi

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Stable backport request for new OVMF version

2015-07-23 Thread Fabio Fantoni

Il 23/07/2015 16:51, Jan Beulich ha scritto:

On 23.07.15 at 16:43, ian.campb...@citrix.com wrote:

On Thu, 2015-07-23 at 08:26 -0600, Jan Beulich wrote:

On 23.07.15 at 15:56, ian.campb...@citrix.com wrote:

The version of OVMF in 4.5 (and presumably earlier) doesn't build
with
the gcc in modern Linux distros (I noticed it with 4.9 from
Debian/Jessie).

I think we should update Config.mk:OVMF_UPSTREAM_REVISION to
cb9a7ebabcd6b8a49dc0854b2f9592d732b5afbd in all stable branches.
That
version is the one currently in master.

To be honest I didn't even know we support use of OVMF in any
released version (nor in unstable). When did this change?

It was present in 4.3. I don't think we build it by default since its
toolchain is a bit picky, but we turn it on and test it in osstest from
4.4 onwards.

You can ask for it with bios=ovmf in your hvm guest's config.

Well, that means it can be used by the adventurous, but it doesn't
mean we consider it supported.

Jan




Even if used only by the adventurous left it unable to build with newer 
gcc don't seems very good, even if the adventurous should be able to 
change Config.mk to use update ovmf git that don't fail the build.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [MINUTES] Monthly Xen.org Technical Call (2015-07-22)

2015-07-23 Thread David Vrabel
On 23/07/15 10:08, Ian Campbell wrote:
 
   * For Linux:
   * The Xen entrypoint shall point to a stub in
 the same vein of the UEFI stub.
   * The stub will set up a basic initial set of
 page tables, fills in bootinfo and then jump to
 the native 32- or 64-bit entry point as
 appropriate. 
   * The stub can/should live in linux.git
 (presumably arch/x86/xen) but should be self
 -contained and isolated from the main body of
 Linux code. It does setup to impedance match
 the Xen entry point to the Linux native entry
 point.
   * Other things (e.g. lack of ACPI) should be
 addressed by fixing the native Linux entry path
 to be able to cope without them. Likewise
 installing PV hooks may need hypervisor
 detection to be moved earlier in the native
 boot path.

Agreed.

The goal here is to use as much of the native code paths as possible to
reduce the chance of Xen support being broken by x86 changes.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/4] x86/compat: Test both PV and PVH guests for compat mode

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 16:13, ian.campb...@citrix.com wrote:
 On Thu, 2015-07-23 at 08:07 -0600, Jan Beulich wrote:
 what we called no-pm on yesterday's call?
 
 FYI it was no-dm (no device model).

Of course it was - fingers typing disconnected from brain, I guess.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Stable backport request for new OVMF version

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 15:56, ian.campb...@citrix.com wrote:
 The version of OVMF in 4.5 (and presumably earlier) doesn't build with
 the gcc in modern Linux distros (I noticed it with 4.9 from
 Debian/Jessie).
 
 I think we should update Config.mk:OVMF_UPSTREAM_REVISION to
 cb9a7ebabcd6b8a49dc0854b2f9592d732b5afbd in all stable branches. That
 version is the one currently in master.

To be honest I didn't even know we support use of OVMF in any
released version (nor in unstable). When did this change?

Jan

 One aspect of the specific failure I've tripped over (Debian Jessie) is
 that it will be a blocker for updating osstest to using the new version
 of Debian on the test hosts, which in turn is a prerequisite for arm64
 support.
 
 Ian.




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.

2015-07-23 Thread Sahita, Ravi
From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Thursday, July 23, 2015 3:22 AM

 On 23.07.15 at 01:01, edmund.h.wh...@intel.com wrote:
 Signed-off-by: Ed White edmund.h.wh...@intel.com

 Acked-by: Jan Beulich jbeul...@suse.com

And I have to withdraw this ack pending clarification of (and perhaps
adjustment to) the #VE info address interface.


Could we have the ack back :-) I clarified the #VE info address interface in 
the other email - repeating here:

 If the EPT-violation #VE VM-execution control is 1, the 
virtualization-exception information address must
satisfy the following checks:
- Bits 11:0 of the address must be 0.
- The address must not set any bits beyond the processor's physical-address 
width.


 --- a/xen/arch/x86/hvm/hvm.c
 +++ b/xen/arch/x86/hvm/hvm.c
 @@ -6138,6 +6138,140 @@ static int hvmop_get_param(
  return rc;
  }

 +static int do_altp2m_op(
 +XEN_GUEST_HANDLE_PARAM(void) arg) {
 +struct xen_hvm_altp2m_op a;
 +struct domain *d = NULL;
 +int rc = 0;
 +
 +if ( !hvm_altp2m_supported() )
 +return -EOPNOTSUPP;
 +
 +if ( copy_from_guest(a, arg, 1) )
 +return -EFAULT;
 +
 +if ( a.pad1 || a.pad2 ||
 + (a.version != HVMOP_ALTP2M_INTERFACE_VERSION) ||
 + (a.cmd  HVMOP_altp2m_get_domain_state) ||
 + (a.cmd  HVMOP_altp2m_change_gfn) )

I'm afraid such a change invalidates any earlier ack, even if ti is correct. 
Instead
of this, why don't you start numbering of the sub-ops at zero? Or if you really
have a reason to start at 1, why not simply check a.cmd against zero (without
using any particular sub-op value)? And then it escapes me why this can't be
handled in a default case in the switch statement below anyway.

Hmm - is that a requirement per se? we are consistently checking per the sub-op 
definition we have.
Would like this to be considered as is. 

As I said in the cover letter we have constraints on how much more we can do 
this week now - 
so requesting the maintainers to accept v7 with the review comments you have on 
those recorded as pending to be addressed by us.

Thanks much!
Ravi




Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Help

2015-07-23 Thread Akash Talole
Hello,

Issue in tapdisk-vbd.c

Why td_queue_write(parent,treq); is called in  static void
__tapdisk_vbd_reissue_td_request(td_vbd_t *vbd,td_image_t *image,
td_request_t treq) function as we can't write in parent vhd because it is
read only.

I have attached the code of tapdisk-vbd.c and block-vhd.c.

And please tell me about how to write into child vhd after reading from
parent vhd in block-vhd.c in case of VHD_BM_BIT_CLEAR in function
vhd_queue_read().

Thanks.
/*
 * Copyright (C) Citrix Systems Inc.
 *
 * This program is free software; you can redistribute it and/or
 * modify it under the terms of the GNU General Public License
 * as published by the Free Software Foundation; version 2.1 only
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software Foundation, Inc.,
 * 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
 */

#ifdef HAVE_CONFIG_H
#include config.h
#endif

#include stdio.h
#include errno.h
#include fcntl.h
#include regex.h
#include unistd.h
#include stdlib.h
#include libgen.h
#include sys/mman.h
#include sys/ioctl.h
#include sys/stat.h
#include sys/types.h

#include debug.h
#include libvhd.h
#include tapdisk-blktap.h
#include tapdisk-image.h
#include tapdisk-driver.h
#include tapdisk-server.h
#include tapdisk-vbd.h
#include tapdisk-disktype.h
#include tapdisk-interface.h
#include tapdisk-stats.h
#include tapdisk-storage.h
#include tapdisk-nbdserver.h
#include td-stats.h
#include tapdisk-utils.h
#include md5.h

#define DBG(_level, _f, _a...) tlog_write(_level, _f, ##_a)
#define ERR(_err, _f, _a...) tlog_error(_err, _f, ##_a)

#define INFO(_f, _a...)tlog_syslog(TLOG_INFO, vbd:  _f, ##_a)
#define ERROR(_f, _a...)   tlog_syslog(TLOG_WARN, vbd:  _f, ##_a)

#define TD_VBD_EIO_RETRIES  10
#define TD_VBD_EIO_SLEEP1
#define TD_VBD_WATCHDOG_TIMEOUT 10

static void tapdisk_vbd_complete_vbd_request(td_vbd_t *, td_vbd_request_t *);
static int  tapdisk_vbd_queue_ready(td_vbd_t *);
static void tapdisk_vbd_check_queue_state(td_vbd_t *);

/*
 * initialization
 */

static void
tapdisk_vbd_mark_progress(td_vbd_t *vbd)
{
	gettimeofday(vbd-ts, NULL);
}

td_vbd_t*
tapdisk_vbd_create(uint16_t uuid)
{
	td_vbd_t *vbd;

	vbd = calloc(1, sizeof(td_vbd_t));
	if (!vbd) {
		EPRINTF(failed to allocate tapdisk state\n);
		return NULL;
	}

shm_init(vbd-rrd.shm);

	vbd-uuid= uuid;
	vbd-req_timeout = TD_VBD_REQUEST_TIMEOUT;

	INIT_LIST_HEAD(vbd-images);
	INIT_LIST_HEAD(vbd-new_requests);
	INIT_LIST_HEAD(vbd-pending_requests);
	INIT_LIST_HEAD(vbd-failed_requests);
	INIT_LIST_HEAD(vbd-completed_requests);
	INIT_LIST_HEAD(vbd-next);
INIT_LIST_HEAD(vbd-rings);
INIT_LIST_HEAD(vbd-dead_rings);
	tapdisk_vbd_mark_progress(vbd);

	return vbd;
}

int
tapdisk_vbd_initialize(int rfd, int wfd, uint16_t uuid)
{
	td_vbd_t *vbd;

	vbd = tapdisk_server_get_vbd(uuid);
	if (vbd) {
		EPRINTF(duplicate vbds! %u\n, uuid);
		return -EEXIST;
	}

	vbd = tapdisk_vbd_create(uuid);

	tapdisk_server_add_vbd(vbd);

	return 0;
}

static inline void
tapdisk_vbd_add_image(td_vbd_t *vbd, td_image_t *image)
{
	list_add_tail(image-next, vbd-images);
}

static inline int
tapdisk_vbd_is_last_image(td_vbd_t *vbd, td_image_t *image)
{
	return list_is_last(image-next, vbd-images);
}

static inline td_image_t *
tapdisk_vbd_first_image(td_vbd_t *vbd)
{
	td_image_t *image = NULL;
	if (!list_empty(vbd-images))
		image = list_entry(vbd-images.next, td_image_t, next);
	return image;
}

static inline td_image_t *
tapdisk_vbd_last_image(td_vbd_t *vbd)
{
	td_image_t *image = NULL;
	if (!list_empty(vbd-images))
		image = list_entry(vbd-images.prev, td_image_t, next);
	return image;
}

static inline td_image_t *
tapdisk_vbd_next_image(td_image_t *image)
{
	return list_entry(image-next.next, td_image_t, next);
}

static int
tapdisk_vbd_validate_chain(td_vbd_t *vbd)
{
	return tapdisk_image_validate_chain(vbd-images);
}

static int
vbd_stats_destroy(td_vbd_t *vbd) {

int err = 0;

ASSERT(vbd);

err = shm_destroy(vbd-rrd.shm);
if (unlikely(err)) {
EPRINTF(failed to destroy RRD file: %s\n, strerror(err));
goto out;
}

free(vbd-rrd.shm.path);
vbd-rrd.shm.path = NULL;

out:
return -err;
}

static int
vbd_stats_create(td_vbd_t *vbd) {

int err;

ASSERT(vbd);

	err = mkdir(/dev/shm/metrics, S_IRUSR | S_IWUSR);
	if (likely(err)) {
err = errno;
if (unlikely(err != EEXIST))
		goto out;
else
err = 0;
}

/*
 * FIXME Rename this to something like vbd3-domid-devid. Consider
 * consolidating this with the io_ring shared memory file. Check if blkback
 * exports the same information 

Re: [Xen-devel] [v4 11/17] vt-d: Add API to update IRTE when VT-d PI is used

2015-07-23 Thread Andrew Cooper
On 23/07/15 12:35, Feng Wu wrote:
 +GET_IREMAP_ENTRY(ir_ctrl-iremap_maddr, remap_index, iremap_entries, p);
 +
 +old_ire = new_ire = *p;
 +
 +/* Setup/Update interrupt remapping table entry. */
 +setup_posted_irte(new_ire, pi_desc, gvec);
 +ret = cmpxchg16b(p, old_ire, new_ire);
 +
 +ASSERT(ret == *(__uint128_t *)old_ire);

This cannot be correct.  Either the cmpxchg() is required and you must
cope with it failing, or the cmpxchg() is not required and this should
be a plain write.

It looks as if the cmpxchg() is required.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] tools/libxl: Fixes to stream v2 task joining logic

2015-07-23 Thread Ian Jackson
Andrew Cooper writes (Re: [PATCH] tools/libxl: Fixes to stream v2 task joining 
logic):

 Migration v2 only ever has 1024 guest pages ever mapped at once (see
 xc_sr_save.c:write_batch() ), and doesn't leak the mapping, which means
 that the kernel decided that it was out of memory.

Maybe it was being asked for too much.  I don't think this can really
be an OOM condition in the debianhvm test.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/4] x86/compat: Test both PV and PVH guests for compat mode

2015-07-23 Thread Ian Campbell
On Thu, 2015-07-23 at 08:07 -0600, Jan Beulich wrote:
 what we called no-pm on yesterday's call?

FYI it was no-dm (no device model).

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.6 02/13] libxl: properly clean up array in libxl_list_cpupool failure path

2015-07-23 Thread Dario Faggioli
On Thu, 2015-07-23 at 10:22 +0100, Ian Campbell wrote:
 On Thu, 2015-07-23 at 08:59 +0100, Wei Liu wrote:

  There is change of behaviour. Previously if memory allocation fails 
  the
  said function returns NULL. Now memory allocation failure is fatal. 
  This
  is in line with how we deal with memory allocation failure in other
  places in libxl though.
 
 I think this function would benefit from making the out: label be the
 error path and the success case just a return ptr (just before the
 label).
 
 Then your error handling for cpupool_info would become
 libxl_cpupoolinfo_dispose(info);
 if (errno != ENOENT) goto out;
 break;
 
 and the if (failed) block would be at the out label (without the
 iff).
 
I was about to give my Reviewed-by, but yes, I like Ian's suggestion
too.

In any case, apart from how error handling is done, I think this patch
is fine, as far as dealing with cpupools goes.

Dario
-- 
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 06/15] VMX/altp2m: add code to support EPTP switching and #VE.

2015-07-23 Thread Sahita, Ravi
From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Thursday, July 23, 2015 2:43 AM

 On 23.07.15 at 01:01, edmund.h.wh...@intel.com wrote:
 @@ -1770,6 +1771,105 @@ static bool_t
vmx_is_singlestep_supported(void)
  return cpu_has_monitor_trap_flag;  }

 +static void vmx_vcpu_update_eptp(struct vcpu *v) {
 +struct domain *d = v-domain;
 +struct p2m_domain *p2m = NULL;
 +struct ept_data *ept;
 +
 +if ( altp2m_active(d) )
 +p2m = p2m_get_altp2m(v);
 +if ( !p2m )
 +p2m = p2m_get_hostp2m(d);
 +
 +ept = p2m-ept;
 +ept-asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
 +
 +vmx_vmcs_enter(v);
 +
 +__vmwrite(EPT_POINTER, ept_get_eptp(ept));
 +
 +if ( v-arch.hvm_vmx.secondary_exec_control 
 +SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )

Indentation.


Ok.

 +static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v) {
 +struct domain *d = v-domain;
 +u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
 +
 +if ( !cpu_has_vmx_vmfunc )
 +return;
 +
 +if ( cpu_has_vmx_virt_exceptions )
 +mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
 +
 +vmx_vmcs_enter(v);
 +
 +if ( !d-is_dying  altp2m_active(d) )
 +{
 +v-arch.hvm_vmx.secondary_exec_control |= mask;
 +__vmwrite(VM_FUNCTION_CONTROL,
VMX_VMFUNC_EPTP_SWITCHING);
 +__vmwrite(EPTP_LIST_ADDR,
 + virt_to_maddr(d-arch.altp2m_eptp));
 +
 +if ( cpu_has_vmx_virt_exceptions )
 +{
 +p2m_type_t t;
 +mfn_t mfn;
 +
 +mfn = get_gfn_query_unlocked(d,
 + gfn_x(vcpu_altp2m(v).veinfo_gfn), t);
 +
 +if ( mfn_x(mfn) != INVALID_MFN )
 +__vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) 
 + PAGE_SHIFT);

Considering that the VMCS field holds a byte-aligned address, why do you
have the (later introduced) hvmop specify a GFN instead of a GPA?


The SDM states:
 If the EPT-violation #VE VM-execution control is 1, the 
virtualization-exception information address must
satisfy the following checks:
- Bits 11:0 of the address must be 0.
- The address must not set any bits beyond the processor's physical-address 
width.

Also you shouldn't be open coding pfn_to_paddr().


ok

thanks,
Ravi

 +else
 +v-arch.hvm_vmx.secondary_exec_control =
 +~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
 +}
 +}
 +else
 +v-arch.hvm_vmx.secondary_exec_control = ~mask;
 +
 +__vmwrite(SECONDARY_VM_EXEC_CONTROL,
 +v-arch.hvm_vmx.secondary_exec_control);

Indentation again.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 1/2] xen: sched: reorganize cpu_disable_scheduler()

2015-07-23 Thread Dario Faggioli
The function is called both when we want to remove a cpu
from a cpupool, and during cpu teardown, for suspend or
shutdown. If, however, the boot cpu (cpu 0, most of the
times) is not present in the default cpupool, during
suspend or shutdown, Xen crashes like this:

  root@Zhaman:~# xl cpupool-cpu-remove Pool-0 0
  root@Zhaman:~# shutdown -h now
  (XEN) [ Xen-4.6-unstable  x86_64  debug=y  Tainted:C ]
  ...
  (XEN) Xen call trace:
  (XEN)[82d0801238de] _csched_cpu_pick+0x156/0x61f
  (XEN)[82d080123db5] csched_cpu_pick+0xe/0x10
  (XEN)[82d08012de3c] vcpu_migrate+0x18e/0x321
  (XEN)[82d08012e4f8] cpu_disable_scheduler+0x1cf/0x2ac
  (XEN)[82d08018bb8d] __cpu_disable+0x313/0x36e
  (XEN)[82d080101424] take_cpu_down+0x34/0x3b
  (XEN)[82d08013097a] stopmachine_action+0x70/0x99
  (XEN)[82d0801325f0] do_tasklet_work+0x78/0xab
  (XEN)[82d080132926] do_tasklet+0x5e/0x8a
  (XEN)[82d08016478c] idle_loop+0x56/0x6b
  (XEN)
  (XEN)
  (XEN) 
  (XEN) Panic on CPU 15:
  (XEN) Assertion 'cpu  nr_cpu_ids' failed at 
...URCES/xen/xen/xen.git/xen/include/xen/cpumask.h:97
  (XEN) 

There also are problems when we try to suspend or shutdown
with a cpupool configured with just one cpu (no matter, in
this case, whether that is the boot cpu or not):

  root@Zhaman:~# xl create /etc/xen/test.cfg
  root@Zhaman:~# xl cpupool-migrate test Pool-1
  root@Zhaman:~# xl cpupool-list -c
  Name   CPU list
  Pool-0 0,1,2,3,4,5,6,7,8,9,10,11,13,14,15
  Pool-1 12
  root@Zhaman:~# shutdown -h now
  (XEN) [ Xen-4.6-unstable  x86_64  debug=y  Tainted:C ]
  (XEN) CPU:12
  ...
  (XEN) Xen call trace:
  (XEN)[82d08018bb91] __cpu_disable+0x317/0x36e
  (XEN)[82d080101424] take_cpu_down+0x34/0x3b
  (XEN)[82d08013097a] stopmachine_action+0x70/0x99
  (XEN)[82d0801325f0] do_tasklet_work+0x78/0xab
  (XEN)[82d080132926] do_tasklet+0x5e/0x8a
  (XEN)[82d08016478c] idle_loop+0x56/0x6b
  (XEN)
  (XEN)
  (XEN) 
  (XEN) Panic on CPU 12:
  (XEN) Xen BUG at smpboot.c:895
  (XEN) 

In both cases, the problem is the scheduler not being able
to:
 - move all the vcpus to the boot cpu (as the boot cpu is
   not in the cpupool), in the former;
 - move the vcpus away from a cpu at all (as that is the
   only one cpu in the cpupool), in the latter.

Solution is to distinguish, inside cpu_disable_scheduler(),
the two cases of cpupool manipulation and teardown. For
cpupool manipulation, it is correct to ask the scheduler to
take an action, as pathological situation (like there not
being any cpu in the pool where to send vcpus) are taken
care of (i.e., forbidden!) already. For suspend and shutdown,
we don't want the scheduler to be involved at all, as the
final goal is pretty simple: send all the vcpus to the
boot cpu ASAP, so we just go for it.

Signed-off-by: Dario Faggioli dario.faggi...@citrix.com
Acked-by: George Dunlap george.dun...@eu.citrix.com
---
Changes from v3:
 * fix typos and language in comments, as requested during
   review

Changes from v2:
 * add a missing spin_unlock, most likely eaten by a
   forgotten `stg refresh' (sorry!)
 * fix a typo

Changes from v1:
 * BUG_ON() if, in the suspend/shutdown case, the mask of
   online pCPUs will ever get empty, as suggested during
   review;
 * reorganize and improve comments inside cpu_disable_scheduler()
   as suggested during review;
 * make it more clear that vcpu_move_nosched() (name
   changed, as suggested during review), should only be
   called from quite contextes, such us, during suspend
   or shutdown. Do that via both comments and asserts,
   as requested during review;
 * reorganize cpu_disable_scheduler() and vcpu_move_nosched()
   so that calling to sleep and wakeup functions are only
   called when necessary (i.e., *not* in case we are
   suspending/shutting down, as requested during review.
---
Cc: Juergen Gross jgr...@suse.com
---
 xen/common/schedule.c |  104 +
 1 file changed, 88 insertions(+), 16 deletions(-)

diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index df8c1d0..df64268 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -454,9 +454,10 @@ void vcpu_unblock(struct vcpu *v)
  * Do the actual movement of a vcpu from old to new CPU. Locks for *both*
  * CPUs needs to have been taken already when calling this!
  */
-static void vcpu_move(struct vcpu *v, unsigned int old_cpu,
-  unsigned int new_cpu)
+static void vcpu_move_locked(struct vcpu *v, unsigned int new_cpu)
 {
+unsigned int old_cpu = v-processor;
+
 /*
  * Transfer urgency status to new CPU before switching CPUs, as
  * once the switch occurs, v-is_urgent is no longer protected by

[Xen-devel] [PATCH v4 0/2] xen: sched/cpupool: more fixing of (corner?) cases

2015-07-23 Thread Dario Faggioli
Take 4, with only typos and language fixes in comments, in patch 1.

This now have all the Ack-s it requires, I think.

Regards,
Dario
---
Dario Faggioli (2):
  xen: sched: reorganize cpu_disable_scheduler()
  xen: sched/cpupool: properly update affinity when removing a cpu from a 
cpupool

 xen/common/cpupool.c  |   18 
 xen/common/schedule.c |  111 +
 2 files changed, 112 insertions(+), 17 deletions(-)
--
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 2/2] xen: sched/cpupool: properly update affinity when removing a cpu from a cpupool

2015-07-23 Thread Dario Faggioli
And this time, do it right. In fact, a similar change was
attempted in 93be8285a79c6 (cpupools: update domU's node-affinity
on the cpupool_unassign_cpu() path). But that was buggy, and got
reverted with 8395b67ab0b8a86.

However, even though reverting was the right thing to do, it
remains true that:
 - calling the function is better done in the cpupool cpu removal
   code, even if just for simmetry with the cpupool cpu adding path;
 - it is not necessary to call it during cpu teardown (for suspend
   or shutdown) code as we either are going down and will never
   come up (shutdown) or, when coming up, we want everything to be
   as before the tearing down process started, and so we would just
   undo any update made during the process.
 - calling it from the teardown path is not only unnecessary, but
   it can trigger an ASSERT(), in case we get, during the process,
   to remove the last online pcpu of a domain's node affinity:

  (XEN) Assertion '!cpumask_empty(dom_cpumask)' failed at domain.c:466
  (XEN) [ Xen-4.6-unstable  x86_64  debug=y  Tainted:C ]
  ... ... ...
  (XEN) Xen call trace:
  (XEN)[82d0801055b9] domain_update_node_affinity+0x113/0x240
  (XEN)[82d08012e676] cpu_disable_scheduler+0x334/0x3f2
  (XEN)[82d08018bb8d] __cpu_disable+0x313/0x36e
  (XEN)[82d080101424] take_cpu_down+0x34/0x3b
  (XEN)[82d080130ad9] stopmachine_action+0x70/0x99
  (XEN)[82d08013274f] do_tasklet_work+0x78/0xab
  (XEN)[82d080132a85] do_tasklet+0x5e/0x8a
  (XEN)[82d08016478c] idle_loop+0x56/0x6b
  (XEN)
  (XEN)
  (XEN) 
  (XEN) Panic on CPU 12:
  (XEN) Assertion '!cpumask_empty(dom_cpumask)' failed at domain.c:466
  (XEN) 

Therefore, for all these reasons, move the call from
cpu_disable_schedule() to cpupool_unassign_cpu_helper().

While there, add some sanity checking (in the latter function), and
make sure that scanning the domain list is done with domlist_read_lock
held, at least when the system is 'live'.

I re-tested the scenario described in here:
 http://permalink.gmane.org/gmane.comp.emulators.xen.devel/235310

which is what led to the revert of 93be8285a79c6, and that is
working ok after this commit.

Signed-off-by: dario.faggi...@citrix.com
Acked-by: George Dunlap george.dun...@eu.citrix.com
Acked-by: Juergen Gross jgr...@suse.com
---
 xen/common/cpupool.c  |   18 ++
 xen/common/schedule.c |7 ++-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/xen/common/cpupool.c b/xen/common/cpupool.c
index b48ae17..69b984c 100644
--- a/xen/common/cpupool.c
+++ b/xen/common/cpupool.c
@@ -297,12 +297,25 @@ static int cpupool_assign_cpu_locked(struct cpupool *c, 
unsigned int cpu)
 static long cpupool_unassign_cpu_helper(void *info)
 {
 int cpu = cpupool_moving_cpu;
+struct cpupool *c = info;
+struct domain *d;
 long ret;
 
 cpupool_dprintk(cpupool_unassign_cpu(pool=%d,cpu=%d)\n,
 cpupool_cpu_moving-cpupool_id, cpu);
 
 spin_lock(cpupool_lock);
+if ( c != cpupool_cpu_moving )
+{
+ret = -EBUSY;
+goto out;
+}
+
+/*
+ * We need this for scanning the domain list, both in
+ * cpu_disable_scheduler(), and at the bottom of this function.
+ */
+rcu_read_lock(domlist_read_lock);
 ret = cpu_disable_scheduler(cpu);
 cpumask_set_cpu(cpu, cpupool_free_cpus);
 if ( !ret )
@@ -319,6 +332,11 @@ static long cpupool_unassign_cpu_helper(void *info)
 cpupool_cpu_moving = NULL;
 }
 
+for_each_domain_in_cpupool(d, c)
+{
+domain_update_node_affinity(d);
+}
+rcu_read_unlock(domlist_read_lock);
 out:
 spin_unlock(cpupool_lock);
 cpupool_dprintk(cpupool_unassign_cpu ret=%ld\n, ret);
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index df64268..3eefed7 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -650,6 +650,12 @@ int cpu_disable_scheduler(unsigned int cpu)
 if ( c == NULL )
 return ret;
 
+/*
+ * We'd need the domain RCU lock, but:
+ *  - when we are called from cpupool code, it's acquired there already;
+ *  - when we are called for CPU teardown, we're in stop-machine context,
+ *so that's not be a problem.
+ */
 for_each_domain_in_cpupool ( d, c )
 {
 for_each_vcpu ( d, v )
@@ -735,7 +741,6 @@ int cpu_disable_scheduler(unsigned int cpu)
 ret = -EAGAIN;
 }
 }
-domain_update_node_affinity(d);
 }
 
 return ret;


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Stable backport request for new OVMF version

2015-07-23 Thread Ian Campbell
On Thu, 2015-07-23 at 08:51 -0600, Jan Beulich wrote:
  
On 23.07.15 at 16:43, ian.campb...@citrix.com wrote:
  On Thu, 2015-07-23 at 08:26 -0600, Jan Beulich wrote:

  On 23.07.15 at 15:56, ian.campb...@citrix.com wrote:
The version of OVMF in 4.5 (and presumably earlier) doesn't 
build 
with
the gcc in modern Linux distros (I noticed it with 4.9 from
Debian/Jessie).

I think we should update Config.mk:OVMF_UPSTREAM_REVISION to
cb9a7ebabcd6b8a49dc0854b2f9592d732b5afbd in all stable 
branches. 
That
version is the one currently in master.
   
   To be honest I didn't even know we support use of OVMF in any
   released version (nor in unstable). When did this change?
  
  It was present in 4.3. I don't think we build it by default since 
  its
  toolchain is a bit picky, but we turn it on and test it in osstest 
  from
  4.4 onwards.
  
  You can ask for it with bios=ovmf in your hvm guest's config.
 
 Well, that means it can be used by the adventurous, but it doesn't
 mean we consider it supported.

TBH, I'm more concerned that it is a blocker for upgrade osstest to
Debian Jessie, which in turn is a prerequisite for testing arm64.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/4] x86/pvh: Set 32b PVH guest mode in XEN_DOMCTL_set_address_size

2015-07-23 Thread Jan Beulich
 On 11.07.15 at 00:20, boris.ostrov...@oracle.com wrote:
 Signed-off-by: Boris Ostrovsky boris.ostrov...@oracle.com

Reviewed-by: Jan Beulich jbeul...@suse.com


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] oxenstored: link in the systemd system library

2015-07-23 Thread Ian Campbell
On Thu, 2015-07-23 at 08:40 -0500, Jonathan Creekmore wrote:
 If systemd is configured for use AND you are building oxenstored, the 
 C
 systemd library must be linked in to the systemd.cxma library.
 
 Signed-off-by: Jonathan Creekmore jonathan.creekm...@gmail.com

Acked-by: Ian Campbell ian.campb...@citrix.com

OOI what was the failure mode for you without this? Runtime rather than
build time I suspect?

 ---
 Changed since v1:
   * Link the systemd library in to the systemd.cxma instead of the 
 final
   binary.
 ---
  tools/ocaml/xenstored/Makefile | 2 ++
  1 file changed, 2 insertions(+)
 
 diff --git a/tools/ocaml/xenstored/Makefile 
 b/tools/ocaml/xenstored/Makefile
 index d861f11..59875f7 100644
 --- a/tools/ocaml/xenstored/Makefile
 +++ b/tools/ocaml/xenstored/Makefile
 @@ -30,6 +30,8 @@ systemd_OBJS = systemd
  systemd_C_OBJS = systemd_stubs
  OCAML_LIBRARY += systemd
  
 +LIBS_systemd += $(LDFLAGS-y)
 +
  OBJS = define \
   stdext \
   trie \

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] oxenstored: link in the systemd system library

2015-07-23 Thread Jonathan Creekmore

 On Jul 23, 2015, at 9:00 AM, Ian Campbell ian.campb...@citrix.com wrote:
 
 On Thu, 2015-07-23 at 08:40 -0500, Jonathan Creekmore wrote:
 If systemd is configured for use AND you are building oxenstored, the 
 C
 systemd library must be linked in to the systemd.cxma library.
 
 Signed-off-by: Jonathan Creekmore jonathan.creekm...@gmail.com
 
 Acked-by: Ian Campbell ian.campb...@citrix.com
 
 OOI what was the failure mode for you without this? Runtime rather than
 build time I suspect?

No, it was build-time. It failed to link the oxenstored binary due to missing 
symbols. 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-23 Thread Dario Faggioli
On Thu, 2015-07-23 at 06:43 +0200, Juergen Gross wrote:
 On 07/22/2015 04:44 PM, Boris Ostrovsky wrote:
  On 07/22/2015 10:09 AM, Juergen Gross wrote:

  I think we have 2 possible solutions:
 
  1. Try to handle this all in the hypervisor via CPUID mangling.
 
  2. Add PV-topology support to the guest and indicate this capability
  via
 elfnote; only enable PV-numa if this note is present.
 
  I'd prefer the second solution. If you are okay with this, I'd try
  to do
  some patches for the pvops kernel.
 
  Why do you think that kernel patches are preferable to CPUID management?
  This would be all in tools, I'd think. (Well, one problem that I can
  think of is that AMD sometimes pokes at MSRs and/or Northbridge's PCI
  registers to figure out nodeID --- that we may need to have to address
  in the hypervisor)
 
 Doing it via CPUID is more HW specific. Trying to fake a topology for
 the guest from outside might lead to weird decisions in the guest e.g.
 regarding licenses based on socket counts.
 
I do see the value of this, I think...

 If you are doing it in the guest itself you are able to address the
 different problems (scheduling, licensing) in different ways.
 
... but, at least in the case of vNUMA for instance, there still need to
be a correlation between the vNUMA topology, and the CPUID topology,
and vNUMA topology is decided by toolstack.

Then, if you mean, within all the possible solutions that matches (i.e.,
that does not cause problems to!) the vNUMA setup we've been given,
let's pick up one that also is best for this xxx other purpose, then I
agree.

What I'm not sure I see, although, is how you would be specifying the
other purpose, e.g., in this case, are you thinking to another parameter
saying that we want to minimize the socket count?

 Depending on the licensing model
 playing with CPUID is either good or bad. I can even imagine the CPUID
 configuration capabilities in xl are in use today for exactly this
 purpose. Using them for pv-NUMA as well will make this feature unusable
 for those users.

Yeah, well... So, you want a VM with only one socket, because of
whatever reasons (say licensing), and you're using libxl's CPUID
fiddling capability to do that. Now, if you specify, for such a VM, a
vNUMA layout with 2 vnodes, well, I'd call this asking for troubles. I
know, strictly speaking, socket != (v)NUMA node. Still, I think this
will be a corner case, way less common than just a user specifying a
vNUMA topology, and getting only a fraction of the vcpus being
used/usable! :-/

In summary, I probably know too few of CPUID handling to have a clear
view on whether something like 'making it match the topology' --which
also means, if no vNUMA, CPUID should say flat, for some definition of
flat-- should leave in tools or in kernel... I just know that we need to
do something *consistent*.

FWIW, I was thinking that the kernel were a better place, as Juergen is
saying, while now I'm more convinced that tools would be more
appropriate, as Boris is saying.

Thanks and Regards,
Dario
-- 
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/4] x86/compat: Test both PV and PVH guests for compat mode

2015-07-23 Thread Jan Beulich
 On 11.07.15 at 00:20, boris.ostrov...@oracle.com wrote:
 Add is_pvh_32bit_domain() macro and use it alongside is_pv_32bit_domain() 
 where necessary.
 
 Since PVH guests cannot change execution mode, has_32bit_shinfo is a good
 indicator of whether the guest is PVH and 32-bit.
 
 Signed-off-by: Boris Ostrovsky boris.ostrov...@oracle.com

Relative to what is in the tree right now this is fine, but ...

 @@ -771,7 +771,7 @@ int arch_set_info_guest(
  
  /* The context is a compat-mode one if the target domain is compat-mode;
   * we expect the tools to DTRT even in compat-mode callers. */
 -compat = is_pv_32bit_domain(d);
 +compat = is_pv_32bit_domain(d) || is_pvh_32bit_domain(d);

... won't this and ...

 @@ -1203,7 +1204,7 @@ void arch_get_info_guest(struct vcpu *v, 
 vcpu_guest_context_u c)
  {
  unsigned int i;
  const struct domain *d = v-domain;
 -bool_t compat = is_pv_32bit_domain(d);
 +bool_t compat = is_pv_32bit_domain(d) || is_pvh_32bit_domain(d);

... this get in the way of what we called no-pm on yesterday's call?
I would assume that for the transitional period both ought to be able
to co-exist.

Plus - is this in line with what the tools are doing? Aren't they
assuming !PV = native format context? I.e. don't you need
to treat differently v-domain == current-domain and its
opposite? Roger btw. raised a similar question on IRC earlier
today...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4 04/17] iommu: Add iommu_intpost to control VT-d Posted-Interrupts feature

2015-07-23 Thread Andrew Cooper
On 23/07/15 15:01, Andrew Cooper wrote:
 On 23/07/15 12:35, Feng Wu wrote:
 VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
 With VT-d Posted-Interrupts enabled, external interrupts from
 direct-assigned devices can be delivered to guests without VMM
 intervention when guest is running in non-root mode.

 This patch adds variable 'iommu_intpost' to control whether enable VT-d
 posted-interrupt or not in the generic IOMMU code.

 CC: Jan Beulich jbeul...@suse.com
 CC: Kevin Tian kevin.t...@intel.com
 Signed-off-by: Feng Wu feng...@intel.com
 Reviewed-by: Kevin Tian kevin.t...@intel.com
 Please patch docs/misc/xen-command-line.markdown as you are

 ---
 v4:
 - No changes

 v3:
 - Remove pointless initializer for 'iommu_intpost'.
 - Some adjustment for if no intremap then no intpost logic.
 * For parse_iommu_param(), move it to the end of the function,
   so we don't need to add the some logic when introduing the
   new kernel parameter 'intpost' in later patch.
 * Add this logic in iommu_setup() after iommu_hardware_setup()
   is called.

  xen/drivers/passthrough/iommu.c | 17 -
  xen/include/xen/iommu.h |  2 +-
  2 files changed, 17 insertions(+), 2 deletions(-)

 diff --git a/xen/drivers/passthrough/iommu.c 
 b/xen/drivers/passthrough/iommu.c
 index 06cb38f..536b69f 100644
 --- a/xen/drivers/passthrough/iommu.c
 +++ b/xen/drivers/passthrough/iommu.c
 @@ -39,6 +39,7 @@ static void iommu_dump_p2m_table(unsigned char key);
   *   no-snoop   Disable VT-d Snoop Control
   *   no-qinval  Disable VT-d Queued Invalidation
   *   no-intremapDisable VT-d Interrupt Remapping
 + *   no-intpost Disable VT-d Interrupt posting
   */
  custom_param(iommu, parse_iommu_param);
  bool_t __initdata iommu_enable = 1;
 @@ -51,6 +52,14 @@ bool_t __read_mostly iommu_passthrough;
  bool_t __read_mostly iommu_snoop = 1;
  bool_t __read_mostly iommu_qinval = 1;
  bool_t __read_mostly iommu_intremap = 1;
 +
 +/*
 + * In the current implementation of VT-d posted interrupts, in some extreme
 + * cases, the per cpu list which saves the blocked vCPU will be very long,
 + * and this will affect the interrupt latency, so let this feature off by
 + * default until we find a good solution to resolve it.
 + */
 +bool_t __read_mostly iommu_intpost;
  bool_t __read_mostly iommu_hap_pt_share = 1;
  bool_t __read_mostly iommu_debug;
  bool_t __read_mostly amd_iommu_perdev_intremap = 1;
 @@ -112,6 +121,9 @@ static void __init parse_iommu_param(char *s)
  
  s = ss + 1;
  } while ( ss );
 +
 +if ( !iommu_intremap )
 +iommu_intpost = 0;
 This hunk is still wrong, as I said in v3.

Furthermore, you appear to have no way in v4 to enable intpost on the
Xen command line.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 3/4] x86/pvh: Handle hypercalls for 32b PVH guests

2015-07-23 Thread Jan Beulich
 On 11.07.15 at 00:20, boris.ostrov...@oracle.com wrote:
 Signed-off-by: Boris Ostrovsky boris.ostrov...@oracle.com
 ---
 Changes in v3:
 * Defined compat_mmuext_op(). (XEN_GUEST_HANDLE_PARAM(mmuext_op_compat_t)
   is not defined in header files so I used 'void' type.

How is it not? It's in compat/xen.h (which is a generated header).

 @@ -4951,6 +4950,29 @@ static hvm_hypercall_t *const 
 pvh_hypercall64_table[NR_hypercalls] = {
  [ __HYPERVISOR_arch_1 ] = (hvm_hypercall_t *)paging_domctl_continuation
  };
  
 +extern int compat_mmuext_op(XEN_GUEST_HANDLE_PARAM(void) cmp_uops,
 +unsigned int count,
 +XEN_GUEST_HANDLE_PARAM(uint) pdone,
 +unsigned int foreigndom);
 +static hvm_hypercall_t *const pvh_hypercall32_table[NR_hypercalls] = {
 +HYPERCALL(platform_op),
 +COMPAT_CALL(memory_op),
 +HYPERCALL(xen_version),
 +HYPERCALL(console_io),
 +[ __HYPERVISOR_grant_table_op ]  =
 +(hvm_hypercall_t *)hvm_grant_table_op_compat32,
 +COMPAT_CALL(vcpu_op),
 +COMPAT_CALL(mmuext_op),
 +HYPERCALL(xsm_op),
 +COMPAT_CALL(sched_op),
 +HYPERCALL(event_channel_op),
 +[ __HYPERVISOR_physdev_op ] = (hvm_hypercall_t *)hvm_physdev_op_compat32,
 +HYPERCALL(hvm_op),
 +HYPERCALL(sysctl),
 +HYPERCALL(domctl),
 +[ __HYPERVISOR_arch_1 ] = (hvm_hypercall_t *)paging_domctl_continuation

Looks like you didn't fully sync with staging - did you forget that it
was you who added xenpmu_op to the 64-bit counterpart?
Without that ...

 @@ -4981,7 +5003,7 @@ int hvm_do_hypercall(struct cpu_user_regs *regs)
  return viridian_hypercall(regs);
  
  if ( (eax = NR_hypercalls) ||
 - (is_pvh_domain(currd) ? !pvh_hypercall64_table[eax]
 + (is_pvh_domain(currd) ? !pvh_hypercall32_table[eax]
 : !hvm_hypercall32_table[eax]) )

... this will break (as we're assuming 32- and 64-bit tables to be fully
in sync here; there's still the pending work item of constructing these
tables so that this has a better chance of not getting broken).

Also, since you're touching this here anyway, could I ask you to
adjust this along the lines of the change you do further down,
i.e. to become

 !(is_pvh_domain(currd) ? pvh_hypercall32_table
: hvm_hypercall32_table)[eax] )

Thanks, Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-23 Thread Juergen Gross

On 07/23/2015 04:07 PM, Dario Faggioli wrote:

On Thu, 2015-07-23 at 06:43 +0200, Juergen Gross wrote:

On 07/22/2015 04:44 PM, Boris Ostrovsky wrote:

On 07/22/2015 10:09 AM, Juergen Gross wrote:



I think we have 2 possible solutions:

1. Try to handle this all in the hypervisor via CPUID mangling.

2. Add PV-topology support to the guest and indicate this capability
via
elfnote; only enable PV-numa if this note is present.

I'd prefer the second solution. If you are okay with this, I'd try
to do
some patches for the pvops kernel.


Why do you think that kernel patches are preferable to CPUID management?
This would be all in tools, I'd think. (Well, one problem that I can
think of is that AMD sometimes pokes at MSRs and/or Northbridge's PCI
registers to figure out nodeID --- that we may need to have to address
in the hypervisor)


Doing it via CPUID is more HW specific. Trying to fake a topology for
the guest from outside might lead to weird decisions in the guest e.g.
regarding licenses based on socket counts.


I do see the value of this, I think...


If you are doing it in the guest itself you are able to address the
different problems (scheduling, licensing) in different ways.


... but, at least in the case of vNUMA for instance, there still need to
be a correlation between the vNUMA topology, and the CPUID topology,
and vNUMA topology is decided by toolstack.

Then, if you mean, within all the possible solutions that matches (i.e.,
that does not cause problems to!) the vNUMA setup we've been given,
let's pick up one that also is best for this xxx other purpose, then I
agree.

What I'm not sure I see, although, is how you would be specifying the
other purpose, e.g., in this case, are you thinking to another parameter
saying that we want to minimize the socket count?


Depending on the licensing model
playing with CPUID is either good or bad. I can even imagine the CPUID
configuration capabilities in xl are in use today for exactly this
purpose. Using them for pv-NUMA as well will make this feature unusable
for those users.


Yeah, well... So, you want a VM with only one socket, because of
whatever reasons (say licensing), and you're using libxl's CPUID
fiddling capability to do that. Now, if you specify, for such a VM, a
vNUMA layout with 2 vnodes, well, I'd call this asking for troubles. I
know, strictly speaking, socket != (v)NUMA node. Still, I think this
will be a corner case, way less common than just a user specifying a
vNUMA topology, and getting only a fraction of the vcpus being
used/usable! :-/

In summary, I probably know too few of CPUID handling to have a clear
view on whether something like 'making it match the topology' --which
also means, if no vNUMA, CPUID should say flat, for some definition of
flat-- should leave in tools or in kernel... I just know that we need to
do something *consistent*.

FWIW, I was thinking that the kernel were a better place, as Juergen is
saying, while now I'm more convinced that tools would be more
appropriate, as Boris is saying.


I'm just about to write down all sources of information the linux kernel
is using to build topology information and for what purpose this data is
being used. This should help to decide which way to go.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 05/15] x86/altp2m: basic data structures and support routines.

2015-07-23 Thread Sahita, Ravi
From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Thursday, July 23, 2015 2:22 AM

 On 23.07.15 at 01:01, edmund.h.wh...@intel.com wrote:
 @@ -6569,6 +6571,25 @@ void hvm_toggle_singlestep(struct vcpu *v)
  v-arch.hvm_vcpu.single_step = !v-arch.hvm_vcpu.single_step;  }

 +void altp2m_vcpu_update_p2m(struct vcpu *v) {
 +if ( hvm_funcs.altp2m_vcpu_update_p2m )
 +hvm_funcs.altp2m_vcpu_update_p2m(v);
 +}
 +
 +void altp2m_vcpu_update_vmfunc_ve(struct vcpu *v) {
 +if ( hvm_funcs.altp2m_vcpu_update_vmfunc_ve )
 +hvm_funcs.altp2m_vcpu_update_vmfunc_ve(v);
 +}
 +
 +bool_t altp2m_vcpu_emulate_ve(struct vcpu *v) {
 +if ( hvm_funcs.altp2m_vcpu_emulate_ve )
 +return hvm_funcs.altp2m_vcpu_emulate_ve(v);
 +return 0;
 +}

These are _still_ not inline functions (in hvm.h), and 00/15 also doesn't
mention that this was intentionally left out.


Yup possibly missed this one.

 @@ -498,6 +498,28 @@ int hap_enable(struct domain *d, u32 mode)
 goto out;
  }

 +if ( hvm_altp2m_supported() )
 +{
 +/* Init alternate p2m data */
 +if ( (d-arch.altp2m_eptp = alloc_xenheap_page()) == NULL )
 +{
 +rv = -ENOMEM;
 +goto out;
 +}
 +
 +for ( i = 0; i  MAX_EPTP; i++ )
 +d-arch.altp2m_eptp[i] = INVALID_MFN;

And there is _still_ EPT-specific code left in a generic file.


This was mentioned explicitly in the cover letter and in the patch header - 
Altough our error was that the comment was only in patch 10, and should have 
been in patch 5 also.
Patch 10:   ... not done - abstracting some ept functionality from p2m

 @@ -183,6 +184,43 @@ static void p2m_teardown_nestedp2m(struct
domain *d)
  }
  }

 +static void p2m_teardown_altp2m(struct domain *d) {
 +unsigned int i;
 +struct p2m_domain *p2m;
 +
 +for ( i = 0; i  MAX_ALTP2M; i++ )
 +{
 +if ( !d-arch.altp2m_p2m[i] )
 +continue;
 +p2m = d-arch.altp2m_p2m[i];
 +p2m_free_one(p2m);
 +d-arch.altp2m_p2m[i] = NULL;

If you already think it's necessary to latch the array entry into a local 
variable,
why don't you zap the array entry _before_ freeing the structure?


Could be done.

 @@ -1940,6 +1988,59 @@ int unmap_mmio_regions(struct domain *d,
  return err;
  }

 +unsigned int p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp)
 +{
 +struct p2m_domain *p2m;
 +struct ept_data *ept;
 +unsigned int i;
 +
 +altp2m_list_lock(d);
 +
 +for ( i = 0; i  MAX_ALTP2M; i++ )
 +{
 +if ( d-arch.altp2m_eptp[i] == INVALID_MFN )
 +continue;
 +
 +p2m = d-arch.altp2m_p2m[i];
 +ept = p2m-ept;
 +
 +if ( eptp == ept_get_eptp(ept) )
 +goto out;
 +}
 +
 +i = INVALID_ALTP2M;
 +
 +out:


Missed sorry.


Just to repeat - labels should be indented by at least one space.

And you already know what the comment is regarding this being EPT-specific
code (and here one can't even debate whether it's just unfortunate naming,
since ept_get_eptp() is _very_ EPT- specific, and that macro - if headers were
properly structured - shouldn't even be visible here).


Was mentioned albeit in another patch (should have been mentioned here also)

 --- /dev/null
 +++ b/xen/include/asm-x86/altp2m.h
 @@ -0,0 +1,38 @@
 +/*
 + * Alternate p2m HVM
 + * Copyright (c) 2014, Intel Corporation.
 + *
 + * This program is free software; you can redistribute it and/or
 +modify it
 + * under the terms and conditions of the GNU General Public License,
 + * version 2, as published by the Free Software Foundation.
 + *
 + * This program is distributed in the hope it will be useful, but
 +WITHOUT
 + * ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY
 +or
 + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
 +License for
 + * more details.
 + *
 + * You should have received a copy of the GNU General Public License
 +along with
 + * this program; if not, write to the Free Software Foundation, Inc.,
 +59 Temple
 + * Place - Suite 330, Boston, MA 02111-1307 USA.
 + */
 +
 +#ifndef _ALTP2M_H
 +#define _ALTP2M_H

I'm sure I said so before - this is not a specific enough guard symbol.
It should have at least an _X86 prefix.


Right (don't think I saw this before but comment makes sense)

 +#include xen/types.h
 +#include xen/sched.h /* for struct vcpu, struct domain */
 +#include asm/hvm/vcpu.h  /* for vcpu_altp2m */

I don't see the type mentioned in the comment used anywhere below - is this
a stray include?

 --- a/xen/include/asm-x86/domain.h
 +++ b/xen/include/asm-x86/domain.h
 @@ -233,6 +233,10 @@ struct paging_vcpu {  typedef xen_domctl_cpuid_t
 cpuid_input_t;

  #define MAX_NESTEDP2M 10
 +
 +#define MAX_ALTP2M  ((uint16_t)10)
 +#define INVALID_ALTP2M  ((uint16_t)~0)

Didn't you claim to have removed all stray casts?

Considering how late we are with this, this patch can have my ack only
provided you promise 

Re: [Xen-devel] [PATCH OSSTEST v8 00/14] add distro domU testing flight

2015-07-23 Thread Ian Campbell
On Wed, 2015-07-08 at 13:30 +0100, Ian Campbell wrote:
 
 I retained acks even when changing things due to either the moving of
 the make-*flight filter or the moving of the runvar docs to the 
 script, otherwise I dropped them, I hope that is ok.
 
 Summary of (A)cks, (M)odified and (N)ew (NM==Replaced something):

Ian acked the rest and after discussion with him I've pushed this to
pretest.

Cheers,
Ian.

 
 AMmfi-common: Allow make-*flight to filter the set of build 
 jobs to include
  MTestSupport: Add helper to fetch a URL on a host
 AMdistros: add support for installing Debian PV guests via d
 -i, flight and jobs
 AMdistros: support booting Debian PV (d-i installed) guests 
 with pvgrub.
  Mdistros: Support pvgrub for Wheezy too.
 A Test pygrub and pvgrub on the regular flights
 A distros: add branch infrastructure
   N   crontab-cambridge: Use hard tabs for alignment.
  Mdistros: Run one suite per day on a weekly basis
 A Debian: Handle lack of bootloader support in d-i on ARM.
 A ts-debian-di-install: Refactor root_disk specification
 A make-flight: refactor PV debian tests
  MAdd testing of file backed disk formats
   make-distros-flight: Use ftp.debian.org directly
 
 Results for an adhoc xen-unstable flight are at 
 http://osstest.xs.citrite.net/~osstest/testlogs/logs/37711/
 And for Jessie:
 http://osstest.xs.citrite.net/~osstest/testlogs/logs/37717/
 
 Ian.
 
 
 ___
 Xen-devel mailing list
 Xen-devel@lists.xen.org
 http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 06/15] VMX/altp2m: add code to support EPTP switching and #VE.

2015-07-23 Thread Sahita, Ravi
From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Thursday, July 23, 2015 8:00 AM
To: Sahita, Ravi
Cc: Andrew Cooper; Wei Liu; George Dunlap; Ian Jackson; White, Edmund H;
Nakajima, Jun; xen-devel@lists.xen.org; tleng...@novetta.com; Daniel De
Graaf; Tim Deegan
Subject: RE: [PATCH v7 06/15] VMX/altp2m: add code to support EPTP
switching and #VE.

 On 23.07.15 at 16:40, ravi.sah...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Thursday, July 23, 2015 2:43 AM
 On 23.07.15 at 01:01, edmund.h.wh...@intel.com wrote:
 +static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v) {
 +struct domain *d = v-domain;
 +u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
 +
 +if ( !cpu_has_vmx_vmfunc )
 +return;
 +
 +if ( cpu_has_vmx_virt_exceptions )
 +mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
 +
 +vmx_vmcs_enter(v);
 +
 +if ( !d-is_dying  altp2m_active(d) )
 +{
 +v-arch.hvm_vmx.secondary_exec_control |= mask;
 +__vmwrite(VM_FUNCTION_CONTROL,
VMX_VMFUNC_EPTP_SWITCHING);
 +__vmwrite(EPTP_LIST_ADDR,
 + virt_to_maddr(d-arch.altp2m_eptp));
 +
 +if ( cpu_has_vmx_virt_exceptions )
 +{
 +p2m_type_t t;
 +mfn_t mfn;
 +
 +mfn = get_gfn_query_unlocked(d,
 + gfn_x(vcpu_altp2m(v).veinfo_gfn), t);
 +
 +if ( mfn_x(mfn) != INVALID_MFN )
 +__vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) 
 + PAGE_SHIFT);

Considering that the VMCS field holds a byte-aligned address, why do
you have the (later introduced) hvmop specify a GFN instead of a GPA?


 The SDM states:
  If the EPT-violation #VE VM-execution control is 1, the
 virtualization-exception information address must satisfy the
 following checks:
 - Bits 11:0 of the address must be 0.
 - The address must not set any bits beyond the processor's
 physical-address width.

Ah, interesting. Knowing what to search for, I was able to find this.
But to be honest, it should also be stated in the section talking about #VE, 
not
just in the one talking about checks being done.

With that, just like for the other patch my ack depends on the promise to deal
with all the remaining issues.

Yes our SDM has information that's not the best organized...
Agreed on the remaining issues

Thanks
Ravi


Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 10/15] x86/altp2m: add remaining support routines.

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 16:51, ravi.sah...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Thursday, July 23, 2015 3:06 AM
 On 23.07.15 at 01:01, edmund.h.wh...@intel.com wrote:
 @@ -758,6 +765,38 @@ bool_t p2m_switch_vcpu_altp2m_by_id(struct
vcpu
 *v, unsigned int idx);
  /* Check to see if vcpu should be switched to a different p2m. */
 void p2m_altp2m_check(struct vcpu *v, uint16_t idx);

The context here suggests that I overlooked a still not replaced uint16_t.

 
 Ok.
 
 Just wanted to make sure these are also ok to do post 4.6

I guess so, but you realize you're stretching it? Plus the patch can't
go in as is anyway, considering the off-by-one bugs found.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 07/32] xen/x86: fix arch_set_info_guest for HVM guests

2015-07-23 Thread Roger Pau Monné
El 23/07/15 a les 13.41, Ian Campbell ha escrit:
 On Thu, 2015-07-23 at 05:29 -0600, Jan Beulich wrote:

 On 23.07.15 at 12:25, roger@citrix.com wrote:
 El 13/07/15 a les 16.01, Jan Beulich ha escrit:
 On 03.07.15 at 13:34, roger@citrix.com wrote:
 --- a/xen/arch/x86/domain.c
 +++ b/xen/arch/x86/domain.c
 @@ -795,6 +795,15 @@ int arch_set_info_guest(
c.nat-fs_base || c.nat-gs_base_user)) )
  return -EINVAL;
  }
 +else if ( is_hvm_domain(d) )
 +{
 +if ( c(ctrlreg[0]) || c(ctrlreg[1]) || c(ctrlreg[2]) 
 ||
 + c(ctrlreg[3]) || c(ctrlreg[4]) || c(ctrlreg[5]) 
 ||
 + c(ctrlreg[6]) || c(ctrlreg[7]) || c(ldt_base) ||
 + c(ldt_ents) || c(kernel_ss) || c(kernel_sp) ||
 + c(gdt_ents) )
 +return -EINVAL;
 +}

 In addition to what Andrew said - is the use of c() here really 
 correct
 considering

 compat = is_pv_32bit_domain(d);

 #define c(fld) (compat ? (c.cmp-fld) : (c.nat-fld))

 near the beginning of the function?

 I've been thinking about this. Since HVM vCPUs are always started 
 in
 32bit mode, I would ideally like to use the 
 vcpu_guest_context_x86_32_t
 struct to set the vCPU context. This means that for HVM guests 
 compat
 should always be true.

 The problem is that inside of the guest there's no notion of
 vcpu_guest_context_x86_32_t or vcpu_guest_context_x86_64_t, there's 
 only
 vcpu_guest_context, which defaults to the native bitness of the 
 guest.
 So if your guest is running in 64bit mode vcpu_guest_context is 
 aliased
 to vcpu_guest_context_x86_64_t by default.

 TBH I'm not sure about the best way to solve this. Should
 vcpu_guest_context_x86_32_t and vcpu_guest_context_x86_64_t be 
 exposed
 to the guest like it's done for the tools?

 Why? We're in arch_set_info_guest(), which is unreachable for a
 HVM guest on itself. Or is this in preparation of allowing HVM
 guests to use VCPUOP_initialise? In that case, exposing might be
 an option, but remember that the compat mode layout even today
 is used only for PV guests. So I'd rather avoid exposing both
 layouts to guests, and do translation instead inside the hypervisor.
 
 On ARM we deliberately arranged that the vcpu_guest_context was the
 same for both arm32 and arm64. Moving to PVH seems like an ideal
 opportunity to do something similar for x86, if not by standardising on
 the x86_64 layout then by declaring a new struct which can encompass
 both and declaring the old ones PV legacy.

IMHO introducing a new structure that gets rid of all the PV-only 
fields seems like a good option:

struct vcpu_hvm_context {
#define _VGCF_online   5
#define VGCF_online(1_VGCF_online)
uint32_t flags; /* VGCF_* flags */
struct cpu_hvm_user_regs user_regs; /* User-level CPU registers */
/* NB. User pagetable on x86/64 is placed in ctrlreg[1]. */
uint32_t ctrlreg[8];/* CR0-CR7 (control registers)  */
uint32_t debugreg[8];   /* DB0-DB7 (debug registers)*/
};

I'm also seriously considering getting rid of ctrlreg and debugreg.
Since HVM VCPUs will always be started in 32bit flat mode, it doesn't
make sense IMHO to have both the 32 and the 64 version of the 
registers, so cpu_hvm_user_regs is always going to be:

struct cpu_hvm_user_regs {
uint32_t ebx;
uint32_t ecx;
uint32_t edx;
uint32_t esi;
uint32_t edi;
uint32_t ebp;
uint32_t eax;
uint32_t eip;
uint32_t esp;
uint32_t eflags;
uint16_t cs;
uint16_t ss;
uint16_t es;
uint16_t ds;
uint16_t fs;
uint16_t gs;
};

We could however do something similar to what's done in ARM and have
a union of both the 32 and the 64bit registers in case we want to
start the vCPU in 64bit mode sometime in the future.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.

2015-07-23 Thread Sahita, Ravi
From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Thursday, July 23, 2015 8:09 AM

 On 23.07.15 at 16:56, ravi.sah...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Thursday, July 23, 2015 3:22 AM

 On 23.07.15 at 01:01, edmund.h.wh...@intel.com wrote:
 Signed-off-by: Ed White edmund.h.wh...@intel.com

 Acked-by: Jan Beulich jbeul...@suse.com

And I have to withdraw this ack pending clarification of (and perhaps
adjustment to) the #VE info address interface.


 Could we have the ack back :-) I clarified the #VE info address
 interface in the other email - repeating here:

  If the EPT-violation #VE VM-execution control is 1, the
 virtualization-exception information address must satisfy the
 following checks:
 - Bits 11:0 of the address must be 0.
 - The address must not set any bits beyond the processor's
 physical-address width.

Yes, for this aspect.

 --- a/xen/arch/x86/hvm/hvm.c
 +++ b/xen/arch/x86/hvm/hvm.c
 @@ -6138,6 +6138,140 @@ static int hvmop_get_param(
  return rc;
  }

 +static int do_altp2m_op(
 +XEN_GUEST_HANDLE_PARAM(void) arg) {
 +struct xen_hvm_altp2m_op a;
 +struct domain *d = NULL;
 +int rc = 0;
 +
 +if ( !hvm_altp2m_supported() )
 +return -EOPNOTSUPP;
 +
 +if ( copy_from_guest(a, arg, 1) )
 +return -EFAULT;
 +
 +if ( a.pad1 || a.pad2 ||
 + (a.version != HVMOP_ALTP2M_INTERFACE_VERSION) ||
 + (a.cmd  HVMOP_altp2m_get_domain_state) ||
 + (a.cmd  HVMOP_altp2m_change_gfn) )

I'm afraid such a change invalidates any earlier ack, even if ti is correct.
 Instead
of this, why don't you start numbering of the sub-ops at zero? Or if
you
 really
have a reason to start at 1, why not simply check a.cmd against zero
(without using any particular sub-op value)? And then it escapes me
why this can't be handled in a default case in the switch statement below
anyway.

 Hmm - is that a requirement per se? we are consistently checking per
 the sub-op definition we have.

Well, in a way. But doing range checks like this means future additions of sub-
ops would always need to touch this code. Quite different from doing it in the
default case of a switch statement.
Plus, can you see how the expression is going to look like if in interface 
version
2 you need to remove one or two of the current entries, replacing them with
new, higher numbers?


Yes in a revision I would handle this with the default case.

 Would like this to be considered as is.

 As I said in the cover letter we have constraints on how much more we
 can do this week now - so requesting the maintainers to accept v7 with
 the review comments you have on those recorded as pending to be
 addressed by us.

Yes, on that basis, albeit extremely hesitantly to be honest. If any other
maintainer would be as hesitant as I am about this, I would likely put the two
together to yield a NAK.

Thanks for your flexibility and honesty on this - 
hope other maintainers are ok with v7 in the same way, with keeping the other 
comments you have logged already as pending for post 4.6.

Ravi


Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4 11/17] vt-d: Add API to update IRTE when VT-d PI is used

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 17:55, andrew.coop...@citrix.com wrote:
 On 23/07/15 16:52, Jan Beulich wrote:
 On 23.07.15 at 15:51, andrew.coop...@citrix.com wrote:
 On 23/07/15 12:35, Feng Wu wrote:
 +GET_IREMAP_ENTRY(ir_ctrl-iremap_maddr, remap_index, iremap_entries, 
 p);
 +
 +old_ire = new_ire = *p;
 +
 +/* Setup/Update interrupt remapping table entry. */
 +setup_posted_irte(new_ire, pi_desc, gvec);
 +ret = cmpxchg16b(p, old_ire, new_ire);
 +
 +ASSERT(ret == *(__uint128_t *)old_ire);
 This cannot be correct.  Either the cmpxchg() is required and you must
 cope with it failing, or the cmpxchg() is not required and this should
 be a plain write.
 Not exactly: The cmpxchg() is required for this to be an atomic
 128-bit write. And hence I would view the ASSERT() as
 appropriate - it validates that the entry didn't change behind our
 back.
 
 But p is an active descriptor, which means hardware is liable to change
 it behind our back.

I inquired about this on the previous round and was told
hardware doesn't alter the descriptor. Comparing this with
the spec, I couldn't spot any fields that I would suspect
getting written. Which fields do you have in mind?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 07/32] xen/x86: fix arch_set_info_guest for HVM guests

2015-07-23 Thread Ian Campbell
On Thu, 2015-07-23 at 17:48 +0200, Roger Pau Monné wrote:
 El 23/07/15 a les 17.32, Jan Beulich ha escrit:
 On 23.07.15 at 17:10, roger@citrix.com wrote:
   IMHO introducing a new structure that gets rid of all the PV-only 
   
   fields seems like a good option:
   
   struct vcpu_hvm_context {
   #define _VGCF_online   5
   #define VGCF_online(1_VGCF_online)
   uint32_t flags; /* VGCF_* flags  
  
   */
   struct cpu_hvm_user_regs user_regs; /* User-level CPU 
   registers 
   */
   /* NB. User pagetable on x86/64 is placed in ctrlreg[1]. */
   uint32_t ctrlreg[8];/* CR0-CR7 (control 
   registers)  
   */
   uint32_t debugreg[8];   /* DB0-DB7 (debug 
   registers)
   */
   };
   
   I'm also seriously considering getting rid of ctrlreg and 
   debugreg.
   Since HVM VCPUs will always be started in 32bit flat mode, it 
   doesn't
   make sense IMHO to have both the 32 and the 64 version of the 
   registers, so cpu_hvm_user_regs is always going to be:
   
   struct cpu_hvm_user_regs {
   uint32_t ebx;
   uint32_t ecx;
   uint32_t edx;
   uint32_t esi;
   uint32_t edi;
   uint32_t ebp;
   uint32_t eax;
   uint32_t eip;
   uint32_t esp;
   uint32_t eflags;
   uint16_t cs;
   uint16_t ss;
   uint16_t es;
   uint16_t ds;
   uint16_t fs;
   uint16_t gs;
   };
   
   We could however do something similar to what's done in ARM and 
   have
   a union of both the 32 and the 64bit registers in case we want to
   start the vCPU in 64bit mode sometime in the future.
  
  What you gave above is suitable only for VCPUOP_initialise afaict.
  Did you intend this to be the case?
 
 Certainly not, this should also be used by XEN_DOMCTL_setvcpucontext.
 I'm afraid I'm missing something obvious, but I don't see why this
 couldn't be used by XEN_DOMCTL_setvcpucontext TBH, can you please 
 clarify?

set/getvcpucontext need to pickle all state for save/restore/migration,
not just the start of day state.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] target_run_apt: Correctly escape line wrapping

2015-07-23 Thread Ian Campbell
Otherwise the envvars are on the preceding line and therefore not set
in the apt-get process.

I broke this in 6fea4be08306 apt: lock osstest's usages of apt-get
against each other, committed in January.

Signed-off-by: Ian Campbell ian.campb...@citrix.com
---
 Osstest/TestSupport.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Osstest/TestSupport.pm b/Osstest/TestSupport.pm
index 0fbc519..a8b66ef 100644
--- a/Osstest/TestSupport.pm
+++ b/Osstest/TestSupport.pm
@@ -432,7 +432,7 @@ sub target_putfile_root (;$) {
 sub target_run_apt {
 my ($ho, @aptopts) = @_;
 target_cmd_root($ho,
-DEBIAN_PRIORITY=critical UCF_FORCE_CONFFOLD=y \
+DEBIAN_PRIORITY=critical UCF_FORCE_CONFFOLD=y \\
 with-lock-ex -w /var/lock/osstest-apt apt-get @aptopts, 3000);
 }
 sub target_install_packages {
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 10/15] x86/altp2m: add remaining support routines.

2015-07-23 Thread George Dunlap
On 07/23/2015 03:51 PM, Sahita, Ravi wrote:
 +int p2m_destroy_altp2m_by_id(struct domain *d, unsigned int idx) {
 +struct p2m_domain *p2m;
 +int rc = -EINVAL;
 +
 +if ( !idx || idx  MAX_ALTP2M )

 = (and then also elsewhere further down)?

 
 Right.

[snip]

 Just wanted to make sure these are also ok to do post 4.6

Well the off-by-one errors certainly need to be fixed for 4.6.

If this was the only thing holding it up, the committer could fix it up
on check-in, or we could take a fix-up patch afterwards.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST v3 02/13] Introduce ts-saverestore-support-check

2015-07-23 Thread Ian Campbell
On Wed, 2015-07-22 at 11:13 +0100, Wei Liu wrote:
 We need this script because we're going to separate the concept of 
 save
 / restore and migration later.
 
 Signed-off-by: Wei Liu wei.l...@citrix.com

Acked-by: Ian Campbell ian.campb...@citrix.com


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Stable backport request for new OVMF version

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 16:58, ian.campb...@citrix.com wrote:
 On Thu, 2015-07-23 at 08:51 -0600, Jan Beulich wrote:
  
On 23.07.15 at 16:43, ian.campb...@citrix.com wrote:
  On Thu, 2015-07-23 at 08:26 -0600, Jan Beulich wrote:

  On 23.07.15 at 15:56, ian.campb...@citrix.com wrote:
The version of OVMF in 4.5 (and presumably earlier) doesn't 
build 
with
the gcc in modern Linux distros (I noticed it with 4.9 from
Debian/Jessie).

I think we should update Config.mk:OVMF_UPSTREAM_REVISION to
cb9a7ebabcd6b8a49dc0854b2f9592d732b5afbd in all stable 
branches. 
That
version is the one currently in master.
   
   To be honest I didn't even know we support use of OVMF in any
   released version (nor in unstable). When did this change?
  
  It was present in 4.3. I don't think we build it by default since 
  its
  toolchain is a bit picky, but we turn it on and test it in osstest 
  from
  4.4 onwards.
  
  You can ask for it with bios=ovmf in your hvm guest's config.
 
 Well, that means it can be used by the adventurous, but it doesn't
 mean we consider it supported.
 
 TBH, I'm more concerned that it is a blocker for upgrade osstest to
 Debian Jessie, which in turn is a prerequisite for testing arm64.

If it was explicitly unsupported, I could say do whatever you need
to do to it, but with not being sure what status it has I'm kind of
wondering what other changes (i.e. apart from fixing the build) the
tree(s) would get which then may break someone. But in the end
it's probably okay to do the change.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 07/32] xen/x86: fix arch_set_info_guest for HVM guests

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 17:10, roger@citrix.com wrote:
 IMHO introducing a new structure that gets rid of all the PV-only 
 fields seems like a good option:
 
 struct vcpu_hvm_context {
 #define _VGCF_online   5
 #define VGCF_online(1_VGCF_online)
 uint32_t flags; /* VGCF_* flags 
 */
 struct cpu_hvm_user_regs user_regs; /* User-level CPU registers 
 */
 /* NB. User pagetable on x86/64 is placed in ctrlreg[1]. */
 uint32_t ctrlreg[8];/* CR0-CR7 (control registers)  
 */
 uint32_t debugreg[8];   /* DB0-DB7 (debug registers)
 */
 };
 
 I'm also seriously considering getting rid of ctrlreg and debugreg.
 Since HVM VCPUs will always be started in 32bit flat mode, it doesn't
 make sense IMHO to have both the 32 and the 64 version of the 
 registers, so cpu_hvm_user_regs is always going to be:
 
 struct cpu_hvm_user_regs {
 uint32_t ebx;
 uint32_t ecx;
 uint32_t edx;
 uint32_t esi;
 uint32_t edi;
 uint32_t ebp;
 uint32_t eax;
 uint32_t eip;
 uint32_t esp;
 uint32_t eflags;
 uint16_t cs;
 uint16_t ss;
 uint16_t es;
 uint16_t ds;
 uint16_t fs;
 uint16_t gs;
 };
 
 We could however do something similar to what's done in ARM and have
 a union of both the 32 and the 64bit registers in case we want to
 start the vCPU in 64bit mode sometime in the future.

What you gave above is suitable only for VCPUOP_initialise afaict.
Did you intend this to be the case?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 0/2] xen: sched/cpupool: more fixing of (corner?) cases

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 16:45, dario.faggi...@citrix.com wrote:
 Take 4, with only typos and language fixes in comments, in patch 1.
 
 This now have all the Ack-s it requires, I think.

So it does.

 Dario Faggioli (2):
   xen: sched: reorganize cpu_disable_scheduler()
   xen: sched/cpupool: properly update affinity when removing a cpu from 
 a cpupool

Especially for the first one, though, the title suggests mere cleanup
(i.e. not to go in now), while the description of it looks more like a
bug fix. Considering this I'd prefer to have a release ack.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 07/32] xen/x86: fix arch_set_info_guest for HVM guests

2015-07-23 Thread Roger Pau Monné
El 23/07/15 a les 17.32, Jan Beulich ha escrit:
 On 23.07.15 at 17:10, roger@citrix.com wrote:
 IMHO introducing a new structure that gets rid of all the PV-only 
 fields seems like a good option:

 struct vcpu_hvm_context {
 #define _VGCF_online   5
 #define VGCF_online(1_VGCF_online)
 uint32_t flags; /* VGCF_* flags 
 */
 struct cpu_hvm_user_regs user_regs; /* User-level CPU registers 
 */
 /* NB. User pagetable on x86/64 is placed in ctrlreg[1]. */
 uint32_t ctrlreg[8];/* CR0-CR7 (control registers)  
 */
 uint32_t debugreg[8];   /* DB0-DB7 (debug registers)
 */
 };

 I'm also seriously considering getting rid of ctrlreg and debugreg.
 Since HVM VCPUs will always be started in 32bit flat mode, it doesn't
 make sense IMHO to have both the 32 and the 64 version of the 
 registers, so cpu_hvm_user_regs is always going to be:

 struct cpu_hvm_user_regs {
 uint32_t ebx;
 uint32_t ecx;
 uint32_t edx;
 uint32_t esi;
 uint32_t edi;
 uint32_t ebp;
 uint32_t eax;
 uint32_t eip;
 uint32_t esp;
 uint32_t eflags;
 uint16_t cs;
 uint16_t ss;
 uint16_t es;
 uint16_t ds;
 uint16_t fs;
 uint16_t gs;
 };

 We could however do something similar to what's done in ARM and have
 a union of both the 32 and the 64bit registers in case we want to
 start the vCPU in 64bit mode sometime in the future.
 
 What you gave above is suitable only for VCPUOP_initialise afaict.
 Did you intend this to be the case?

Certainly not, this should also be used by XEN_DOMCTL_setvcpucontext.
I'm afraid I'm missing something obvious, but I don't see why this
couldn't be used by XEN_DOMCTL_setvcpucontext TBH, can you please clarify?

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 07/32] xen/x86: fix arch_set_info_guest for HVM guests

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 17:48, roger@citrix.com wrote:
 El 23/07/15 a les 17.32, Jan Beulich ha escrit:
 On 23.07.15 at 17:10, roger@citrix.com wrote:
 IMHO introducing a new structure that gets rid of all the PV-only 
 fields seems like a good option:

 struct vcpu_hvm_context {
 #define _VGCF_online   5
 #define VGCF_online(1_VGCF_online)
 uint32_t flags; /* VGCF_* flags 
 */
 struct cpu_hvm_user_regs user_regs; /* User-level CPU registers 
 */
 /* NB. User pagetable on x86/64 is placed in ctrlreg[1]. */
 uint32_t ctrlreg[8];/* CR0-CR7 (control registers)  
 */
 uint32_t debugreg[8];   /* DB0-DB7 (debug registers)
 */
 };

 I'm also seriously considering getting rid of ctrlreg and debugreg.
 Since HVM VCPUs will always be started in 32bit flat mode, it doesn't
 make sense IMHO to have both the 32 and the 64 version of the 
 registers, so cpu_hvm_user_regs is always going to be:

 struct cpu_hvm_user_regs {
 uint32_t ebx;
 uint32_t ecx;
 uint32_t edx;
 uint32_t esi;
 uint32_t edi;
 uint32_t ebp;
 uint32_t eax;
 uint32_t eip;
 uint32_t esp;
 uint32_t eflags;
 uint16_t cs;
 uint16_t ss;
 uint16_t es;
 uint16_t ds;
 uint16_t fs;
 uint16_t gs;
 };

 We could however do something similar to what's done in ARM and have
 a union of both the 32 and the 64bit registers in case we want to
 start the vCPU in 64bit mode sometime in the future.
 
 What you gave above is suitable only for VCPUOP_initialise afaict.
 Did you intend this to be the case?
 
 Certainly not, this should also be used by XEN_DOMCTL_setvcpucontext.
 I'm afraid I'm missing something obvious, but I don't see why this
 couldn't be used by XEN_DOMCTL_setvcpucontext TBH, can you please clarify?

Just like for HVM (and even more so with the plan to allow PVH to
freely switch between 32- and 64-bit modes) the full 64-bit context
should be got and set in a single domctl imo. If the caller was to
first figure out what bitness a to be migrated guest was paused in
(and that may differ between vCPU-s), things would likely become
quite ugly. Not to speak of the need for a 64-bit tool stack to then
have a way to specify which format the structure is in for a set
operation.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 16:56, ravi.sah...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Thursday, July 23, 2015 3:22 AM

 On 23.07.15 at 01:01, edmund.h.wh...@intel.com wrote:
 Signed-off-by: Ed White edmund.h.wh...@intel.com

 Acked-by: Jan Beulich jbeul...@suse.com

And I have to withdraw this ack pending clarification of (and perhaps
adjustment to) the #VE info address interface.

 
 Could we have the ack back :-) I clarified the #VE info address interface in 
 the other email - repeating here:
 
  If the EPT-violation #VE VM-execution control is 1, the 
 virtualization-exception information address must
 satisfy the following checks:
 - Bits 11:0 of the address must be 0.
 - The address must not set any bits beyond the processor's physical-address 
 width.

Yes, for this aspect.

 --- a/xen/arch/x86/hvm/hvm.c
 +++ b/xen/arch/x86/hvm/hvm.c
 @@ -6138,6 +6138,140 @@ static int hvmop_get_param(
  return rc;
  }

 +static int do_altp2m_op(
 +XEN_GUEST_HANDLE_PARAM(void) arg) {
 +struct xen_hvm_altp2m_op a;
 +struct domain *d = NULL;
 +int rc = 0;
 +
 +if ( !hvm_altp2m_supported() )
 +return -EOPNOTSUPP;
 +
 +if ( copy_from_guest(a, arg, 1) )
 +return -EFAULT;
 +
 +if ( a.pad1 || a.pad2 ||
 + (a.version != HVMOP_ALTP2M_INTERFACE_VERSION) ||
 + (a.cmd  HVMOP_altp2m_get_domain_state) ||
 + (a.cmd  HVMOP_altp2m_change_gfn) )

I'm afraid such a change invalidates any earlier ack, even if ti is correct. 
 Instead
of this, why don't you start numbering of the sub-ops at zero? Or if you 
 really
have a reason to start at 1, why not simply check a.cmd against zero (without
using any particular sub-op value)? And then it escapes me why this can't be
handled in a default case in the switch statement below anyway.
 
 Hmm - is that a requirement per se? we are consistently checking per the 
 sub-op definition we have.

Well, in a way. But doing range checks like this means future
additions of sub-ops would always need to touch this code. Quite
different from doing it in the default case of a switch statement.
Plus, can you see how the expression is going to look like if in
interface version 2 you need to remove one or two of the current
entries, replacing them with new, higher numbers?

 Would like this to be considered as is. 
 
 As I said in the cover letter we have constraints on how much more we can do 
 this week now - 
 so requesting the maintainers to accept v7 with the review comments you have 
 on those recorded as pending to be addressed by us.

Yes, on that basis, albeit extremely hesitantly to be honest. If any
other maintainer would be as hesitant as I am about this, I would
likely put the two together to yield a NAK.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xenconsole: Allow non-interactive use

2015-07-23 Thread Martin Lucina
On Thursday, 23.07.2015 at 09:48, Ian Campbell wrote:
 On Wed, 2015-07-22 at 19:08 +0200, Martin Lucina wrote:
  If xenconsole is run with stdin closed or redirected to /dev/null,
  console_loop() will return immediately due to failure to read from
  STDIN_FILENO. This patch tests if stdin and stdout are both connected 
  to
  a TTY and, if not, xenconsole will not attempt to read from stdin or
  modify stdout terminal attributes.
  
  Existing behaviour when xenconsole is run from a terminal does not
  change.
  
  This allows for non-interactive use, eg. running xl create -c under
  systemd or piping the output of xl console to another command.
  
  Signed-off-by: Martin Lucina mar...@lucina.net
  Cc: Ian Jackson ian.jack...@eu.citrix.com
  Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com
  Cc: Ian Campbell ian.campb...@citrix.com
  Cc: Wei Liu wei.l...@citrix.com
 
 Acked-by: Ian Campbell ian.campb...@citrix.com
 
 WRT the 4.6 freeze I'm torn between calling this a feature or a bugfix.
 
 A pair of nits, which probably aren't worth acting on:
 
  @@ -176,8 +177,13 @@ static int console_loop(int fd, struct xs_handle
  *xs, char *pty_path)
  fd_set fds;
   
  FD_ZERO(fds);
  -   FD_SET(STDIN_FILENO, fds);
  -   max_fd = STDIN_FILENO;
  +   if (interactive) {
  +   FD_SET(STDIN_FILENO, fds);
  +   max_fd = STDIN_FILENO;
  +   }
  +   else {
  +   max_fd = -1;
  +   }
 
 Looking at the rest of the file and tools/console subtree it seems the
 prevailing coding style is:
 
   } else
   max_fd = -1;
 
 (i.e. } brace on the same line as the else and no {} for single
 statements after an else).
 
 But maybe it would be better to set max_fd = -1 on declaration and do
 the max dance here as with the following cases?

Declaring max_fd = -1 is indeed clearer, I can do a v2 with that change if
you like.

One other bug that my change makes potentially easier to trigger is that
you can run xl console DOMID multiple times with the same DOMID and the
result is broken; each instance gets part of the data written to the
console.

Any ideas on how to address this in a simple fashion?

Martin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xenconsole: Allow non-interactive use

2015-07-23 Thread Ian Campbell
On Thu, 2015-07-23 at 17:09 +0200, Martin Lucina wrote:

  But maybe it would be better to set max_fd = -1 on declaration and 
  do
  the max dance here as with the following cases?
 
 Declaring max_fd = -1 is indeed clearer, I can do a v2 with that 
 change if you like.

If you are happy to then yes, please.

 One other bug that my change makes potentially easier to trigger is that
 you can run xl console DOMID multiple times with the same DOMID and the
 result is broken; each instance gets part of the data written to the
 console.
 
 Any ideas on how to address this in a simple fashion?

Perhaps the client should take some exclusive lock (fcntl based?) on an
fd of an open file with domid in the name. Failure to get the lock
should result in the client exiting with some message indicating the
console is in use.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Stable backport request for new OVMF version

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 17:31, wei.l...@citrix.com wrote:
 On Thu, Jul 23, 2015 at 09:28:15AM -0600, Jan Beulich wrote:
  On 23.07.15 at 16:58, ian.campb...@citrix.com wrote:
  On Thu, 2015-07-23 at 08:51 -0600, Jan Beulich wrote:
   
 On 23.07.15 at 16:43, ian.campb...@citrix.com wrote:
   On Thu, 2015-07-23 at 08:26 -0600, Jan Beulich wrote:
 
   On 23.07.15 at 15:56, ian.campb...@citrix.com wrote:
 The version of OVMF in 4.5 (and presumably earlier) doesn't 
 build 
 with
 the gcc in modern Linux distros (I noticed it with 4.9 from
 Debian/Jessie).
 
 I think we should update Config.mk:OVMF_UPSTREAM_REVISION to
 cb9a7ebabcd6b8a49dc0854b2f9592d732b5afbd in all stable 
 branches. 
 That
 version is the one currently in master.

To be honest I didn't even know we support use of OVMF in any
released version (nor in unstable). When did this change?
   
   It was present in 4.3. I don't think we build it by default since 
   its
   toolchain is a bit picky, but we turn it on and test it in osstest 
   from
   4.4 onwards.
   
   You can ask for it with bios=ovmf in your hvm guest's config.
  
  Well, that means it can be used by the adventurous, but it doesn't
  mean we consider it supported.
  
  TBH, I'm more concerned that it is a blocker for upgrade osstest to
  Debian Jessie, which in turn is a prerequisite for testing arm64.
 
 If it was explicitly unsupported, I could say do whatever you need
 to do to it, but with not being sure what status it has I'm kind of
 wondering what other changes (i.e. apart from fixing the build) the
 tree(s) would get which then may break someone. But in the end
 it's probably okay to do the change.
 
 
 It's marked as experimental, not supported and not built by default --
 if that makes you feel more comfortable.

It does. So Ian, feel free to go ahead.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4 11/17] vt-d: Add API to update IRTE when VT-d PI is used

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 15:51, andrew.coop...@citrix.com wrote:
 On 23/07/15 12:35, Feng Wu wrote:
 +GET_IREMAP_ENTRY(ir_ctrl-iremap_maddr, remap_index, iremap_entries, p);
 +
 +old_ire = new_ire = *p;
 +
 +/* Setup/Update interrupt remapping table entry. */
 +setup_posted_irte(new_ire, pi_desc, gvec);
 +ret = cmpxchg16b(p, old_ire, new_ire);
 +
 +ASSERT(ret == *(__uint128_t *)old_ire);
 
 This cannot be correct.  Either the cmpxchg() is required and you must
 cope with it failing, or the cmpxchg() is not required and this should
 be a plain write.

Not exactly: The cmpxchg() is required for this to be an atomic
128-bit write. And hence I would view the ASSERT() as
appropriate - it validates that the entry didn't change behind our
back.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST v3 01/13] toolstack: save / restore check

2015-07-23 Thread Ian Campbell
On Wed, 2015-07-22 at 11:13 +0100, Wei Liu wrote:
 Introduce check_for_command function and use it to check save / 
 restore
 functionality.
 
 Signed-off-by: Wei Liu wei.l...@citrix.com

Acked-by: Ian Campbell ian.campb...@citrix.com


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Stable backport request for new OVMF version

2015-07-23 Thread Wei Liu
On Thu, Jul 23, 2015 at 09:28:15AM -0600, Jan Beulich wrote:
  On 23.07.15 at 16:58, ian.campb...@citrix.com wrote:
  On Thu, 2015-07-23 at 08:51 -0600, Jan Beulich wrote:
   
 On 23.07.15 at 16:43, ian.campb...@citrix.com wrote:
   On Thu, 2015-07-23 at 08:26 -0600, Jan Beulich wrote:
 
   On 23.07.15 at 15:56, ian.campb...@citrix.com wrote:
 The version of OVMF in 4.5 (and presumably earlier) doesn't 
 build 
 with
 the gcc in modern Linux distros (I noticed it with 4.9 from
 Debian/Jessie).
 
 I think we should update Config.mk:OVMF_UPSTREAM_REVISION to
 cb9a7ebabcd6b8a49dc0854b2f9592d732b5afbd in all stable 
 branches. 
 That
 version is the one currently in master.

To be honest I didn't even know we support use of OVMF in any
released version (nor in unstable). When did this change?
   
   It was present in 4.3. I don't think we build it by default since 
   its
   toolchain is a bit picky, but we turn it on and test it in osstest 
   from
   4.4 onwards.
   
   You can ask for it with bios=ovmf in your hvm guest's config.
  
  Well, that means it can be used by the adventurous, but it doesn't
  mean we consider it supported.
  
  TBH, I'm more concerned that it is a blocker for upgrade osstest to
  Debian Jessie, which in turn is a prerequisite for testing arm64.
 
 If it was explicitly unsupported, I could say do whatever you need
 to do to it, but with not being sure what status it has I'm kind of
 wondering what other changes (i.e. apart from fixing the build) the
 tree(s) would get which then may break someone. But in the end
 it's probably okay to do the change.
 

It's marked as experimental, not supported and not built by default --
if that makes you feel more comfortable.

Wei.

 Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4 11/17] vt-d: Add API to update IRTE when VT-d PI is used

2015-07-23 Thread Andrew Cooper
On 23/07/15 16:52, Jan Beulich wrote:
 On 23.07.15 at 15:51, andrew.coop...@citrix.com wrote:
 On 23/07/15 12:35, Feng Wu wrote:
 +GET_IREMAP_ENTRY(ir_ctrl-iremap_maddr, remap_index, iremap_entries, 
 p);
 +
 +old_ire = new_ire = *p;
 +
 +/* Setup/Update interrupt remapping table entry. */
 +setup_posted_irte(new_ire, pi_desc, gvec);
 +ret = cmpxchg16b(p, old_ire, new_ire);
 +
 +ASSERT(ret == *(__uint128_t *)old_ire);
 This cannot be correct.  Either the cmpxchg() is required and you must
 cope with it failing, or the cmpxchg() is not required and this should
 be a plain write.
 Not exactly: The cmpxchg() is required for this to be an atomic
 128-bit write. And hence I would view the ASSERT() as
 appropriate - it validates that the entry didn't change behind our
 back.

But p is an active descriptor, which means hardware is liable to change
it behind our back.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Stable backport request for new OVMF version

2015-07-23 Thread Ian Campbell
On Thu, 2015-07-23 at 09:37 -0600, Jan Beulich wrote:
  
  It's marked as experimental, not supported and not built by default 
  --
  if that makes you feel more comfortable.
 
 It does. So Ian, feel free to go ahead.

Thanks. I suppose you were addressing Ian J here since I don't
typically do stable release work.

I'm happy to do the grunt work on this occasion though...

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 13/15] x86/altp2m: XSM hooks for altp2m HVM ops

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 01:01, edmund.h.wh...@intel.com wrote:
 --- a/xen/arch/x86/hvm/hvm.c
 +++ b/xen/arch/x86/hvm/hvm.c
 @@ -6005,6 +6005,9 @@ static int hvmop_set_param(
  nestedhvm_vcpu_destroy(v);
  break;
  case HVM_PARAM_ALTP2M:
 +rc = xsm_hvm_param_altp2mhvm(XSM_PRIV, d);
 +if ( rc )
 +break;
  if ( a.value  1 )
  rc = -EINVAL;
  if ( a.value 
 @@ -6189,6 +6192,9 @@ static int do_altp2m_op(
  goto out;
  }
  
 +if ( (rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, d ? d : current-domain)) )
 +goto out;
 +

Shouldn't this have got updated after the changes to patch 11?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4 11/17] vt-d: Add API to update IRTE when VT-d PI is used

2015-07-23 Thread Andrew Cooper
On 23/07/15 17:00, Jan Beulich wrote:
 On 23.07.15 at 17:55, andrew.coop...@citrix.com wrote:
 On 23/07/15 16:52, Jan Beulich wrote:
 On 23.07.15 at 15:51, andrew.coop...@citrix.com wrote:
 On 23/07/15 12:35, Feng Wu wrote:
 +GET_IREMAP_ENTRY(ir_ctrl-iremap_maddr, remap_index, iremap_entries, 
 p);
 +
 +old_ire = new_ire = *p;
 +
 +/* Setup/Update interrupt remapping table entry. */
 +setup_posted_irte(new_ire, pi_desc, gvec);
 +ret = cmpxchg16b(p, old_ire, new_ire);
 +
 +ASSERT(ret == *(__uint128_t *)old_ire);
 This cannot be correct.  Either the cmpxchg() is required and you must
 cope with it failing, or the cmpxchg() is not required and this should
 be a plain write.
 Not exactly: The cmpxchg() is required for this to be an atomic
 128-bit write. And hence I would view the ASSERT() as
 appropriate - it validates that the entry didn't change behind our
 back.
 But p is an active descriptor, which means hardware is liable to change
 it behind our back.
 I inquired about this on the previous round and was told
 hardware doesn't alter the descriptor.

Ah - apologies for not noticing this.

 Comparing this with
 the spec, I couldn't spot any fields that I would suspect
 getting written. Which fields do you have in mind?

None in particular.

I called it out because ASSERT(cmpxchg(...) == old) reads as if it is buggy.

If hardware will genuinely never update the descriptor, then a comment
should be put in here explaining why the assert is safe in this instance.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Stable backport request for new OVMF version

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 17:59, ian.campb...@citrix.com wrote:
 On Thu, 2015-07-23 at 09:37 -0600, Jan Beulich wrote:
  
  It's marked as experimental, not supported and not built by default 
  --
  if that makes you feel more comfortable.
 
 It does. So Ian, feel free to go ahead.
 
 Thanks. I suppose you were addressing Ian J here since I don't
 typically do stable release work.

No, since this was mainly for ARM (for which I recall you doing
backports), I did indeed mean you.

 I'm happy to do the grunt work on this occasion though...

Thanks.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 07/32] xen/x86: fix arch_set_info_guest for HVM guests

2015-07-23 Thread Andrew Cooper
On 23/07/15 17:00, Ian Campbell wrote:
 On Thu, 2015-07-23 at 17:48 +0200, Roger Pau Monné wrote:
 El 23/07/15 a les 17.32, Jan Beulich ha escrit:
 On 23.07.15 at 17:10, roger@citrix.com wrote:
 IMHO introducing a new structure that gets rid of all the PV-only 

 fields seems like a good option:

 struct vcpu_hvm_context {
 #define _VGCF_online   5
 #define VGCF_online(1_VGCF_online)
 uint32_t flags; /* VGCF_* flags  

 */
 struct cpu_hvm_user_regs user_regs; /* User-level CPU 
 registers 
 */
 /* NB. User pagetable on x86/64 is placed in ctrlreg[1]. */
 uint32_t ctrlreg[8];/* CR0-CR7 (control 
 registers)  
 */
 uint32_t debugreg[8];   /* DB0-DB7 (debug 
 registers)
 */
 };

 I'm also seriously considering getting rid of ctrlreg and 
 debugreg.
 Since HVM VCPUs will always be started in 32bit flat mode, it 
 doesn't
 make sense IMHO to have both the 32 and the 64 version of the 
 registers, so cpu_hvm_user_regs is always going to be:

 struct cpu_hvm_user_regs {
 uint32_t ebx;
 uint32_t ecx;
 uint32_t edx;
 uint32_t esi;
 uint32_t edi;
 uint32_t ebp;
 uint32_t eax;
 uint32_t eip;
 uint32_t esp;
 uint32_t eflags;
 uint16_t cs;
 uint16_t ss;
 uint16_t es;
 uint16_t ds;
 uint16_t fs;
 uint16_t gs;
 };

 We could however do something similar to what's done in ARM and 
 have
 a union of both the 32 and the 64bit registers in case we want to
 start the vCPU in 64bit mode sometime in the future.
 What you gave above is suitable only for VCPUOP_initialise afaict.
 Did you intend this to be the case?
 Certainly not, this should also be used by XEN_DOMCTL_setvcpucontext.
 I'm afraid I'm missing something obvious, but I don't see why this
 couldn't be used by XEN_DOMCTL_setvcpucontext TBH, can you please 
 clarify?
 set/getvcpucontext need to pickle all state for save/restore/migration,
 not just the start of day state.

HVM migration doesn't use set/getvcpucontext.

I would expect an HVMLite-based-PVH to follow suit and just use
set/gethvmcontext.

(In fact, while doing migration v2 I was half thinking of making PV
follow HVM suit somewhat, to simplify the toolstack side of things)

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST v3 05/13] toolstack: distinguish local and remote migration support

2015-07-23 Thread Ian Campbell
On Wed, 2015-07-22 at 11:13 +0100, Wei Liu wrote:
 
 +# Mode should be either 1 (local) or 0 (remote)
 +our ($whhost, $gn, $mode) = @ARGV;

I think $mode would be better given a more boolean name, in this case
$local.

However that's a minor nit so with or without that changed:

Acked-by: Ian Campbell ian.campb...@citrix.com

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [MINUTES] Monthly Xen.org Technical Call (2015-07-22)

2015-07-23 Thread Ian Campbell
On Thu, 2015-07-23 at 13:17 +0200, Roger Pau Monné wrote:
 El 23/07/15 a les 12.59, Konrad Rzeszutek Wilk ha escrit:
 [...]
  We forgot to speak about dom0. This work outlined will lay out how 
  to do it - but the pieces for dom0 are not implemented and would 
  need work (which actually may be following most of the is_pvh in 
  the hypervisor).
 
 Let's do that once we have DomU nailed down and the guest ABI marked 
 as stable.

IIRC someone on the call pointed out that one large piece of the Dom0
work would be very similar to the PCI passthrough work, so it makes
sense to defer them together.

 
 Some work will indeed be needed for Dom0, for example I would like to
 avoid using the current Dom0 builder (construct_dom0, that is suited for
 PV, but all the PVH hacks just make it almost impossible to understand)
 and start from scratch with a more sane approach.
 
 Roger.
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 07/32] xen/x86: fix arch_set_info_guest for HVM guests

2015-07-23 Thread Jan Beulich
 On 23.07.15 at 12:25, roger@citrix.com wrote:
 El 13/07/15 a les 16.01, Jan Beulich ha escrit:
 On 03.07.15 at 13:34, roger@citrix.com wrote:
 --- a/xen/arch/x86/domain.c
 +++ b/xen/arch/x86/domain.c
 @@ -795,6 +795,15 @@ int arch_set_info_guest(
c.nat-fs_base || c.nat-gs_base_user)) )
  return -EINVAL;
  }
 +else if ( is_hvm_domain(d) )
 +{
 +if ( c(ctrlreg[0]) || c(ctrlreg[1]) || c(ctrlreg[2]) ||
 + c(ctrlreg[3]) || c(ctrlreg[4]) || c(ctrlreg[5]) ||
 + c(ctrlreg[6]) || c(ctrlreg[7]) || c(ldt_base) ||
 + c(ldt_ents) || c(kernel_ss) || c(kernel_sp) ||
 + c(gdt_ents) )
 +return -EINVAL;
 +}
 
 In addition to what Andrew said - is the use of c() here really correct
 considering
 
 compat = is_pv_32bit_domain(d);
 
 #define c(fld) (compat ? (c.cmp-fld) : (c.nat-fld))
 
 near the beginning of the function?
 
 I've been thinking about this. Since HVM vCPUs are always started in
 32bit mode, I would ideally like to use the vcpu_guest_context_x86_32_t
 struct to set the vCPU context. This means that for HVM guests compat
 should always be true.
 
 The problem is that inside of the guest there's no notion of
 vcpu_guest_context_x86_32_t or vcpu_guest_context_x86_64_t, there's only
 vcpu_guest_context, which defaults to the native bitness of the guest.
 So if your guest is running in 64bit mode vcpu_guest_context is aliased
 to vcpu_guest_context_x86_64_t by default.
 
 TBH I'm not sure about the best way to solve this. Should
 vcpu_guest_context_x86_32_t and vcpu_guest_context_x86_64_t be exposed
 to the guest like it's done for the tools?

Why? We're in arch_set_info_guest(), which is unreachable for a
HVM guest on itself. Or is this in preparation of allowing HVM
guests to use VCPUOP_initialise? In that case, exposing might be
an option, but remember that the compat mode layout even today
is used only for PV guests. So I'd rather avoid exposing both
layouts to guests, and do translation instead inside the hypervisor.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] tools/libxl: Fixes to stream v2 task joining logic

2015-07-23 Thread Wei Liu
On Thu, Jul 23, 2015 at 12:09:38PM +0100, Andrew Cooper wrote:
 During review of the libxl migration v2 series, I changes the task
 joining logic, but clearly didn't think the result through
 properly. This patch fixes several errors.
 
 1) Do not call check_all_finished() in the success cases of
 libxl__xc_domain_{save,restore}_done().  It serves no specific purpose
 as the save helper state will report itself as inactive by this point,
 and avoids triggering a second stream-completion_callback() in the case
 that write_toolstack_record()/stream_continue() record errors
 synchronously themselves.
 
 2) Only ever set stream-rc in stream_complete().  The first version of
 the migration v2 series had separate rc and joined_rc parameters, where
 this logic worked.  However when combining the two, the teardown path
 fails to trigger if stream_done() records stream-rc itself.  A side
 effect of this is that stream_done() needs to take an rc parameter.
 
 3) Avoid stacking of check_all_finished() via synchronous teardown of
 tasks.  If the _abort() functions call back synchronously,
 stream-completion_callback() ends up getting called twice, as first and
 last check_all_finished() frames observe each task being finished.
 
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 
 ---
 I found this while working to fix the toolstack record issue, but am
 posting this bugfix ahead of the other work as OSSTest has tripped over
 the issue.

This change itself doesn't seem to have anything to do with libxc. In
OSSTest the error that triggers this knock on effect is the failure of
xc_map_foreign_bulk. Does that mean this patch only fix half of the
problem seen in OSSTest?

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [v4 11/17] vt-d: Add API to update IRTE when VT-d PI is used

2015-07-23 Thread Feng Wu
This patch adds an API which is used to update the IRTE
for posted-interrupt when guest changes MSI/MSI-X information.

CC: Yang Zhang yang.z.zh...@intel.com
CC: Kevin Tian kevin.t...@intel.com
CC: Keir Fraser k...@xen.org
CC: Jan Beulich jbeul...@suse.com
CC: Andrew Cooper andrew.coop...@citrix.com
Signed-off-by: Feng Wu feng...@intel.com
Acked-by: Kevin Tian kevin.t...@intel.com
---
v4:
- Don't inline setup_posted_irte()
- const struct pi_desc *pi_desc for setup_posted_irte()
- return -EINVAL when pirq_spin_lock_irq_desc() fails.
- Make some variables const
- Release irq desc lock earlier in pi_update_irte()
- Remove the pointless do-while() loop when doing cmpxchg16b()

v3:
- Remove adding PDA_MASK() when updating 'pda_l' and 'pda_h' for IRTE.
- Change the return type of pi_update_irte() to int.
- Remove some pointless printk message in pi_update_irte().
- Use structure assignment instead of memcpy() for irte copy.

 xen/drivers/passthrough/vtd/intremap.c | 107 +
 xen/drivers/passthrough/vtd/iommu.h|   2 +
 xen/include/asm-x86/iommu.h|   2 +
 3 files changed, 111 insertions(+)

diff --git a/xen/drivers/passthrough/vtd/intremap.c 
b/xen/drivers/passthrough/vtd/intremap.c
index b7a42f6..724ba83 100644
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -900,3 +900,110 @@ void iommu_disable_x2apic_IR(void)
 for_each_drhd_unit ( drhd )
 disable_qinval(drhd-iommu);
 }
+
+static void setup_posted_irte(
+struct iremap_entry *new_ire, const struct pi_desc *pi_desc, uint8_t gvec)
+{
+new_ire-post.urg = 0;
+new_ire-post.vector = gvec;
+new_ire-post.pda_l = virt_to_maddr(pi_desc)  (32 - PDA_LOW_BIT);
+new_ire-post.pda_h = virt_to_maddr(pi_desc)  32;
+
+new_ire-post.res_1 = 0;
+new_ire-post.res_2 = 0;
+new_ire-post.res_3 = 0;
+new_ire-post.res_4 = 0;
+new_ire-post.res_5 = 0;
+
+new_ire-post.im = 1;
+}
+
+/*
+ * This function is used to update the IRTE for posted-interrupt
+ * when guest changes MSI/MSI-X information.
+ */
+int pi_update_irte(struct vcpu *v, struct pirq *pirq, uint8_t gvec)
+{
+struct irq_desc *desc;
+const struct msi_desc *msi_desc;
+int remap_index;
+int rc = 0;
+const struct pci_dev *pci_dev;
+const struct acpi_drhd_unit *drhd;
+struct iommu *iommu;
+struct ir_ctrl *ir_ctrl;
+struct iremap_entry *iremap_entries = NULL, *p = NULL;
+struct iremap_entry new_ire, old_ire;
+const struct pi_desc *pi_desc = v-arch.hvm_vmx.pi_desc;
+unsigned long flags;
+__uint128_t ret;
+
+desc = pirq_spin_lock_irq_desc(pirq, NULL);
+if ( !desc )
+return -EINVAL;
+
+msi_desc = desc-msi_desc;
+if ( !msi_desc )
+{
+rc = -EBADSLT;
+goto unlock_out;
+}
+
+pci_dev = msi_desc-dev;
+if ( !pci_dev )
+{
+rc = -ENODEV;
+goto unlock_out;
+}
+
+remap_index = msi_desc-remap_index;
+
+/*
+ * For performance concern, we will store the 'iommu' pointer in
+ * 'struct msi_desc' in some other place, so we don't need to waste
+ * time searching it here. I will fix this soon.
+ */
+drhd = acpi_find_matched_drhd_unit(pci_dev);
+if ( !drhd )
+{
+rc = -ENODEV;
+goto unlock_out;
+}
+
+iommu = drhd-iommu;
+ir_ctrl = iommu_ir_ctrl(iommu);
+if ( !ir_ctrl )
+{
+rc = -ENODEV;
+goto unlock_out;
+}
+
+spin_unlock_irq(desc-lock);
+
+spin_lock_irqsave(ir_ctrl-iremap_lock, flags);
+
+GET_IREMAP_ENTRY(ir_ctrl-iremap_maddr, remap_index, iremap_entries, p);
+
+old_ire = new_ire = *p;
+
+/* Setup/Update interrupt remapping table entry. */
+setup_posted_irte(new_ire, pi_desc, gvec);
+ret = cmpxchg16b(p, old_ire, new_ire);
+
+ASSERT(ret == *(__uint128_t *)old_ire);
+
+iommu_flush_cache_entry(p, sizeof(struct iremap_entry));
+iommu_flush_iec_index(iommu, 0, remap_index);
+
+if ( iremap_entries )
+unmap_vtd_domain_page(iremap_entries);
+
+spin_unlock_irqrestore(ir_ctrl-iremap_lock, flags);
+
+return 0;
+
+ unlock_out:
+spin_unlock_irq(desc-lock);
+
+return rc;
+}
diff --git a/xen/drivers/passthrough/vtd/iommu.h 
b/xen/drivers/passthrough/vtd/iommu.h
index 0be6c2e..692473f 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -330,6 +330,8 @@ struct iremap_entry {
   };
 };
 
+#define PDA_LOW_BIT26
+
 /* Max intr remapping table page order is 8, as max number of IRTEs is 64K */
 #define IREMAP_PAGE_ORDER  8
 
diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h
index e7a65da..2a1523e 100644
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -32,6 +32,8 @@ int iommu_supports_eim(void);
 int iommu_enable_x2apic_IR(void);
 void iommu_disable_x2apic_IR(void);
 
+int pi_update_irte(struct vcpu *v, struct pirq *pirq, uint8_t gvec);
+
 #endif /* 

[Xen-devel] [v4 13/17] vmx: posted-interrupt handling when vCPU is blocked

2015-07-23 Thread Feng Wu
This patch includes the following aspects:
- Add a global vector to wake up the blocked vCPU
  when an interrupt is being posted to it (This
  part was sugguested by Yang Zhang yang.z.zh...@intel.com).
- Adds a new per-vCPU tasklet to wakeup the blocked
  vCPU. It can be used in the case vcpu_unblock
  cannot be called directly.
- Define two per-cpu variables:
  * pi_blocked_vcpu:
  A list storing the vCPUs which were blocked on this pCPU.

  * pi_blocked_vcpu_lock:
  The spinlock to protect pi_blocked_vcpu.

CC: Kevin Tian kevin.t...@intel.com
CC: Keir Fraser k...@xen.org
CC: Jan Beulich jbeul...@suse.com
CC: Andrew Cooper andrew.coop...@citrix.com
Signed-off-by: Feng Wu feng...@intel.com
---
v4:
- Use local variables in pi_wakeup_interrupt()
- Remove vcpu from the blocked list when pi_desc.on==1, this
- avoid kick vcpu multiple times.
- Remove tasklet

v3:
- This patch is generated by merging the following three patches in v2:
   [RFC v2 09/15] Add a new per-vCPU tasklet to wakeup the blocked vCPU
   [RFC v2 10/15] vmx: Define two per-cpu variables
   [RFC v2 11/15] vmx: Add a global wake-up vector for VT-d Posted-Interrupts
- rename 'vcpu_wakeup_tasklet' to 'pi_vcpu_wakeup_tasklet'
- Move the definition of 'pi_vcpu_wakeup_tasklet' to 'struct arch_vmx_struct'
- rename 'vcpu_wakeup_tasklet_handler' to 'pi_vcpu_wakeup_tasklet_handler'
- Make pi_wakeup_interrupt() static
- Rename 'blocked_vcpu_list' to 'pi_blocked_vcpu_list'
- move 'pi_blocked_vcpu_list' to 'struct arch_vmx_struct'
- Rename 'blocked_vcpu' to 'pi_blocked_vcpu'
- Rename 'blocked_vcpu_lock' to 'pi_blocked_vcpu_lock'

 xen/arch/x86/hvm/vmx/vmcs.c|  3 ++
 xen/arch/x86/hvm/vmx/vmx.c | 63 ++
 xen/include/asm-x86/hvm/vmx/vmcs.h |  3 ++
 xen/include/asm-x86/hvm/vmx/vmx.h  |  5 +++
 4 files changed, 74 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 11dc1b5..0c5ce3f 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -631,6 +631,9 @@ int vmx_cpu_up(void)
 if ( cpu_has_vmx_vpid )
 vpid_sync_all();
 
+INIT_LIST_HEAD(per_cpu(pi_blocked_vcpu, cpu));
+spin_lock_init(per_cpu(pi_blocked_vcpu_lock, cpu));
+
 return 0;
 }
 
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index cd5bfab..d137ea0 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -82,7 +82,15 @@ static int vmx_msr_read_intercept(unsigned int msr, uint64_t 
*msr_content);
 static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content);
 static void vmx_invlpg_intercept(unsigned long vaddr);
 
+/*
+ * We maintian a per-CPU linked-list of vCPU, so in PI wakeup handler we
+ * can find which vCPU should be waken up.
+ */
+DEFINE_PER_CPU(struct list_head, pi_blocked_vcpu);
+DEFINE_PER_CPU(spinlock_t, pi_blocked_vcpu_lock);
+
 uint8_t __read_mostly posted_intr_vector;
+uint8_t __read_mostly pi_wakeup_vector;
 
 static int vmx_domain_initialise(struct domain *d)
 {
@@ -105,6 +113,9 @@ static int vmx_vcpu_initialise(struct vcpu *v)
 
 spin_lock_init(v-arch.hvm_vmx.vmcs_lock);
 
+INIT_LIST_HEAD(v-arch.hvm_vmx.pi_blocked_vcpu_list);
+INIT_LIST_HEAD(v-arch.hvm_vmx.pi_vcpu_on_set_list);
+
 v-arch.schedule_tail= vmx_do_resume;
 v-arch.ctxt_switch_from = vmx_ctxt_switch_from;
 v-arch.ctxt_switch_to   = vmx_ctxt_switch_to;
@@ -1849,6 +1860,53 @@ static struct hvm_function_table __initdata 
vmx_function_table = {
 .enable_msr_exit_interception = vmx_enable_msr_exit_interception,
 };
 
+/*
+ * Handle VT-d posted-interrupt when VCPU is blocked.
+ */
+static void pi_wakeup_interrupt(struct cpu_user_regs *regs)
+{
+struct arch_vmx_struct *vmx, *tmp;
+struct vcpu *v;
+spinlock_t *lock = this_cpu(pi_blocked_vcpu_lock);
+struct list_head *blocked_vcpus = this_cpu(pi_blocked_vcpu);
+LIST_HEAD(list);
+
+spin_lock(lock);
+
+/*
+ * XXX: The length of the list depends on how many vCPU is current
+ * blocked on this specific pCPU. This may hurt the interrupt latency
+ * if the list grows to too many entries.
+ */
+list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocked_vcpu_list)
+{
+if ( pi_test_on(vmx-pi_desc) )
+{
+list_del_init(vmx-pi_blocked_vcpu_list);
+
+/*
+ * We cannot call vcpu_unblock here, since it also needs
+ * 'pi_blocked_vcpu_lock', we store the vCPUs with ON
+ * set in another list and unblock them after we release
+ * 'pi_blocked_vcpu_lock'.
+ */
+list_add_tail(vmx-pi_vcpu_on_set_list, list);
+}
+}
+
+spin_unlock(lock);
+
+list_for_each_entry_safe(vmx, tmp, list, pi_vcpu_on_set_list)
+{
+v = container_of(vmx, struct vcpu, arch.hvm_vmx);
+list_del_init(vmx-pi_vcpu_on_set_list);
+vcpu_unblock(v);
+}
+
+ack_APIC_irq();
+

[Xen-devel] [v4 08/17] vmx: Initialize VT-d Posted-Interrupts Descriptor

2015-07-23 Thread Feng Wu
This patch initializes the VT-d Posted-interrupt Descriptor.

CC: Kevin Tian kevin.t...@intel.com
CC: Keir Fraser k...@xen.org
CC: Jan Beulich jbeul...@suse.com
CC: Andrew Cooper andrew.coop...@citrix.com
Signed-off-by: Feng Wu feng...@intel.com
Acked-by: Kevin Tian kevin.t...@intel.com
---
v4:
- No changes

v3:
- Move pi_desc_init() to xen/arch/x86/hvm/vmx/vmcs.c
- Remove the 'inline' flag of pi_desc_init()

 xen/arch/x86/hvm/vmx/vmcs.c   | 18 ++
 xen/include/asm-x86/hvm/vmx/vmx.h |  2 ++
 2 files changed, 20 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 3aff365..11dc1b5 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -40,6 +40,7 @@
 #include asm/flushtlb.h
 #include asm/shadow.h
 #include asm/tboot.h
+#include asm/apic.h
 
 static bool_t __read_mostly opt_vpid_enabled = 1;
 boolean_param(vpid, opt_vpid_enabled);
@@ -921,6 +922,20 @@ void virtual_vmcs_vmwrite(void *vvmcs, u32 vmcs_encoding, 
u64 val)
 virtual_vmcs_exit(vvmcs);
 }
 
+static void pi_desc_init(struct vcpu *v)
+{
+uint32_t dest;
+
+v-arch.hvm_vmx.pi_desc.nv = posted_intr_vector;
+
+dest = cpu_physical_id(v-processor);
+
+if ( x2apic_enabled )
+v-arch.hvm_vmx.pi_desc.ndst = dest;
+else
+v-arch.hvm_vmx.pi_desc.ndst = MASK_INSR(dest, PI_xAPIC_NDST_MASK);
+}
+
 static int construct_vmcs(struct vcpu *v)
 {
 struct domain *d = v-domain;
@@ -1054,6 +1069,9 @@ static int construct_vmcs(struct vcpu *v)
 
 if ( cpu_has_vmx_posted_intr_processing )
 {
+if ( iommu_intpost )
+pi_desc_init(v);
+
 __vmwrite(PI_DESC_ADDR, virt_to_maddr(v-arch.hvm_vmx.pi_desc));
 __vmwrite(POSTED_INTR_NOTIFICATION_VECTOR, posted_intr_vector);
 }
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h 
b/xen/include/asm-x86/hvm/vmx/vmx.h
index 2eb7cbd..b53973e 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -89,6 +89,8 @@ typedef enum {
 #define EPT_EMT_WB  6
 #define EPT_EMT_RSV27
 
+#define PI_xAPIC_NDST_MASK  0xFF00
+
 void vmx_asm_vmexit_handler(struct cpu_user_regs);
 void vmx_asm_do_vmentry(void);
 void vmx_intr_assist(void);
-- 
2.1.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [v4 12/17] Update IRTE according to guest interrupt config changes

2015-07-23 Thread Feng Wu
When guest changes its interrupt configuration (such as, vector, etc.)
for direct-assigned devices, we need to update the associated IRTE
with the new guest vector, so external interrupts from the assigned
devices can be injected to guests without VM-Exit.

For lowest-priority interrupts, we use vector-hashing mechamisn to find
the destination vCPU. This follows the hardware behavior, since modern
Intel CPUs use vector hashing to handle the lowest-priority interrupt.

For multicast/broadcast vCPU, we cannot handle it via interrupt posting,
still use interrupt remapping.

CC: Jan Beulich jbeul...@suse.com
Signed-off-by: Feng Wu feng...@intel.com
---
v4:
- Make some 'int' variables 'unsigned int' in pi_find_dest_vcpu()
- Make 'dest_id' uint32_t
- Rename 'size' to 'bitmap_array_size'
- find_next_bit() and find_first_bit() always return unsigned int,
  so no need to check whether the return value is less than 0.
- Message error level XENLOG_G_WARNING - XENLOG_G_INFO
- Remove useless warning message
- Create a seperate function vector_hashing_dest() to find the
- destination of lowest-priority interrupts.
- Change some comments

v3:
- Use bitmap to store the all the possible destination vCPUs of an
  interrupt, then trying to find the right destination from the bitmap
- Typo and some small changes

 xen/drivers/passthrough/io.c | 124 ++-
 1 file changed, 123 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index 9b77334..546b962 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -26,6 +26,7 @@
 #include asm/hvm/iommu.h
 #include asm/hvm/support.h
 #include xen/hvm/irq.h
+#include asm/io_apic.h
 
 static DEFINE_PER_CPU(struct list_head, dpci_list);
 
@@ -199,6 +200,108 @@ void free_hvm_irq_dpci(struct hvm_irq_dpci *dpci)
 xfree(dpci);
 }
 
+/*
+ * This routine handles lowest-priority interrupts using vector-hashing
+ * mechanism. As an example, modern Intel CPUs use this method to handle
+ * lowest-priority interrupts.
+ *
+ * Here is the details about the vector-hashing mechanism:
+ * 1. For lowest-priority interrupts, store all the possible destination
+ *vCPUs in an array.
+ * 2. Use gvec % max number of destination vCPUs to find the right
+ *destination vCPU in the array for the lowest-priority interrupt.
+ */
+static struct vcpu *vector_hashing_dest(const struct domain *d,
+uint32_t dest_id,
+bool_t dest_mode,
+uint8_t gvec)
+
+{
+unsigned long *dest_vcpu_bitmap;
+unsigned int dest_vcpu_num = 0, idx;
+unsigned int bitmap_array_size = BITS_TO_LONGS(d-max_vcpus);
+struct vcpu *v, *dest = NULL;
+unsigned int i;
+
+dest_vcpu_bitmap = xzalloc_array(unsigned long, bitmap_array_size);
+if ( !dest_vcpu_bitmap )
+{
+dprintk(XENLOG_G_INFO,
+dom%d: failed to allocate memory\n, d-domain_id);
+return NULL;
+}
+
+for_each_vcpu ( d, v )
+{
+if ( !vlapic_match_dest(vcpu_vlapic(v), NULL, 0,
+dest_id, dest_mode) )
+continue;
+
+__set_bit(v-vcpu_id, dest_vcpu_bitmap);
+dest_vcpu_num++;
+}
+
+if ( dest_vcpu_num != 0 )
+{
+idx = 0;
+
+for ( i = gvec % dest_vcpu_num; i = 0; i--)
+{
+idx = find_next_bit(dest_vcpu_bitmap, d-max_vcpus, idx) + 1;
+BUG_ON(idx = d-max_vcpus);
+}
+idx--;
+
+dest = d-vcpu[idx];
+}
+
+xfree(dest_vcpu_bitmap);
+
+return dest;
+}
+
+/*
+ * The purpose of this routine is to find the right destination vCPU for
+ * an interrupt which will be delivered by VT-d posted-interrupt. There
+ * are several cases as below:
+ *
+ * - For lowest-priority interrupts, use vector-hashing mechanism to find
+ *   the destination.
+ * - Otherwise, for single destination interrupt, it is straightforward to
+ *   find the destination vCPU and return true.
+ * - For multicast/broadcast vCPU, we cannot handle it via interrupt posting,
+ *   so return NULL.
+ */
+static struct vcpu *pi_find_dest_vcpu(const struct domain *d, uint32_t dest_id,
+  bool_t dest_mode, uint8_t delivery_mode,
+  uint8_t gvec)
+{
+unsigned int dest_vcpu_num = 0;
+struct vcpu *v, *dest = NULL;
+
+if ( delivery_mode == dest_LowestPrio )
+return vector_hashing_dest(d, dest_id, dest_mode, gvec);
+
+for_each_vcpu ( d, v )
+{
+if ( !vlapic_match_dest(vcpu_vlapic(v), NULL, 0,
+dest_id, dest_mode) )
+continue;
+
+dest_vcpu_num++;
+dest = v;
+}
+
+/*
+ * For fixed destination, we only handle single-destination
+ * interrupts.
+ */
+if ( dest_vcpu_num == 1 )
+   return dest;
+
+

[Xen-devel] [v4 14/17] vmx: Properly handle notification event when vCPU is running

2015-07-23 Thread Feng Wu
When a vCPU is running in Root mode and a notification event
has been injected to it. we need to set VCPU_KICK_SOFTIRQ for
the current cpu, so the pending interrupt in PIRR will be
synced to vIRR before VM-Exit in time.

CC: Kevin Tian kevin.t...@intel.com
CC: Keir Fraser k...@xen.org
CC: Jan Beulich jbeul...@suse.com
CC: Andrew Cooper andrew.coop...@citrix.com
Signed-off-by: Feng Wu feng...@intel.com
Acked-by: Kevin Tian kevin.t...@intel.com
---
v4:
- Coding style.

v3:
- Make pi_notification_interrupt() static

 xen/arch/x86/hvm/vmx/vmx.c | 47 +-
 1 file changed, 46 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index d137ea0..f2b7fe0 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1907,6 +1907,51 @@ static void pi_wakeup_interrupt(struct cpu_user_regs 
*regs)
 this_cpu(irq_count)++;
 }
 
+/* Handle VT-d posted-interrupt when VCPU is running. */
+static void pi_notification_interrupt(struct cpu_user_regs *regs)
+{
+/*
+ * We get here when a vCPU is running in root-mode (such as via hypercall,
+ * or any other reasons which can result in VM-Exit), and before vCPU is
+ * back to non-root, external interrupts from an assigned device happen
+ * and a notification event is delivered to this logical CPU.
+ *
+ * we need to set VCPU_KICK_SOFTIRQ for the current cpu, just like
+ * __vmx_deliver_posted_interrupt(). So the pending interrupt in PIRR will
+ * be synced to vIRR before VM-Exit in time.
+ *
+ * Please refer to the following code fragments from
+ * xen/arch/x86/hvm/vmx/entry.S:
+ *
+ * .Lvmx_do_vmentry
+ *
+ *  ..
+ *  point 1
+ *
+ *  cmp  %ecx,(%rdx,%rax,1)
+ *  jnz  .Lvmx_process_softirqs
+ *
+ *  ..
+ *
+ *  je   .Lvmx_launch
+ *
+ *  ..
+ *
+ * .Lvmx_process_softirqs:
+ *  sti
+ *  call do_softirq
+ *  jmp  .Lvmx_do_vmentry
+ *
+ * If VT-d engine issues a notification event at point 1 above, it cannot
+ * be delivered to the guest during this VM-entry without raising the
+ * softirq in this notification handler.
+ */
+raise_softirq(VCPU_KICK_SOFTIRQ);
+
+ack_APIC_irq();
+this_cpu(irq_count)++;
+}
+
 const struct hvm_function_table * __init start_vmx(void)
 {
 set_in_cr4(X86_CR4_VMXE);
@@ -1944,7 +1989,7 @@ const struct hvm_function_table * __init start_vmx(void)
 
 if ( cpu_has_vmx_posted_intr_processing )
 {
-alloc_direct_apic_vector(posted_intr_vector, event_check_interrupt);
+alloc_direct_apic_vector(posted_intr_vector, 
pi_notification_interrupt);
 
 if ( iommu_intpost )
 alloc_direct_apic_vector(pi_wakeup_vector, pi_wakeup_interrupt);
-- 
2.1.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [v4 16/17] vmx: Add some scheduler hooks for VT-d posted interrupts

2015-07-23 Thread Feng Wu
This patch adds the following arch hooks in scheduler:
- vmx_pre_ctx_switch_pi():
It is called before context switch, we update the posted
interrupt descriptor when the vCPU is preempted, go to sleep,
or is blocked.

- vmx_post_ctx_switch_pi()
It is called after context switch, we update the posted
interrupt descriptor when the vCPU is going to run.

- arch_vcpu_wake()
It will be called when waking up the vCPU, we update
the posted interrupt descriptor when the vCPU is unblocked.

CC: Keir Fraser k...@xen.org
CC: Jan Beulich jbeul...@suse.com
CC: Andrew Cooper andrew.coop...@citrix.com
CC: Kevin Tian kevin.t...@intel.com
CC: George Dunlap george.dun...@eu.citrix.com
CC: Dario Faggioli dario.faggi...@citrix.com
Sugguested-by: Dario Faggioli dario.faggi...@citrix.com
Signed-off-by: Feng Wu feng...@intel.com
---
v4:
- Newly added

 xen/arch/x86/domain.c  |  10 +++
 xen/arch/x86/hvm/vmx/vmx.c | 145 +
 xen/common/schedule.c  |   2 +
 xen/include/asm-x86/domain.h   |   3 +
 xen/include/asm-x86/hvm/hvm.h  |   2 +
 xen/include/asm-x86/hvm/vmx/vmcs.h |   8 ++
 6 files changed, 170 insertions(+)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index db073a6..36fd2d5 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1550,9 +1550,19 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
 
 set_current(next);
 
+/*
+ * We need to update posted interrupt descriptor for each context switch,
+ * hence cannot use the lazy context switch for this.
+ */
+if ( !is_idle_vcpu(prev)  prev-arch.pi_ctxt_switch_from )
+prev-arch.pi_ctxt_switch_from(prev);
+
 if ( (per_cpu(curr_vcpu, cpu) == next) ||
  (is_idle_vcpu(next)  cpu_online(cpu)) )
 {
+if ( !is_idle_vcpu(next)  next-arch.pi_ctxt_switch_to )
+next-arch.pi_ctxt_switch_to(next);
+
 local_irq_enable();
 }
 else
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index f2b7fe0..f8b3601 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -67,6 +67,8 @@ enum handler_return { HNDL_done, HNDL_unhandled, 
HNDL_exception_raised };
 
 static void vmx_ctxt_switch_from(struct vcpu *v);
 static void vmx_ctxt_switch_to(struct vcpu *v);
+static void vmx_pre_ctx_switch_pi(struct vcpu *v);
+static void vmx_post_ctx_switch_pi(struct vcpu *v);
 
 static int  vmx_alloc_vlapic_mapping(struct domain *d);
 static void vmx_free_vlapic_mapping(struct domain *d);
@@ -116,10 +118,17 @@ static int vmx_vcpu_initialise(struct vcpu *v)
 INIT_LIST_HEAD(v-arch.hvm_vmx.pi_blocked_vcpu_list);
 INIT_LIST_HEAD(v-arch.hvm_vmx.pi_vcpu_on_set_list);
 
+v-arch.hvm_vmx.pi_block_cpu = -1;
+
+spin_lock_init(v-arch.hvm_vmx.pi_lock);
+
 v-arch.schedule_tail= vmx_do_resume;
 v-arch.ctxt_switch_from = vmx_ctxt_switch_from;
 v-arch.ctxt_switch_to   = vmx_ctxt_switch_to;
 
+v-arch.pi_ctxt_switch_from = vmx_pre_ctx_switch_pi;
+v-arch.pi_ctxt_switch_to = vmx_post_ctx_switch_pi;
+
 if ( (rc = vmx_create_vmcs(v)) != 0 )
 {
 dprintk(XENLOG_WARNING,
@@ -715,6 +724,141 @@ static void vmx_fpu_leave(struct vcpu *v)
 }
 }
 
+void arch_vcpu_wake(struct vcpu *v)
+{
+unsigned long gflags;
+
+if ( !iommu_intpost || !is_hvm_vcpu(v) || !has_arch_pdevs(v-domain) )
+return;
+
+spin_lock_irqsave(v-arch.hvm_vmx.pi_lock, gflags);
+
+if ( likely(vcpu_runnable(v)) ||
+ !test_bit(_VPF_blocked, v-pause_flags) )
+{
+struct pi_desc *pi_desc = v-arch.hvm_vmx.pi_desc;
+unsigned long flags;
+
+/*
+ * We don't need to send notification event to a non-running
+ * vcpu, the interrupt information will be delivered to it before
+ * VM-ENTRY when the vcpu is scheduled to run next time.
+ */
+pi_set_sn(pi_desc);
+
+/*
+ * Set 'NV' feild back to posted_intr_vector, so the
+ * Posted-Interrupts can be delivered to the vCPU by
+ * VT-d HW after it is scheduled to run.
+ */
+write_atomic((uint8_t*)pi_desc-nv, posted_intr_vector);
+
+/*
+ * Delete the vCPU from the related block list
+ * if we are resuming from blocked state
+ */
+if ( v-arch.hvm_vmx.pi_block_cpu != -1 )
+{
+spin_lock_irqsave(per_cpu(pi_blocked_vcpu_lock,
+  v-arch.hvm_vmx.pi_block_cpu), flags);
+list_del_init(v-arch.hvm_vmx.pi_blocked_vcpu_list);
+spin_unlock_irqrestore(per_cpu(pi_blocked_vcpu_lock,
+v-arch.hvm_vmx.pi_block_cpu), flags);
+}
+}
+
+spin_unlock_irqrestore(v-arch.hvm_vmx.pi_lock, gflags);
+}
+
+static void vmx_pre_ctx_switch_pi(struct vcpu *v)
+{
+struct pi_desc *pi_desc = v-arch.hvm_vmx.pi_desc;
+struct pi_desc old, new;
+unsigned long flags, gflags;
+
+if ( 

[Xen-devel] [v4 17/17] VT-d: Dump the posted format IRTE

2015-07-23 Thread Feng Wu
Add the utility to dump the posted format IRTE.

CC: Yang Zhang yang.z.zh...@intel.com
CC: Kevin Tian kevin.t...@intel.com
Signed-off-by: Feng Wu feng...@intel.com
---
v4:
- Newly added

 xen/drivers/passthrough/vtd/utils.c | 41 -
 1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/vtd/utils.c 
b/xen/drivers/passthrough/vtd/utils.c
index a5fe237..7f73a84 100644
--- a/xen/drivers/passthrough/vtd/utils.c
+++ b/xen/drivers/passthrough/vtd/utils.c
@@ -211,6 +211,9 @@ static void dump_iommu_info(unsigned char key)
 ecap_intr_remap(iommu-ecap) ?  : not ,
 (status  DMA_GSTS_IRES) ?  and enabled :  );
 
+printk(  Interrupt Posting: %ssupported.\n,
+cap_intr_post(iommu-ecap) ?  : not );
+
 if ( status  DMA_GSTS_IRES )
 {
 /* Dump interrupt remapping table. */
@@ -221,6 +224,7 @@ static void dump_iommu_info(unsigned char key)
 
 printk(  Interrupt remapping table (nr_entry=%#x. 
 Only dump P=1 entries here):\n, nr_entry);
+printk (Entries for remapped format:\n);
 printk(   SVT  SQ   SID  DST  V  AVL DLM TM RH DM 
FPD P\n);
 for ( i = 0; i  nr_entry; i++ )
@@ -238,7 +242,7 @@ static void dump_iommu_info(unsigned char key)
 else
 p = iremap_entries[i % (1  IREMAP_ENTRY_ORDER)];
 
-if ( !p-remap.p )
+if ( !p-remap.p || p-remap.im )
 continue;
 printk(  %04x:  %x   %x  %04x %08x %02x%x   %x  %x  %x  
%x
%x %x\n, i,
@@ -248,6 +252,41 @@ static void dump_iommu_info(unsigned char key)
 (u32)p-remap.dm, (u32)p-remap.fpd, (u32)p-remap.p);
 print_cnt++;
 }
+
+if ( iremap_entries )
+unmap_vtd_domain_page(iremap_entries);
+
+iremap_entries = NULL;
+printk (\nEntries for posted format:\n);
+printk(   SVT  SQ   SID  PDA  V  URG AVL FPD 
P\n);
+for ( i = 0; i  nr_entry; i++ )
+{
+struct iremap_entry *p;
+if ( i % (1  IREMAP_ENTRY_ORDER) == 0 )
+{
+/* This entry across page boundry */
+if ( iremap_entries )
+unmap_vtd_domain_page(iremap_entries);
+
+GET_IREMAP_ENTRY(iremap_maddr, i,
+ iremap_entries, p);
+}
+else
+p = iremap_entries[i % (1  IREMAP_ENTRY_ORDER)];
+
+if ( !p-post.p || !p-post.im )
+continue;
+
+printk(  %04x:  %x   %x  %04x %16lx %02x%x   %x  %x  
%x\n,
+i,
+(u32)p-post.svt, (u32)p-post.sq, (u32)p-post.sid,
+(((u64)p-post.pda_h  32) | (p-post.pda_l  6)),
+(u32)p-post.vector, (u32)p-post.urg,
+(u32)p-post.avail, (u32)p-post.fpd, (u32)p-post.p);
+
+print_cnt++;
+}
+
 if ( iremap_entries )
 unmap_vtd_domain_page(iremap_entries);
 if ( iommu_ir_ctrl(iommu)-iremap_num != print_cnt )
-- 
2.1.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [v4 10/17] vt-d: Extend struct iremap_entry to support VT-d Posted-Interrupts

2015-07-23 Thread Feng Wu
Extend struct iremap_entry according to VT-d Posted-Interrupts Spec.

CC: Yang Zhang yang.z.zh...@intel.com
CC: Kevin Tian kevin.t...@intel.com
Signed-off-by: Feng Wu feng...@intel.com
Acked-by: Kevin Tian kevin.t...@intel.com
---
v4:
- res_4 is not a bitfiled, correct it.
- Expose 'im' to remapped irte as well.

v3:
- Use u32 instead of u64 to define the bitfields in 'struct iremap_entry'
- Limit using bitfield if possible

 xen/drivers/passthrough/vtd/intremap.c | 92 +-
 xen/drivers/passthrough/vtd/iommu.h| 43 ++--
 xen/drivers/passthrough/vtd/utils.c| 10 ++--
 3 files changed, 81 insertions(+), 64 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/intremap.c 
b/xen/drivers/passthrough/vtd/intremap.c
index 0333686..b7a42f6 100644
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -123,9 +123,9 @@ static u16 hpetid_to_bdf(unsigned int hpet_id)
 static void set_ire_sid(struct iremap_entry *ire,
 unsigned int svt, unsigned int sq, unsigned int sid)
 {
-ire-hi.svt = svt;
-ire-hi.sq = sq;
-ire-hi.sid = sid;
+ire-remap.svt = svt;
+ire-remap.sq = sq;
+ire-remap.sid = sid;
 }
 
 static void set_ioapic_source_id(int apic_id, struct iremap_entry *ire)
@@ -220,7 +220,7 @@ static unsigned int alloc_remap_entry(struct iommu *iommu, 
unsigned int nr)
 else
 p = iremap_entries[i % (1  IREMAP_ENTRY_ORDER)];
 
-if ( p-lo_val || p-hi_val ) /* not a free entry */
+if ( p-lo || p-hi ) /* not a free entry */
 found = 0;
 else if ( ++found == nr )
 break;
@@ -254,7 +254,7 @@ static int remap_entry_to_ioapic_rte(
 GET_IREMAP_ENTRY(ir_ctrl-iremap_maddr, index,
  iremap_entries, iremap_entry);
 
-if ( iremap_entry-hi_val == 0  iremap_entry-lo_val == 0 )
+if ( iremap_entry-hi == 0  iremap_entry-lo == 0 )
 {
 dprintk(XENLOG_ERR VTDPREFIX,
 %s: index (%d) get an empty entry!\n,
@@ -264,13 +264,13 @@ static int remap_entry_to_ioapic_rte(
 return -EFAULT;
 }
 
-old_rte-vector = iremap_entry-lo.vector;
-old_rte-delivery_mode = iremap_entry-lo.dlm;
-old_rte-dest_mode = iremap_entry-lo.dm;
-old_rte-trigger = iremap_entry-lo.tm;
+old_rte-vector = iremap_entry-remap.vector;
+old_rte-delivery_mode = iremap_entry-remap.dlm;
+old_rte-dest_mode = iremap_entry-remap.dm;
+old_rte-trigger = iremap_entry-remap.tm;
 old_rte-__reserved_2 = 0;
 old_rte-dest.logical.__reserved_1 = 0;
-old_rte-dest.logical.logical_dest = iremap_entry-lo.dst  8;
+old_rte-dest.logical.logical_dest = iremap_entry-remap.dst  8;
 
 unmap_vtd_domain_page(iremap_entries);
 spin_unlock_irqrestore(ir_ctrl-iremap_lock, flags);
@@ -318,27 +318,28 @@ static int ioapic_rte_to_remap_entry(struct iommu *iommu,
 if ( rte_upper )
 {
 if ( x2apic_enabled )
-new_ire.lo.dst = value;
+new_ire.remap.dst = value;
 else
-new_ire.lo.dst = (value  24)  8;
+new_ire.remap.dst = (value  24)  8;
 }
 else
 {
 *(((u32 *)new_rte) + 0) = value;
-new_ire.lo.fpd = 0;
-new_ire.lo.dm = new_rte.dest_mode;
-new_ire.lo.tm = new_rte.trigger;
-new_ire.lo.dlm = new_rte.delivery_mode;
+new_ire.remap.fpd = 0;
+new_ire.remap.dm = new_rte.dest_mode;
+new_ire.remap.tm = new_rte.trigger;
+new_ire.remap.dlm = new_rte.delivery_mode;
 /* Hardware require RH = 1 for LPR delivery mode */
-new_ire.lo.rh = (new_ire.lo.dlm == dest_LowestPrio);
-new_ire.lo.avail = 0;
-new_ire.lo.res_1 = 0;
-new_ire.lo.vector = new_rte.vector;
-new_ire.lo.res_2 = 0;
+new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
+new_ire.remap.avail = 0;
+new_ire.remap.res_1 = 0;
+new_ire.remap.vector = new_rte.vector;
+new_ire.remap.res_2 = 0;
 
 set_ioapic_source_id(IO_APIC_ID(apic), new_ire);
-new_ire.hi.res_1 = 0;
-new_ire.lo.p = 1; /* finally, set present bit */
+new_ire.remap.res_3 = 0;
+new_ire.remap.res_4 = 0;
+new_ire.remap.p = 1; /* finally, set present bit */
 
 /* now construct new ioapic rte entry */
 remap_rte-vector = new_rte.vector;
@@ -511,7 +512,7 @@ static int remap_entry_to_msi_msg(
 GET_IREMAP_ENTRY(ir_ctrl-iremap_maddr, index,
  iremap_entries, iremap_entry);
 
-if ( iremap_entry-hi_val == 0  iremap_entry-lo_val == 0 )
+if ( iremap_entry-hi == 0  iremap_entry-lo == 0 )
 {
 dprintk(XENLOG_ERR VTDPREFIX,
 %s: index (%d) get an empty entry!\n,
@@ -524,25 +525,25 @@ static int remap_entry_to_msi_msg(
 msg-address_hi = MSI_ADDR_BASE_HI;
 msg-address_lo =
 MSI_ADDR_BASE_LO |
-((iremap_entry-lo.dm == 0) 

[Xen-devel] [v4 15/17] arm: add a dummy arch hooks for scheduler

2015-07-23 Thread Feng Wu
Add a dummy arch hooks for scheduler to make the build successful
in ARM.

CC: Ian Campbell ian.campb...@citrix.com
CC: Stefano Stabellini stefano.stabell...@citrix.com
CC: Tim Deegan t...@xen.org
Signed-off-by: Feng Wu feng...@intel.com
---
v4:
- Newly added

 xen/include/asm-arm/domain.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index f1a087e..7f25fd5 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -265,6 +265,8 @@ static inline unsigned int domain_max_vcpus(const struct 
domain *d)
 return MAX_VIRT_CPUS;
 }
 
+static void arch_vcpu_wake(struct vcpu *v) {}
+
 #endif /* __ASM_DOMAIN_H__ */
 
 /*
-- 
2.1.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [v4 03/17] Add cmpxchg16b support for x86-64

2015-07-23 Thread Feng Wu
This patch adds cmpxchg16b support for x86-64, so software
can perform 128-bit atomic write/read.

CC: Keir Fraser k...@xen.org
CC: Jan Beulich jbeul...@suse.com
CC: Andrew Cooper andrew.coop...@citrix.com
Signed-off-by: Feng Wu feng...@intel.com
---
v4:
- Use pointer as the parameter of __cmpxchg16b().
- Use gcc's __uint128_t built-in type
- Make the parameters of __cmpxchg16b() void *

v3:
- Newly added.

 xen/include/asm-x86/x86_64/system.h | 32 
 1 file changed, 32 insertions(+)

diff --git a/xen/include/asm-x86/x86_64/system.h 
b/xen/include/asm-x86/x86_64/system.h
index 662813a..0267797 100644
--- a/xen/include/asm-x86/x86_64/system.h
+++ b/xen/include/asm-x86/x86_64/system.h
@@ -6,6 +6,38 @@
(unsigned long)(n),sizeof(*(ptr
 
 /*
+ * Atomic 16 bytes compare and exchange.  Compare OLD with MEM, if
+ * identical, store NEW in MEM.  Return the initial value in MEM.
+ * Success is indicated by comparing RETURN with OLD.
+ *
+ * This function can only be called when cpu_has_cx16 is ture.
+ */
+
+static always_inline __uint128_t __cmpxchg16b(
+volatile void *ptr, void *old, void *new)
+{
+uint64_t prev_high, prev_low;
+uint64_t new_high = (*(__uint128_t *)new)  64;
+uint64_t new_low = (uint64_t)(*(__uint128_t *)new);
+uint64_t old_high = (*(__uint128_t *)old)  64;
+uint64_t old_low = (uint64_t)(*(__uint128_t *)old);
+
+ASSERT(cpu_has_cx16);
+
+asm volatile ( lock; cmpxchg16b %4
+   : =d (prev_high), =a (prev_low)
+   : c (new_high), b (new_low),
+   m (*__xg((volatile void *)ptr)),
+   0 (old_high), 1 (old_low)
+   : memory );
+
+return ( ((__uint128_t)prev_high)  64 | prev_low );
+}
+
+#define cmpxchg16b(ptr,o,n) \
+__cmpxchg16b((ptr), (__uint128_t *)(o), (__uint128_t *)(n))
+
+/*
  * This function causes value _o to be changed to _n at location _p.
  * If this access causes a fault then we return 1, otherwise we return 0.
  * If no fault occurs then _o is updated to the value we saw at _p. If this
-- 
2.1.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   3   >